From non-RSS XML to PHP to HTML

november 20, 2007

What could be easier than having some info on, oh lets say…, interesting travels, dropped in a RSS feed, so anyone can include this info in their website. For example to, oh lets say… to get into an affiliate program. It could be easier, but it’s not. Today I got an XML which had to be turned into some HTML for use on a travelling website. I was assuming it was plain RSS, so scripting something together within a few minutes would be no trouble at all. But upon checking the actual site URL I found that they’ve made stuff harder. Who needs standards? Just cook up yer own XML document!

So now I’m in the middle of getting this thing to work. I assumed the following steps were all it took:

  1. Download the file;
  2. Parse the XML file into a multidimensional array structure;
  3. Take every productitem and convert it to an object;
  4. And drop every item into a nicely formatted HTML layout…
  5. And perhaps introduce some caching of the XML file (since they do not change as often as a regular RSS feed)

First I encountered a problem with getting the file to download (1). The PHP fopen() wrapper for remote urls was disabled at my shared host. Which is understandable and very common with shared hosting services. Fortunately, I was allowed to use the Curl library in PHP to retrieve the files. I coded the download, made a local file for caching purposes, and set up the XML parsing (2).

I wanted to parse the XML into a multidimensional array, as I find XML handling a little easier using arrays. I used a PHP class (clsParseXML.php by Eric Rosebrock from www.phpfreaks.com) to get this to work. Then I added a class productItem to map the XML structure into an object structure. And that should have done the trick.

Of course there were some setbacks. I developed the steps above and tested only with one feed. After placing the whole thing live, I found out that some feeds are simply huge. Two of them were just over 3 Mb in size. Downloading them to my site was no problem. But the PHP XML parsing failed miserably, due to lack of memory. I tried setting the memory limit up to 64 Mb, but no use. Apparently PHP isn’t really too happy about parsing large XML files. So I made up a little search-and-replace piece of code to get rid of any unwanted material, thereby reducing the file size… It’s still not really performing to the max, but I am getting there. All feeds I need to access can now be downloaded, cached, parsed and displayed the way I want.

I’m cleaning up the code now to get it online here.

It would be nice to develop this script a bit further to make it useful for other productfeeds.

Entry Filed under: scriptStuff. Tags: , , , , , , .

Leave a Comment

Required

Required, hidden

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed


Calendar

november 2007
M D W D V Z Z
     
 1234
567891011
12131415161718
19202122232425
2627282930  

Most Recent Posts