OpenFlights

News from OpenFlights, the site for flight logging, mapping, stats and sharing

Improvements to FlightMemory imports

1 Comment

I’ll be frank: the bit of code for importing flights from our friends at FlightMemory to OpenFlights has long been riddled with bugs, and that’s why we’ve at long last thrown out the bulk of the plumbing and rebuilt it with shiny new pipe.  This has already fixed a number of bugs (most notably, accented characters getting lost), but there may be loose fittings somewhere, so please let us know ASAP if something’s leaking on the floor somewhere.  The next roaches in queue to be swatted are this (can’t handle ICAO codes for airlines) and this (duplicates in database).  And if you’d be really keen on one-step imports (that is, give the site your FM password and it’ll slurp up all your flights), now would be a good time to say so!

Encrypted transmission for ninjas: PHP being woefully short on decent DOM handling libraries (my kingdom for, say, Ruby’s Nokogiri or Python’s Beautiful Soup!), the screen-scraping for FlightMemory pages was originally implemented using Simple DOM HTML Parser.  Simple it may be, and the documentation is commendable, but it also turned out to be riddled with critical bugs and completely unmaintained, the maintainer apparently having gone AWOL a good two years ago.  So we ripped it out and replaced it with phpQuery, which is exceedingly powerful (it’s a straight-up PHP port of jQuery), but has some of the worst documentation known to man.  Here’s a typical extract from the API docs:

method appendPHP [line 1915]
phpQueryObject|QueryTemplatesSource|QueryTemplatesParse|QueryTemplatesSourceQuery|QueryTemplatesPhpQuery appendPHP( $content)
Enter description here...
Tags:
access
: public
Parameters:
$content

Useful isn’t it?  (I still don’t have the faintest clue what that particular function does.)  But with the far superior jQuery docs, some uninspired guesswork and lashings of raw DOM handling (shudder) we got the imports working again, and the code, while still in dire need of more TLC, is considerably better than it used to be.  And now that we have the beginnings of decent test coverage, it’s actually possible to poke around without breaking too much in the process!  I’m actually looking forward to finishing this off.

Self-import-antly,
-j.

Advertisements

One thought on “Improvements to FlightMemory imports

  1. I’m really keen on one-step imports 😉 ! Nice improvements already!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s