Wednesday, July 14, 2010

Webscraping awful JavaScript, part II

I edited the previous post to remove the link to the website I'm trying to scrape data from, in order to protect the guilty. What I am dealing with there is about 8000 lines of JavaScript, of which, I think, roughly 6000 lines comprise multiple blocks of code that are identical except for a loop condition, comparison, or other minor change. It's a classic if ugly C idiom, and sometimes unavoidable in a language so primitive. In a higher-level language like Java, C++, or JavaScript, it's the mark of a tyro—especially in JavaScript, which has first-class functions, which could be passed as arguments to a single instantiation of the repeated code.1


Luckily I can sidestep the mess and just figure out the data-parsing code.


  1. Douglas Crockford, in JavaScript: The Good Parts (hilariously, a little less than one-fifth the length of JavaScript: The Definitive Guide), argues that first-class functions are the best and most important thing about it: they make the language essentially "LISP in C's clothing." [back]

No comments:

Post a Comment