Wednesday, September 22, 2010

"Shadow IT" is a symptom, not a disease

...specifically, it’s a symptom of an inability of an organization’s IT department to meet business needs. And not all shadow IT is created equal: the term lumps together (for instance) staff circumventing security controls on their workstations to install unauthorized software, with people creating complex Excel spreadsheets to answer their business-intelligence questions (rather than asking IT to develop or purchase a new software package). So a little bit of the handwringing by old IT salts that you’ll see if you Google the term is justified, but for the wrong reasons, as I’ll explain.


If people need to install software and have shown that they can do so without bricking their machines, then IT should relax security controls, encourage them to keep informal records of changes they’ve made to their configuration, but most of all stay out of their way and allow them to be more productive. Similarly, don’t keep people from solving their problems with Excel; if they’re solving their problems (period), ipso facto that’s a good thing and IT staff (not users) are the ones who need to adjust their attitude to fit reality.


Let’s focus on the latter example for a bit. Here’s where IT really has a good chance to support rather than obstruct. As I’ve argued before, Excel has long been less a spreadsheet program than a development environment, with more flexibility and usable power than many dedicated business intelligence tools and ERP systems. Consider Juice Analytics, my favorite gang of Excel wizards: If the sample code they give away for free in their blog is anything to go by, the stuff they charge for is a wonder of transparency, structure, and scalability—just like the best "real" software.


But if it’s really a development environment, however, Excel needs to be treated as such; and in turn its power users need to be treated as developers. What are the three most important support tools for developers in a mature programming shop? Source control, source control, source control. Excel source control tools exist—there’s a short but sweet discussion of them here at StackOverflow—but they certainly aren’t mature. The technical problem here is that Excel stores data and business logic (the latter in the form of formulas and VBA code) as one binary file; same thing for the BI and ERP tools I’ve used, which make you navigate through dialog box after menu to create a query, report, or whatever that’s then stored somewhere mysterious inside the software. If all such software stored everything the user created in a text file in a standardized parsable format (XML being the obvious candidate) those files could be manipulated just like the source code files in a traditional development environment (i.e. not just versioned but interactively or programmatically edited using the user’s favorite tools), and they’d be exposed within the organization to be shared, catalogued, backed up, etc. Again, a win-win for users and IT, provided the latter opens its mind to the possibilities.


The article that sparked this post? Ironically, it's on TechRepublic, the epicenter of an awful lot of that handwringing I mentioned above. "Decade of the Developer," indeed. Open source reshapes the organization itself?

Wednesday, August 11, 2010

Not reinventing the wheel (again)

As I mentioned in two previous posts, I’ve been doing some 3-D graphics programming using the Perl interface to SDL. SDL is quite fast for 2-D graphics, but it doesn’t do 3-D on its own; so there’s a stack on my desk over a foot high of used textbooks with titles like Introduction to Computer Graphics Algorithms from which I’ve extracted the matrix transforms for various kinds of projections of a given pixel from threespace to twospace (orthographically onto a planar "window" given by a certain equation, seen in perspective by an eye located at given points, and so on). It’s been educational, but the results I’m getting are unusably slow, and I think they would still be too slow even if I were using a faster language. So, it’s time to start using a graphics library that supports 3-D—that is, one where someone else has already done the hard part of writing optimized code to do those transforms. It’s one of the hardest lessons to learn as a new software engineer transitioning into the workforce from school: it’s OK to use other people’s code; it’s better, in fact, to use other people’s code in the form of using existing libraries, rather than reinventing the wheel. (Jeff Atwood wrote an interesting contrarian blog post on why reinventing the wheel is sometimes, in special cases, the right thing to do: in general, the preference for reinvention is proportional to the need for performance; but note that there’s no consideration of cost there. Ironically, reinventing the wheel in the software sense is more likely to be favored by the ill-informed or non-technical manager, who doesn’t understand [the accepted wisdom] that it’s "always" better to buy than to build).


The obvious choice, one that’s been around for years with lots of support in the form of books, tutorials, and sample code online, is OpenGL with the GLUT libraries. The GLUT sample code I’ve skimmed seems to have (at a very high level) a two-part structure—you first set up the parameters for a 3-D rendering (the minimal case being an object and a view, with no light sources, texturemapping, or other complicated/cool stuff), and then you spawn a thread or another process to do the actual rendering on the fly until told to terminate. I’m not sure yet how this will square with the fact that I want the rendering to refresh in real time per the new data I’m collecting from the server. Let’s pull some numbers out of a hat: say I have a Perl program that runs all day long, sleeping most of the time, but waking up every five minutes to web-scrape and munge some derivatives-pricing data; say the latter takes about a minute, allowing for the bottleneck of my mediocre broadband connection. It sends that data on to the graphics code, which creates an interesting 3-D graph or visualization1 in n minutes. Then that 3-D image pops up in a window, rotating slowly so that you can see it well enough to interpret it. If n<4, then the whole thing can repeat ad infinitum. If n>4, then the rendering is going to lag behind (and more behind, and more behind...) the data collection, and become useless. Depending on how GLUT works, I might be able to kill two birds with one stone—minimize n, and minimize CPU load—by avoiding the real-time render entirely: I’m hoping that I can actually have OpenGL render the frames of a smooth 360-degree rotation of the graph from start to finish as image files. Then these could be converted to an animated GIF (if they aren’t already) and shown on a web page containing other useful data (which itself could get updated every five minutes, continuing the example, or more often). Meanwhile, the Perl code sleeps again, and then wakes up and continues the cycle...


I want to have the possibility of running this code on OS X, where I spend most of my time, or on Linux, or on Windows 2000 or Windows XP (which I run in virtual machines here and there for various reasons). Toward that end, in turn, I’m requiring myself to write the most portable code possible: since OpenGL and GLUT are (intended to be) entirely platform-independent within the limits of the hardware they’re run on, there’s no sense in doing anything else. To refamiliarize myself with the Microsoft Visual Studio IDE2 I’ll be coding my GLUT app as a Win32 application in C++, if necessary using some simple but not too kludgy wrapper classes around C code full of function pointer callbacks that’s sadly inherent to the Win32 libraries (well, I guess I can give them a break... it was the ‘90s.). However, if the instructions I’ve found for writing cross-platform-safe GLUT code for Linux and Windows using VS 2005 are accurate, that shouldn’t be a problem.


But I’m also going to try writing the whole thing in POGL, the Perl OpenGL/GLUT bindings, which are claimed to be just as fast as using OpenGL via C or C++ or <insert compiled language here> That makes the parsing that I referred to here easier since I won’t have to call Perl from a compiled C++ program (obviously a performance hit to be avoided) and I won’t have to find a regular expression library for C++ (of which I'm sure there are plenty, but the differences in regexp syntax between them and Perl would probably be unpleasant).


  1. The form of which is a secret—I can’t give away all my tricks, can I? However, one of the books in that big pile is a collection of academic papers called something like Topics in Scientific Data Visualization, so you can guess I’m headed into some pretty wild territory. (Here be dragons!)[back]

  2. One of the few truly great pieces of programming to come out of Redmond, it pains me somewhat to admit, along with Excel 2003 and the old QuickBASIC compiler (the distant MS-DOS ancestor of Visual BASIC, for you young whippersnappers). [back]

Friday, July 16, 2010

Does your installer do dependency checking?

As I threatened to do a few posts ago, I've been working on some graphical applications written in Perl and using the Simple DirectMedia Layer. I got a bit of practice code written, but not tested, a few days ago—a 3-D plot of a trig function—only to find that the software that allows Perl code to talk to the SDL libraries wasn't actually installed on my machine. OK, I thought, so that's not part of the default Perl install on OS X—I'll just find it and install it.

Easier said than done! First I had to get the entire developer kit (2+ Gb of downloads, and a couple of hours to install) from Apple. (I can see excluding the GUI developer tools, XCode and friends, from a default OS X install—but why the command-line C compiler and a basic set of libraries and header files? They'd take up a negligible amount of disk space.)

Next came installing the actual SDL libraries, then the Perl-to-SDL interface. I first tried Fink, which aims to provide one-stop shopping for .deb packages à la the various GUI frontends to apt in different Linux distros. Fink presents you with a window of available packages (which updates every time you open it, naturally) and downloads the ones you choose either as an OS X binary if one exists or as source if necessary (then configuring the latter to build under Darwin, compiling it, etc.). Fink has worked well for me in the past, but in this case though the list of available packages included SDL, it didn't include SDL-Perl. Fine, I thought, I'll tell Fink to install SDL, then install SDL-Perl via the command-line CPAN installer module included with Perl. The former worked OK; the latter didn't. The CPAN installer module couldn't deal with the many modules that were prerequisites to installing SDL-Perl, so I gave up and started doing everything by hand. This quickly led me into a "dependency hell" wherein every tarball I downloaded seemed to require that something else already be installed. About here is when I started to feel like banging my head against the keyboard. THIS SHOULDN'T HAPPEN. The problem of package dependencies on Unix-like systems was solved 15 years ago when the first Debian Linux distribution was released. I remember installing Debian 1.something on my first Linux box, which had previously been running Red Hat with its markedly inferior .rpm package system, and marvelling at how smoothly the install process went despite my idiosyncratic selection of software from the packages available.

Then I tried MacPorts, a similar front end to open-source software, mostly command-line and X Window stuff configured for Darwin. After installing it from a .pkg file (a self-installing Mac OS X package format) I typed just one line in a terminal window (MacPorts so far has no GUI front end):

sudo port install p5-sdl_perl
and was rewarded with a couple of hundred lines of beee-you-dee-ful status info scrolling by; extracting this, checksumming that, and compiling the other thing, as MacPorts automagically set up the necessary prerequisites and finally SDL-Perl itself. When I tried to run my Perl script again, it Just Worked.1,2

Moral: No matter whether your software is the most user-friendly GUI application ever, or some kind of arcane tool only to be used by superhackers; if you don't implement (at the very least) dependency checking or (preferably) automatic dependency resolution, you deserve forty lashes with a wet noodle, as Ann Landers used to say.

  1. Well, almost—I had to tell Perl where to find the new modules, but that was both trivial and something I needed to know how to do anyway. [back]
  2. Check it out! (Not impressed yet? Hey, it's just proof of concept.) [back]

Wednesday, July 14, 2010

Webscraping awful JavaScript, part II

I edited the previous post to remove the link to the website I'm trying to scrape data from, in order to protect the guilty. What I am dealing with there is about 8000 lines of JavaScript, of which, I think, roughly 6000 lines comprise multiple blocks of code that are identical except for a loop condition, comparison, or other minor change. It's a classic if ugly C idiom, and sometimes unavoidable in a language so primitive. In a higher-level language like Java, C++, or JavaScript, it's the mark of a tyro—especially in JavaScript, which has first-class functions, which could be passed as arguments to a single instantiation of the repeated code.1


Luckily I can sidestep the mess and just figure out the data-parsing code.


  1. Douglas Crockford, in JavaScript: The Good Parts (hilariously, a little less than one-fifth the length of JavaScript: The Definitive Guide), argues that first-class functions are the best and most important thing about it: they make the language essentially "LISP in C's clothing." [back]

Thursday, July 8, 2010

Reverse-engineering dynamically-created JavaScript

This is interesting: a page I want to webscrape some options price data from appears to be entirely created dynamically by JavaScript code which itself is created dynamically by an unknown CGI backend (probably PHP). That seems a little bit kludgey, but I understand the reasoning behind it; the page is interactive, but there's a lot of data that can potentially be displayed. This way there is one big server hit when the page is first loaded (to get a snapshot of all the options data)—presumably that CGI code is querying a database—and the JS just displays the data or not as the user clicks show/hide for each stat or each block of options prices.


I can get the data I need by parsing the JavaScript code, if I can figure out how that code parses its data strings (i.e. the data "passed" to it by the underlying CGI code) for display; luckily, the JavaScript string-manipulation methods seem to be modeled closely on Perl's.


I'll then have the data in an elaborate Perl data structure, and can manipulate it as I see fit.

Monday, July 5, 2010

My take on "best practices" for hiring programmers

Do the majority of managers hiring for programming jobs think like Justin James?

I can't find a page to link to right now that sums it all up, but I have the impression that the way the wind is blowing right now in the programming job market, there are a lot more jobs than there are qualified people. The number of computer science graduates is way down now from a peak ten years ago during and shortly after the so-called dot-com boom, and while the slack has somewhat been taken up by people who, however well-intentioned, think they should be instantly hireable because they whizzed through Teach Yourself Javascript in 24 Hours in only eight, that's little consolation.

As a hiring manager, I assume you'd want first of all to screen the latter group out; optimally, you'd want to select from what remained (perhaps 25% of applicants, charitably) the tiny fraction (less than 1% overall) of truly smart, capable, competent people. But James' approach in the blog post I linked to above seems designed to hire not the great but only the good enough. What I'm bothered by in particular is James' expectations for the candidate's understanding of technologies inside and outside of the mainstream. It's a long quotation but I think it's worth it:

I am not hiring Lisp, Prolog, Erlang, APL, Scheme, Clipper, PowerBuilder, Delphi, Pascal, Perl, Ruby, Python (forgive me for including those four in this list), Fortran, Ada, Algol, PL/1, OCaml, F#, Spec#, Smalltalk, Logo, StarLogo, Haskell, ML, D, Cobra, B, or even COBOL (which is fairly mainstream) developers. If you show these on your resume, I will want to interview you just for the sake of slipping in a few questions about these items. I am serious. As part of my secret geekiness, I am really into obscure and almost obscure languages and technologies. I know that a lot of those items take better-than-industry-average intellect and experience to do; they also provide a set of experiences that gives their practitioners a great angle on problems. While you will never directly use those skills in my shop, you will be using those ways of thinking, and it will give us something to talk about on your first day.


There's more than a whiff of ambivalence here, isn't there? Anything in that list is "obscure" and "you'll never directly use those skills," although "you will be using those ways of thinking." The subtext here is that a good programmer, a hireable programmer, is one who knows the languages and technologies that are currently the most popular (prima facie the stuff that isn't in James' list in the quotation above); anything even a little bit outside of the mainstream isn't a marketable skill, though it might make for good talk around the water cooler. And James is probably about as liberal and accommodating as it gets. Another manager might be more plain-spoken: "Well, it's nice if you know LISP, 'cause that shows you're a geek who programs for fun when he goes home; but you really need to know dot-NET 'cause that's where the money is."

As I suggested a few paragraphs back, I object to this mentality because it places a premium on being a cut-'n'-paste programmer over being one who actually thinks about what he or she is doing; on being a good-enough programmer who understands enough of the current flavor-of-the month technology to get by, rather than being something qualitatively different, the great programmer who knows that technology and its limits, but also knows others and their limits; it's like the difference—yeah, I know these programming-as-cabinetry-or-whatever-physical-craft analogies have been done to death, but bear with me—between someone who puts together furniture on an assembly line, versus someone who hand-crafts each piece individually; maybe it's even the distinction between knowledge and wisdom.

Grand claims? Impractical stuff? Expectations with nothing to do with the reality of the job market? Well, go read Joel Spolsky's blog post on the same subject, and get back to me.

Done? OK. What I wanted you to get from that is that Spolsky also hires programmers; and he takes one sentence to dismiss James' emphasis on job candidates' knowing the flavor-of-the-month tech. For Spolsky, being a good programmer is all about aptitude. What skills should a good programmer have? Wrong question. Distinguish "skills" from "familiarities," and discard the latter; what programmers need, among other things, is a knack for simultaneous application of logic (the real "guts" of programming whatever language or library you're using) and creativity (because the best solution to a software problem is so often outside the box). This doesn't mean lack of hands-on technical knowledge— quite the contrary: it's a wide knowledge of tools, and when to use the right one for the job, that makes that famous 10x difference in productivity that a great programmer has over a good one 1 manifest. It's what distinguishes software designers from software engineers; programmers from code monkeys; IT from CS (pick your terminology). As in the old saying, it's the state of being a man who's been taught how to fish.

So, let's say I'm a hiring manager. What kind of concrete test can I use in the real world to determine whether an applicant is a great programmer, or has what it takes to become one (probably an even more preferable case since I'll then have a hand in their professional growth and can shape it to my needs—oh dear,that sounds more sinister than I meant it to!)? Interestingly, Spolsky in the post I linked to above dismisses out of hand the kind of brainteasers that Microsoft interviews used to be (maybe still are?) famous for (so much so that a micro-industry sprang up with its own corner in the job-searcher section at Barnes & Noble, dedicated to help prepare you for the inevitable). He does offer a couple of positive suggestions: make the candidate write some real code on the spur of the moment (although I'd be a little less stringent about the requirements than he is, and a little more forgiving of bugs); ask the candidate to discuss (not "solve") a story problem, what Jon Bentley in Programming Pearls referred to as a back-of-the-envelope calculation.2

Those are good, but I can think of a couple of screening tests that perhaps bring Spolsky down to earth—that is, they speak to the candidate's approach to programming, but they also explicitly involve the tools of software development.

  • One would be something like—if hiring for a web frontend position—"What text editor would you use for making some quick changes in a CSS file?" The answer "What's a text editor?" would be an immediate fail. But so would "Notepad." Why? Because it indicates a lack of the programming "gene" in a couple of different ways: a lack of understanding of the toolset available to today's programmer (what? you mean there are other text editors?); an indifference to version control (it won't break anybody else's code if I just make this little change); a lack of sufficient creativity to see that a tool like Notepad could be vastly improved for the specific needs of programmers (source code coloring? paren matching? huh?).
  • Another would be to test the applicant's willingness to use appropriate technology specifically in their code. Again, this is job- and applicant-skill- dependent to an extent, but here's an example: Applicant has recently graduated from a well-regarded CS program. I describe to him/her a scenario in which a client, having been happy with a custom-built interactive software package for manufacturing process control sold them by our company, now wants to be able to automate the process further by integrating some kind of scripting language. Does the interviewee say, "Well, if it's a command-line tool, a shell script should do it; if it's an interactive text-based application, that's what Expect is for; if it's a graphical application things get more complicated." Or does he/she say, without missing a beat, "Lex and Yacc!"? Or, "Well, how complicated a language are we talking about? Lex and Yacc are probably overkill; I could build a little parser more quickly by hand, and then just traverse an abstract syntax tree in memory... [followed by a flurry of thinking aloud how that might be done]." Any of those three could be right, depending. But anything that smacks of reinventing the wheel would be wrong.3
  • A third would be to ask the applicant to describe a hobby or other spare-time interest that involves problem-solving, and prompt until you hear a sufficient level of detail that indicates that this person has his logic hat on not only when programming but at other times, and that it feels natural. For instance, I like to work on old cars, specifically European cars from the '80s, for fun; and if you get me started I will talk your ear off about how those kind of cars are modern enough to have electronic engine controls but not modern enough to have the kind of computerized self-diagnostic abilities that new cars have, which presents a challenge, when the car doesn't run right, that is a lot like debugging code. But I'm not sure questions like that about the candidate's life outside the job can be presented in a non-EOE-violating way.



  1. Originally from The Psychology of Computer Programming, I believe, this number has been pretty much taken as fact since the first edition of McConnell's Code Complete. In other words, don't blame me. [back]

  2. Besides the Bentley book (and its sequel, More Programming Pearls), a great source of these is John Paulos' book Innumeracy. [back]

  3. At a former job I had in which several programmers were working on a C++/MFC app, the need arose to do some complicated pattern matching in an input string. One programmer went home and spent three hours after dinner writing some character-by-character algorithms, basically C string manipulation at its goriest. The other added a public-domain regular expression library to the project, then wrote, debugged, and thoroughly tested a regexp to do the same thing—in 15 minutes. There's your order-of-magnitude difference in productivity. [back]

Thursday, June 24, 2010

Stymied by horrible interface design, not for the last time

I had an experience last night that was a textbook example of how not to design a user interface for a consumer product. (I've been told this post is too long: I think the anecdote is revealing, but skip to the last paragraph if you just want to read the moral.)


I was trying to connect a new DVD player to a high-definition flatscreen TV. The DVD player box said that it did automatic upconversion of DVDs (which normally, of course, are not high-def) to the highest-quality HD protocol, 1080p, so I was hopeful that the HDMI connection would provide excellent video quality.


I expected to be able to just connect the HDMI cable to both devices, set the TV to accept an input from the HDMI port, tweak a few setup parameters, and be off and running. Of course it didn't work that way. No picture. The DVD player had a series of LEDs on the front indicating what type of signal was being sent over the cable at that moment, and a button labeled "HDMI", duplicated on the remote, to select one. I tried pressing that button repeatedly. Still nothing. HDMI is a handshaking (i.e., two-way) protocol which should gracefully degrade if necessary down from 1080p down to 720i (DVD quality), with stops in between if the hardware is capable of it, and I expected at some point the DVD player and TV player would find a protocol on which they could agree; but apparently they didn't. I say "apparently" because neither device offered any kind of error indication other than a black screen!


I looked in the manual, and found that I was supposed to manually select HDMI output in a setup menu. In order to see the menu, I hooked up the player to the composite video/left-right audio jacks on the TV, and was able to get a (relatively low-quality) picture. I changed the output setting to HDMI, and changed the TV input back to match. Still no picture. I pressed the "HDMI" button on the front of the DVD player a few times, and watched the LEDs cycle through the available HDMI video options, but with no change (the TV displayed nothing but "HDMI Input," which apparently meant "I'm waiting for HDMI input" rather than "I detect it").


I went back to the composite connection on the TV and looked at the output setting in the DVD player setup menu again. "Composite" was selected—"HDMI" was still available, but was not selected as I had left it before. I made sure I was pressing the right button to save the settings when I left the setup menu, rather than exiting without actually making changes. I was. I tried the same routine a few more times, but the "HDMI" setting for DVD output just wouldn't take.


So here are a few tough questions:

  • Why couldn't the DVD player designers have included a single LED, maybe a red/green indicator of whether HDMI handshaking was successful or not?
  • Why an HDMI button on the machine (and the remote!) at all? The player ought to automatically output the highest quality signal that it can, at all times.
  • Similarly, why an output selection menu on the DVD player? Why can't all outputs on the player be active at all times?
  • Finally, why couldn't the TV, with much more real estate for messages, display something like "No compatible HDMI signal found" to assist with troubleshooting?
As it was, I made an educated guess, based on the limited evidence available, that the DVD player was faulty right out of the box and unable to actually generate an HDMI signal (not at all far-fetched given the general cheapness of consumer electronics these days): perhaps a poorly soldered connection to the output jack on the circuit board inside. But the frustrating hour spent arriving at that educated guess was worthy of user-interface guru Donald Norman, whose discussions of confusing electronics in cars, door handles shaped so that you don't know whether to push or pull them, and much more, are still 100% relevant twenty years after he began publishing them (and should be required reading for anyone involved in the design of end-user software or consumer products).


Why did I tell this long story on a blog primarily about software design for financial analysis? Bottom line: design affects productivity. The way things work affects what the user can get out of them, and that's always going to be true, whether you're producing an iPhone app, a spreadsheet, a rich web application, or whatever the Next Big Thing in tech is.

Wednesday, June 23, 2010

I'm overdue to learn Java

Part of my reason for writing this blog is to document my reëducation in programming (and, reciprocally, to encourage myself to take that process seriously). I've just ordered a general Java handbook for experienced programmers and a book specifically on graphics and audio programming. I'd like to port (or write shameless copies of, depending on your point of view) some classic 8-bit games to Java, with the aim of making them sufficiently platform-agnostic that they will run on mobile devices just as well as on desktop OSs. That little trackball on the BlackBerry is just crying out to be used to play Centipede, Missile Command, or maybe something a bit less well known like Crystal Castles:



I think my first significant Java program, however, is going to be a machine that plays Terry Riley's modern-classical composition In C (differently every time, of course). And there'll be some kind of kaleidoscopic graphical accompaniment showing which notes each "musician" is playing when. Yes, I'm aware of how ridiculous this may seem, but what's wrong with it as an exercise in generating graphics and sound in sync?


The last time I was paid to write software, it was a couple of medium-sized programs in C++ that ran in user space on Windows NT but worked hand-in-hand with a custom driver running (of course) in kernel space. One of those programs that I'm particularly proud of was an interpreter for a "little language" that enabled automatic testing of some proprietary vertical-market hardware. After that job, I taught C++ to undergrads for a while, with an emphasis on OOP fundamentals and how to apply them properly, not just on the syntax of the language. So I don't expect any trouble becoming fluent in Java, since the concepts of C++ are still rattling around in my head: multiple inheritance, operator overloading, virtual functions, references, and all that cool stuff. (I'm rusty on C++ syntax, but that will come back quickly with exposure if necessary.) Still, this will be far more of a brain-stretcher than the Perl code in the previous couple of posts—if the two can even be compared: in the Java case we're talking about brushing up on the effective design of a medium-sized program, while the tiny Perl web-scrapers I presented had a trivial structure (and what little structure there was, was procedural, and I've been doing procedural programming for thirty years).

Coming soon: my resume

Just what the title says! I'm open to offers of permanent or contract positions doing challenging, interesting work in any of the areas I touch on in this blog (see the subtitle above if what I mean isn't clear), as well as in related fields such as education, training, technical writing, etc.

I am particularly interested in interdisciplinary work such as business intelligence software development, data mining, or instructional design, but the bottom line is this: if, after reading some or all of this blog, it sounds like I might be interested in doing something for you, then I probably will be.

Monday, June 21, 2010

Another financial-data-mining/web-scraping/scripting exercise

Continuing the theme of the previous couple of entries, I'm thinking of writing a program, almost certainly in Perl, to scrape the options price and open interest data for any given stock from the Morningstar web site (example here) and analyze it in various ways. It seems to me that there are more ways, and sometimes simpler ways, than the traditional "greeks" to evaluate the resulting data set as a predictor of short- and medium-term stock prices.

Given the numbers on the Morningstar page, I should be able to compute:

  • A simple put/call ratio: total open interest of puts over total open interest of calls.

  • (Here's where my own ideas start) A put/call ratio where the open interest is weighted according to time until expiration—i.e., near-term options are given more weight since (maybe) they represent traders who have a larger stake in the game and thus are paying more attention to whether their bet will pay off.

  • A similar ratio, but with the open interest weighted by how far the strike price is out of the money.

  • Change in the price of the option and of the underlying stock. (Call these delta-o and delta-s to avoid confusion with the options greek called "delta.")


The last should be graphed over time. There's a nice module in CPAN already to build 2-D graphs (line charts, bar charts, the usual) given an array of data, and output them as image files ready to be tagged on a simple web page like I did in the previous entry. But you have undoubtedly already noticed, Careful Reader, that there are really two dependent variables in the graph I just described. (Yes, the difference and the ratio between delta-o and delta-s are interesting, but the individual stats are also interesting in themselves.) This calls for a three-dimensional graph. I don't see any relevant library on CPAN, but I'm sure a little digging will turn up something I can adapt. (In fact, I already have the book Perl Hacks, which has a nice explanation of how to do bitmapped graphics in a window with SDL; so all I really need, if I'm recalling Computer Graphics 101 correctly, are the equations for a projection of a three-dimensional point onto a two-dimensional plane from a particular relative viewpoint). Having done this, I've got another idea for which a 3-D graph—and preferably a dynamic one that you can "fly around" and look at from all sides—would be not just nice to have but mandatory: distance out of the money (y) and price (z) versus time to expiration (x). If the points are colored or shaded appropriately to indicate the sign and magnitude of delta-o, then a large solid-colored area would be a tipoff that a certain set of options with a similar strike price and time until expiration have gone up or down in asking price—quite possibly a significant leading indicator of the price trend of the underlying security.

The analysis of options is always—and this is a truism, but a deep and important one—complicated by the fact that for every buyer of options, there's also a seller. The line "Most options expire worthless" is often given as an argument that options buyers are mostly ignorant speculators whose bets don't pan out. I don't buy that. (If that were the case, the put-call ratio is universally interpreted in reverse; we ought to expect that the buyers of puts are mostly wrong in their predictions, so the price of the underlying stock will go up; and the inverse for calls.) I'd bet that most options are in fact both bought and sold not in order to speculate on the options themselves but in order to hedge a trade of the underlying securities. If that's the case, then something would be wrong if the options mostly didn't expire worthless. Most people rarely if ever make claims against their car insurance, too.

(Hey, there's even already a Perl module to do Black-Scholes options pricing...)

Wednesday, June 2, 2010

"Now seems like a good time," I said to myself...

..."to get those rusty programming skills going."


I had found myself wanting to do some analysis in Excel of price behavior of a large list of stocks.


I glanced at the first few pages of Perl and LWP, and then at the Regular Expressions Pocket Reference; I opened Firebug on the Yahoo Finance "summary page" for a stock I was interested in, so that I could see the raw HTML I was dealing with; and wrote the following:1



#!/usr/bin/perl
use LWP::Simple;

# Expects a list of security symbols on standard input, one per line.

print("Symbol\tPrevClo\tOpen\tLast\n");

while ($sym = <>)
{
chop $sym;
$summary=get("http://finance.yahoo.com/q?s=$sym");
die "Couldn't get Yahoo Finance Quote Summary page for symbol $sym!"
unless defined $summary;

$summary =~ m/>Prev Close:<.*?>(\d+\.\d+)</;
$prevclose = $1;
$summary =~ m/>Open:<.*?>(\d+\.\d+)</;
$open = $1;
$summary =~ m/>Last Trade:<.*?>(\d+\.\d+)</;
$last = $1;

print("$sym\t$prevclose\t$open\t$last\n");
}



It worked the first time—not bad for not having done any programming whatsoever for about five years and nothing of significant size for ten. (Yes, I know it's not very idiomatic Perl—combining the match regexps and doing a few other things would probably cut the line count in half.) That code took a few hours to produce, but subsequent similar programs to web-scrape other pages took much less time, now that I was in the groove.

For example, more exciting was the following, which expects the same list of symbols:




#!/usr/bin/perl
use LWP::Simple;

print "<html>";

while ($sym = <>)
{
chop $sym;

print "<font size=5>$sym</font><br>";

foreach $period ("1d","1w","1m")
{
getstore("http://ichart.finance.yahoo.com/z?" .
"s=$sym&t=$period&q=l&l=on&z=m&p=e5,e20&" .
"a=p12&lang=en-US&region=US",
"$sym$period.png");
print "<img src=\"$sym$period.png\"/>";
}

print "<br><br>\n";
}

print "</html>";



What I'm doing here, if it isn't clear, is scraping a number of security price charts from Yahoo Finance, saving the image files locally, and building a crude but effective web page to make them viewable in one place. Beats looking at each stock by hand for price trends, let me tell you!


Now, all of this may seem like "Hello World" stuff to anyone reading this who's had any programming experience beyond Computer Science 101. But what I think shouldn't be taken for granted here is the amazing ability to (in the first script, as the simpler example) in just a few lines of code, suck an entire web page into a string variable, search that string in a complex way, and output the result in a universally readable (i.e. by humans or other programs) format. We used to want applications to have built-in programming languages—now (and here's the takeaway!) we have programming languages with built-in applications: very-high-level functionality to do things that only applications used to be able to do. And we can do them in a scriptable, redirectable, programmatic way. Admittedly much of this is due to the straightforward API of the LWP module (Perlspeak for "library"); but I'd argue that that accessibility is a function of the design of Perl; there's obviously stuff going on there behind the scenes that would be much harder to write in a language without such integral support for string manipulations (C, say).


I was weaned as a programmer on 1980s consumer 8-bit machines, the multimedia powerhouses of their day, on which even in a high-level language (built-in BASIC), to do anything interesting you had to twiddle bits. And most of the software I've been paid to write has been low-level stuff in C—device drivers and the like. So I'm easily impressed and easily seduced by VHL (very-high-level) languages that let you do so much with so little typing. Of course there's danger inherent in only knowing high-level languages. When you don't understand what's really going on at the machine level, optimization can be much more difficult, for example. Ironically, though, even as undergraduate computer science programs deemphasize C and assembler skills and move their students towards Java, C#, .NET, PHP, and so on—preparing them more effectively for the kind of web-back-end-database-interface work 9 out of 10 of them will face as new programmers—even as this huge and largely unremarked shift in what it means to be a professional computer programmer takes place, hobbyists tinker with microcontrollers, programmed at as low a level as you want, to recapture some of that early-80s frontier-machine-code feeling. Some will call this retrograde or Luddite-ish but the truth is, I think, that controlling hardware directly with one's code fulfills some kind of deep need in the engineering personality to exercise maximum control over one's immediate universe; and there's nothing wrong with the practical experience gained thus: few programmers will ever write an operating system, true, but there will always be lesser software that needs to run "close to the metal." (A $5 pocket calculator will never run a Java interpreter, for instance. I think...)


Getting back to my own programming for my own use and profit, far more complicated and wonderful things will come in time. I'm comfortable using Perl for this kind of stuff, but have never written a program of any serious size in it. What little user-level software development I've done has been fairly strictly object-oriented code in C++. I only know how to use Perl procedurally; understanding the OOP features of the language, which seem to be highly regarded, would be a good thing to have under my belt.


On the other hand, I have had a strong hankering to learn Python, thanks to what seems to me to be a very elegant syntax. And I have the book A Primer on Scientific Programming with Python which—while I'm quite sure that somewhere on CPAN there's a module to support in Perl the same kind of computations I need to do—numerical integration and differentiation, curve fitting, linear regression, etc.—is an excellent tutorial for Python in general besides describing the appropriate libraries in detail. Lastly, the Beautiful Soup library looks like an even cleaner way to do webscraping.


  1. How the heck do you format code nicely (i.e. not just in a non-proportional font but also indented correctly, lines that overrun the margin indicated clearly, and with symbols correctly escaped) in idiomatic HTML these days? Yeah, I know there's the <pre> tag, but it doesn't help you with lines that run past the right edge of your text frame (or wherever your body text is going), and you still have to festoon your code with &whatever-entity tags to escape all the non-alphanumeric characters.[back]



Monday, May 24, 2010

New blog, new start

The previous three entries (i.e., farther down on your screen) are reposted from a more-broadly-themed blog I kept briefly in 2007; hopefully this one's slightly narrower focus, if nothing else, will help it to last longer.

I had to edit the HTML of those posts a bit because Blogger, as far as I can tell, is not quite as smart as WordPress as regards footnotes in particular. Let me know if anything seems to be broken.

Let’s Retire the Forms-Based Interface

[Originally published July 11, 2007.]

I bet not too many people reading this have ever used a terminal hooked up to a minicomputer, and even fewer have used a forms-based interface on a terminal of that kind. But this kind of human-computer interaction is still a commonplace event in the business world: legacy IBM mini systems are everywhere. (If Y2K couldn’t kill them, nothing can.) People call this model “green-screen technology,” and they mean it pejoratively—it’s something archaic, clunky, and generally inferior to “modern” user interfaces.

That charge is mostly accurate. Though green-screen technology has its place (and to say what that place is, would be getting off-track, but it does have one) there’s no denying that it’s old-fashioned. But let me point out that so much of the Web consists of exactly that sort of thing. Web pages with fill-out boxes, check boxes, and radio buttons are even called “forms,” in obvious acknowledgement; so are Visual Basic application windows. The metaphor works, sort of: you fill in a form and then click “submit” (or press “Enter” on your IBM terminal’s keyboard—ever wonder why Windows PC keyboards say “Enter” instead of “Return”? now you know) which is like sending in a (paper) form to an office somewhere. Then you get an answer back—the results of a computation, a database query, etc.1

None of this is shocking. What is shocking is how much modern software exists whose interface is still essentially forms-based, yet which pretends to be interactive. They’re two very, very different interface paradigms. Interactivity in software comes from more than just adding buttons & windows to a forms-based interface. I.e., if your idea of successful HCI consists of a modal window in which the user fills in a bunch of fields and presses a button, whereupon a new modal window pops up containing a report, then you’re not only putting lipstick on a pig but also being just plain dishonest: you’re selling ’70s tech as if it were something new. Way too many commercial products that are essentially prettified database frontends (which isn’t a bad thing in itself) are designed with this mentality—that all you, the user, ever do with a computer is run an offline query (and maybe a batch of them if you’re a power user). ("But I'm not a fish!")

Now think about actual interactivity, the thing that microcomputers give us (or at least were supposed to, back around 1980). This is the state where not just all the data you’re working with but also the operations on that data and their results are fully accessible at all times, within reason. It’s the guiding mentality behind WYSIWIG in word processors, for example, as opposed to typesetting software like nroff or TeX (in which you write your document as a text file with interpolated commands, then submit that file to a program which outputs a proof). Another great example is Excel, which is nothing like programming numerical computations in a traditional programming language (for Excel is a programming tool—it has more in common with friendly interpreted language environments like the old 8-bit BASICs than with much application software). You see all your numbers in front of you, and by clicking in a cell or pressing a magic keystroke you can see all the operations on them (i.e. formulas), or the results of those operations. And you have total freedom to change or transform the data or the operations in realtime. There’s no modality to speak of.

Again, because this is the critical idea: you can’t just base an interface on pulling stuff out, changing it, and then resubmitting (putting it back in), and call it interactive. True interactivity requires non-modality of not just operations but also data: that is, all the data should be accessible all the time. Jeff Atwood wrote a great blog post about taking incremental search beyond the dataspace into the commandspace (pace Emacs). I’d like to see a lot more development of and experimentation with interfaces that use this kind of dynamic filtering to perform search, Neuromancer-style n-dimensional visualization of the dataspace, or a combination of both. Imagine this: instead of filling out a form and hitting “search,” you type (or click on) your parameters and watch a nebula of data dynamically shade itself as you type, with color and transparency indicating the sets involved and their relevance rating2—sort of a 3-D mixture of Venn diagrams and PivotTables.3 Or... remember the holodeck-furniture-database-search scene from the Star Trek episode “Schisms”?4


  1. Actually I like to think of this not so much as a “sending a form into a government office” model of computing as a “Wizard of Oz” model. You make your request of the Great and Powerful Oz and hope he gives you back something you can use.[back]
  2. C’mon, let’s use those alpha channels and all that other pretty stuff that modern graphics hardware can do for something other than another variation on Doom! [back]
  3. But please don’t call it “drilling down!” That’s not what that means, but I’ll save that for another entry. [back]
  4. Why is this not required watching for budding interaction designers and database programmers? [back]

Applications: Tools, or Just Fancy Data?

[Originally published April 6, 2007.]

In my previous post in this blog I challenged the feasibility, from the interface-design point of view, of running applications in a browser window—on the grounds that applications and data are two different things, and the browser is inherently a tool for viewing data.

Let me add a couple of thoughts to this. First, a general point: it’s important to note that this isn’t just a linguistic or even an epistemological issue, but an ontological one. That is, it’s not just a matter of what kinds of arrangements of bits we call “data” versus “applications” and what kind of tool we use to manipulate them. It’s a question of what that tool is, what it does, to what, and for whom. Think about physical tools, the kind you buy at a hardware store.1 They’re classified according to a huge variety of schemata:

Some are classified according to the raw material on which they are designed to operate: a crosscut saw for wood, versus a hacksaw for metal.

Some, according to the physical shape of the artifact they manipulate: an Allen wrench for bolts or screws with a concave hexagonal impression in the head, a Robertson driver for those with a similar but square impression.

Some, according to the purpose of the artifact they manipulate: a flare wrench for fittings on hydraulic lines.

Some, according to the operation to be performed, largely independent of the context of the object of the operation: a screw extractor for rotating fasteners whose heads are damaged.

And often there is an overlapping schema wherein tools are classified according to the general circumstances in which you would use them, hierarchically, with groups and subgroups: there are mechanics’ tools, and there are metric tools, and then there are wrenches with built-in tubing, for opening hydraulic bleed screws without a mess in brake or clutch systems with metric fasteners.2

The point is that these classification schemes aren’t something imposed from outside, as biologists impose the Linnaean taxonomy on the ever-changing and ever-being-discovered sloppiness of the natural world in order to make it a little bit more manageable. The epistemology of a tool guides its ontology: that brake-bleeding wrench was designed specifically for the task, very likely by some mechanic fed up with the inadequate tools he or she had available to do a brake job, and a crosscut saw acquired the form it has not by chance but because generations of woodworkers refined the design to cut certain pieces of lumber in a way that was useful to them. So tools evolve not only with their objects but also with the circumstances of their use, and it’s an oversimplification to say that there is a straightforward correspondence between the tool and its object and thus a clear-cut division of tools by what object they act upon. This isn’t an excuse for the browser as application platform as currently understood, though. Exactly the opposite! The tool that is used to run online applications and explore online databases must be one that is tailored to its job, rather than the clumsy square-peg-in-a-round-hole hack of the browser-as-it-stands.

And this brings us to the other point I want to add to the previous entry. Douglas Hofstadter said of the supposed form/content distinction3 that “content is just fancy form.”4 Are applications, then, just fancy data? It’s tempting to state the question and its counterpoint as opposing theses:

  • [First,] Applications are just fancy data: the more complex a data set becomes, the more operations its inherent properties suggest, until some “tipping point” of complexity is reached, at which those operations can be abstracted from that data set and ones with similar structure. [vs.]
  • [Second,] Applications are closely analogous to tools; data, to raw materials and the workpieces made from them: though they may evolve together, the two are fundamentally different.

I don’t think I can, or need to, disprove the second, though I think in the paragraphs above I’ve pointed the way towards some problems with the thesis that make it less appealing than it might at first be.

The first is also intuitively appealing, but it too is problematic. I think there’s a implied argument there, a flawed one: there’s a leap in logic between the premise “data sets inspire operations” and the conclusion “those operations comprise the application.” While the premise is true, valuable and significant operations often emerge from the users of data; and through a feedback process, these operations become commonplace in ways that the data alone never could have suggested. People made tabular data easier to understand by making graphs of it for hundreds of years before some mad genius at Microsoft came up with PivotTables, and now they’re indispensable. But they sure aren’t inherent in a ledger of handwritten numbers. Nicholson Baker made a perceptive point in one of his semi-autobiographical novels that the designers of sugar packets and windshield wipers didn’t anticipate that people would centrifuge the first to better control the release of the contents and use the second to keep advertising flyers from blowing off of parked cars in the wind, but those behaviors have become integral facets of the use and therefore cultural significance of those artifacts.

And yet not everything you can do with a particular kind of data is something you should do. Word processors replace typewriters, in that they let you do things with paper. You can put various semantically significant symbols on a piece of paper, and you can also make an airplane out of it. Should a word processing program, then, contain a paper-airplane-design feature? Probably not.

The appeal of flawed thesis #1 above when I first started thinking hard about it a few years ago led me to embrace document-centric user interface design. For instance, I mentioned the idea approvingly in a review of Alan Cooper’s book The Inmates Are Running the Asylum in 2001 (an essay that now seems somewhat embarrassingly snarky and strident, but I’m archiving it here as-is anyway rather than trust Amazon to hold on to it for me forever). I still like the idea in theory, but I have serious doubts about the viability of the implementation. As I note in the Cooper review, Jef Raskin’s work in UI design exhibits the most extreme form of document-centricity—no applications at all. In the characteristic systems Raskin pioneered, the screen is a single window into one big document containing text, numbers, pictures, whatever; in theory, any operation can be performed at any point in the document at any time. Not only does the user not need to open a spreadsheet application to total up a column of numbers in the middle of a word-processing document, but he or she simply can’t, because there is no spreadsheet and no word processor; there’s just the numbers and the text. To add up those numbers you’d just select them and invoke a “TOTAL” command of some kind. Don't believe me? Read the description of the Canon Cat interface.

This is supposed to make life with the computer easier, because it does away with modes, the most-feared bugbear of interface design since the early days of the Macintosh. That is, you never have to worry about whether you’re in the “typing mode” or the “calculating mode” (for example), because (again) there is no spreadsheet and no word processor to switch between. But just a few sentences ago I said that there are problems with the implementation of this notion. Here’s the issue: Not all operations can be performed on all types of data. What happens when you try to invoke that “TOTAL” command after selecting a column of words? Will the computer do nothing? Will it spit back at you the total of the ASCII values of the letters in the selected words? (Let’s hope not!) Will it beep? (Ditto.) There’s no good answer. If you’re allowed to perform any operation on any type of data, cases where user input doesn’t make sense are going to be plentiful. (And any interface that makes it easier to make mistakes is obviously not an improvement.)

The alternative is to allow only those operations that make sense for the particular type of data in question. And that’s just modes again! Suddenly our noble and inspired designers of document-centric interfaces find themselves impaled on the horns of the elephant in the room.5 (Ouch!6)


  1. The definitive discussion of this is probably somewhere in Wittgenstein, if one were to look hard enough. [back]
  2. This one is going on my birthday list for sure. [back]
  3. ”Supposed” to other people than just me and Hofstadter. See, for example, this whitepaper. [back]
  4. In Metamagical Themas. And perhaps the design of a tool is an emergent property of its use—just fancy use! [back]
  5. The designers of archy, the latest system inspired by Raskin’s work, imply that they solve these problems by making the system smart enough to detect the kind of data being acted upon and perform the correct action. Way to beg the question, guys. [back]
  6. Sorry about the mixed metaphor and cliche. On the third hand, though, why exactly is modality so dreaded? I’m not the first person to notice that life itself is modal. If you pick up a pencil you’re constrained to a certain, even if fairly large, set of actions: you can write, pick your nose, stab your enemies, or stir a pot of soup with that pencil, but you can’t loosen a bolt, or do a thousand other things for which only other tools are appropriate. (And here we are back at the flawed but oh-so-seductive analogy between physical and virtual tools.) [back]

Why Web Applications Are Broken

[Originally posted April 1st, 2007. No foolin’!]

I’ve been thinking of large websites with heavy back ends (Amazon being the canonical example) as applications for a long time now. So I have a bit of a so-what reaction when I hear people talking about a paradigm shift to applications in the browser. I want to ask, don’t you remember what Scott McNealy was saying in every interview in the late ’90s—Sun’s slogan “the network is the computer”? Turns out the people promoting a web-based thin-client model ten years ago were just way ahead of their time; it took technologies like Ajax and proof-of-concept apps like GMail and Google Maps to make the idea concrete. The reason I’m underwhelmed is not so much that something old has been dressed up and called the latest thing (what else is marketing about?), but more that there’s a fundamental change that needs to happen before apps in a browser will work. This isn’t a technological barrier—more precisely, it isn’t just a technological barrier, but also (more challenging!) one of human-computer interaction and design.

The problem is this: as it stands, the web browser as an environment for applications is almost irredeemably broken. We’re used to thinking of the navigation controls (buttons, bookmarks menu, etc.) in the browser as first-class controls, while the widgets in the window are second-class. If you get somewhere you don’t want to be in the browser, you don’t hunt through the window for an escape hatch to the previous page provided by the site designer—you just click the “back” button. (But [consider this] does “forward” ever do anything useful or predictable?) But in doing that you’ve made a conscious choice between two different interfaces—that of the browser and that of the page. Which interface does what? Giving the page its own controls is like giving the road its own steering wheel.

(Actually, the “back” button has been broken since day 1 [or at least since the first time I used the Web, in 1994, via Mosaic]. Here’s an example.

Start at page A and click a link to go to page B. Then click a link to go to page C. Then click the “back” button twice to return to the home page, A. Click on a link to go to page D. Now try to return to page B via the “back” button. You won’t be able to! As the history menu will indicate, the browser remembers only that you visited A. The interface is broken because it’s unnecessarily confusing. The “back” button is trying to serve two different and incompatible purposes: it’s supposed to mean both “undo” and “go to a higher level in the hierarchy.” The latter doesn’t work, because a fundamental principle of Web ontology1 is that the web is a network, not a hierarchy. There’s only incidentally an “up” in hypertextspace! Further, if the browser saved your entire surfing history [for this session], and if “back” also meant “up a level,” what would it mean to click “back” while viewing a child page (e.g. C above)? Would you end up at B or D? 2 Clearly the only workable solution is for “back” to mean “undo,” and for the browser history to show every page visited, in parity. Or is it workable? It’d be nice for “forward” to mean “redo.” But what does it mean [just to give one of many available troubling examples] to undo the submission of a form?)

Perhaps the real problem is deeper. The web browser as such is a tool for accessing data. It may have grown far beyond its origins as a graphical Gopher, but it’s still, at heart, just a way to navigate a topology of discrete records (pages) in a huge (non-relational) database (the ’Net). But now we’re asked to think of the browser also as an environment in which to run applications. Applications and data, despite the promises of object-oriented programming (irrelevant anyway, since that’s a methodology of software architecture, not interface architecture3), are two entirely different kinds of entities. This means that one program that does both is inevitably going to have, as I just noted, an inconsistent, confusing, unfriendly interface. Blurring the distinction between applications and data under present interface standards only makes things worse. Why not remove the controls entirely and make the browser into essentially a sort of terminal emulator window for remote applications? Or why not go all the way in the other direction and make everything you work with on the computer part of a unified, modeless, totally data-centric interface, like Swyftware and the Canon Cat? (Actually, I’m less than half joking with that last rhetorical question—Jef Raskin’s legacy is the only viable hope I’ve yet seen for a truly new and truly better approach to the UI.)

Jesse James Garrett’s whitepaper that introduced the term “Ajax” posed as an important open question “Does Ajax break the back button?” I’d turn that around: Does the back button break Ajax? That is, is the Web 0.9 interface of the browser a vestigial impediment to writing applications that run well (meaning at the same usability level as traditional non-Web-based applications) in the browser window?


  1. E.g., as articulated in Chapter 1 of the Polar Bear Book. [back]
  2. The mirror image of this problem afflicts the implementation of the cd command in bash (the standard shell on Linux). If you are currently in directory X and follow symbolic link S to directory Y, then enter the command “cd ..”, you end up not in the parent of Y but in X again! There is no way to get to Z, Y's parent, without explicitly specifying more of its path than just “..” (i.e. “parent of current”). This is broken beyond belief. Look in any documentation for any command-line interface that includes the cd command (MS-DOS, VMS, Unix shells, whatever) and I guarantee you won’t find “cd ..” explained as meaning “undo.” For it to behave as such is horrifyingly inconsistent. “..” means “up one level in the hierarchy.” Symbolic links explicitly break the hierarchy, but that’s OK: they’re understood to be “hyperspace” shortcuts, like the Secret Passage across the board in Clue that takes the player on a third-dimensional trip outside the board-game Flatland. [back]
  3. And the dangers of the tendency of programmers, and of companies headed by programmers, to conflate the two are legion. [back]