Thursday, June 24, 2010

Stymied by horrible interface design, not for the last time

I had an experience last night that was a textbook example of how not to design a user interface for a consumer product. (I've been told this post is too long: I think the anecdote is revealing, but skip to the last paragraph if you just want to read the moral.)

I was trying to connect a new DVD player to a high-definition flatscreen TV. The DVD player box said that it did automatic upconversion of DVDs (which normally, of course, are not high-def) to the highest-quality HD protocol, 1080p, so I was hopeful that the HDMI connection would provide excellent video quality.

I expected to be able to just connect the HDMI cable to both devices, set the TV to accept an input from the HDMI port, tweak a few setup parameters, and be off and running. Of course it didn't work that way. No picture. The DVD player had a series of LEDs on the front indicating what type of signal was being sent over the cable at that moment, and a button labeled "HDMI", duplicated on the remote, to select one. I tried pressing that button repeatedly. Still nothing. HDMI is a handshaking (i.e., two-way) protocol which should gracefully degrade if necessary down from 1080p down to 720i (DVD quality), with stops in between if the hardware is capable of it, and I expected at some point the DVD player and TV player would find a protocol on which they could agree; but apparently they didn't. I say "apparently" because neither device offered any kind of error indication other than a black screen!

I looked in the manual, and found that I was supposed to manually select HDMI output in a setup menu. In order to see the menu, I hooked up the player to the composite video/left-right audio jacks on the TV, and was able to get a (relatively low-quality) picture. I changed the output setting to HDMI, and changed the TV input back to match. Still no picture. I pressed the "HDMI" button on the front of the DVD player a few times, and watched the LEDs cycle through the available HDMI video options, but with no change (the TV displayed nothing but "HDMI Input," which apparently meant "I'm waiting for HDMI input" rather than "I detect it").

I went back to the composite connection on the TV and looked at the output setting in the DVD player setup menu again. "Composite" was selected—"HDMI" was still available, but was not selected as I had left it before. I made sure I was pressing the right button to save the settings when I left the setup menu, rather than exiting without actually making changes. I was. I tried the same routine a few more times, but the "HDMI" setting for DVD output just wouldn't take.

So here are a few tough questions:

  • Why couldn't the DVD player designers have included a single LED, maybe a red/green indicator of whether HDMI handshaking was successful or not?
  • Why an HDMI button on the machine (and the remote!) at all? The player ought to automatically output the highest quality signal that it can, at all times.
  • Similarly, why an output selection menu on the DVD player? Why can't all outputs on the player be active at all times?
  • Finally, why couldn't the TV, with much more real estate for messages, display something like "No compatible HDMI signal found" to assist with troubleshooting?
As it was, I made an educated guess, based on the limited evidence available, that the DVD player was faulty right out of the box and unable to actually generate an HDMI signal (not at all far-fetched given the general cheapness of consumer electronics these days): perhaps a poorly soldered connection to the output jack on the circuit board inside. But the frustrating hour spent arriving at that educated guess was worthy of user-interface guru Donald Norman, whose discussions of confusing electronics in cars, door handles shaped so that you don't know whether to push or pull them, and much more, are still 100% relevant twenty years after he began publishing them (and should be required reading for anyone involved in the design of end-user software or consumer products).

Why did I tell this long story on a blog primarily about software design for financial analysis? Bottom line: design affects productivity. The way things work affects what the user can get out of them, and that's always going to be true, whether you're producing an iPhone app, a spreadsheet, a rich web application, or whatever the Next Big Thing in tech is.

Wednesday, June 23, 2010

I'm overdue to learn Java

Part of my reason for writing this blog is to document my reëducation in programming (and, reciprocally, to encourage myself to take that process seriously). I've just ordered a general Java handbook for experienced programmers and a book specifically on graphics and audio programming. I'd like to port (or write shameless copies of, depending on your point of view) some classic 8-bit games to Java, with the aim of making them sufficiently platform-agnostic that they will run on mobile devices just as well as on desktop OSs. That little trackball on the BlackBerry is just crying out to be used to play Centipede, Missile Command, or maybe something a bit less well known like Crystal Castles:

I think my first significant Java program, however, is going to be a machine that plays Terry Riley's modern-classical composition In C (differently every time, of course). And there'll be some kind of kaleidoscopic graphical accompaniment showing which notes each "musician" is playing when. Yes, I'm aware of how ridiculous this may seem, but what's wrong with it as an exercise in generating graphics and sound in sync?

The last time I was paid to write software, it was a couple of medium-sized programs in C++ that ran in user space on Windows NT but worked hand-in-hand with a custom driver running (of course) in kernel space. One of those programs that I'm particularly proud of was an interpreter for a "little language" that enabled automatic testing of some proprietary vertical-market hardware. After that job, I taught C++ to undergrads for a while, with an emphasis on OOP fundamentals and how to apply them properly, not just on the syntax of the language. So I don't expect any trouble becoming fluent in Java, since the concepts of C++ are still rattling around in my head: multiple inheritance, operator overloading, virtual functions, references, and all that cool stuff. (I'm rusty on C++ syntax, but that will come back quickly with exposure if necessary.) Still, this will be far more of a brain-stretcher than the Perl code in the previous couple of posts—if the two can even be compared: in the Java case we're talking about brushing up on the effective design of a medium-sized program, while the tiny Perl web-scrapers I presented had a trivial structure (and what little structure there was, was procedural, and I've been doing procedural programming for thirty years).

Coming soon: my resume

Just what the title says! I'm open to offers of permanent or contract positions doing challenging, interesting work in any of the areas I touch on in this blog (see the subtitle above if what I mean isn't clear), as well as in related fields such as education, training, technical writing, etc.

I am particularly interested in interdisciplinary work such as business intelligence software development, data mining, or instructional design, but the bottom line is this: if, after reading some or all of this blog, it sounds like I might be interested in doing something for you, then I probably will be.

Monday, June 21, 2010

Another financial-data-mining/web-scraping/scripting exercise

Continuing the theme of the previous couple of entries, I'm thinking of writing a program, almost certainly in Perl, to scrape the options price and open interest data for any given stock from the Morningstar web site (example here) and analyze it in various ways. It seems to me that there are more ways, and sometimes simpler ways, than the traditional "greeks" to evaluate the resulting data set as a predictor of short- and medium-term stock prices.

Given the numbers on the Morningstar page, I should be able to compute:

  • A simple put/call ratio: total open interest of puts over total open interest of calls.

  • (Here's where my own ideas start) A put/call ratio where the open interest is weighted according to time until expiration—i.e., near-term options are given more weight since (maybe) they represent traders who have a larger stake in the game and thus are paying more attention to whether their bet will pay off.

  • A similar ratio, but with the open interest weighted by how far the strike price is out of the money.

  • Change in the price of the option and of the underlying stock. (Call these delta-o and delta-s to avoid confusion with the options greek called "delta.")

The last should be graphed over time. There's a nice module in CPAN already to build 2-D graphs (line charts, bar charts, the usual) given an array of data, and output them as image files ready to be tagged on a simple web page like I did in the previous entry. But you have undoubtedly already noticed, Careful Reader, that there are really two dependent variables in the graph I just described. (Yes, the difference and the ratio between delta-o and delta-s are interesting, but the individual stats are also interesting in themselves.) This calls for a three-dimensional graph. I don't see any relevant library on CPAN, but I'm sure a little digging will turn up something I can adapt. (In fact, I already have the book Perl Hacks, which has a nice explanation of how to do bitmapped graphics in a window with SDL; so all I really need, if I'm recalling Computer Graphics 101 correctly, are the equations for a projection of a three-dimensional point onto a two-dimensional plane from a particular relative viewpoint). Having done this, I've got another idea for which a 3-D graph—and preferably a dynamic one that you can "fly around" and look at from all sides—would be not just nice to have but mandatory: distance out of the money (y) and price (z) versus time to expiration (x). If the points are colored or shaded appropriately to indicate the sign and magnitude of delta-o, then a large solid-colored area would be a tipoff that a certain set of options with a similar strike price and time until expiration have gone up or down in asking price—quite possibly a significant leading indicator of the price trend of the underlying security.

The analysis of options is always—and this is a truism, but a deep and important one—complicated by the fact that for every buyer of options, there's also a seller. The line "Most options expire worthless" is often given as an argument that options buyers are mostly ignorant speculators whose bets don't pan out. I don't buy that. (If that were the case, the put-call ratio is universally interpreted in reverse; we ought to expect that the buyers of puts are mostly wrong in their predictions, so the price of the underlying stock will go up; and the inverse for calls.) I'd bet that most options are in fact both bought and sold not in order to speculate on the options themselves but in order to hedge a trade of the underlying securities. If that's the case, then something would be wrong if the options mostly didn't expire worthless. Most people rarely if ever make claims against their car insurance, too.

(Hey, there's even already a Perl module to do Black-Scholes options pricing...)

Wednesday, June 2, 2010

"Now seems like a good time," I said to myself...

..."to get those rusty programming skills going."

I had found myself wanting to do some analysis in Excel of price behavior of a large list of stocks.

I glanced at the first few pages of Perl and LWP, and then at the Regular Expressions Pocket Reference; I opened Firebug on the Yahoo Finance "summary page" for a stock I was interested in, so that I could see the raw HTML I was dealing with; and wrote the following:1

use LWP::Simple;

# Expects a list of security symbols on standard input, one per line.


while ($sym = <>)
chop $sym;
die "Couldn't get Yahoo Finance Quote Summary page for symbol $sym!"
unless defined $summary;

$summary =~ m/>Prev Close:<.*?>(\d+\.\d+)</;
$prevclose = $1;
$summary =~ m/>Open:<.*?>(\d+\.\d+)</;
$open = $1;
$summary =~ m/>Last Trade:<.*?>(\d+\.\d+)</;
$last = $1;


It worked the first time—not bad for not having done any programming whatsoever for about five years and nothing of significant size for ten. (Yes, I know it's not very idiomatic Perl—combining the match regexps and doing a few other things would probably cut the line count in half.) That code took a few hours to produce, but subsequent similar programs to web-scrape other pages took much less time, now that I was in the groove.

For example, more exciting was the following, which expects the same list of symbols:

use LWP::Simple;

print "<html>";

while ($sym = <>)
chop $sym;

print "<font size=5>$sym</font><br>";

foreach $period ("1d","1w","1m")
getstore("" .
"s=$sym&t=$period&q=l&l=on&z=m&p=e5,e20&" .
print "<img src=\"$sym$period.png\"/>";

print "<br><br>\n";

print "</html>";

What I'm doing here, if it isn't clear, is scraping a number of security price charts from Yahoo Finance, saving the image files locally, and building a crude but effective web page to make them viewable in one place. Beats looking at each stock by hand for price trends, let me tell you!

Now, all of this may seem like "Hello World" stuff to anyone reading this who's had any programming experience beyond Computer Science 101. But what I think shouldn't be taken for granted here is the amazing ability to (in the first script, as the simpler example) in just a few lines of code, suck an entire web page into a string variable, search that string in a complex way, and output the result in a universally readable (i.e. by humans or other programs) format. We used to want applications to have built-in programming languages—now (and here's the takeaway!) we have programming languages with built-in applications: very-high-level functionality to do things that only applications used to be able to do. And we can do them in a scriptable, redirectable, programmatic way. Admittedly much of this is due to the straightforward API of the LWP module (Perlspeak for "library"); but I'd argue that that accessibility is a function of the design of Perl; there's obviously stuff going on there behind the scenes that would be much harder to write in a language without such integral support for string manipulations (C, say).

I was weaned as a programmer on 1980s consumer 8-bit machines, the multimedia powerhouses of their day, on which even in a high-level language (built-in BASIC), to do anything interesting you had to twiddle bits. And most of the software I've been paid to write has been low-level stuff in C—device drivers and the like. So I'm easily impressed and easily seduced by VHL (very-high-level) languages that let you do so much with so little typing. Of course there's danger inherent in only knowing high-level languages. When you don't understand what's really going on at the machine level, optimization can be much more difficult, for example. Ironically, though, even as undergraduate computer science programs deemphasize C and assembler skills and move their students towards Java, C#, .NET, PHP, and so on—preparing them more effectively for the kind of web-back-end-database-interface work 9 out of 10 of them will face as new programmers—even as this huge and largely unremarked shift in what it means to be a professional computer programmer takes place, hobbyists tinker with microcontrollers, programmed at as low a level as you want, to recapture some of that early-80s frontier-machine-code feeling. Some will call this retrograde or Luddite-ish but the truth is, I think, that controlling hardware directly with one's code fulfills some kind of deep need in the engineering personality to exercise maximum control over one's immediate universe; and there's nothing wrong with the practical experience gained thus: few programmers will ever write an operating system, true, but there will always be lesser software that needs to run "close to the metal." (A $5 pocket calculator will never run a Java interpreter, for instance. I think...)

Getting back to my own programming for my own use and profit, far more complicated and wonderful things will come in time. I'm comfortable using Perl for this kind of stuff, but have never written a program of any serious size in it. What little user-level software development I've done has been fairly strictly object-oriented code in C++. I only know how to use Perl procedurally; understanding the OOP features of the language, which seem to be highly regarded, would be a good thing to have under my belt.

On the other hand, I have had a strong hankering to learn Python, thanks to what seems to me to be a very elegant syntax. And I have the book A Primer on Scientific Programming with Python which—while I'm quite sure that somewhere on CPAN there's a module to support in Perl the same kind of computations I need to do—numerical integration and differentiation, curve fitting, linear regression, etc.—is an excellent tutorial for Python in general besides describing the appropriate libraries in detail. Lastly, the Beautiful Soup library looks like an even cleaner way to do webscraping.

  1. How the heck do you format code nicely (i.e. not just in a non-proportional font but also indented correctly, lines that overrun the margin indicated clearly, and with symbols correctly escaped) in idiomatic HTML these days? Yeah, I know there's the <pre> tag, but it doesn't help you with lines that run past the right edge of your text frame (or wherever your body text is going), and you still have to festoon your code with &whatever-entity tags to escape all the non-alphanumeric characters.[back]