API's are for Humans, too

Find my posts on IT strategy, enterprise architecture, and digital transformation at ArchitectElevator.com.

When graphical user interfaces became popular developers loved it because GUI's are cool and fun, and make the software easier to use. But creating good GUI's is not easy and often requires user interface designers, developers writing abstraction layers etc. From the view of a development manager this equates to Dollars (well, at the current exchange rate US Dollars don’t matter that much, lets assume it equates to a real currency like Euros or British Pounds). Naturally, it was not long before vendors decided to capitalize on that market opportunity. Why don’t you create the GUI from your business objects or your database tables? If a customer has 10 fields we can simply render a screen with text boxes and labels for people to enter that data. If an order form has multiple line items we can create a grid multi-row input element. As expected, to a drowning man any straw seems good enough so people jumped on the bandwagon. When the buzz wore off they realized that the silver bullets were in fact made from lead (and failed to kill any werewolves). Data structures (objects nor tables) do not make good user interface elements unless you build a simple data entry application (despite all the jokes Microsoft Access can claim some success in that category). I like to remind people that in the real world people do not set their address, their last name, or their bank account balance. Instead, they move, get married (or divorced) or transfer money into their account. User interfaces are all about intuition, guiding the user, having the user build a mental model, not about setting fields on business objects.

Soon people shuddered at the mere thought of generating user interfaces from the code (who would ever do that?!) and went on to hire interaction designers, information architects and the like to build nicely usable applications. Of course then the Web came around and GUI's started to look worse than the stuff generated from the code. But I digress. This is supposed to be about API's after all.

Half a decade later we discovered that we can not only have users browse over the Internet but we can make machines talk to each other as well. Web services and SOA became the latest buzzwords and it is all about API's now. Once again we need to create a new type of interface, but this time it is one for machines. Instead of drop-down controls and drag-drop interactions we deal with Schemas, WSDL, XML parsers and the like. Once again, this requires new skill and new Dollars (or British Pounds or Euros). Invariably the thought of just generating this interface from your objects springs to mind. If we could only generate this stuff, we won't have to hire people specializing in all the Web services hoopla. This time it is even more appealing. Generating stuff that is meant for actual users seems kinda hard. But generating a programming interface? That must be much easier. With predictable certainty dozens of vendor offerings and open source frameworks generate WSDL and what not from Java classes and about anything else that looks like an application code.

Let's not forget, though, that API's are also consumed by humans. Some developer (yes, developers are humans!) does read that documentation or sample code and builds an application that interacts with yours. What's most important, this developer forms a mental model very much in the way a GUI user does. Anyone interacting with your system sees it as a big black box with a few knobs and levers, i.e. API methods or buttons. Based on the observed behavior (and the invariably terse documentation) they form a mental model of what goes on inside the box. They use that model to predict what will happen and are surprised (or frustrated, or aggravated) when the actual result does not match the expected result.

I was reminded of how difficult it is to build a good model when Aaron Geman came to Google to show his TinkerThink Puzzle. His puzzle is not even in a black box but a transparent one. And the controls are limited to one crank and two or three handles. Based on these controls pinballs are routed through the maze, the goal being to channel one ball into each of the three output lanes. According to Aaron it takes many people an hour to resolve this puzzle (I managed to get 2 balls in within 10 minutes and decided to quit while I am ahead). Now compare this to an API to a complex system. The inner working consisting of hundreds of classes are reduced to a few dozen levers and cranks, leaving the user wondering which lever they have to pull to get the desired effect. And they can't even look inside.

Now I am not the first person to advocate contract-first design of public interfaces. Yet there is still surprisingly little work telling us how to build intuitive programming interfaces. Kevlin Henney as well as Josh Bloch have shared some of their experience. For example Josh popularized the notion of the conceptual weight of an interface. I would like to see more concrete guidance for good design of Web services interfaces that allow developers to build consistent and accurate mental models, which make using the API intuitive. But first we have to get people of the "auto-generate the interface" koolaid.

ps: If you are wondering what Big G is doing in API's, have a look. More on that later...