Find my posts on IT strategy, enterprise architecture, and digital transformation at ArchitectElevator.com.
I attended the first day of Mashup Camp today. The event took place at the Computer History Museum, which is actually walking distance from my office. Ironically, the only day in the year when something actually happens within walking distance in Silicon Valley it had to rain. Oh well. On the upside, my walk took me past the temporary Kwiki Mart.
This was my first mashup camp so I mostly came to listen and get plugged into this community. It's part of my master plan to bring integration ideas to a whole new user group. To not embarrass myself completely I authored some tutorials on Google Mashup Editor and Yahoo! Pipes to code.google.com.
The camp drew a good 100 – 150 people, which is a good size for an openspaces unconference. Apparently the Mashup University held during the past 2 days drew only about two dozen people, which is disappointing. Maybe people are intimidated by the hipster mashups crowd or it's impossible to time off from work to learn how to plot your favorite bars onto a Google map? Or maybe building mashups has become so easy that there is no need for a university? I guess we can only speculate or do a better job advertising the event next year.
Here my impressions of day one:
I was impressed and a bit surprised by the number of vendors at the conference. About 7 or 8 vendors had setup a table, including the usual heavyweights like IBM, Sun, Google (I was hoping for some Yahoo! Pipes folks) as well as mashup pureplays like Dapper. The overall feel I got was that the mashup community is a bit confused whether they want to be baggy pants wearing renegades scraping data off big corporate sites for the greater good of mankind or whether they want to be the next generation of VC funded Internet startups. I honestly expected more discussion and sharing as opposed to vendor demos, but some of the demos were actually interesting.
I was a little put off by the openkapow guy who claimed that they solved all issues of brittleness in HTML screen scraping without being able to say exactly how. The demo seemed to use a fully qualified XPath into the DOM node of interest, the slashes cleverly replaced with dots. Not so robust. I am sure the tool can do much better than that but I did not expect to see that much hand waving and corporate attitude at camp. They one-upped this only when someone asked for their competitors to be told "I don't know. Can you name some?" Come on, dude, this is not MBA Camp but is supposed to be a community event. Plus, if you don’t know your competitors you should not be running a business in the first place. Well, at least they are making money, which after some prompting turned out to be revenue money as opposed to profit money. I think the last time we brushed off the difference as a minor detail was in 2000. Besides the attitude, though, the tool looked pretty powerful.
I noticed a surprising variety of voice integration, possibly spurred by the attendance of folks from Lignup. The ability to have your computer call you is intriguing because the system includes an IVR voice prompt ability. Hmmm, now I can create my own customer service menus? Press "1" if you are aggravated, "2" if you want to kill the service rep. I think voice also has liabilities because it is a synchronous medium (at least if you want the interactivity). Also, getting 50 alert e-mails are easier to deal with than 50 alert calls. Still I like to have the option of integrating voice into my applications. Google local voice for example is great and the voice recognition is amazingly good.
Mashup Camp partly felt like a graffiti artist convention. The performers know that what they do is not quite legal but it is cool. Until they try to make a living off it. The issue of Web site end user agreements became most apparent when one of the guys decided to use his "Clipper" tool to take the Google search page and replace the Google logo with the Yahoo! Logo and clip the ads off the results page. To his shock, Google had asked him to take down some of these hosted "services". While it may seem so corporate and anal for people to not allow you to scrape their screens we need to remember that some of them pay for the data, so there rarely is a free lunch. Even in the dot-com run-up someone paid for all those free services and parties. Some of my friends are still writing their $3000 per year capital losses. It'll be interesting to see how the issue of data ownership will evolve. Ads in RSS? Micropayments? As mashups become more popular (especially the HTML scraping ones) I am certain the issue will come up. HTML screen scraping is always going to be controversial because sites who like to share their data already provide feeds (e.g., most of the Web 2.0 crowd or marketplaces like craiglist). The ones who don't, probably don't like you to scrape their data off their (probably ad-funded) site. Luckily, inside the enterprise this is less of an issue.
The guys from IBM and SnapLogic showed mashup data processing tools that share some similarities with early EAI tools (TIBCO MessageBroker comes to mind). Similar to Yahoo! Pipes they allow users to users to compose data pipelines that extract, transform, and combine data. The data stream is then available in a variety of formats, for example RSS and Atom. The demo combined data from SalesForce and Quickbooks.
I joined a good discussion on enterprise mashups. This was not a demo but more of a brainstorm. We concluded that it is not so simple to distinguish between mashups and composite applications. We did compile the following list to help contrast them:
I think part of the big shift here is that in the past ad-hoc solutions were not integratable and integratable solutions were cumbersome to build. Mashups are often built ad-hoc but simple protocols like RSS or Atoms allow them to be integrated very easily. One person rightly pointed out that enterprise mashups are the Visual Basic macro of the Internet age.
I really enjoyed talking to the guy who build a mashup across Amazon and public library Web sites. He keeps a wish list on Amazon and checks the public library to see whether they have the book. It was refreshing to hear about some of the actuall issues and hurdles he had to overcome (unlike most of the vendors, who make you believe it is a piece of cake once you use their tool). For example, Amazon data for books works by ASIN, which for books equals the ISBN. As such, it cannot recognize the relationship between the same works published in different editions or as book on tape.
Another interesting hurdle was the fact that you cannot search Amazon by UPC for non-book products. However, you can get the UPS once you have the ASIN. To go from UPC to ASIN the solution uses Google search. The wide syndication of Amazon data enables a search for the UPC and the word "ASIN" to include the ASIN in the search results. To minimize errors the solution uses the discovered ASIN and retrieves the UPC from Amazon to make sure the match is correct.
I think these techniques could make for some more widely applicable mashup patterns:
Key Translation with Search
Problem: You have a key but require a different key for a database lookup.
Forces: Popular keys are likely to be stored on Web pages in proximity. Getting a key translation database will be difficult or impossible
Solution: Search for the key you have and the name of the desired key.
I tried this approach to do perform geocoding. A simple search for "San Francisco geo:long" yields the following text in the first result geo:lat=37.759665 geo:long=-122.421509. A quick check in Google Earth confirms that the coordinates do in fact point to San Francisco. Maybe this is a trivial example but it makes we want to try this with more types of keys.
Lookup Confirmation
Problem: You can find key B from key A but nor the other way around. You have a unreliable method to lookup A from B.
Solution: Perform the unreliable lookup from B to A. Then use the precise lookup from A to B to make sure you got the right A.
These patterns could benefit from some more elaboration but I think documenting some of the approaches will be useful for mashup developers
Time to go to bed to be fresh for another day at camp.