Gregor's Ramblings
Home Patterns Ramblings Articles Talks Download Links Books Contact

Mashup Camp

July 18, 2007

ABOUT ME
Gregor Hohpe
Hi, I am Gregor Hohpe, co-author of the book Enterprise Integration Patterns. I like to work on and write about asynchronous messaging systems, service-oriented architectures, and all sorts of enterprise computing and architecture topics. I am also the Chief Architect at Allianz SE, one of the largest insurance companies in the world.
RAMBLINGS
Here I share my thoughts and ideas on a semi-regular basis. I call these notes "ramblings" because they are typically based on my personal opinions and observations as opposed to official "articles".

I attended the first day of Mashup Camp today. The event took place at the Computer History Museum, which is actually walking distance from my office. Ironically, the only day in the year when something actually happens within walking distance in Silicon Valley it had to rain. Oh well. On the upside, my walk took me past the temporary Kwiki Mart.

This was my first mashup camp so I mostly came to listen and get plugged into this community. It's part of my master plan to bring integration ideas to a whole new user group. To not embarrass myself completely I authored some tutorials on Google Mashup Editor and Yahoo! Pipes to code.google.com.

The camp drew a good 100 – 150 people, which is a good size for an openspaces unconference. Apparently the Mashup University held during the past 2 days drew only about two dozen people, which is disappointing. Maybe people are intimidated by the hipster mashups crowd or it's impossible to time off from work to learn how to plot your favorite bars onto a Google map? Or maybe building mashups has become so easy that there is no need for a university? I guess we can only speculate or do a better job advertising the event next year.

Here my impressions of day one:

Vendor Presence

I was impressed and a bit surprised by the number of vendors at the conference. About 7 or 8 vendors had setup a table, including the usual heavyweights like IBM, Sun, Google (I was hoping for some Yahoo! Pipes folks) as well as mashup pureplays like Dapper. The overall feel I got was that the mashup community is a bit confused whether they want to be baggy pants wearing renegades scraping data off big corporate sites for the greater good of mankind or whether they want to be the next generation of VC funded Internet startups. I honestly expected more discussion and sharing as opposed to vendor demos, but some of the demos were actually interesting.

I was a little put off by the openkapow guy who claimed that they solved all issues of brittleness in HTML screen scraping without being able to say exactly how. The demo seemed to use a fully qualified XPath into the DOM node of interest, the slashes cleverly replaced with dots. Not so robust. I am sure the tool can do much better than that but I did not expect to see that much hand waving and corporate attitude at camp. They one-upped this only when someone asked for their competitors to be told "I don't know. Can you name some?" Come on, dude, this is not MBA Camp but is supposed to be a community event. Plus, if you don’t know your competitors you should not be running a business in the first place. Well, at least they are making money, which after some prompting turned out to be revenue money as opposed to profit money. I think the last time we brushed off the difference as a minor detail was in 2000. Besides the attitude, though, the tool looked pretty powerful.

Voice Integration

I noticed a surprising variety of voice integration, possibly spurred by the attendance of folks from Lignup. The ability to have your computer call you is intriguing because the system includes an IVR voice prompt ability. Hmmm, now I can create my own customer service menus? Press "1" if you are aggravated, "2" if you want to kill the service rep. I think voice also has liabilities because it is a synchronous medium (at least if you want the interactivity). Also, getting 50 alert e-mails are easier to deal with than 50 alert calls. Still I like to have the option of integrating voice into my applications. Google local voice for example is great and the voice recognition is amazingly good.

End User License Agreements

The openspace boardMashup Camp partly felt like a graffiti artist convention. The performers know that what they do is not quite legal but it is cool. Until they try to make a living off it. The issue of Web site end user agreements became most apparent when one of the guys decided to use his "Clipper" tool to take the Google search page and replace the Google logo with the Yahoo! Logo and clip the ads off the results page. To his shock, Google had asked him to take down some of these hosted "services". While it may seem so corporate and anal for people to not allow you to scrape their screens we need to remember that some of them pay for the data, so there rarely is a free lunch. Even in the dot-com run-up someone paid for all those free services and parties. Some of my friends are still writing their $3000 per year capital losses. It'll be interesting to see how the issue of data ownership will evolve. Ads in RSS? Micropayments? As mashups become more popular (especially the HTML scraping ones) I am certain the issue will come up. HTML screen scraping is always going to be controversial because sites who like to share their data already provide feeds (e.g., most of the Web 2.0 crowd or marketplaces like craiglist). The ones who don't, probably don't like you to scrape their data off their (probably ad-funded) site. Luckily, inside the enterprise this is less of an issue.

Integration Tools

The guys from IBM and SnapLogic showed mashup data processing tools that share some similarities with early EAI tools (TIBCO MessageBroker comes to mind). Similar to Yahoo! Pipes they allow users to users to compose data pipelines that extract, transform, and combine data. The data stream is then available in a variety of formats, for example RSS and Atom. The demo combined data from SalesForce and Quickbooks.

Enterprise Mashups

I joined a good discussion on enterprise mashups. This was not a demo but more of a brainstorm. We concluded that it is not so simple to distinguish between mashups and composite applications. We did compile the following list to help contrast them:

  • Mashups: REST/XML, ad-hoc, bottom-up, easy to change, low expectations, built by user
  • Composite Apps: SOA/WS-*, planned, top-down, more static, (too) high expectations, built by IT

I think part of the big shift here is that in the past ad-hoc solutions were not integratable and integratable solutions were cumbersome to build. Mashups are often built ad-hoc but simple protocols like RSS or Atoms allow them to be integrated very easily. One person rightly pointed out that enterprise mashups are the Visual Basic macro of the Internet age.

Mashup Patterns

SpeedGeekI really enjoyed talking to the guy who build a mashup across Amazon and public library Web sites. He keeps a wish list on Amazon and checks the public library to see whether they have the book. It was refreshing to hear about some of the actuall issues and hurdles he had to overcome (unlike most of the vendors, who make you believe it is a piece of cake once you use their tool). For example, Amazon data for books works by ASIN, which for books equals the ISBN. As such, it cannot recognize the relationship between the same works published in different editions or as book on tape.

Another interesting hurdle was the fact that you cannot search Amazon by UPC for non-book products. However, you can get the UPS once you have the ASIN. To go from UPC to ASIN the solution uses Google search. The wide syndication of Amazon data enables a search for the UPC and the word "ASIN" to include the ASIN in the search results. To minimize errors the solution uses the discovered ASIN and retrieves the UPC from Amazon to make sure the match is correct.

I think these techniques could make for some more widely applicable mashup patterns:

Key Translation with Search

Problem: You have a key but require a different key for a database lookup.

Forces: Popular keys are likely to be stored on Web pages in proximity. Getting a key translation database will be difficult or impossible

Solution: Search for the key you have and the name of the desired key.

I tried this approach to do perform geocoding. A simple search for "San Francisco geo:long" yields the following text in the first result geo:lat=37.759665 geo:long=-122.421509. A quick check in Google Earth confirms that the coordinates do in fact point to San Francisco. Maybe this is a trivial example but it makes we want to try this with more types of keys.

Lookup Confirmation

Problem: You can find key B from key A but nor the other way around. You have a unreliable method to lookup A from B.

Solution: Perform the unreliable lookup from B to A. Then use the precise lookup from A to B to make sure you got the right A.

These patterns could benefit from some more elaboration but I think documenting some of the approaches will be useful for mashup developers

More Later

Time to go to bed to be fresh for another day at camp.

MORE RAMBLINGS    Subscribe  SUBSCRIBE TO GREGOR'S RAMBLINGS


Gregor is the Chief IT Architect of Allianz SE. He is a frequent speaker on asynchronous messaging and service-oriented architectures and co-authored Enterprise Integration Patterns (Addison-Wesley). His mission is to make integration and distributed system development easier by harvesting common patterns and best practices from many different technologies.
www.eaipatterns.com