July 27, 2007
Mashups pull data from different sources, aggregate and transform the data to be used in different contexts. EAI solutions pull data from different sources, aggregate and transform the data to be used in different contexts. Huh?
In a previous post I promised to talk a bit more on Google products, especially now that Google has a variety of offerings in the Web 2.0 space. As great as maps are, the Maps API is now one of 34 (!) public Google API's and developer tools. Since I've been playing with Mashups recently, what would be a better fit than Google Mashup Editor (GME)? GME provides a browser based development environment that allows developers to create HTML pages, which incorporate data from external sources. My GME tutorial on Google Code describes how to build a simple GME application, which uses data from my public Google speaking calendar (the calendar is a little sparse right now as I am planning my next round of conference appearances). Google calendar data is available as an Atom Feed, which can be read by GME. My GME application calls the Google Maps API to enrich the feed with longitude and latitude data based on the conference location. Finally, it plots the results on a map. Not earth shattering, but trying to build this type of application 5 years ago before the advent of geocoding, RSS, Atom, and Google maps would been very time consuming at best. The presence of standard protocols and powerful tools make this type of integration relatively easy.
Did I say "integration"? Yes, you heard correctly. Nowadays the term "EAI" is certainly more uncool than anything Web 1.0 but the problems of integration have not gone away. Let's look at the GME application I just described from a higher level of abstraction, let's say at the level of integration patterns:
The integration patterns help us decompose the behavior of our solution into individual patterns. Using the Pipes-and-Filters style, we can implement each pattern as its own component and compose the solution from the individual pieces. The composability of the pipes-and-filters approach is one of its biggest strengths and hinges on the availability of a universal message channel. As we all know, the Internet is not message-oriented (although I argued before that it can be considered event-driven), but not all is lost. The most common integration point on the Web is XML over HTTP, especially when using standard formats like RSS. So could we use RSS as our message channel and compose the solution from independent components? Last time I showed how to use Yahoo! Pipes to extract XML data from Amazon's e-Commerce Service and convert it into an RSS feed. Pipes can do a lot more, including geocoding of an incoming RSS feed. In this role Pipes enriches the data stream with geo data, i.e. longitude and latitude, and makes the enriched stream available as an RSS feed, our universal integration point. Naturally, GME applications read geo-enriched RSS feeds and display the data on a map without even a single line of code. The new solution looks as follows (sorry for the 90 degree rotation):
My new tutorial on Google Mashup Editor and Yahoo! Pipes describes how to build this solution step-by-step.
Using Pipes in this role reminds me of the VETO pattern: Validate, Extract, Transform Operate, which is commonly used in EAI and ESB solutions (e.g., see Dave Chappell's description of the pattern ). It sounds like a number of our friends from the EAI world are still useful in the world of Web 2.0 and mashups.
In my post about Mashup Camp I speculated that the difference between mashups and composite applications is more about the approach than the technology, so it is not surprising that some of the same patterns and techniques seem familiar. This fact appears to be confirmed by vendors with extensive integration and ETL experience now dabbling in the mashup space (more about that next time). While we do not want to sound like the CORBA people in an SOA world ("we built all that 10 years ago!"), it does mean that some of the old challenges are likely to still haunt us. For example, token and ID management can already become a challenge for mashups. Semantic mismatches and incompatible data formats are a staple of data integration and are not likely to vanish.
However, I think we learned a few things in the meantime. Because integration is difficult, mashups set the expectations lower and are generally less ambitious. They provide small, high-value point solutions as opposed to trying to achieve enterprise-wide integration nirvana. Moreover, after realizing that syntactic schema definitions à la WSDL solve only a very small portion of the semantic mismatch issues, mashups put the whole notion if schema aside and just look at an example message. Yahoo! Pipes makes this very clear when you use the "Rename" module. This module is the equivalent of the omnipresent EAI transformation editor. Rather than requiring a schema, the module samples the incoming stream and gives you a selection of all fields it saw. Is the field required or optional, a string or a number? Pipes does not care much because it does not have to. Does this mean schemas and data formats are a dumb idea? No, if you are building a B2B integration with your business partners a rigid definition of all message formats and protocols (beyond schema) is an excellent idea (hello, RosettaNet). Tools like Pipes highlight that the schema approach is a pretty big hammer and you can get pretty far with a much smaller hammer, that is also easier to carry around.
The IT world is known for its short memory (and attention span). So let's remember what we learned in EAI and not reinvent everything from scratch. Repackaging some of the old stuff can be fun and gets attention. In the EAI world I would have never written a tutorial explaining how to add two data fields to a message stream...