What Makes a Good Integration Developer?

My blog posts related to IT strategy, enterprise architecture, digital transformation, and cloud have moved to a new home: ArchitectElevator.com.

Good Developers are Cheap

Over time, a number of statistics have shown that the top developers outperform the average developers (not even talking about the bad ones) by an order of magnitude or so. As it turns out that the top developers do not earn 10 times the average salary, these guys (and girls) are a steal. Every company should then aim to get a hold of these "top quartile" developers, assuming the individuals demonstrate a minimum amount of social behavior. There are two main stumbling blocks in this chain of logic that I have seen. First, a company may not want hot shot developers out of fear that such a person may unsettle the equilibrium of average developers (or because the boss hates people who are smarter than he is). The second reason is that many companies do not know how to attract and evaluate hot shot developers. When it comes to assessing the quality of a developer, counting years of experience is an enticingly simple metric, but it is about equally flawed to using lines of code as a metric for productivity.

Over time people have learned that years of experience with a certain technology are not a good predictor of a candidate's performance. Yes, applicable technology experience is important as a bottom bar, but once a person has 1 or 2 years real experience using vendor X's tool or programming language Y, it is unlikely that we can find the great developer by asking for 5 years. Instead, we need to focus on how well the person understands the underlying concepts and principles, such as object-oriented design techniques. In recent years patterns have become almost the de facto standard for knowledge of these underlying principles. Interestingly, in this context patterns have almost become the victim of their own success. Every self-respecting developer can't help but spew out big words like factory, decorator, or observer. It's almost like developers brag about how many patterns they know -- an exercise not much more useful than counting lines of code. [oops -- this is starting to sound like a whole new topic -- back on track :-) ]

Skills Framework

So what can help more fairly assess an individual's skill set or the skills balance on a development team? A few years ago I developed a simple skills framework to ensure the proper skills balance on an integration project. I started by creating a tiered framework that ranges from generic to specific. Here are the categories I came up with (The first six items form a hierarchy while the last two stand on their own):

Computer Science / Algorithms
Modeling / Patterns
Languages
Development Tools
Packaged Applications
Operating Systems / Platforms
Architecture
Software Engineering / Best Practices

To illustrate the framework, a real person might know how to perform tree searches, sorts, or linked lists (or at the top of the food chain be familiar with Don Knuth's The Art of Computer Programming). On top of that the same person might understand object-oriented design principles and common design patterns, such as the proverbial Abstract Factory. He or she applies those patterns when programming in Java or C#, using the respective run-time libraries. A good knowledge of a refactoring IDE such as IntelliJ, Eclipse or Visual Studio (well, if you wait for the Whidbey release or use the Refactory add-on) certainly helps productivity. Since it is more and more likely that new applications interface with existing, packaged applications we cannot ignore those either. So a good grasp of SAP iDocs or some other package API is useful. Lastly, being able to administer and work with the OS of choice (this should include things like file or network services) or the database platform rounds off the skill set of the developer. Architecture is a fuzzy and broad enough skill that I will dodge it for now, but most of us will agree that a good developer should understand best practices such as test-first development and refactoring.

I consider all levels of the framework important, a good balance is certainly a strong asset. However, if I had to choose a single level it would have to be the programming models and associated patterns level. This is the skill that defines a developer's command of the underlying structures, design elements and constraints. In the object-oriented world, this would mean knowing how to achieve encapsulation and separation of concerns by using interfaces, inheritance, polymorphism, abstract base classes and the like. The interesting part is that in general the number of these constructs is relatively small, but the does and don'ts (i.e. the common patterns) around these constructs are plentiful. These tends to be the areas of higher conceptual complexity where 3 or 4 years of experience vs. 1 year experience do make a big difference.

The Integration Octopus

So how can we can map all this back to integration? Like an application developer, an integration developer has to work with IDEs of sorts, do some coding and understand the packages that need to be integrated. When we examine the category "models and patterns" as it relates to integration, though, it appears to me that it requires a much broader range of knowledge.As a first attempt I tried to summarize the underlying models as follows:

Asynchronous Messaging Architectures
Process Modeling
Rule-based Systems
Network Architectures
Object-Oriented Development

To me this variety of underlying programming models is part of the reason integration is (and remains) so difficult (the other reason is the inherently large scale of the solutions). Let's look at each of those briefly.

Asynchronous Messaging Architectures

Many integration solutions are based on message queues and publish-subscribe channels. Kilewise, there is also a considerable amount of energy around enabling Web services to support reliable, asynchronous messaging. Just like in the world of object-oriented systems, the basic constructs of this architecture are very simple: you deal with components, connected by message channels, which can buffer messages between sender and receiver. However, the correct use of such an architecture is a reasonably extensive topic (we managed to write a 700 page book about it!). One of the issues at the core of asynchronous messaging is for example the lack of a call stack, so the developer is responsible for keeping track what needs to happen next once a method completes. For an extensive treatment of design decisions in the world of asynchronous messaging, have a close look at the patterns on this site or, better yet, get the book :-)

Process Modeling

Once multiple participants exchange messages, it is likely that each participant executes an internal process to track the state of the conversation. Most EAI suites allow us to model these processes using specialized modeling tools (which can be a lot of fun). Again the underlying constructs and constraints of process modeling are very different from traditional procedural or object-oriented development. Process models focus on parallel execution, concurrent tasks and synchronization points. Also, a process definition tends to be much more driven by data flow than by the association of data and behavior as it is common in the object-oriented world. If you want to find out more about the theoretical underpinnings on how these process languages (for example, BPEL) differ from traditional programming languages, you might want to have a look at the π calculus (hint: most traditional programming languages are based on the λ calculus).

Rule-based Systems

Rule-based systems are yet another animal with its own set of constructs, rules and constraints. Rule-based systems appear in integration in two primary spots: business rules and transformations. Business rules deserve pretty little explanation -- they are commonly embedded in a process definition: "if the order is larger than $1000, it has to be approved by a supervisor first" type of stuff. As these rules become more complex, it makes sense to maintain them in a dedicated rules engine so that they are easier to test and maintain. Rules are usually not executed sequentially either. There are usually chosen by criteria such as specificity and priority. That means that the exact execution path of a rules engine might not be know at design time -- a scary thought for application developers that are used to being able to step through sequential code line by line. Some rule based systems are based on predicates, sort of a relic from the AI days. The key is that predicate logic is declarative rather than procedural, which means that its correctness does not depend on the order of execution. The second home of rule-based systems are transformations a la XSLT. Anyone who has tried to debug an XSL stylesheet can appreciate the pitfalls of declarative programming. XSL "chooses" the next templates based on pattern matching and specificity. Many EAI tools supply visual editors for transformation tasks, but as long as XSLT and XPATH live underneath, developing and debugging these components will always feel unnatural and difficult to developers who are used to sequential, procedural logic.

Network Architectures

If all this computer science babble was not enough, integration architectures also have a lot to do with networks. Not only does a good knowledge of IP networking help with designing integration solutions (I would count that as a "platform skill"), but the underlying principles are different yet again. Now we need to concern ourselves with graphs, shortest paths, dynamic routing, and algorithms by Ford-Fulkerson, Dijkstra and friends.

Object-Oriented Development

And last but not least, most integration solutions have significant code components that are best implemented using object-oriented languages and paradigms. So we expect the developer of these pieces to be fluent in OOAD, design patterns and related topics.

Conclusion

So does every integration developer worthy of a pay check have to be fluent in predicates, π calculus, graph theory and asynchronous architectures? Surely not (actually, that assumption would surely put a lot of us out of work!). However, this little tour though the theoretical underpinnings of integration solution hopefully leaves you with two thoughts. First, integration remains a complex field regardless of how many UI layers we put on top of it because the underlying programming models are inherently varied and complex. Second, the next time we look at a resume or compose a "wanted" ad we may go beyond the "X years with tool ABC" approach and see whether the candidates have at least a basic command of the underlying design principles and paradigms.