What Does It Mean to Use Messaging?

My blog posts related to IT strategy, enterprise architecture, digital transformation, and cloud have moved to a new home: ArchitectElevator.com.

I was recently asked to help a team decide whether they should use messaging. Of course, I have not forgotten what I learned during many years in consulting: the consultant always answers "it depends." A good consultant can tell you what the answer depends on. And a truly great consultant can convince you that you are asking the wrong question in the first place.

Without hesitation, I told the team that they are asking the wrong question. Messaging is not an all/or nothing decision, but one that happens at many different levels. With the limited time I had, I concluded that messaging has a place in at least three levels. There may well be more.

Transport Mechanism
Programming Model
Event-sourced Applications

Transport Mechanism

The first usage of messaging is as a transport mechanism in distributed systems. This is the messaging as we described it EIP. Using messaging in this context has many advantages as we outlined in our introduction:

Remote Communication. Messaging enables separate applications to communicate and transfer data.
Platform/Language Integration. A messaging system can be a universal translator between the applications that works with different languages and platforms.
Asynchronous Communication . Messaging enables a send and forget approach to communication. The sender does not have to wait for the receiver to receive and process the message; it does not even have to wait for the messaging system to deliver the message.
Variable Timing . Asynchronous communication allows the sender fire off requests to the receiver at its pace, and for the receiver to consume the requests at its own different pace. This allows both applications to run at maximum throughput and not waste time waiting on each other.
Throttling. A problem with remote procedure calls is that too many of them on a single receiver at the same time can overload the receiver. This can cause performance degradation and even cause the receiver to crash. Asynchronous communication enables the receiver to control the rate at which it consumes requests, so as not to become overloaded by too many simultaneous requests. The adverse effect on callers caused by this throttling is minimized because the communication is asynchronous, so the callers are not blocked waiting on the receiver.
Reliable Communication. Messaging provides reliable delivery because it can use a store and forward approach to transmitting messages with automatic retry.
Disconnected Operation. Some applications are specifically designed to run disconnected from the network, yet to synchronize with servers when a network connection is available. Messaging is ideal for these applications.
Mediation. A messaging system acts as a mediator, as in the Mediator pattern [GoF]. For example, if an application becomes disconnected from the others, it need only reconnect to the messaging system, not to all of the other messaging applications.

Programming Model

Messaging is not limited to physically distributed systems. It can be just as useful as a programming model to define the interaction between different parts of an application. Naturally, applications that include concurrent processing benefit the most from messaging. This fact is reflected by programming languages that are aimed at concurrent programming, such as Erlang (see a also the Functions + Messages + Concurrency = Erlang presentation by Joe Armstrong) or more recently Go. Go includes channels, which can be buffered (allowing for asynchronous communication) or unbuffered (allowing for sync points). I am hopelessly behind on checking out Go, but I hope to get to play with it one of these days.

Even though latency or unreliable communication are typically less of a concern in a non-distributed system, these systems share some of the benefits of a distributed message-based system. On top of the list is likely the simplified interaction model. Sending (one-way) messages to other components makes for a clear separation and often provides a more convenient programming model than threads. Message-oriented API's steer towards a data flow architecture as opposed to threads, where the control flow tends to stand in the foreground. What better suits your application depends, but it's good to consider both options.

Event-sourced Applications

So far we looked at messaging at the infrastructure (plumbing) and API layers. But a quite different use of messaging can be found the application domain layer. If all interactions with your domain layer occur through messages (or events – let's postpone the discussion about the difference), you can gain some unique features that would be otherwise difficult to obtain. The best description I have seen is Martin Fowler's event-sourced application.

The short version is that by representing all system updates as a stream of messages over time, you are able to go back in time and play back a modified stream of events, thus allowing you to perform "what if" calculations. A prime example is payroll computation. Let's assume an error was made during data entry, causing a worker to report an incorrect number of hours worked. The system recorded 39 hours worked, but the person actually worked 41 hours. By the time the error is detected, the pay period has already been closed and (incorrect) wages have been paid. The system is now tasked with computing how much the employee should have been paid. This may not be as simple as computing 2 hours' wage and adding it to the next payroll because a number of business rules may impact the amount to be paid. For example, by working 41 hours the employee may be eligible to be paid overtime, or to receive additional vacation or comp-time. Computing the differential can therefore be quite complicated. However, resetting the system to a "snapshot" to the previous pay period and replaying the correct series of events (including the actual hours worked) can compute the correct system state, taking into account all business rules. Resolving the error is now reduced to checking the difference between the system state based on the correct message stream versus the system state based on the actual, but incorrect message stream.

Financial systems are often based on a similar concept: instead of simply setting the account value to $100, each change (aka message) is logged in the system. This way one can review what transactions (credits and debits) led to the current system state (the account balance).

As we can see, using messaging is not an all-or-nothing proposition. It can span different layers, such as transport, API, or the business domain layer. But even within one layer, it's not binary either. For an interesting discussion, see Udi Dahan's blog post on CQRS, aka Command-Query Responsibility Segregation. In this architectural model, updates are performed via command messages, whereas reads are performed synchronously.