Gregor's Ramblings
HOME    PATTERNS    RAMBLINGS    ARTICLES    TALKS    DOWNLOAD    BOOKS    CONTACT

Serverless Loan Broker @ AWS, Part 5: Integration Patterns with CDK

Jan 25, 2022

Gregor HohpeHi, I am Gregor Hohpe, co-author of the book Enterprise Integration Patterns. I like to work on and write about asynchronous messaging systems, service-oriented architectures, and all sorts of enterprise computing and architecture topics. I am also an Enterprise Strategist at AWS.
TOPICS
ALL RAMBLINGS  Architecture (12)  Cloud (10)  Conversations (8)  Design (26)  Events (27)  Gregor (4)  Integration (19)  Messaging (12)  Modeling (5)  Patterns (8)  Visualization (3)  WebServices (5)  Writing (12) 
POPULAR RAMBLINGS
RECENT

My blog posts related to IT strategy, enterprise architecture, digital transformation, and cloud have moved to a new home: ArchitectElevator.com.

In Part 4 of this mini-series we automated the serverless Loan Broker using the CLI (just for kicks), AWS CloudFormation, and the Serverless Application Model (SAM). Although that was a solid step forward, by using an object-oriented language and CDK, the Cloud Development Kit we can build abstractions to express the intent of our distributed composition. If that reminds you of patterns, especially asynchronous messaging patterns, you are spot on!

This is the fifth installment of the Serverless Loan Broker mini-series. If you landed on this page first, here's a quick recap:

CDK - IaC for Grown-Ups

As we saw in Part 4, traditional automation tools favor document-oriented languages, represented in YAML or JSON syntax, occasionally augmented by simple functions to reduce repetition by means of loops or templates. Still, as your application grows, the resulting structures are likely to grow unwieldy. That's why recent automation tools like Pulumi, AWS CDK, or CDK for Terraform encode cloud automation as libraries for popular object-oriented languages like Java, Python, or TypeScript.

Because these libraries correspond to the same base constructs as document-oriented languages (which in turn map to the underlying cloud APIs), the transition to tools like CDK is relatively smooth. For example, creating an EventBridge target to invoke a Lambda function looks as follows in CDK for TypeScript (when checking the CDK reference docs, make sure you're viewing the up-to-date v2):

const fn = new lambda.Function(this, ...);
const rule = new events.Rule(this, 'rule', { eventPattern: { detail: { "field": "value" } } } );
rule.addTarget(new targets.LambdaFunction(fn));

The first line creates a Lambda Function, followed by an event rule that triggers when incoming messages contain a certain field value. The last line connects the rule to the function, instructing EventBridge to invoke the function for any message that matches the rule. The rule itself has to be associated with an EventBus, which is set to the default bus if none is specified (as is the case here).

lambda.Function is part of the aws_lambda package whereas targets.LambdaFunction is specific to defining an EventBridge target that happens to be a Lambda function and is appropriately part of the aws_events package. I am wondering whether a factory method and some overloading would have saved us from having to use a separate class for each type of event target, but then different targets have different options, which might break our polymorphism or leave us with a generic and untyped props collection, which isn't any better.

CDK Constructs

Being able to use a proper IDE, auto-complete and all the OO language features that have grown so dear to our heart is a definite step forward. However, wouldn't suit us to actually use those freshly rediscovered language features to provide some abstractions, like what we do when writing applications?

Sure enough, CDK defines three levels of constructs:

Although the higher-level CDK constructs provide a welcome mechanism for reducing the amount of code and options that need to be set, they don't really provide much abstraction: the new constructs either correspond to a single resource or to a combination of multiple (usually two) resources. The former means that we still code automation at the cloud resource level, with the additional constructs largely providing convenience functions. The latter significantly reduces the amount of code to be written as it can pre-wire multiple resources, but it falls short of establishing a new vocabulary for us to describe the solution architecture. It also quickly ends up with a n-squared effort.

Given that it's an object-oriented library, I feel that CDK could go further towards implementing actual design patterns and establishing a pattern language that's distinct from the AWS resources.

CDK constructs use the language of the atomic resources. Although that's what chemists do by naming molecules "(h)aitch-two-oh", the rest of us prefer to call it "water". That kind of vocabulary would be a huge addition to CDK constructs.

So, let's see if we can "raise the bar" a bit.

Design Pattern: Circuit Breaker

A commonly used design pattern in distributed systems is the Circuit Breaker. A Circuit Breaker prevents a slow or partially failing component from compromising the whole system be detects such a situation and "tripping", causing it to immediately return an error code instead of consuming components sending repeated requests and timing out, unnecessarily consume resources, While in the open state, the Circuit Breaker probes the failing component at a reasonable interval to detect the component recovery and return into a normal state (users of asynchronous messaging might find this mechanism strangely unnecessary).

There's an implementation of a EventBridge Circuit Breaker in CDK by AWS Hero Matt Coulter (the example of Google being down is rather humorous). The implementation uses DynamoDB as a simple time-series database to track the number of errors that occurred in the last 60 seconds (using Timestream might be overkill but would be an interesting exercise). If the error count exceeds a given threshold, the Circuit Breaker (a Lambda function in this case) trips and returns instant errors instead of calling the external service over and over again. The message flow is as follows:

The pattern code (GitHub) combines Python (or TypeScript) CDK code and Lambda functions written in TypeScript. The Context-Based Router pattern (the web page only shows an abridged version of the pattern; full-text in the book) that routes messages based on external state is implemented in a custom Lambda function. It'd be interesting to see if it also be implemented in EventBridge.

I noticed is that this pattern implementation doesn't actively probe the service to see if it's still brokenbut rather relies on the sliding time window that "expires" past errors. On a service that receives consistent traffic this might lead to the faulty service receiving a burst of ERROR_THRESHOLD requests about every 60 seconds. It could also be a nice enhancement to map out the parameters and variability points, e.g. thresholds, external service ARN etc, as parameters. The code is open source, so I guess I am invited to submit a pull request :-) .

Expressing Intent - Integration Patterns

In our quest to find suitable abstractions for distributed serverless applications let's close the loop and recall our very own integration patterns. It turns out, they actually make for a very good combo.

Layers, but different

During the Loan Broker implementation, we found that many integration patterns are already built into the platform. For example AWS SNS (or GCP Pub/Sub) are great implementations of a Publish-Subscribe Channel. We also used a Content Filter and a Message Filter to eliminate empty bank quotes and to only pass the useful message payload. The pattern icons are nicely included in this Overview Diagram:

Now, wouldn't it be awesome to describe the solution with those patterns instead of YAML expressions? The filter expressions would then be neatly in our code as opposed to being buried deep down in a YAML document. Well, it looks like CDK might be exactly our ticket for that!

Integration Patterns in CDK

Being able to write automation code in an object-oriented language gives us the ability to form clean layers that each use a different vocabulary. Those layers are quite different from the existing CDK layers (extra points for noticing the non-arbitrary color scheme):

The bottom layer of our approach consists of the usual CDK constructs, such as the EventBridge event bus, Lambda functions, and SQS queues. Moving up the stack, our goal is to describe applications using Enterprise Integration Patterns instead of cloud resources. We achieve that with a (icon-green) middle layer that implements common integration patterns on top of the AWS serverless ecosystem. The top layer finally uses the language of our example domain, meaning banks and loan brokers, credit bureaus, etc.

The three layers each use the language of their respective domain:

The second layer is the critical one as it allows separates the application from the product / service / resource names.

Serverless Automation Patterns

Application automation code in the top layer now looks like this snippet (see CDK Integration Patterns Github Repo).

var nonEmptyQuoteMessageFilter = MessageFilter.fromDetail(this, "nonEmptyQuoteMessageFilter",
   { responsePayload: { bankId: [{ exists: true }] } }
);
var payloadContentFilter = ContentFilter.createPayloadFilter(this, "PayloadContentFilter");

new MessageContentFilter(this, "FilterMortgageQuotes", {
    sourceEventBus: mortgageQuotesEventBus,
    targetQueue: mortgageQuotesQueue,
    messageFilter: nonEmptyQuoteMessageFilter,
    contentFilter: payloadContentFilter,
});

This automation code looks oddly like... application code. It uses a language that's abstracted from the cloud resources and instead uses a vocabulary that's suitable to describing message-oriented solutions.

The nonEmptyQuoteMessageFilter is a Message Filter. As you'd expect, this Message Filter accepts a predicate, an expression that evaluates to true or false. In our case, that predicate is described in EventBridge event pattern syntax. We happily accept this leak from the underlying platform because it vastly simplifies the implementation. The expression { responsePayload: { bankId: [{ exists: true }] } } specifies that the message has to have a bankId field inside the responsePayload, just as it did when it was embedded in YAML in Part 4.

Second, we create a special kind of Content Filter, a Payload Filter, that reduces incoming messages to just the payload, stripping off the metadata that might have been added by Lambda Destinations. This class doesn't require any additional parameters.

Last, the code combines both patterns into a MessageContentFilter, which additionally accepts a source and a target. The MessageContentFilter is a somewhat contrived combination of a Message Filter and a Content Filter that matches the EventBridge implementation under the covers. You'd be right to point out another small amount of leakage from the lower layers. I'll discuss our options on this below.

Automation code isn't limited to dealing with platform resources. Instead, it should be using abstractions that express the intent of your application.

Mapping Integration Patterns to AWS Serverless

The magic question now is: what does the middle layer look like? That layer exposes the Integration Pattern language and maps that to the AWS Serverless ecosystem. We are starting very simple here with just two patterns, a Message Filter and a Content Filter. Both patterns are conveniently implemented inside EventBridge, although with some nuances.

The official functional diagram of Amazon EventBridge provides a starting point:

Being a universal event bus, the diagram focuses on the variety of supported sources and targets. It also highlights the ability to work with Event Schemas, which we aren't using for the Loan Broker. The "Rules" description gives us a hint by highlighting that it can be used "to filter and send events". That's going to be our Message Filter. The Content Filter isn't really visible in this diagram, though. As we found out in Part 3, content filtering is part of the EventBridge target. The setting looked like this in CloudFormation (see Part 4):

Targets:
    - Arn: !GetAtt QuoteResponseChannel.Arn
      InputPath: $.detail.responsePayload
      Id: MortgageQuotes

InputPath is one of the options for selecting data to be sent to the target, in this case, a subset of the event.

Coding Integration Patterns with CDK

Armed with the mapping from pattern to AWS resource, we can implement the Integration Patterns on top of CDK. The green middle layer contains two core classes MessageFilter and ContentFilter (see Source on GitHub). Let's tackle the Message Filter first:

interface MessageFilterProps extends EventPattern {}
interface MessageFilterDetailProps { [key: string]: any; }

class MessageFilter extends Construct {
    public readonly eventPattern: EventPattern;

    constructor(scope: Construct, id: string, props: MessageFilterProps) {
        super(scope, id);
        this.eventPattern = props;
    }

    static fromDetail(scope: Construct, id: string, detailProps: MessageFilterDetailProps): MessageFilter {
        return new MessageFilter(scope, id, {
            detail: detailProps,
        });
    }
}

The MessageFilter is really just a wrapper for a CDK EventPattern. One of the EventPattern's properties is the detailparameter, which according to the documentation is "A JSON object, whose content is at the discretion of the service originating the event." Er, ah, OK... not really. What we need to understand here is the structure of the incoming events, passed along by the Lambda Destination (I omit numerous fields—this is the reason use a Content Filter!):

{
    "version": "0",
    "id": "12345678-0054-2c61-4c9f-c3c77a599782",
    "detail-type": "Lambda Function Invocation Result - Success",
    "source": "lambda",
    "account": "1234567890",
    "time": "2022-01-18T08:02:39Z",
    "region": "us-east-2",
    "resources": [
        "arn:aws:events:us-east-2:1234567890:event-bus/LoanBroker3",
        "arn:aws:lambda:us-east-2:1234567890:function:BankSns3Premium:$LATEST"
    ],
    "detail": {
        "version": "1.0",
        "timestamp": "2022-01-18T08:02:39.190Z",
        "requestContext": {
            "requestId": "12345678-6eed-46f3-8725-6e62466588ae",
            "functionArn": "arn:aws:lambda:us-east-2:1234567890:function:BankSns3Premium:$LATEST",
            "condition": "Success",
            "approximateInvokeCount": 1
        },
        "requestPayload": {
            "Records": [ ]  ## Removed
        },
        "responseContext": {
            "statusCode": 200,
        },
        "responsePayload": {
            "rate": 4.225740338234942,
            "bankId": "Premium",
            "id": "AAAA12345"
        }
    }
}

The detail property of the CDK EventPattern corresponds to the event's detail element (you'll find other properties for account, detail-type (detailType in TypeScript CDK), etc). Now it makes sense that we accept MessageFilterDetailProps, which really can represent any JSON construct. Passing { responsePayload: { bankId: [{ exists: true }] } } specifies that a bankId element must exist under the responsePayload within the detail section of the event.

The Content Filter we are using is a very common one: we only want to retain the responsePayload. Hence we create a static method createPayloadFilter, which conveniently contains the correct JSONPath expression to the responsePayload (you'll see detail here again, albeit this time as part of the expression string—those little variations are one reason the pattern layer is so helpful):

interface ContentFilterProps { readonly jsonPath: string; }

class ContentFilter extends Construct {
    public readonly ruleTargetInput: RuleTargetInput;

    constructor(scope: Construct, id: string, props: ContentFilterProps) {
        super(scope, id);
        this.ruleTargetInput = RuleTargetInput.fromEventPath(props.jsonPath);
    }

    static createPayloadFilter(scope: Construct, id: string): ContentFilter {
        return new ContentFilter(scope, id, {
            jsonPath: "$.detail.responsePayload",
        });
    }
}

The arguments to the MessageFilter and the ContentFilter are different (JSON Object vs. JSONPath expression) because EventBridge rules support complex expressions, including "logical AND" (within limits), whereas the target's InputPath extracts a single subset of the event. At the same time, both objects are really just data holders - neither one invokes the lower-level CDK constructs directly.

Finally, the MessageContentFilter defines an event bus rule and target based on the settings from the two objects, plus an event source and a destination (our SQS Queue):

interface MessageContentFilterProps {
  sourceEventBus: EventBus;
  targetQueue: IQueue;
  messageFilter: MessageFilter;
  contentFilter: ContentFilter;
}

class MessageContentFilter extends Construct {
  constructor(scope: Construct, id: string, props: MessageContentFilterProps) {
    super(scope, id);

    const messageFilterRule = new Rule(scope, id + "Rule", {
      eventBus: props.sourceEventBus,
      ruleName: id + "Rule",
      eventPattern: props.messageFilter.eventPattern,
    });

    var queueMessageProps = props.contentFilter.ruleTargetInput ? {
      message: props.contentFilter.ruleTargetInput,
    } : {};
    messageFilterRule.addTarget(new targets.SqsQueue(props.targetQueue, queueMessageProps));
  }
}

The abstraction implements explicit composition, as described in Part 3:

Making the composition explicit in a central component might be considered more tightly coupled. However, being able to describe the composition in automation code that can freely recompose the application's anatomy, allows for rapid and reliable changes. The key requirement is that you have control over all application elements, making this technique more useful for distributed applications than application integration.

Describing an application's composition in automation code allows us to freely recompose the application's anatomy.

What about leaky abstractions?

While discussing the pattern layer, we noticed multiple leaks from the CDK layer, such as the expression syntax or the existence of a MessageContentFilter. You would rightly conclude that such leakages would make it difficult to use the same abstraction with an implemnentation on another cloud. There are several pragmatic reason that I am fine with those leaks, at least at this point:

The bigger consideration is main objective:

The primary objective is expressive automation code, not portability. Portability might be a welcome bonus.

Portability is a valuable benefit. However, it's also a future, potential benefit—it's an Option. In comparison, more expressive automation code and a smoother on-ramp are immediate benefits.

What's Next?

I firmly believe that we are just starting to realize the true potential of cloud automation. Modern cloud automation isn't just about reducing toil. It also isn't about converting CLI scripts into something slightly more elegant. Instead, it can help us blur the line between application and automation code. You surely could have implemented a Content Filter in application code. However, you can also map it to a platform service, without leaving the comfort of your programming language and without having to switch to another technical domain (that of platform services). That, in my opinion, is huge. Watch this space!


Related Posts in This Series

Share:            

Follow:       Subscribe  SUBSCRIBE TO FEED

More On:  INTEGRATION  CLOUD     ALL RAMBLINGS   

Gregor is an Enterprise Strategist with Amazon Web Services (AWS). He is a frequent speaker on asynchronous messaging, IT strategy, and cloud. He (co-)authored several books on architecture and architects.