Gregor's Ramblings
HOME    PATTERNS    RAMBLINGS    ARTICLES    TALKS    DOWNLOAD    BOOKS    CONTACT

Serverless Loan Broker @ AWS, Part 4: Automation

November 30, 2021

Gregor HohpeHi, I am Gregor Hohpe, co-author of the book Enterprise Integration Patterns. I like to work on and write about asynchronous messaging systems, service-oriented architectures, and all sorts of enterprise computing and architecture topics. I am also an Enterprise Strategist at AWS.
TOPICS
ALL RAMBLINGS  Architecture (12)  Cloud (8)  Conversations (8)  Design (26)  Events (27)  Gregor (4)  Integration (17)  Messaging (12)  Modeling (5)  Patterns (8)  Visualization (3)  WebServices (5)  Writing (12) 
POPULAR RAMBLINGS
RECENT

My blog posts related to IT strategy, enterprise architecture, digital transformation, and cloud have moved to a new home: ArchitectElevator.com.

It's hard to imagine cloud without automation. Cloud computing has transformed the way we provision infrastructure and deploy applications because it makes functions that used to require lengthy manual processes available as an API call. Not taking advantage of this capability would seem outright silly. So, let's automate our Loan Broker application.

CLI - The Command Line

Automation is hardly a new idea—sysadmins have been replacing manual tasks with shell scripts for decades. Shell scripts make automation straightforward because they use the same commands that you'd issue by hand. For that very reason, my initial implementation uses the following shell script with calls to the AWS CLI (command-line interface) to create the bank functions (on Github):

account=`aws sts get-caller-identity --query Account --output text`

# TODO: edit to reflect your role to be used
role=arn:aws:iam::$account:role/service-role/CreditBureau-role-abcdefg

zip BankSns.zip BankSns.js

aws lambda delete-function --function-name=BankSnsPawnshop
aws lambda delete-function --function-name=BankSnsUniversal
aws lambda delete-function --function-name=BankSnsPremium

aws lambda create-function --function-name=BankSnsPawnshop \
    --runtime=nodejs12.x --handler=BankSns.handler --role=$role \
    --environment="Variables={BANK_ID=PawnShop,BASE_RATE=5,MAX_LOAN_AMOUNT=500000,MIN_CREDIT_SCORE=400}" \
    --zip-file=fileb://BankSns.zip

aws lambda create-function --function-name=BankSnsUniversal \
    --runtime=nodejs12.x --handler=BankSns.handler --role=$role \
    --environment="Variables={BANK_ID=Universal,BASE_RATE=4,MAX_LOAN_AMOUNT=700000,MIN_CREDIT_SCORE=500}" \
    --zip-file=fileb://BankSns.zip

aws lambda create-function --function-name=BankSnsPremium \
    --runtime=nodejs12.x --handler=BankSns.handler --role=$role \
    --environment="Variables={BANK_ID=Premium,BASE_RATE=3,MAX_LOAN_AMOUNT=900000,MIN_CREDIT_SCORE=600}" \
    --zip-file=fileb://BankSns.zip

The script packages the Bank source code, deletes any potentially already existing functions and deploys three Bank lambda functions, passing the respective configuration parameters.

Although convenient, this method has a major (and well-known) drawback: it isn't designed to deal with change. How would you deploy one additional bank? If you make a separate script for just that bank, you'd end up with a large collection of scripts that depend on each other. Deleting and re-creating all Bank functions, like this script does, unnecessarily interrupts the existing service. A smarter script could determine which banks already exist and update only the ones that need changing, but it'll become excessively complex as it has to cover an ever larger variety of cases. That's why modern cloud automation doesn't use shell scripts.

AWS CloudFormation

Dating back to 2011, 5 years after the birth of AWS, AWS CloudFormation is one of the earliest cloud automation tools (Terraform came about in 2014 with Terraform 1.0.0 being released in 2021). CloudFormation looks more like a data structure than a script, using hierarchical YAML (or equivalent JSON) syntax to describe the resources that should be deployed. It follows a declarative approach by specifying a desired target state, e.g., three banks. By letting the tool figure out what resources to (de-)provision to reach that state from the current setup, it solves the script explosion problem from above.

CloudFormation is a new language, but you can get a small head start from the CLI by describing existing resources in a YAML format via the output parameter:

$ aws lambda list-functions --output=yaml --query='Functions[?starts_with(FunctionName, `BankSns`) == `true`]'

Although the result won't fit the CloudFormation script 100%, for example because CloudFormation requires additional settings like the source file, it's a reasonable start for simple resources.

Creating a Bank in YAML

I like to start simple, so my first CloudFormation template creates just a single bank function:

AWSTemplateFormatVersion: '2010-09-09'
Parameters:
  BankRole:
    Type: String
Resources:
  BankSnsPawnShop:
    Type: AWS::Lambda::Function
    DeletionPolicy: Delete
    Properties:
      Runtime: nodejs12.x
      Code:
        S3Bucket: loanbroker-source
        S3Key: BankSns
      Handler: BankSns.handler
      Role:
        Ref: BankRole
      FunctionName: 'BankSnsPawnShop2'
      Description: 'Pawn Shop'
      Environment:
        Variables:
          BANK_ID: PawnShop
          BASE_RATE: '5'
          MAX_LOAN_AMOUNT: '500000'
          MIN_CREDIT_SCORE: '400'

After specifying a version header and requiring the security role as a parameter, my template defines a single resource of type Lambda::Function. All parameters that we'd pass on the command line before are now represented as nested YAML elements. For example, the environment settings of the name of the handler code can be found under Properties/Environment/Variables.

CloudFormation pulls the ZIP'd source code from an S3 bucket, so before we can execute the CloudFormation template, we upload (copy) it there (yes, via command line):

$ aws s3 cp BankSns.zip s3://loanbroker-source/BankSns

Creating a Stack

CloudFormation is based on the concepts of a template and a stack. The template is the file above - a description of the resources that should be provisioned. A stack is an actual deployment of the template that tracks the deployed state. You could think of it as classes and objects instantiated from a class, with the key difference that your instances are running systems, which also incur costs. Deleting a stack equates to deprovisioning all resources associated with it.

We create a stack LoanBrokerPubSub based on the template above to deploy a single bank function, specifying the required security role parameter (as always, replace it with your role).

$ aws cloudformation create-stack --stack-name LoanBrokerPubSub \
    --template-body file://LoanBrokerPubSub.yml \
    --parameters ParameterKey=BankRole,ParameterValue=arn:aws:iam::1234567890:role/service-role/CreditBureau-role-abcdef
{
    "StackId": "arn:aws:cloudformation:us-east-2:1234567890:stack/LoanBrokerPubSub/abcdef"
}

The CLI returns a StackId so we can refer to it later. Just having an ID doesn't mean the whole stack was created flawlessly, though, so it's a good idea to check the detailed event log:

$ aws cloudformation describe-stack-events --stack-name LoanBrokerPubSub  

This command will show the time and status for each resource that is created by the template (you'll fondly remember the JMESPath syntax for the query parameter from Part 2).

Composing Multiple Banks

So far, so good, but we want to do a bit more than just deploy a single bank. Being largely a data structure, CloudFormation doesn't support loops or expansions as you would find them in regular programming languages.

Provisioning multiple banks therefore follows the proven CTRL+C / CTRL+V method. A thorough read of the CloudFormation Resource Reference and good dosage of grit enabled me to also create the QuoteRequestChannel as an SNS topic, the QuoteResponseChannel as an SQS queue, and the message filter as an EventBridge rule. The script wires everything together by subscribing the Bank functions to the SNS channel and sending responses to the SQS queue via EventBridge. Those links are accomplished with resource references, e.g. to the subscribe banks to the Topic (some unrelated settings are omitted):

BankUniversalSubscription:
  Type: AWS::SNS::Subscription
  Properties:
    TopicArn: !Ref QuoteRequestChannel
    Protocol: lambda
    Endpoint: !GetAtt BankSnsUniversal.Arn

and to send responses to the EventBridge bus:

SendQuoteUniversal:
  Type: AWS::Lambda::EventInvokeConfig
  Properties:
    FunctionName: !Ref BankSnsUniversal
    DestinationConfig:
      OnSuccess:
        Destination: !GetAtt FilterMortgageQuotesBus.Arn

Even for our simple example, the code grows quickly, so you'll find the 200+ lines on Github.

Security constructs such as queue and event bus policies are also represented as CloudFormation resources. For example, allowing our banks to receive quote requests and to publish responses requires the following policies:

BankPawnShopInvokePermission:
  Type: 'AWS::Lambda::Permission'
  DeletionPolicy: Delete
  Properties:
    Action: 'lambda:InvokeFunction'
    FunctionName: !Ref BankSnsPawnShop
    Principal: sns.amazonaws.com
    SourceArn: !Ref QuoteRequestChannel

AllowMessagesToResponseChannel:    
  Type: AWS::SQS::QueuePolicy
  DeletionPolicy: Delete
  Properties:
    Queues:
      - !Ref QuoteResponseChannel
    PolicyDocument:
      Statement:
        - Action:
            - "SQS:SendMessage"
            - "SQS:ReceiveMessage"
            - "SQS:DeleteMessage"
            - "SQS:ChangeMessageVisibility"
          Effect: "Allow"
          Resource: !GetAtt QuoteResponseChannel.Arn
          Principal:
            AWS: !Ref AWS::AccountId
        - Action:
            - "SQS:SendMessage"
          Effect: "Allow"
          Resource: !GetAtt QuoteResponseChannel.Arn
          Principal:
            Service: events.amazonaws.com

Once defined, we can deploy the banks and queues (the Step Functions broker and aggregator aren't yet included) with a single command:

$ aws cloudformation update-stack --stack-name LoanBrokerPubSub --template-body file://LoanBrokerPubSub.yml \
    --parameters ParameterKey=BankRole,ParameterValue=arn:aws:iam::1234567890:role/service-role/CreditBureau-role-abcdef

Proud of our creation, we can test the setup by sending a quote request to the SNS channel and looking for responses on the response queue:

$ aws sns publish --topic-arn arn:aws:sns:us-east-2:1234567890:MortgageQuoteRequest2 \
    --message '{ "SSN": "123-45-6666", "Amount": 500000, "Term": 30, "Credit": { "Score": 803, "History": 22 } }' \
    --message-attributes '{ "RequestId": { "DataType": "String", "StringValue": "ABC12345" } }'

The great news is that with a single command we can deploy a serverless solution that combines Publish-subscribe Channels, Message Queues, Message Filters, and Lambda Functions. However, composing such an automation script blends resource provisioning, application composition, component configuration, and security settings. Not only does this mix separate concerns, it also makes writing and debugging these scripts a non-trivial exercise.

Polyglot Low-Code

Automating with CloudFormation is a huge step ahead from CLI scripts. However, getting the syntax right can be challenging at times (at least for me). The following snippet that configures the event bus to filter out empty mortgage quote responses might not look too complicated but took me an extraordinary amount of time to get right:

RouteMortgageQuotes:
  Type: AWS::Events::Rule
  DeletionPolicy: Retain
  Properties:
    Name: RouteMortgageQuotes2
    Description: "Filter out empty quotes"
    EventBusName: !GetAtt FilterMortgageQuotesBus.Name
    EventPattern:
      detail:
        requestContext:
          functionArn: [{ prefix: !Sub 'arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:BankSns2' }]
        responsePayload:
          bankId: [{ exists: true }]
    Targets:
      - Arn: !GetAtt QuoteResponseChannel.Arn
        InputPath: $.detail.responsePayload
        Id: MortgageQuotes

After trying to assemble an expression using the Fn::Join function, the Burning Monk saved me by recommending Fn::Sub instead. Nevertheless, I found the cognitive load to be high as this single line combines syntax from multiple languages:

functionArn: [{ prefix: !Sub 'arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:BankSns2' }]

This line filters incoming events to only process those originating from our banks, identified by the name prefix BankSns2. Because EventBridge only supports prefix matching, the code constructs a full arn (AWS Resource Name) prefix.

  1. !Sub is syntactic sugar for Fn::Sub, a CloudFormation intrinsic function that substitutes placeholders in a string.
  2. In this case, the values to be substituted are pre-defined CloudFormation pseudo parameters. The resulting string looks like arn:aws:lambda:us-east-2:1234567890:function:BankSns2.
  3. This string is used inside the EventBridge pattern syntax to perform prefix matching, following the syntax [{ prefix: "string"}].
  4. functionArn is the field name used in the envelope (requestContext) added by the Lambda destination to form the invocation record.

This single line pulls from several topic domains, including CloudFormation syntax, functions, and pseudo parameters, EventBridge filter patterns, and Lambda Destination event formats, so it's low-code but in a polyglot flavor. Luckily, my simple use case didn't need to supply integer values.

Defining the event bus target utilizes another syntax. This EventBridge SQS Target filters down the message content to the actual quote, removing the event envelope (a classic Content Filter):

Targets:
  - Arn: !GetAtt QuoteResponseChannel.Arn
    InputPath: $.detail.responsePayload
    Id: MortgageQuotes

These harmless-looking lines also combine syntax from multiple sources:

  1. InputPath is an EventBridge attribute to pass filtered content to the target. Alternatively, you can use the Input (JSON literal) or InputTransformer attributes (a map of JSONPath expressions plus a template in form of a JSON literal with placeholders for the fields calculated in the map).
  2. The value $.detail.responsePayload references a specific field from the event using JSONPath syntax. Nested fields are represented with dot notation here, as opposed to nested elements in the EventPattern attribute.
  3. Arn references the MortgageQuotes SQS channel that non-empty mortgage quotes are sent to, using the intrinsic function GetAtt, which accepts the name of an attribute in dot notation, however without a $.

Looking for Abstraction: SAM

Although automation scripts are a great help, they can be verbose at times, using much boilerplate, but then contain some key lines that combine syntax from multiple sources. Knowing that architecture is defined by meaningful decisions, I'd like to find a way to amplify the meaningful text while reducing the noise. In short, I am looking for useful abstractions. In AWS-land the next stop is SAM - The Serverless Application Model.

The Serverless Application Model comprises several elements, including the ability to build and run Lambda functions locally and to speed up (non-production) deployments with SAM Accelerate. At the heart of it is are additional resource types that promise to make CloudFormation templates less verbose.

In our automation template above, defining a Lambda function that consumes messages from an SNS Topic and sends responses to EventBridge via Lambda Destination required three resources (AWS::Lambda::Function, AWS::SNS::Subscription, AWS::Lambda::EventInvokeConfig) plus IAM permissions (AWS::Lambda::Permission, AWS::Events::EventBusPolicy). SAM condenses this to a single Lambda resource with Event and EventInvokeConfig properties (I omitted the environment parameters below):

BankSnsPawnShop:
  Type: AWS::Serverless::Function
  Properties:
    PackageType: Zip
    Runtime: nodejs12.x
    CodeUri: src
    Handler: BankSns.handler
    FunctionName: 'BankSns3PawnShop'
    Description: 'Pawn Shop'
    Events:
      MortgageQuoteRequest:
        Type: SNS
        Properties:
          Topic: !Ref QuoteRequestChannel
    EventInvokeConfig:
      DestinationConfig:
        OnSuccess:
          Type: EventBridge
          Destination: !GetAtt FilterMortgageQuotesBus.Arn

You notice the new resource type AWS::Serverless::Function. Instead of specifying the ZIP package, this resource type accepts the source code because SAM will actually build and package the function (for my trivial example it's really just ZIP-ing up the source file). Also, this resource no longer specifies a Role attribute as SAM will generate one for us, including the required permissions. The reduction from five resources to one is surely welcome. The resulting template (on GitHub) is now 129 non-empty lines, a 35% reduction.

You build and run SAM applications from the command line (CloudShell has SAM pre-installed):

$ sam build
$ sam deploy --guide

The guide option allows you to enter required settings (like the CloudFormation stack name) on the command line and stores them in a local file so you don't have to repeat the process. SAM expands the serverless resources into actual resources:

CloudFormation stack changeset
----------------------------------------------------------------------------------------------
Operation    LogicalResourceId                               ResourceType                                     
----------------------------------------------------------------------------------------------
+ Add        BankSnsPawnShopEventInvokeConfig                AWS::Lambda::EventInvokeConfig   
+ Add        BankSnsPawnShopMortgageQuoteRequestPermission   AWS::Lambda::Permission             
+ Add        BankSnsPawnShopMortgageQuoteRequest             AWS::SNS::Subscription                
+ Add        BankSnsPawnShopRole                             AWS::IAM::Role                            
+ Add        BankSnsPawnShop                                 AWS::Lambda::Function                 
+ Add        BankSnsPremiumEventInvokeConfig                 AWS::Lambda::EventInvokeConfig   
+ Add        BankSnsPremiumMortgageQuoteRequestPermission    AWS::Lambda::Permission             
+ Add        BankSnsPremiumMortgageQuoteRequest              AWS::SNS::Subscription                
+ Add        BankSnsPremiumRole                              AWS::IAM::Role                            
+ Add        BankSnsPremium                                  AWS::Lambda::Function                 
+ Add        BankSnsUniversalEventInvokeConfig               AWS::Lambda::EventInvokeConfig   
+ Add        BankSnsUniversalMortgageQuoteRequestPermission  AWS::Lambda::Permission              
+ Add        BankSnsUniversalMortgageQuoteRequest            AWS::SNS::Subscription                 
+ Add        BankSnsUniversalRole                            AWS::IAM::Role                             
+ Add        BankSnsUniversal                                AWS::Lambda::Function                 
+ Add        FilterMortgageQuotesBus                         AWS::Events::EventBus                  
+ Add        QuoteRequestChannel                             AWS::SNS::Topic                           
+ Add        QuoteResponseChannel                            AWS::SQS::Queue                         
+ Add        RouteMortgageQuotes                             AWS::Events::Rule                         

Here you can see roles created for each function, e.g. BankSnsPawnShopRole plus the required permissions. The cool part is that with just two command lines, you're ready to send quote requests and fetch the results from the MortgageQuotes3 channel (that query parameter is really coming in handy):

$ aws sns publish --topic-arn arn:aws:sns:us-east-2:1234567890:MortgageQuoteRequest3  \
    --message '{ "SSN": "123-45-6666", "Amount": 500000, "Term": 30, "Credit": { "Score": 803, "History": 22 } }' \
    --message-attributes '{ "RequestId": { "DataType": "String", "StringValue": "ABCD1234" } }'

$ aws sqs receive-message --queue-url https://sqs.us-east-2.amazonaws.com/1234567890/MortgageQuotes3 \
    --query='Messages[].Body' --max-number-of-messages=5 --wait-time-seconds=5

[
    "{\"rate\":6.134508050538122,\"bankId\":\"PawnShop\",\"id\":\"ABCD1234\"}"
    "{\"rate\":5.692421843390756,\"bankId\":\"Universal\",\"id\":\"ABCD1234\"}"
]

$ aws sqs purge-queue --queue-url https://sqs.us-east-2.amazonaws.com/1234567890/MortgageQuotes3

sqs receive-message doesn't delete messages from the queue, so for testing it's handy to purge the queue after receiving messages (it's great that SQS includes a Channel Purger). Also, be aware that a single call to sqs receive-message might not return all messages at once.

What initially tripped me up is that specifying an EventBridge Target to an SQS channel doesn't automatically give the event bus permission to send messages. If you consider that the EventBridge resource type is still AWS::Events::Rule, this makes sense as all the SAM magic is reserved for resources of the AWS::Serverless types. So, the SAM template still needs a AWS::SQS::QueuePolicy to allow our event bus to send messages.

SAM certainly makes automation template development easier, especially for complex resources like API gateways. But did it really provide us with a new model, i.e. abstraction, as the name suggests?

Serverless Automation: A Reflection

Cloud without automation is just going to be a better data center, which isn't really what anyone wants. Especially with serverless applications, automation takes on a whole new meaning as it's less concerned with provisioning (the platform takes care of that) but much more with composition (which resource sends messages where) and application settings (like our bank parameters). However, these expanded responsibilities also stretch what automation languages like CloudFormation were originally designed to do. Time for us architects to zoom out.

Separation of Concerns

Even for our simple demo application, the automation script handles multiple levels of application management:

Having all these capabilities coded and version controlled is a definite asset. However, you can expect these respective settings to be made by different parties at different times and perhaps subject to different authorization rules. Parameters allow separation for very simple cases (like the role identifierin my simple example), so you might need to resort to YAML / JSON manipulation (they're easily parsed!), or built-in mechanisms like nested stacks. The danger here is that you are building layers over layers (your templating system - SAM - CloudFormation - AWS API), which will make debugging cumbersome.

Syntax and Tooling

As shown above, using YAML (or JSON is you prefer) syntax is convenient for expressing resource definitions. However, most automation languages like CloudFormation combine elements from multiple technical domains into the same file and often the same line. For example, the following snippet nests CloudFormation-defined elements (Properties, EventPattern) with event node names that are defined by AWS resources (detail, requestContext) and event node names defined by your application (bankId):

Properties:
  EventPattern:
    detail:
      requestContext:
        functionArn: [{ prefix: !Sub 'arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:BankSns2' }]
      responsePayload:
        bankId: [{ exists: true }]

On top of this, expression combines syntax from CloudFormation functions (e.g., !Sub or ${AWS:Region}) and that used by the respective service (e.g., prefix in this case).

The basic syntax also makes the scripts verbose: a script that implements only a subset of the Loan Broker (the most complex elements, i.e., the Step Functions orchestrator and the Aggregator are missing) already contains over 200 lines.

Abstraction

The automation scripts we created provide a huge convenience and in fact a new way of working. For example, I had made a mistake by not specifying the correct bank IDs and all I had to do was update the SAM template, build and deploy and SAM figured out to re-provision just those functions (SAM accelerate might have been able to do it on the fly, even). However, the automation languages don't really provide abstractions over the cloud resources: we are still dealing with Functions and EventBridges and Step Functions. You might guess that I have some ideas on how to describe asynchronous, distributed systems at a higher level of abstraction. Using such abstractions isn't syntactic sugar or shortcuts but a different vocabulary that emphases the key design decisions. For example, I'd want to be able to express that my EventBridge Rule acts as a Message Filter and a Content Filter, highlighting the key parameters, which are the filter predicate and the content selection.

Composability and Orthogonality

Distributed solutions should be loosely coupled, meaning that a change in one component doesn't propagate to other components. Not specific to automation tools but rather the platform itself, I spotted several instances where the current implementation would struggle to meet this test. For example, the EventBridge pattern filters on the specific event header added by the Lambda Destination (detail.requestContext.functionArn). If I was to change the implementation to use a different composition mechanism (e.g. sending messages directly from the function code), this logic in the EventBridge pattern would fail.

Likewise, when a Lambda Destination calls calls EventBridge, the data is inside the Detail element (we referenced that above). However, for Lambda-to-Lambda calls you have the option of sending the response only by specifying responseOnly in the CDK Configuration. That option isn't available for other destination targets. Making the availability of options independent from other settings, makes a system orthogonal and therefore more freely composable. A great example are Unix pipes, which are freely composable due to all components reading and writing to standard streams.

Infrastructure as actual Code

Luckily, I am not the first person to encounter these limitations. A new generation of automation tools like AWS CDK, Pulumi, or CDK for Terraform provide programming libraries for cloud automation. So, we might have a better starting point for abstraction and tooling. That'll be the perfect topic for the next post!

Share:            

Follow:       Subscribe  SUBSCRIBE TO FEED

More On:  INTEGRATION  CLOUD     ALL RAMBLINGS   

Gregor is an Enterprise Strategist with Amazon Web Services (AWS). He is a frequent speaker on asynchronous messaging, IT strategy, and cloud. He (co-)authored several books on architecture and architects.