Comprehensive Testing of Serverless Solutions: Exploring Integration, E2E, and Unit Testing with AWS CDK and TypeScript (Part 1)

Detailed article on the importance of testing our serverless solutions, with in depth diagrams and code examples which we will talk through.

35 min readJul 9, 2023

Introduction

Serverless architecture has revolutionised the way we build and deploy applications, offering scalability, cost efficiency, and reduced operational overhead. However, as the complexity of serverless systems grows, ensuring their reliability becomes paramount. Testing serverless solutions thoroughly is crucial to guaranteeing their functionality, performance, and resilience.

In this series, we delve into the world of serverless testing, focusing on integration, end-to-end, and unit testing. We explore how to effectively test serverless applications using the AWS CDK and the power of TypeScript. Through practical examples and code snippets, we demonstrate how to construct robust test suites that validate the behavior of your serverless components, ensuring they meet your expectations.

Let’s embark on this journey together to uncover the intricacies of testing serverless applications and harness the full potential of AWS CDK and TypeScript.

The code for the repo can be found here:

GitHub - leegilmorecode/serverless-e2e-testing: Example repo showing e2e testing in serverless…

Example repo showing e2e testing in serverless applications using the AWS CDK and TypeScript - GitHub …

github.com

💡 Note: The associated code repo is for the benefit of talking through the code only and not production ready.

How would we defined the three types of tests?
Ephemeral environments vs local emulation.
What are we building in this article?
E2E and Integration Testing: why do we need it in serverless development?
Clean code: how can hexagonal architecture help?
Talking through key code: Integration Tests.
Talking through key code: e2e Tests.

How would we define the three types of tests?

Let’s start by defining what each of these test types are below:

Integration Testing

Integration testing in serverless software development involves testing the interactions and interfaces between various serverless components, services, or systems. The primary objective is to verify that these components work together seamlessly and produce the expected results. Integration testing helps identify potential issues such as permission failures, configuration errors, or wrong target/routing setup.

“Integration tests should be used to test ‘integrations’ between services — not business logic specifically”

💡 Note: Integration tests should not use mocks. They are testing the integration between one or more services, so we want to ensure we test the real services. They test everything that your unit tests would have previously covered. Mocks give you a false sense of trust, which is why we test with the actual AWS services.

In this example repo and article we will talk through two types of integration tests:

✔️ Modules (packages). In our example we have a package called ‘@packages/aws-async-test-library’ which can help us to integration test our secondary adapters so they can be reused across multiple services within the domain (i.e. database-adapter.ts and event-adapter.ts).

✔️ Infra. This is integration testing our serverless infrastructure that we are creating through the AWS CDK code.

End-to-End Testing (Integrated)

End-to-end testing, also known as system testing, focuses on evaluating the overall behavior and performance of a serverless application from the user’s perspective. It involves simulating real-world scenarios and user interactions to validate the entire workflow or user journey, spanning multiple connected serverless components and services.

Mocks give you a false sense of trust, which is why we test with the actual AWS services.

The purpose is to ensure that the application functions correctly, data flows correctly across the system, and the desired outcomes are achieved. End-to-end testing helps identify issues like incorrect data transformations, faulty event routing, or failures in the complete execution chain.

💡 Note: E2E tests should not use mocks. They are testing the integration between many services in your process, so we want to ensure we test the real services. These cover all of the code and services that your integration and unit tests have covered, only full end to end scenarios.

One thing to note is that I have explicitly called out the following:

Acceptance Tests. These may use a framework such as Cypress to exercise the journeys a user may take in the front end of your application (if you have one) which gives you the confidence that these UX journeys work as expected (look and feel, responsiveness etc).
E2E Tests. These tests the flow through a system (as shown in the diagram above) whereby you want to ensure that the plumbing of your serverless services are working as expected end to end. This includes adding messages directly to queues, invoking processes via APIs, publishing events etc to kick off our journeys.

Unit Testing (Implementation Detail)

Unit testing is a foundational testing technique in serverless software development. It involves testing individual units or components of code in isolation, typically at the function level. In the context of serverless, these units can be AWS Lambda functions, or individual modules responsible for specific tasks. The objective is to validate that each unit behaves as expected and returns the correct outputs for given inputs. Unit testing helps catch bugs, ensure code correctness, and facilitates the development of modular and maintainable serverless applications.

In the World of CDK, it is also typical to unit test some parts of the CDK stack code where it is favourable to do so, and especially with L3 custom constructs which will be used/consumed many times (and potentially by many teams).

💡 Note: Unit tests should use mocks as we are running them locally, they should be fast, and only want to test the system under test (SUT); and not the external dependencies (they will be tested through the integration and e2e tests).

Section summary

In summary, integration testing focuses on verifying the interaction between serverless components, end-to-end testing validates the complete workflow of the application, and unit testing ensures the correctness of individual units or functions within the serverless system (typically business logic). Together, these three types of tests provide comprehensive coverage and confidence in the reliability and functionality of serverless applications.

💡 Note: e2e tests cover all aspects of the integration and unit tests, whilst integration tests also cover everything that the unit tests have covered. Each type of tests is essentially building upon the other i.e. e2e > integration > unit.

👇 Before we go any further — please connect with me on LinkedIn for future blog posts and Serverless news https://www.linkedin.com/in/lee-james-gilmore/

How do these tests fit into our SDLC? 💭

For me personally, what I want to get from testing in my SDLC is:

✔️ Fast feedback on any issues or errors; whether that is infrastructure (serverless) related, or business logic (unit).

✔️ I want confidence that my serverless infrastructure is working end to end without any issues, so I can fix these as early as possible.

✔️ ️I want to limit the number of re-deployments of my CDK stacks that I need to do.

✔️ ️I want the confidence that once my service is live I don’t break something which get’s through a merge request.

What I typically see teams doing however is:

The engineers write a huge amount of code up front and then a large deployment which surfaces many many issues, such as permission issues, configuration problems, erroneous targets and filters, and business logic issues etc. They fix one issue and re-deploy.

They then have another cycle of large amounts of changes and development work and another slow re-deploy, again surfacing many more issues. This cycle continues…

The testing here is typically manual through the console and checking CloudWatch logs — with low confidence and slow feedback; and they go through this process many, many times. At the end they have a completed feature; however future changes or additional features on the service have the same issues with slow speed and a lack of confidence.

With that in mind, there are some key aspects to how I need these tests to work that are non-negotiables:

✔️ I need to be able to run these infra tests locally from the IDE without going into the console or exclusively through the pipeline, but against actual cloud services on AWS for integration and e2e testing (no local emulation or mocking).

✔️ I want to limit the number of re-deploys I need to perform with the AWS CDK where possible.

✔️ I want my unit tests to run in memory and not interact with cloud services (or any other external dependencies).

This is the ideal for me personally:

Pre-deploy Stage

With this in mind, I start with defining contracts, whether that is for EventBridge events (versioned schemas) or OpenAPI contracts for our API Gateway (formally Swagger).

We can then utilise a Spec first approach which is discussed below. We can also have a team member work on our Postman tests for our API at the same time, which includes success and failure scenarios, validation tests, as well as fuzz testing.

Serverless OpenAPI & Amazon API Gateway with the AWS CDK — Part 1

How we can utilise the “OpenAPI first approach” alongside Amazon API Gateway in our AWS CDK TypeScript solutions.

blog.serverlessadvocate.com

I am then writing unit tests that run locally and in memory without any external dependencies which is super fast (i.e. we mock all secondary adapters); and I am also utilising frameworks such as CDK Nag in here too for additional confidence on security of the infrastructure. This includes both the CDK unit tests and the actual function unit tests.

Post-deploy Stage

Following a deployment of the initial CDK code successfully, I am then writing integration tests for my infrastructure with fast feedback on the issues discussed above, as my Jest assertions should weed them out quickly.

I am running the tests using Jest in the IDE locally, and not manually needing to play around with the console.

As these tests live with the code, I have ongoing confidence for any future changes or additional features.

The only time I re-deploy the stack at this point is if I make a change to the CDK infrastructure code based on a failed test, or for a business logic error in our Lambda functions I can use the hotswapping feature (which redeploys the function through the AWS SDK and not CloudFormation which is super quick!):

The Power of AWS SAM & AWS CDK together 🚀

How to use AWS SAM and AWS CDK together in serverless development to quickly push code changes to the cloud, and easily…

blog.serverlessadvocate.com

I finally write a small amount of e2e tests for the main journeys through the system, which are typically a flow through multiple integration tests together.

If we look at the opinionated AWS Deployment Pipeline Reference Architecture (DPRA) we can see that:

✔️ Unit tests are ran locally during development, and then subsequently at the build stage through the pipeline.

✔️ Integration and e2e tests are run during the Test (Beta) stage of the pipeline.

This is shown in the diagram:

https://pipelines.devops.aws.dev/application-pipeline/index.html

What are the run times and complexities of these test types?

So a quick word on the time and complexities of these various test types which are shown below colour coded in green for unit (easy/quick to write and run), amber for integration (difficult to write and take longer to run), and red for e2e tests (more difficult to write and take the longest to run):

If we follow hexagonal architectures (more on this later) then we will find that unit testing is a breeze! In development this is both quick and easy to write, and very quick to run as they run locally in memory and we mock external dependencies.

Integration tests are typically more difficult to write that unit tests, and need to run in the cloud (more on ephemeral environments later) which makes them slower than unit tests in both development and running; but we can make this quicker to write using libraries or helper functions (more on this later!).

Finally we have end-to-end tests which are as difficult to write as integration tests, and take longer to run. The difficulty to contend with is the assertions based on that indeterminate amount of time: i.e. we start the flow and then need to assert that our expectations are correct somehow at the end of the flow.

Going back to the testing honeycomb, for these reasons we want to ensure that we only write a small amount of e2e tests to exercise the full stack as they take the longest to run, get most bang for our buck through integration tests as this is where most of our issues derive from, and a smaller amount of unit tests for our functions.

💡 Note: If your organisation follows TDD and has heavy business logic then you may find that you need more use of unit tests which is perfectly fine as you presumably have less direct integrations, and more functions and business logic (code).

A note on bounded contexts ⭕

One thing to ensure is that we are only testing the integrations within our own bounded context, and not that of others. This is where the notion of domain-driven design comes in. I won’t delve into this now as the following article covers it:

Serverless Domain Driven Design

Breaking down DDD concepts with real world examples into a tangible serverless equivalent.

blog.serverlessadvocate.com

From our perspective in this example, we want to ensure that we only test within our own bounded context as shown below:

This means we have two ways of interacting with the domain as inputs from others domains/actors: our own orders EventBridge Bus, or our orders API Gateway REST API. We don’t want to have any tests interacting directly with the central EventBridge ESB as we don’t own it — therefore we can simulate the ‘3rd Party Order Raised’ event by putting this directly onto our own Orders bus when testing the inbound events.

For publishing events via our stream handler function we can have a target rule on the central ESB for non production environments which targets a temporary SQS queue in our own bounded context — allowing us to check the permissions and validity of the event being published to the central event bus and filtering to our queue.

💡 Note: We know what the contract is for this 3rd party synthetic event as we have the versioned event schema to ensure that we create and validate it correctly for our tests.

Ephemeral environments vs Local emulation

Before we go any further, a quick word on short-lived ephemeral environments deployed to the cloud for our development and test needs, vs local emulation of our cloud provider to allow for in memory development and testing.

“Local emulation — just don’t do it. I repeat..don’t do it!”

https://twitter.com/LeeJamesGilmore/status/1675144377881034752

Local emulation — just don’t do it. I repeat.. don’t do it!

When I first started out building serverless applications at scale many years ago we looked at solutions like Localstack for testing against what we believed to be a fully emulated version of AWS — with this main thinking being that it would reduce complexity, reduce costs, and increase speed. What we actually found was:

❌ It may have saved us negligible costs from a service perspective as it was all local; however the development costs in setting up Localstack or others was huge (in both time and effort).

❌ These types of solutions never emulate the actual services 100%, and you end up spending more time working around this and any issues/bugs with the emulated services.

Instead, go all in on ephemeral environments which are deployed versions of your serverless code to AWS based on a specific PR or feature that you as an engineer are working on. This means:

✔️ The key benefit of serverless is that it is super cheap to run; so a fully deployed short-lived version of your serverless solution for a given PR or feature while in development will be negligible in costs (or perhaps even free if in free tier).

✔️ You have 100% confidence in running your tests as they are not being ran on an emulated environment, they are being ran against the actual AWS services!

This means as part of our SDLC and CI/CD pipelines we can spin up an ephemeral environment when we start development (a temporary version of our app and stacks), which will be automatically teared down when merged into our main branch or finished our piece of work/feature.

What are we building in this article?

Let’s now cover what we will be building in this solution to showcase how and where we would typically perform the three types of tests, taking into account the areas we have discussed so far:

✔️ We use the AWS CDK, TypeScript and Jest.
✔️ We use ephemeral environments for our testing efforts.
✔️ We cover unit, integration and e2e testing.

We are going to build the following solution which is described below:

Customers can raise new orders on our system through our API Gateway REST API.
We have a Lambda function integration which creates the order (and any business logic that goes with that).
The new order is persisted to Amazon DynamoDB.
A DynamoDB stream of table changes invokes a Lambda function.
The stream hander function publishes a new event onto the central ESB which is outside of our account.
Other domains can raise events, such as 3rd party systems raising new orders through a command.
The 3rd party order event has a target rule to the orders domain bus.
We have an SQS queue which throttles the orders as a target from the orders bus.
The create 3rd party order function persists the new order to the orders table.

In the next section let’s discuss the need for testing in serverless solutions.

E2E and Integration Testing: why do we need it in serverless development?

So we have discussed what each of the test types are at a high level; but why do we need these tests in our SDLC?

Let’s look at the following diagram which shows some of the issues we typically face:

Day to day the typical issues I see which are not noticeable until the stacks have been actually deployed are (but not limited too):

❌ Target Error. We have created an event rule or filter which we believe is correct, but actually contains an error or misconfiguration which prevents it being routed correctly. We don’t see this issue until we actually deploy the stack and test it which is timely, regardless of the stack deploying fine.

❌ Invalid Permissions. We have configured the integration, however we have not given the correct permissions to allow them to interact together; and we don’t see this until the stack is deployed and tested. A great example would be a function which doesn’t have the IAM permissions to write to a DynamoDB table.

❌ Transform Error. In our functions we believe we are transforming the event payload correctly, however we have an error in our logic which means the integration won’t work. We don’t notice this until fully deployed and our function errors.

For more information on the complexities of serverless see the following article:

The conundrum of serverless lock-in & spiralling complexity: Is it all worth it?

Should ‘serverless lock-in’ and spiralling complexity be a consideration for engineering leaders in large organisations…

blog.serverlessadvocate.com

💡 Note: This is not just about one time setup to test the integrations, but ongoing integration tests for any failures which might be introduced by an engineer which look fine in a merge request; but break the integrations when deployed.

Do we get any support from the AWS CDK and TypeScript? Yes!

This article focuses specifically on TypeScript and the AWS CDK, so you may be wondering if this helps us out at all — well yes, it does compared to other languages such as NodeJS and declarative IaC frameworks such as the Serverless Framework.

The first thing to mention about the CDK is the use of construct helper methods like the one below which are in abundance:

// allow the lambda to write to the table
props.table.grantWriteData(createOrderLambda);

This helper method will ensure that the createOrderLambda function can write to the DynamoDB table, and this method is fully tested already by the CDK team. Two things with this though:

We may simply forget to add this line in our code in the first place! — meaning the function won’t have access to the table, and we won’t notice this until deployed. (i.e. this doesn’t help with the absence of code).
Some of these helper methods give more permissions than we sometimes want, meaning we need to create the IAM policies ourselves. This is where we are likely to make a mistake in the creation of the policies.

The example below shows that we actually get permissions for BatchWriteItem, PutItem, UpdateItem, DeleteItem and DescribeTable; where we actually only need PutItem for our specific needs.

Example of over permissive helper functions in the AWS CDK.

TypeScript will also help with the event payloads which we need to transform when using the @types/aws-lambda typings. This is shown below:

export const createOrderAdapter = async ({
  body,
}: APIGatewayProxyEvent): Promise<APIGatewayProxyResult> => {
  try {
    if (!body) throw new ValidationError('no payload body');

    const order = JSON.parse(body) as CreateOrderDto;

    ...
  }
});

but we may still have an error in our code when we try to map the parsed JSON string payload into a custom type for our application, or when we transform the response back out (especially when mapping, filtering or transforming the payloads).

Now, let’s move onto hexagonal architectures (clean code) before jumping into code examples.

Clean code: how can hexagonal architecture help?

If we want to make our overall testing easier, we should focus on the use of clean code and hexagonal architectures for any Lambda functions. This is a way of writing our code whereby we structure it with the aim of keeping the business logic devoid of technical implementation details (frameworks and 3rd party modules), and aim to ensure code that interacts with services (API Gateway, DynamoDB etc) are devoid of business logic. This is shown below for a lightweight version of hexagonal architectures and clean code which we will use in this article:

We can see that the Lambda function has three main facets:

✔️ Primary Adapters. These are the interfaces with the services which are input for our functions (driving side). An example could be a primary adapter for API Gateway as an event source, or in the diagram above the event source for an SQS queue. These are specific to a given function (not shared) and are devoid of any business logic.

✔️ Use Cases. The use case is the actual implementation of the function, which may include business logic, and using secondary adapters to retrieve and/or persist records into a data store, or perhaps raising domain events through Amazon EventBridge. These should be devoid of any frameworks or technical details, purely taking inputs from the driving side and using secondary adapters on the driven side.

✔️ Secondary Adapters. The secondary adapters are interfaces to other services for notifications or storing data in repositories (such as DynamoDB). These are devoid of business logic and can typically be used across many functions (driven side).

💡 Note: Secondary adapters may be shared across multiple services so can be integration tested as modules themselves which we will cover in this article too. In our example we have a secondary adapter for DynamoDB access, with one of the methods ‘createOrder’ being used in multiple Lambda functions (use cases).

I won’t go into unit testing in much more detail as the following articles take us on the journey of a lightweight clean code example with unit tests, through to fully fledged hexagonal architectures and domain-driven design (DDD).

Serverless Lightweight Clean Code Approach

An opinionated example of a lightweight ‘clean code’ Lambda function architecture, with code examples written in the…

blog.serverlessadvocate.com

Serverless Clean Architecture & Code with Domain-Driven Design 🚀

Using clean code and architectures in our Serverless solutions to ensure clean separation of code and infrastructure…

blog.serverlessadvocate.com

One of the key reasons that this helps us with our unit testing is that we can mock the driven side i.e. secondary adapters; which means we can isolate our system under test (SUT) without actually calling out to other systems or services (services such as DynamoDB, EventBridge, SQS, HTTPS etc). We will walk through how we do this later in the article.

A word on the ‘aws-async-test-library’ 🪐

The most typical way people integration and e2e test their serverless services is to use the aws-sdk along with a testing library in their native language (for us TypeScript and Jest), which requires quite a bit of code and setup in the aws-sdk v3.

In our example, we also want to have functionality such as arbitrary delays and retry counts, which we don’t get natively with either.

For this reason I have created a local library in the packages folder called the ‘aws-async-test-library’ which is part of this solution in the packages folder here ‘orders-domain-service/packages/aws-async-test-library’:

This library has a whole host of helper functions and utilities for our integration and e2e test needs, such as auto generating IDs, creating temporary queues, putting items in DynamoDB tables, publishing events to EventBridge, asserting values are on queues and much much more…

Let’s have a quick look at the code for one function as an example, specifically the hasMessage function for SQS:

import {
  ReceiveMessageCommand,
  ReceiveMessageCommandOutput,
  SQSClient,
} from '@aws-sdk/client-sqs';

import { delay } from '@packages/aws-async-test-library';

export async function hasMessage(
  region: string,
  queueUrl: string,
  timeoutInSeconds: number = 2,
  maxIterations: number = 20,
  property?: string,
  propertyValue?: string | number
): Promise<string | null> {
  const client = new SQSClient({ region });

  let iteration = 1;

  while (iteration <= maxIterations) {
    const receiveMessageCommand = new ReceiveMessageCommand({
      QueueUrl: queueUrl,
      MaxNumberOfMessages: 1,
      WaitTimeSeconds: 2,
    });

    try {
      const response: ReceiveMessageCommandOutput = await client.send(
        receiveMessageCommand
      );

      const messages = response.Messages;

      if (messages && messages.length > 0) {
        const message = messages[0];

        if (!message.Body) {
          throw new Error('message has no body');
        }

        // if the optional property value is provided then filter the message
        if (property && propertyValue) {
          const parsedMessage = JSON.parse(message.Body);

          const properties = property.split('.');
          let value = parsedMessage;

          for (const property of properties) {
            if (value && value.hasOwnProperty(property)) {
              value = value[property];
            } else {
              return null;
            }
          }

          if (value === propertyValue) {
            return message.Body;
          }
        } else {
          // if not then just return the first message found
          return message.Body;
        }
      }
    } catch (error) {
      console.error('Error receiving message:', error);
      throw error;
    }

    await delay(timeoutInSeconds);

    iteration++;
  }

  return null;
}

In this basic example function it:

Allows the user to specify to either grab the first message from the queue, or to refine the search based on a specific property of the message and the property value.
Since our tests are async we allow for an arbitrary delay and max iterations count, so we can check for the message being on the queue multiple times.
Once we find the specific message we return the message body for the tests.

Q: “Could we not do something clever with AppSync subscriptions, streams or StepFunctions here for our testing library to get the results without polling and delays?”
A: “Yes, we could, however personally I think this is WAY more complicated than it needs to be. I don’t want to create anymore bespoke code than I have done in this library personally, and I don’t want to introduce more AWS services and complexity than needed!

Now let’s jump into the test’s code in the next section.

🪐 Note: If anybody wants to tidy it up with me (since I quickly wrote this for an article) and open source it please reach out!!

Talking through key code: ⚙️ Integration Tests

Let’s now talk through the code and examples for our three integration tests.

💡 Note: We can run these integration tests using npm run test:integration.

1. Create order API to DynamoDB table

The first integration test we have is utilising the Orders API Gateway API to create (POST) a new order on the resource /orders/; and then ensuring it is created in the DynamoDB table:

This ensures that:

✔️ API Gateway has the correct Lambda integration, resources, configuration and permissions.
✔️ The Create Order Lambda function has the correct permissions to put items into DynamoDB, it has transformed/used the event source payload correctly, and that the environment variables are setup correctly on the function.
✔️ It exercises the full function code as like a unit test would i.e. business logic.

The code for this can be found here: orders-domain-service/tests/integration/1-create-order-api-to-dynamodb/create-order-api-to-dynamodb.integration.ts:

import {
  clearTable,
  getItem,
  httpCall,
} from '@packages/aws-async-test-library';

import { config } from '@config/config';

// set up our constants
const region = config.get('region');
const stageName = config.get('stageName');
const endpoint = config.get('apiEndpoint');

const tableName = `orders-internal-domain-table-${stageName}`;

//               _____________    _____________    _____________
//               |   Orders  |    |   handler  |   |  Table    |
// --> post http |   (API)   | -->|  (Lambda)  |-->| (DynamoDB)|--> Check item exists
//               _____________    _____________    _____________

describe('create-order-api-to-dynamodb', () => {
  beforeAll(() => {
    jest.retryTimes(2, { logErrorsBeforeRetry: false });
  });

  beforeEach(async () => {
    await clearTable(region, tableName);
  }, 20000);

  afterEach(async () => {
    await clearTable(region, tableName);
  }, 20000);

  it('should create the record successfully in the table', async () => {
    // arrange
    const payload = {
      quantity: 199,
      price: 1.99,
      productId: 'PROD-123',
    };

    const resource = 'orders';
    const method = 'POST';

    // act - call the api endpoint to create a new order
    const { id } = await httpCall(endpoint, resource, method, payload);

    // assert - get the item from the db based on the auto generated id from the api call
    const tableItem = await getItem(region, tableName, id, 10, 2);
    expect(tableItem).toMatchObject(payload);
  }, 60000);
});

We can see from the code above that:

Before and after each test we clear down the DynamoDB table to ensure we don’t have old test data hanging around.
We use the httpCall function to post a new order to our API Gateway on /orders/.
We then use the getItem function to ensure that the item with the specific ID is in DynamoDB and that the payload is the same (i.e. the order details).

💡 Note: In Jest we have set the test timeout to 6 seconds as this is an async eventually consistent flow through three services. We also retry the test two times and log any failures which covers any timeouts, cold starts or network style issues.

2. DynamoDB table to Shared Event Bus

The second integration test ensures that when an item is added directly to the DynamoDB table that the event is published to the Shared Event Bus (via a temporary SQS queue for the test which is a target of the shared event bus).

This ensures that:

✔️ The DynamoDB table has streams enabled.
✔️ The stream Lambda function is an event source for the table.
✔️ The stream handler Lambda function has permission to put an event onto the Shared event bus.
✔️ It exercises the full function code as like a unit test would.

Q: “Why not just create a queue in your CDK code for any non-production environments instead of creating it on the fly for the test and tearing it down?
A: “Firstly, there is no problem in doing that. I personally think the chances of leaving them in accidentally and having additional attack surface or cost in production is not required”

The code for this can be found here: orders-domain-service/tests/integration/2-orders-table-stream-to-shared-bus/orders-table-stream-to-shared-bus.integration.ts:

import {
  clearTable,
  createQueueForBus,
  deleteQueue,
  deleteTargetRule,
  generateRandomId,
  hasMessage,
  putItem,
} from '@packages/aws-async-test-library';

import { config } from '@config/config';

// we ensure that the event bus name never clashes with other ephemeral envs
const id = generateRandomId(7);

// set up our constants
const region = config.get('region');
const stageName = config.get('stageName');

const tableName = `orders-internal-domain-table-${stageName}`;
const sharedBusName = `shared-domains-event-bus-${stageName}`;

let queueUrl: string;

const queueName = `${id}-sqs`;
const ruleName = `${sharedBusName}-rule`;
const source = 'com.order.internal';
const detailType = 'OrderCreatedEvent';

//              _____________    _____________   _____________   _________________
//              |   Orders  |    |   stream  |   |  handler  |   |  shared bus   |
// --> put item | (DynamoDB)| -->| (DynamoDB)|-->| (Lambda)  |-->| (EventBridge) | --> Check temp queue
//              -------------    -------------   -------------   -----------------

describe('orders-table-stream-to-shared-bus', () => {
  beforeEach(async () => {
    await clearTable(region, tableName);
  }, 20000);

  afterEach(async () => {
    await clearTable(region, tableName);
  }, 20000);

  beforeAll(async () => {
    jest.retryTimes(2, { logErrorsBeforeRetry: false });

    // create a temp sqs queue as a target for the shared event bus
    // which we will remove after the tests have ran
    queueUrl = await createQueueForBus(
      sharedBusName,
      queueName,
      region,
      source,
      detailType,
      ruleName
    );
  }, 12000);

  afterAll(async () => {
    // delete the temp queue and the temp target rule
    await deleteTargetRule(region, sharedBusName, ruleName);
    await deleteQueue(queueName);
  }, 12000);

  it('should create the event successfully on the shared bus', async () => {
    // arrange
    const id = generateRandomId();
    const order = {
      id,
      created: '2023-07-05T11:09:01.930Z',
      price: 19.66,
      productId: 'PROD-1966',
      quantity: 10,
    };

    // act - put the order directly into the dynamodb table
    await putItem(region, tableName, order);

    // assert - ensure that the event is on our temp sqs queue
    // note: this is a temp queue to ensure that the function published
    // the event to the shared event bus
    const message = (await hasMessage(
      queueUrl,
      20,
      20,
      'detail.id',
      id // ensure we only grab our own message from sqs
    )) as string;
    expect(JSON.parse(message)).toMatchObject({
      'detail-type': 'OrderCreatedEvent',
      source: 'com.order.internal',
      detail: {
        quantity: 10,
        productId: 'PROD-1966',
        created: '2023-07-05T11:09:01.930Z',
        price: 19.66,
        id,
      },
    });
  }, 60000);
});

We can see from the code above that:

We clear down the table before and after the tests as we did in the test above.
We use the createQueueForBus function to create a temporary SQS queue as a target for the shared event bus (which we don’t own) before the test runs.
We use the putItem function to directly put the item into our DynamoDB table with an autogenerated ID.
We then assert that the message with the same ID and payload is now on the temporary queue using the hasMessage function.

💡 Note: After the test we delete the rule and target to our temporary SQS queue and then remove the queue itself.

3. Order Event Bus to DynamoDB table

The third integration tests ensures that when we have a synthetic 3rd party order event added to the internal orders event bus it is created in the DynamoDB table:

This ensures that:

✔️ The orders queue is an event target for the orders event bus through a properly configured rule, and has the permissions setup correctly.
✔️ The Lambda function is an event source for the orders queue and has the permissions to consume messages.
✔️ The Lambda function has the correct permissions to put an item into the DynamoDB table, and it has the correct environment variables configured on the function.
✔️ It exercises the full function code as like a unit test would.

The code for this can be found here: orders-domain-service/tests/integration/3-orders-bus-to-dynamodb/orders-bus-to-dynamodb.integration.ts:

import { clearTable, putEvent } from '@packages/aws-async-test-library';

import { config } from '@config/config';
import { scanItems } from '@packages/aws-async-test-library/dynamo-db/scan-items';

// set up our constants
const region = config.get('region');
const stageName = config.get('stageName');

const tableName = `orders-internal-domain-table-${stageName}`;
const busName = `orders-domain-event-bus-${stageName}`;

//               _________________    _________   ____________   ______________
//               |   Orders Bus  |    | Queue |   |  Handler |   |   Table    |
// --> put event | (EventBridge) | -->| (SQS) |-->| (Lambda) |-->| (DynamoDB) |--> Check item exists
//               -----------------    ----------  ------------   ---------------

describe('3rd-party-orders-bus-to-dynamodb', () => {
  beforeAll(() => {
    jest.retryTimes(2, { logErrorsBeforeRetry: false });
  });

  beforeEach(async () => {
    await clearTable(region, tableName);
  }, 20000);

  afterEach(async () => {
    await clearTable(region, tableName);
  }, 20000);

  it('should create the 3rd party order successfully in the table', async () => {
    // arrange
    const event = {
      quantity: 9811,
      price: 2,
      productId: '3RD_PARTY_ORDER',
    };

    // act
    await putEvent(
      region,
      busName,
      'com.shared.bus',
      '3rdPartyOrderRaised',
      event
    );

    // assert - get the item from the db based on the auto generated id from the api call
    const tableItems = await scanItems(region, tableName, 7);

    expect(tableItems[0]).toMatchObject({
      quantity: 9811,
      price: 2,
      productId: '3RD_PARTY_ORDER',
    });
  }, 60000);
});

We can see from the code above that:

Before and after each test we clear down the DynamoDB table as like the preceding tests.
We create a synthetic 3rd party order event and publish this to our internal orders domain bus using the putEvent function.
We use the scanItems function to ensure that the order now lives in the DynamoDB table as expected with the correct payload.

💡 Note: This has ensured the validity of configuration and permissions across four different AWS services.
Question: “But what about our typical Postman integration tests, are they not needed now??”
Answer: For me personally we still have the delineation between the two, and there is very much a need for the Postman style tests. I see these service integration tests above as testing the ‘plumbing’ of the services and happy paths, and the Postman tests actually checking fine grained responses such as payloads, content types, status codes; and checking validation by using different payloads (including fuzz and smoke testing). Some solutions may not have any API as an interaction point with Postman either, so we need to perform the tests via queues, topics, event bus’s etc.

For more on Acceptance tests with Cypress and Postman Integration tests for our API see the below article which performs a deep dive and has full code examples:

Serverless AWS CDK Pipeline Best Practices & Patterns — Part 3

An opinionated discussion around how to set up, structure, and deploy your AWS CDK Serverless apps using CDK Pipelines…

blog.serverlessadvocate.com

Module integration tests 🎁

At this point, it is worth us talking about two other integration tests we have specifically for our secondary adapters:

Database Adapter. This is a shared secondary adapter used by many functions which allows functions to utilise DynamoDB as a service.
Event Adapter. This is a shared secondary adapter used by many functions for publishing events to Amazon EventBridge.

💡 Note: We can run these module integration tests using npm run test:int.

We have integration tested these two files as it means we have confidence that they work as expected against our two cloud services (Amazon DynamoDB and Amazon EventBridge) — and that they can be pulled into any of our existing use cases and work correctly (we could publish these as modules to NPM and import them if we wished to work across many services for any future service).

✔️ Database Adapter

The code for the database adapter is shown below:

import {
  checkTableExists,
  clearTable,
  createTable,
  deleteTable,
  getItem,
} from '@packages/aws-async-test-library/dynamo-db';

import { OrderDto } from '@dto/order-dto';
import { config } from '@config/config';
import { generateRandomId } from '@packages/aws-async-test-library';

// we ensure that the table name never clashes with other ephemeral envs
const tableName = `database-adapter-${generateRandomId(7)}-table`;

// set up our constants
const region = config.get('region');

describe('database-adapter-integration-tests', () => {
  beforeAll(async () => {
    jest.retryTimes(2, { logErrorsBeforeRetry: false });

    // create the table if it doesn't already exist
    const tableExists = await checkTableExists(region, tableName);
    if (!tableExists) {
      await createTable(
        region,
        tableName,
        [{ AttributeName: 'id', KeyType: 'HASH' }],
        [{ AttributeName: 'id', AttributeType: 'S' }]
      );
    }
  }, 12000);

  afterEach(async () => {
    // after each test clear the table down to ensure we have a clean start
    await clearTable(region, tableName);
  }, 12000);

  afterAll(async () => {
    // after all tests delete the table if it exists
    const tableExists = await checkTableExists(region, tableName);
    if (tableExists) {
      await deleteTable(region, tableName);
    }
  }, 12000);

  describe('create order successfully', () => {
    it('should write the record to the table successfully', async () => {
      // arrange
      const record: OrderDto = {
        quantity: 3,
        productId: 'test-123',
        id: '4f9a8a09-b2c1-49a1-bc20-7bbc0811c0dc',
        created: '2023-07-04T09:12:30.138Z',
        price: 3.98,
      };

      // act
      const { createOrder } = await import('./database-adapter');
      await createOrder(record, tableName);

      // assert
      const item = await getItem(
        region,
        tableName,
        '4f9a8a09-b2c1-49a1-bc20-7bbc0811c0dc'
      );

      expect(item).toEqual(record);
    }, 30000);
  });
});

We can see that:

Before the tests we create a new DynamoDB table with an autogenerated name using the createTable function if it doesn’t already exist (checking with the checkTableExists function), and after all tests have ran we delete the table using the deleteTable function. After each test we also clear down the table using the clearTable function.
We create a new item with a specific ID using the createOrder function from our adapter.
We then get the get item using the getItem function, and assert it is in the correct shape and has the correct properties.

✔️ Event Adapter

The code for the event adapter is shown below:

import {
  checkBusExists,
  createBus,
  deleteBus,
} from '@packages/aws-async-test-library/event-bridge';
import {
  createQueueForBus,
  deleteQueue,
  hasMessage,
} from '@packages/aws-async-test-library/sqs';

import { PublishEventBody } from './event-adapter';
import { config } from '@config/config';
import { generateRandomId } from '@packages/aws-async-test-library';

// we ensure that the event bus name never clashes with other ephemeral envs
const id = generateRandomId(7);

// set up our constants
const region = config.get('region');
const bus = `${id}-bus`;
const queue = `${id}-sqs`;

// setup our test constants
const source = 'com.acme.source';
const detailType = 'createOrder';
const ruleName = `${bus}-rule`;

let queueUrl: string;

describe('event-adapter-integration-tests', () => {
  beforeAll(async () => {
    jest.retryTimes(2, { logErrorsBeforeRetry: false });

    // create the bus if it doesn't already exist
    const busExists = await checkBusExists(region, bus);
    if (!busExists) {
      await createBus(region, bus);
      queueUrl = await createQueueForBus(
        bus,
        queue,
        region,
        source,
        detailType,
        ruleName
      );
    }
  }, 12000);

  afterAll(async () => {
    // after all we delete the bus and queue
    await deleteBus(region, bus, ruleName);
    await deleteQueue(queue);
  }, 12000);

  describe('publish event successfully', () => {
    it('should publish the event successfully to the eventbridge bus', async () => {
      // arrange
      const event: PublishEventBody = {
        event: {
          quantity: 3,
          productId: 'test-123',
          id: '4f9a8a09-b2c1-49a1-bc20-7bbc0811c0dc',
          created: '2023-07-04T09:12:30.138Z',
          price: 3.98,
        },
        detailType,
        source,
        eventVersion: '1',
        eventDateTime: '2023-07-04T09:12:30.138Z',
        eventBus: bus,
      };

      // act
      const { publishEvent } = await import('./event-adapter');
      await publishEvent(event);

      // assert
      const message = (await hasMessage(queueUrl)) as string;

      expect(JSON.parse(message)).toMatchObject({
        'detail-type': detailType,
        source,
        detail: {
          quantity: 3,
          productId: 'test-123',
          price: 3.98,
        },
      });
    }, 30000);
  });
});

We can see that:

Firstly we check if the bus already exists using the checkBusExists function, and if it doesn’t we create it using the createBus function. We also create a new queue and the relevant target and rule for it using the createQueueForBus function.
We use our publishEvent function under test to publish a new event to the bus.
We assert that the message is now on the queue using the hasMessage function and that it has the correct shape and properties.

Now let’s move onto e2e testing in the next section.

Talking through key code: 🚚 e2e Tests

Now we will take a look at our e2e tests which are testing two full user journeys which exercise the full infrastructure path end to end, and cover everything that we have already done as part of our integration and unit tests above (all plugged together):

1 Customer Order Journey. 🛍️
2 Third Party Order Journey 🚚

💡 Note: We can run these end to end tests using npm run test:e2e.

1. Customer Order Journey

The first e2e test is performed via our Orders API with a POST request for a new order, and we then check that we ultimately have the message on a temporary SQS queue which is a target of the shared event bus. This essentially tests the full flow of our first two integration tests above.

The code for this is shown below:

import {
  createQueueForBus,
  deleteQueue,
  deleteTargetRule,
  generateRandomId,
  hasMessage,
  httpCall,
} from '@packages/aws-async-test-library';

import { config } from '@config/config';

// we ensure that the sqs queue name never clashes with other ephemeral envs
const id = generateRandomId(7);

// set up our constants
const region = config.get('region');
const stageName = config.get('stageName');
const endpoint = config.get('apiEndpoint');

const tableName = `orders-internal-domain-table-${stageName}`;
const sharedBusName = `shared-domains-event-bus-${stageName}`;

let queueUrl: string;

const queueName = `${id}-sqs`;
const ruleName = `${sharedBusName}-rule`;
const source = 'com.order.internal';
const detailType = 'OrderCreatedEvent';

//               _____________    _____________    _____________
//               |   Orders  |    |   handler  |   |  Table    |
// --> post http |   (API)   | -->|  (Lambda)  |-->| (DynamoDB)|--|
//               _____________    _____________    _____________  |
//                                                                |
//           ______________________________________________________
//           |
//           V
//  _____________    _____________  _________________
//  |   Stream  |   |   Handler |   |  shared bus   |
//  | (DynamoDB)|-->|  (Lambda) |-->| (EventBridge) |--> Check temp queue ✔️
//  -------------   -------------   -----------------

describe('customer-order-journey', () => {
  beforeAll(async () => {
    jest.retryTimes(2, { logErrorsBeforeRetry: false });

    // create a temp sqs queue as a target for the shared event bus
    // which we will remove after the tests have ran
    queueUrl = await createQueueForBus(
      sharedBusName,
      queueName,
      region,
      source,
      detailType,
      ruleName
    );
  }, 12000);

  afterAll(async () => {
    // delete the temp queue and the temp target rule
    await deleteTargetRule(region, sharedBusName, ruleName);
    await deleteQueue(queueName);
  }, 12000);

  it('should create the event successfully on the shared bus', async () => {
    // arrange
    const payload = {
      quantity: 234,
      price: 4.97,
      productId: 'PROD-E2E',
    };

    const resource = 'orders';
    const method = 'POST';

    // act - call the api endpoint to create a new order and grab the auto generated id
    const { id } = await httpCall(endpoint, resource, method, payload);

    // assert - ensure that the event is on our temp sqs queue
    // note: this is a temp queue to ensure that the function published
    // the event to the shared event bus
    const message = (await hasMessage(
      queueUrl,
      2,
      20,
      'detail.id',
      id // ensure we only grab our own message from sqs
    )) as string;
    expect(JSON.parse(message)).toMatchObject({
      'detail-type': 'OrderCreatedEvent',
      source: 'com.order.internal',
      detail: {
        quantity: 234,
        price: 4.97,
        productId: 'PROD-E2E',
        id,
      },
    });
  }, 120000);
});

Let’s talk through the code:

We start by creating a new temporary SQS queue with the relevant target and rule from the shared event bus using the createQueueForBus function.
We use the httpCall function to create a new order via our API Gateway API, and use the returned autogenerated ID from the POST request later for our assertion.
We use the hasMessage function to ensure that the message with the given ID property resides on the temporary queue with the correct shape and properties.

💡 Note: We also delete the temporary queue and any targets following all of the tests in the suite running.

2. Third Party Order Journey

The second e2e test is performed via dropping a 3rd party order event into our Orders internal event bus, and we then check that we ultimately have the message on a temporary SQS queue which is a target of the shared event bus. This essentially tests the full flow of our third and second integration tests.

The code for this is shown below:

import {
  createQueueForBus,
  deleteQueue,
  deleteTargetRule,
  generateRandomId,
  hasMessage,
  putEvent,
} from '@packages/aws-async-test-library';

import { config } from '@config/config';

// we ensure that the sqs queue name never clashes with other ephemeral envs
const id = generateRandomId(7);

// set up our constants
const region = config.get('region');
const stageName = config.get('stageName');

const busName = `orders-domain-event-bus-${stageName}`;
const sharedBusName = `shared-domains-event-bus-${stageName}`;

let queueUrl: string;

const queueName = `${id}-sqs`;
const ruleName = `${sharedBusName}-rule`;
const source = 'com.order.internal';
const detailType = 'OrderCreatedEvent';

//               __________    _________    ____________
//               | Orders |    | Queue |   |  Handler  |
// --> put event | (Bus)  | -->| (SQS) |-->| (Lambda)  |--|
//               __________    _________    ____________  |
//                                                        |
//           ______________________________________________
//           |
//           V
//  ______________   ______________    _____________  _________________
//  |   Table    |   |   Stream   |   |   Handler |   |  shared bus   |
//  | (DynamoDB) |-->| (DynamoDB) |-->|  (Lambda) |-->| (EventBridge) |--> Check temp queue ✔️
//  --------------   --------------   -------------   -----------------

describe('3rd-party-order-journey', () => {
  beforeAll(async () => {
    jest.retryTimes(2, { logErrorsBeforeRetry: false });

    // create a temp sqs queue as a target for the shared event bus
    // which we will remove after the tests have ran
    queueUrl = await createQueueForBus(
      sharedBusName,
      queueName,
      region,
      source,
      detailType,
      ruleName
    );
  }, 12000);

  afterAll(async () => {
    // delete the temp queue and the temp target rule
    await deleteTargetRule(region, sharedBusName, ruleName);
    await deleteQueue(queueName);
  }, 12000);

  it('should create the event successfully on the shared bus', async () => {
    // arrange
    const productId = generateRandomId(7);

    const event = {
      quantity: 1123,
      price: 4.99,
      productId,
    };

    // act
    await putEvent(
      region,
      busName,
      'com.shared.bus',
      '3rdPartyOrderRaised',
      event
    );

    // assert - ensure that the event is on our temp sqs queue
    // note: this is a temp queue to ensure that the function published
    // the event to the shared event bus
    const message = (await hasMessage(
      queueUrl,
      2,
      20,
      'detail.productId',
      productId // ensure we only grab our own message from sqs (i.e. our generated productId)
    )) as string;
    expect(JSON.parse(message)).toMatchObject({
      'detail-type': 'OrderCreatedEvent',
      source: 'com.order.internal',
      detail: {
        quantity: 1123,
        productId,
        price: 4.99,
      },
    });
  }, 120000);
});

We can see that:

We create the temporary queue, rule and target for the shared event bus as we did in the previous test.
We use the putEvent function to create a new 3rd party order with an autogenerated productId to be used in our assertions.
We assert that that message is on the temporary queue using the hasMessage function by checking the productId, properties and shape.

What are we covering in Part 2?

In Part 2 we will cover:

✔️ Mocking of third party APIs outside of our bounded context.
✔️ Interacting with long-lived infrastructure — how and when?
✔️ We extend the solution to add Step Functions and S3 buckets.
✔️ We dive into unit testing; including our AWS CDK stack code.
✔️ We test our L3 Custom CDK Constructs.

🔔 Subscribe to my articles above to be alerted of Part 2, and if you enjoyed, please share to LinkedIn or Twitter!

Wrapping up

I hope you enjoyed this article, and if you did then please feel free to share and feedback!

Please go and subscribe on my YouTube channel for similar content!

I would love to connect with you also on any of the following:

https://www.linkedin.com/in/lee-james-gilmore/
https://twitter.com/LeeJamesGilmore

If you enjoyed the posts please follow my profile Lee James Gilmore for further posts/series, and don’t forget to connect and say Hi 👋

Please also use the ‘clap’ feature at the bottom of the post if you enjoyed it! (You can clap more than once!!)

About me

“Hi, I’m Lee, an AWS Community Builder, Blogger, AWS certified cloud architect and Global Serverless Architect based in the UK; currently working for City Electrical Factors (UK) & City Electric Supply (US), having worked primarily in full-stack JavaScript on AWS for the past 6 years.

I consider myself a serverless advocate with a love of all things AWS, innovation, software architecture and technology.”

*** The information provided are my own personal views and I accept no responsibility on the use of the information. ***

You may also be interested in the following:

Serverless Content 🚀

An index of all of my Serverless content to easily browse in one place, including videos, blog posts and more..

blog.serverlessadvocate.com

Comprehensive Testing of Serverless Solutions: Exploring Integration, E2E, and Unit Testing with AWS CDK and TypeScript (Part 1)

Detailed article on the importance of testing our serverless solutions, with in depth diagrams and code examples which we will talk through.

Introduction

GitHub - leegilmorecode/serverless-e2e-testing: Example repo showing e2e testing in serverless…

Example repo showing e2e testing in serverless applications using the AWS CDK and TypeScript - GitHub …

Contents

How would we define the three types of tests?

Integration Testing

End-to-End Testing (Integrated)

Unit Testing (Implementation Detail)

How do these tests fit into our SDLC? 💭

Serverless OpenAPI & Amazon API Gateway with the AWS CDK — Part 1

How we can utilise the “OpenAPI first approach” alongside Amazon API Gateway in our AWS CDK TypeScript solutions.

The Power of AWS SAM & AWS CDK together 🚀

How to use AWS SAM and AWS CDK together in serverless development to quickly push code changes to the cloud, and easily…

What are the run times and complexities of these test types?

A note on bounded contexts ⭕

Serverless Domain Driven Design

Breaking down DDD concepts with real world examples into a tangible serverless equivalent.

Ephemeral environments vs Local emulation

What are we building in this article?

E2E and Integration Testing: why do we need it in serverless development?

The conundrum of serverless lock-in & spiralling complexity: Is it all worth it?

Should ‘serverless lock-in’ and spiralling complexity be a consideration for engineering leaders in large organisations…

Do we get any support from the AWS CDK and TypeScript? Yes!

Clean code: how can hexagonal architecture help?

Serverless Lightweight Clean Code Approach

An opinionated example of a lightweight ‘clean code’ Lambda function architecture, with code examples written in the…

Serverless Clean Architecture & Code with Domain-Driven Design 🚀

Using clean code and architectures in our Serverless solutions to ensure clean separation of code and infrastructure…

A word on the ‘aws-async-test-library’ 🪐

Talking through key code: ⚙️ Integration Tests

1. Create order API to DynamoDB table

2. DynamoDB table to Shared Event Bus

3. Order Event Bus to DynamoDB table

Serverless AWS CDK Pipeline Best Practices & Patterns — Part 3

An opinionated discussion around how to set up, structure, and deploy your AWS CDK Serverless apps using CDK Pipelines…

Module integration tests 🎁

✔️ Database Adapter

✔️ Event Adapter

Talking through key code: 🚚 e2e Tests

1. Customer Order Journey

2. Third Party Order Journey

What are we covering in Part 2?

Wrapping up

About me

Serverless Content 🚀

An index of all of my Serverless content to easily browse in one place, including videos, blog posts and more..

Written by Serverless Advocate