Photo by Annie Spratt on Unsplash

How big is a Microservice in Serverless?

We delve into a common query in the minds of enterprise Serverless teams, addressing it by thoroughly examining the merits and drawbacks associated with different sizes and patterns for microservices.

Serverless Advocate
15 min readOct 27, 2023

--

Preface

  • We cover the factors that help determine the size. 📏
  • We cover an opinionated perfect size for services. 🚀
  • We give an example company scenario to work through. 🖼️

Introduction 👋🏽

In the ever-evolving landscape of modern software development, microservices and serverless architectures have become central to building scalable, flexible, and efficient applications.

As organizations embrace these paradigms, a critical question arises:
How big should a microservice be in a serverless environment?”

“As organisations embrace these paradigms, a critical question arises: How big should a microservice be in a serverless environment?

In this article, we delve into the heart of this challenge, exploring the delicate balance between granularity and cohesion in microservice design. We’ll navigate the realm of Domain-Driven Design (DDD) to uncover strategies for effectively determining the size and scope of microservices in a serverless world, ultimately paving the way for robust and maintainable architectures.

We finally give a very opinionated view of what the perfect serverless microservice looks like and walk through a fictitious company example.

Functions are not Microservices! ❌

Let’s start with something we need to call out at the start: “Functions are not Microservices!”.

When it comes to engineers and architects starting out in the Serverless World, there is quite often a misconception that functions (FaaS) are microservices.

Functions in AWS are a small piece of compute that can scale up and down to run a small unique piece of code, with the aim that once the code has run it will scale back down to zero on completion.

As it happens, Lambda functions typically make up ‘part’ of our microservices and are invoked through event sources (whether that be via an API Gateway endpoint or via an EventBridge bus target rule).

This caused some terrible architectural designs when Lambda functions first came out — commonly today known as ‘Nanoservices’. These Nanoservices very quickly became hard to reason about, increased cognitive load on teams, and were massively complex.

💡 Functions are not Microservices — Section Summary

  • The clear distinction here is that functions play a crucial part in our microservices, but they are not microservices in themselves.
  • Nanoservices actually increase overall complexity and cognitive load and are very hard to reason about.

Cohesion vs Coupling 🔗

Cohesion and Coupling are two fundamental concepts in system design and software engineering that have a huge impact on answering our questions regarding the size of microservices.

They refer to the organization and relationships between the components, modules, functions, or services within a system.

“High cohesion (elements closely related) and low coupling (modules loosely connected) are desirable for better system design.”

Cohesion is about how well the elements within a system relate to a common purpose. In other words, it measures how closely the responsibilities and functionalities within a system are related.

High cohesion is generally desirable because it leads to more maintainable, understandable, and reusable modules and systems.

Coupling is about the level of interdependence between modules in a system. It refers to the degree of interconnectedness or interdependence between modules or components within a system. It measures how tightly or loosely these modules are connected to each other.

High cohesion (elements closely related) and low coupling (modules loosely connected) are desirable for better system design.

Coupling in Serverless 🔗

So taking this at face value, what would this mean for services within our systems?

When we talk about ‘Coupling’ in the Serverless World we are typically talking about the following:

  • Synchronous Communication — this is typically via REST interfaces (APIs) such as Amazon API Gateway and is one of the higher forms of coupling. This is point-to-point direct communication.
  • Asynchronous Communication — this is typically through messaging or events using services such as Amazon SNS, SQS, and Amazon EventBridge; and creates the least amount of coupling between services. This is commonly known as publish and subscribe and event-driven architectures (EDA).
  • Shared Platforms — We can find some form of coupling from a shared service perspective such as a shared Identity Microservice or an SDK that all other services use. This is typically a deliberate decision to reduce duplication of effort.
  • Databases — We can have coupling at a database level i.e. typically monoliths (or modular monoliths), and this is the highest form of coupling.

Before we go any further, let’s look at one common misconception: Synchronous communication is not evil! There are times when a process can not be eventually consistent and therefore has to be synchronous.
(for example getting a stock level or taking a payment).

Yes, we should strive for async communication through events as our ‘go to’, thats a given, but that is not always possible (especially when needing to integrate with 3rd party systems through APIs). Don't beat yourself up, but take it into consideration when looking at microservice size!

Let’s take a look at what these look like below for the different patterns:

Taking this into consideration, how would we ideally have our services communicate with each other?

If we quite quickly negate the use of the shared database and understand that the shared platform is desirable at times (and a deliberate decision point called platform engineering), that leaves us with the ‘async’ and ‘sync’ options (and a little hint — we will always need both approaches!).

Synchronous Communication

Let's first consider the synchronous option through REST interfaces (APIs) and what this means for the size of our microservices specifically:

We can see that with smaller microservices and synchronous communication, we have more differing point-to-point connections and REST interfaces (independent APIs), and therefore more overall system complexity. Excessive communication and coordination between distributed services can slow down the system as a whole (including teams), and make it hard to scale efficiently.

“with smaller microservices and synchronous communication, we have more differing point-to-point connections and REST interfaces (independent APIs), and therefore more overall system complexity.”

If we have larger microservices (domain services as I call them), there is naturally less point-to-point communication (fewer individual APIs to understand, implement, and work with for our teams), and therefore less complexity.

Asynchronous Communication

OK, so does asynchronous communication through events help: yes of course, and we should always strive for this, but we will always have a need for synchronous communication too (so take this into consideration when looking at microservice size!).

“we will always have a need for synchronous communication too (so take this into consideration when looking at microservice size!).”

Even when using asynchronous communication with our microservices, we still find that on average there is more complexity with fine-grained microservices. This is typically due to more queues, event buses, resource policies, IAM roles, etc.

💡 Coupling in Serverless — Section Summary

  • We will almost always have the need for some synchronous communications via REST APIs in solutions, and we should therefore plan for this and rightsize our microservices because of it.
  • We have now understood that smaller microservices, and therefore more of them overall in total, typically mean increased overall complexity in our system (more moving parts and higher cognitive load).

Cohesion in Serverless 🔗

Now let’s consider the cohesion aspect of our microservices, and what this means for our overall microservice size. If we go back to the start of the article, we should be aiming for ‘high cohesion’, where “elements of the system are closely related”.

Domain-Driven Design (DDD) is one way of looking at the boundaries of our services, also known as ‘bounded contexts’ which helps understand and determine our level of cohesion; and this is very much based on the business capabilities within those sub-domains (parts of the overall domain).

Note: I won’t go fully into DDD here, but you can view a previous article below which may help with certain definitions.

The following high-level illustration below shows how in an organization you will have your overall ‘Domain’, which can then subsequently be broken down into ‘Sub-domains’ through techniques such as ‘Event Storming’.

It is the sub-domain level where we are looking at the level of cohesion and coupling and what it means for our microservices.

We can see from the diagram above that:

  • Our overall Domain can be split down into smaller Sub-domains.
  • A bounded context is the boundary of the sub-domain model (shown as a dotted line in our diagram). This helps with determining encapsulation and the common language in that part of the system.
  • It’s always desirable to have a 1:1 mapping between a sub-domain and a bounded context.
  • A sub-domain equates to the ‘problem space’ (common language, business rules, processes, etc), and the bounded context, therefore, represents the ‘solution space’ (where we build things). It’s at the solution space level where we start to build one or more microservices for that domain.
  • One team should build and maintain within the bounded context, to limit hand-offs and dependencies, and to encourage loose coupling and well-defined boundaries between parts of the system (think Team Topologies).

We can see why determining the sub-domains and bounded contexts through domain-driven design is important if we look at the illustration below:

We can see that when we have low cohesion (services are small and disparate although actually related), we have more moving parts in the system and more overall complexity and communication (almost going back to our nanoservices nonsense).

If we keep elements that are related together in our microservices, we therefore reduce the overall complexity.

“OK, so why not just go all in on monoliths and reduce the additional complexity and cognitive load entirely?”

OK, so why not just go all in on monolithic services and reduce the additional complexity and cognitive load entirely? As a tipping point when our services are too large with too little respect to cohesion (monolithic), we end up with the following issues (but not limited to):

  • Deploying changes to a monolithic application can be risky. A small code change can necessitate redeploying the entire application, leading to downtime and increased risk of introducing new bugs.
  • Large monolithic codebases can lead to challenges in team collaboration. Teams may step on each other’s toes when making changes, and it can be harder to work in parallel.
  • A failure in one part of a monolithic application can potentially bring down the entire system. It lacks the fault isolation benefits of microservices.
  • Monolithic applications can have longer development cycles, making it harder to respond quickly to changing business needs.
  • In a monolithic system, you may be limited to using a single programming language or framework, restricting your ability to choose the best tool for each component.

If we then have these more monolithic services with tight coupling between them, we end up with what is known as a ‘distributed monolith’.

A distributed monolith emerges when a monolithic application is divided into multiple large services that maintain strong dependencies on one another, all without embracing the essential patterns required for distributed systems. In essence, it involves breaking down a monolith into separate services while retaining their tightly coupled nature.

This means that we need to be sensible about the size of our services and how they communicate, and domain-driven design should be our go-to for right-sizing them as discussed above.

💡 Cohesion in Serverless — Section Summary

  • Domain-driven design allows us to create sensible bounded contexts around our microservices (boundaries) — helping us to right-size them.
  • It is favorable to have a 1:1 mapping between a sub-domain and a bounded context.
  • We need to ensure that we don’t build monoliths when looking at increased cohesion and coupling; therefore a distributed monolith across our organization.

Moving Parts 🧩

Another key aspect to take into consideration is typically when we have more microservices we have more overall moving parts and services, and therefore more complexity. We have touched upon this multiple times in the sections above, but let’s take a moment to look at it closer:

These additional complexities are shown in the diagram above and detailed below (but not limited to):

  • More code repositories to work with typically as each microservice should have its own repository ideally (as a deployable unit).
  • More CI/CD pipelines as each microservice should have its own pipeline as a deployable unit (each microservice).
  • More OpenAPI and Event schemas for consuming service teams to find, understand, and work with.
  • More distributed logs, tracing, instrumentation, duplicated boilerplate code, IAM roles, etc.
  • Higher cognitive load on teams and more context switching…

This shows the direct correlation between more volume of smaller microservices and increased complexity and cognitive load.

💡 Moving Parts — Section Summary

  • There is a direct correlation between more fine-grained microservices and more overall complexity. We should aim to manage the complexities through domain services.

What is the perfect (opinionated) size?

Let’s look at an illustration below based on what we have learned so far:

We can see from the diagram above that we want to ensure:

  • We use Domain-Driven Design to ensure that we have the right level of cohesion of functionality in our services (rightsized bounded contexts); i.e. we don’t want small microservices with noisy communication between them. We also don’t want to lump all functionality into one microservice on the other hand either!
  • We aim for low coupling through the use of events; however, we pay close attention to the fact that we will almost always certainly need synchronous communication through APIs too.
  • We understand that with both nano services (tiny services) and monolithic services (huge services), we have a higher level of complexity and cognitive load. We need to meet somewhere in the middle using DDD.

This means we end up with ‘Domain Services’ (Macroservices) which sit roughly in the middle, taking into account all of the learnings in this article.

Walking through an example 🚶🏽

OK, now that we have covered the overall theory, let’s walk through a fictitious example company called ‘LJ Hotels’, and how they determine their microservice size along with an approach called Serverless Architecture Layers (SAL Architecture).

Taking our fictitious company, we can see that overall we have our domain model for ‘LJ Hotels’; and that through using DDD and techniques such as event storming, we can break it down into the differing sub-domains (business capabilities) shown below:

As you can see above, we determined in this example that we have sub-domains for the following:

  • Dining Management.
  • Payment Maintenance.
  • Reservation Management.
  • Customer Maintenance.
  • Spa Management.
  • Room Maintenance.

Each of these sub-domains therefore has its own ‘bounded context’ as shown in the illustration for the ‘Customer Maintenance’ sub-domain in orange above; which in turn is a logical boundary for our system design.

We can then look at what this may look like at a very high level with SAL Architecture below:

We can now see how each of these domain services plays its part in the overall architecture of “LJ Hotels” from a SAL Architecture perspective; whilst ensuring we have a high level of cohesion and a low level of coupling.

If we now look at how they communicate (looking at the coupling aspect), we can see this is fairly opinionated around the use of ‘events’ and Amazon EventBridge for async communication (each domain service has its own custom bus), and through the use of Private Amazon API Gateways for synchronous communication (via their own well-defined, versioned REST APIs).

Synchronous communication between services

How would we model this in serverless on AWS? This is shown in the diagram below based on the article links for SAL Architecture:

We can see that our services communicate privately on the AWS backbone network through the use of AWS PrivateLink and Private Amazon API Gateway APIs; and that it is only our experience layer BFFs that are exposed to the public (whether that be users on our website or B2B).

Asynchronous communication between services

We can now see that in our Data Layer, we have Amazon EventBridge playing the part of an ESB (Enterprise Service Bus) between our experience and domain layer services:

The diagram above shows that our services will communicate through events and that each of our services has its own Amazon EventBridge custom bus, allowing events to flow through our central shared bus (ESB) using the ‘Single bus, Multi-account’ pattern:

https://github.com/aws-samples/amazon-eventbridge-resource-policy-samples/blob/main/patterns/single-bus-multi-account-pattern/README.md

Domain Service Makeup

Now that we have discussed the way that the services communicate both asynchronously and synchronously; now let’s look at the actual typical makeup of a domain service in this pattern:

We can see from the diagram above that our domain service encapsulates all of the cohesive business capabilities of that sub-domain, and has the following characteristics:

  • It has a well-defined REST API interface using Amazon API Gateway to allow for other experience and domain layer services to interact with it synchronously as a front door (when needed).
  • Everything that goes on behind that REST API is well encapsulated and private i.e. we only expose the functionality that we want to.
  • A dedicated domain team owns the sub-domain and bounded context; and therefore the domain services that make it up.
  • We have a custom Amazon EventBridge Bus specifically for that sub-domain or domain service. This allows the service to consume published events via the shared central bus.
  • The domain service publishes its own public events for other domain services to consume.
  • Within the bounded context of the sub-domain; and therefore the domain service, we have one or more microservices behind the API.

This hopefully gives enough information on the opinionated perfect size of a microservice, and how it can be used alongside SAL Architectures as an approach for enterprise serverless.

Wrapping up

I hope you enjoyed this article, and if you did then please feel free to share and feedback!

Please go and subscribe on my YouTube channel for similar content!

I would love to connect with you also on any of the following:

https://www.linkedin.com/in/lee-james-gilmore/
https://twitter.com/LeeJamesGilmore

If you enjoyed the posts please follow my profile Lee James Gilmore for further posts/series, and don’t forget to connect and say Hi 👋

Please also use the ‘clap’ feature at the bottom of the post if you enjoyed it! (You can clap more than once!!)

About me

Hi, I’m Lee, an AWS Community Builder, Blogger, AWS certified cloud architect and Global Head of Technology & Architecture based in the UK; currently working for City Electrical Factors (UK) & City Electric Supply (US), having worked primarily in full-stack JavaScript on AWS for the past 6 years.

I consider myself a serverless advocate with a love of all things AWS, innovation, software architecture and technology.

*** The information provided are my own personal views and I accept no responsibility on the use of the information. ***

You may also be interested in the following:

--

--

Global Head of Technology & Architecture | Serverless Advocate | Mentor | Blogger | AWS x 7 Certified 🚀