Automating tasks using Amazon Bedrock Agents and AI

Increasing productivity through Amazon Bedrock Agent automation, with examples written in TypeScript and the AWS CDK

Serverless Advocate
16 min readFeb 26, 2024

--

Preface

✔️ We cover what Amazon Bedrock Agents are.
✔️ We talk through the AWS architecture.
✔️ We walk through the TypeScript and AWS CDK code.
✔️ We perform some tests to see it in action.

Introduction 👋🏽

Amazon Bedrock Agents empower you to construct and customise autonomous agents within your company, that can perform tasks on your behalf. These agents assist end-users in accomplishing tasks by leveraging organisational data and user input using conversational chat and AI.

They act as orchestrators, managing interactions between foundational models, data sources, knowledge bases, software applications, and user conversations. Additionally, they automate API calls to execute actions and access knowledge bases to enrich information relevant to these actions.

In this article, we will cover a fictitious hotel and spa company called LJ Resorts so we can talk through the AWS architecture and code. Our customers can use our application to book a hotel stay, golf session, and spa treatment in one go; as well as asking for company information like available treatments, deals, and opening times.

The full code repository can be found here:

👇 Before we go any further — please connect with me on LinkedIn for future blog posts and Serverless news https://www.linkedin.com/in/lee-james-gilmore/

What are Amazon Bedrock Agents? 🤖

Let’s now talk through what Amazon Bedrock Agents are, and how they work; starting with understanding a couple of key acronyms.

Acronyms

Before we get started, let’s cover some acronyms and what they mean:

✔️ FMs — Foundational Models.

In Amazon Bedrock, an FM stands for a Foundational Model. Amazon Bedrock provides access to pre-trained deep-learning models from Amazon and other third-party model providers through a simple API.

✔️ Action Groups.

You define the actions that the agent should perform by providing an OpenAPI schema to define the API operations that the agent can invoke to perform its tasks, and a Lambda function to take the input API operation and parameters identified during orchestration based on the schema, and returning the result of the API invocation as output.

✔️ Instructions.

You write instructions that describe what the agent is designed to do. With advanced prompts, you can further customise instructions for the agent at every step of orchestration and include Lambda functions to parse each step’s output. In our example, it is:

🤖 “Please help our customers to book hotel rooms, spa sessions and golf bookings; whilst providing them with any special offers depending on the day and booking type, make them aware of any opening times or prices before they complete the booking, and also take into consideration our hotel policies.

How do Amazon Bedrock Agents work? 🤖

We can see from the diagram below, that we start with a customers input which is augmented using the prompt store, previous conversation history fetched from the session store for context, and the foundational model invoked.

Next the orchestration prompt parses the response and invokes an action group Lambda based on the operations that are defined in our OpenAPI specification document, as well as retrieving any additional information required from the Bedrock Knowledge Base. Once the orchestration is complete the results are returned to the customer.

https://docs.aws.amazon.com/bedrock/latest/userguide/agents-how.html

Now in the next section let’s look at what we are building in this article so we can look through the code, deploy and test the application.

What are we building? 🛠️

OK, so now we have done a deep dive into Agents with Amazon Bedrock, and understand how they work theoretically, now, let’s look at what we are building in this article:

Note: In a previous article we already covered how to set up Amazon Bedrock Knowledge Bases so we won’t cover in this article.

An example of what we are building. Note: We covered Knowledge Bases in a previous article

We covered Amazon Bedrock Knowledge Bases in a previous article which can be found here:

We can see from the diagram above that:

  1. The customer hits the Lambda function URL with their prompt.
  2. It invokes a query Lambda function which in turn calls the Amazon Bedrock Agent with the request.
  3. The Agent uses both the deployed Knowledge Base with company-specific data and a Lambda function based on an OpenAPI spec in an S3 bucket to route to other internal systems to complete the tasks for the customer (it will orchestrate multiple calls on our behalf).

Now that we have discussed the high-level architecture, let’s look at an example interaction in the next section.

What could an interaction look like? ⛳

We can see an example interaction below of what it may look like when we have a customer chatting to an agent, and this was actually run through our example solution with real prompts and outputs:

Customer: “Hi, what special deals do you have on at the moment for hotels?

Agent: “The following hotel special deals are currently available: — Monday Staycation Special: 20% off room rates (Mondays only) — Last Minute Getaway: 15% off same-day bookings (Tuesdays only) — Extended Stay Discount: 20% off 3-night stays (Wednesdays only) — Suite Upgrade: Complimentary upgrade to executive suite (Thursdays only) — Weekend Getaway Package: 10% off 2-night stays (Fridays only)”

📘 Note: This interaction above read our Knowledge Base for additional context.

Customer: “I would like to book a hotel room on the 2024–02–25, what is available?

Agent: “Here are the available rooms on 2024–02–25: — Room 101 (Standard, $100): Cozy standard room with a queen-sized bed. — Room 102 (Standard, $100): Spacious standard room with two double beds. — Room 103 (Deluxe, $150): Luxurious deluxe room with a king-sized bed and a view. — Room 104 (Suite, $200): Elegant suite with a separate living area and bedroom. — Room 105 (Standard, $90): …..{rest removed for brevity}”

➡️ Note: This interaction above called our external services through our Lambda function.

Customer: “Can I book a deluxe please on 2024–02–25 for $160

Agent: “I have booked room 109 for you on 2024–02–25. This is a deluxe room with a luxurious bathroom for $160. Please let me know if you need anything else!”

➡️ Note: This interaction above called our external services through our Lambda function.

Customer: “What spa sessions are available?

Agent: “Here are the available spa treatments for today: — Swedish Massage: Relaxing full-body massage to relieve tension and promote relaxation. Cost: $80 — Facial: Deep cleansing facial to rejuvenate and refresh your skin. Cost: $70 — Hot Stone Massage: Therapeutic massage using heated stones to soothe muscles and improve circulation. Cost: $90 — Aromatherapy: Gentle massage using aromatic essential oils to enhance relaxation and well-being. Cost: $85 — …..{rest removed for brevity}”

➡️ Note: This interaction above called our external services through our Lambda function.

Customer: “Can I book the Hot Stone Massage please?

Agent: “Yes, I have booked the Hot Stone Massage for you today. This therapeutic massage uses heated stones to soothe muscles and improve circulation. The cost is $90. Please let me know if you need anything else!”

➡️ Note: This interaction above called our external services through our Lambda function.

Customer: “What are the opening hours for the golf?

Agent: “The opening hours for golf at LJ Resorts are: Monday to Friday: 9:00 AM — 6:00 PM Saturday & Sunday: 7:00 AM — 7:00 PM”

📘 Note: This interaction above read our Knowledge Base for additional context.

Customer: “Do you have any golf sessions on that day too in the morning?

Agent: “Checking the available golf sessions for today, I see there is one morning session available: Morning Tee Time: Enjoy a round of golf in the morning with stunning views. Cost: $80 Please let me know if you would like me to book this morning golf session for you.”

➡️ Note: This interaction above called our external services through our Lambda function.

Customer: “Can I book the golf session for $80 on that day please?

Agent: “Yes, I have booked the morning golf session for $80 today. This session will allow you to enjoy a round of golf in the morning with stunning views. Please let me know if you need anything else!”

➡️ Note: This interaction above called our external services through our Lambda function.

Customer: “What was the total cost of my bookings?

Agent: “Based on my conversation history, you booked a Hot Stone Massage for $90 and a morning golf session for $80. The total cost of your bookings is $90 + $80 = $170.”

This can be summarised as shown below in this sequence diagram:

This shows how powerful Bedrock Agents are as this has essentially done the work of a person who would have needed to access many different systems and deal with a customer personally, whereas AI is used to automate the full process end to end.

Now that we have looked at an interaction that could happen with our customer and agent, let’s look at the key code.

Talking through key code 👨‍💻

OK, so we have seen this basic example in action, now let’s look through the TypeScript and CDK code. Remember that the full solution can be found here.

Stateful Stack

Let us start with our Stateful stack where we first create the agent Lambda function:

// create the lambda for the agent - this is the lambda that determines
// what the prompt looks like with regards to mapping to the schema
const actionGroupAgentLambda: nodeLambda.NodejsFunction =
new nodeLambda.NodejsFunction(this, 'AgentLambda', {
functionName: 'action-group-executor',
runtime: lambda.Runtime.NODEJS_20_X,
entry: path.join(
__dirname,
'./src/adapters/primary/action-group-executor/action-group-executor.adapter.ts'
),
memorySize: 1024,
handler: 'handler',
timeout: cdk.Duration.minutes(5),
description: 'action group lambda function',
architecture: lambda.Architecture.ARM_64,
tracing: lambda.Tracing.ACTIVE,
bundling: {
minify: true,
},
environment: {
...lambdaConfig,
},
});

We then create our Amazon Bedrock Agent as shown below:

// create the bedrock agent
const agent = new bedrock.Agent(this, 'BedrockAgent', {
name: 'Agent',
description: 'The agent for hotels, Spa and golf bookings.',
foundationModel: bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_V2,
instruction:
'Please help our customers to book hotel rooms, spa sessions and golf bookings; whilst providing them with any special offers depending on the day and booking type, make them aware of any opening times or prices before they complete the booking, and also take into consideration our hotel policies.',
idleSessionTTL: cdk.Duration.minutes(10),
knowledgeBases: [kb],
shouldPrepareAgent: true,
aliasName: 'Agent',
});

We can see from the code above that we give the Agent key properties like how long a session should last, a link to our Amazon Bedrock Knowledge Base, the FM type (in our case Claud V2), and the instruction of what this agent should do.

We next create our Action Group as shown below:

// add the action group for making bookings
new bedrock.AgentActionGroup(this, 'AgentActionGroup', {
actionGroupName: 'agent-action-group',
description: 'The action group for making a booking',
agent: agent,
apiSchema: bedrock.S3ApiSchema.fromAsset(
path.join(__dirname, './schema/api-schema.json')
),
actionGroupState: 'ENABLED',
actionGroupExecutor: actionGroupAgentLambda,
shouldPrepareAgent: true,
});

We can see that we give it an Open API schema detailing what the agent can do, as well as the Lambda function which is invoked as a proxy to the actions. Let’s now look at what the Open API spec looks like:

{
"openapi": "3.0.0",
"info": {
"title": "Hotel, Spa and Golf booking API for LJ Resorts",
"version": "1.0.0",
"description": "APIs for managing hotel, spa and golf bookings for our customers."
},
"paths": {
"/rooms": {
"get": {
"summary": "Get a list of all rooms which are available",
"description": "Get the list of all available rooms for a given date",
"operationId": "getAllAvailableRooms",
"responses": {
"200": {
"description": "Gets the list of all available rooms for a given date",
"content": {
"application/json": {
"schema": {
"type": "array",
"items": {
"type": "object",
"properties": {
"roomId": {
"type": "string",
"description": "Unique ID of the room."
},
"roomType": {
"type": "string",
"description": "The room type."
},
"roomDescription": {
"type": "string",
"description": "The room description."
},
"date": {
"type": "string",
"description": "The date the room is free."
},
"cost": {
"type": "string",
"description": "The cost of the room per night."
}
}
}
}
}
}
}
}
},
"post": {
"summary": "Book an available room for a specific date",
"description": "Books a room for a specific date",
"operationId": "bookRoom",
"requestBody": {
"required": true,
"content": {
"application/json": {
"schema": {
"type": "object",
"properties": {
"roomId": {
"type": "string",
"description": "ID of the room to book"
},
"date": {
"type": "string",
"description": "Date to book the room"
}
},
"required": ["roomId", "date"]
}
}
}
},
"responses": {
"200": {
"description": "Room booked successfully"
}
}
}
},
"/spa-sessions": {
"get": {
"summary": "Get a list of all spa treatments which are available",
"description": "Get the list of all available spa treatments for a given date",
"operationId": "getAllAvailableSpaTreatments",
"responses": {
"200": {
"description": "Gets a list of all available spa treatments for a given date",
"content": {
"application/json": {
"schema": {
"type": "array",
"items": {
"type": "object",
"properties": {
"treatmentId": {
"type": "string",
"description": "Unique ID of the treatment."
},
"treatmentType": {
"type": "string",
"description": "The treatment type."
},
"treatmentDescription": {
"type": "string",
"description": "The treatment description."
},
"date": {
"type": "string",
"description": "The date the treatment session is free."
},
"cost": {
"type": "string",
"description": "The cost of the treatment."
}
}
}
}
}
}
}
}
},
"post": {
"summary": "Book an available spa treatment for a specific date",
"description": "Book an available spa treatment for a specific date",
"operationId": "bookSpaTreatment",
"requestBody": {
"required": true,
"content": {
"application/json": {
"schema": {
"type": "object",
"properties": {
"treatmentId": {
"type": "string",
"description": "ID of the spa treatment to book"
},
"date": {
"type": "string",
"description": "Date to book the spa treatment"
}
},
"required": ["treatmentId", "date"]
}
}
}
},
"responses": {
"200": {
"description": "Spa treatment booked successfully"
}
}
}
},
"/golf-sessions": {
"get": {
"summary": "Get a list of all golf sessions which are available",
"description": "Get a list of all golf sessions which are available",
"operationId": "getAllAvailableGolfSessions",
"responses": {
"200": {
"description": "Gets a list of all available golf sessions for a given date",
"content": {
"application/json": {
"schema": {
"type": "array",
"items": {
"type": "object",
"properties": {
"sessionId": {
"type": "string",
"description": "Unique ID of the golf session."
},
"sessionType": {
"type": "string",
"description": "The golf session type."
},
"sessionDescription": {
"type": "string",
"description": "The session description."
},
"date": {
"type": "string",
"description": "The date the golf session is free."
},
"cost": {
"type": "string",
"description": "The cost of the golf session."
}
}
}
}
}
}
}
}
},
"post": {
"summary": "Book an available golf session",
"description": "Books an available golf session for a specific date",
"operationId": "bookGolfSession",
"requestBody": {
"required": true,
"content": {
"application/json": {
"schema": {
"type": "object",
"properties": {
"sessionId": {
"type": "string",
"description": "ID of the golf session to book"
},
"date": {
"type": "string",
"description": "Date to book the golf session"
}
},
"required": ["sessionId", "date"]
}
}
}
},
"responses": {
"200": {
"description": "Golf session booked successfully"
}
}
}
}
}
}

The important parts to notice are the descriptions, paths, methods and operation Ids that our model will use to determine what tasks should be done. For example, for listing all hotel rooms it will use:

Description — "Get the list of all available rooms for a given date".
OperationId — "getAllAvailableRooms".
Path — ‘/rooms’.
Method —GET’.

When our Lambda function is invoked it will utilise the relevant details to determine what other systems should be called:

import {
MetricUnits,
Metrics,
logMetrics,
} from '@aws-lambda-powertools/metrics';
import { Tracer, captureLambdaHandler } from '@aws-lambda-powertools/tracer';
import { golfSessions, rooms, spaTreatments } from 'stateful/src/data';

import { injectLambdaContext } from '@aws-lambda-powertools/logger';
import middy from '@middy/core';
import { logger } from '@shared/index';

const tracer = new Tracer();
const metrics = new Metrics();

export const adapter = async ({
inputText,
apiPath,
httpMethod,
actionGroup,
messageVersion,
requestBody,
sessionAttributes,
promptSessionAttributes,
}: Event): Promise<Response> => {
let body;
let httpStatusCode = 200;

try {
logger.info(
`inputText: ${inputText}, apiPath: ${apiPath}, httpMethod: ${httpMethod}`
);

// Note: These would be calls to our Lambda FURLS or other, DBs or APIs/services in reality,
// but we are just using fake stubbed data for the article to show how it works.
switch (apiPath) {
case '/rooms':
if (httpMethod === 'GET') {
body = rooms;
} else if (httpMethod === 'POST') {
body = rooms.find((room) => room.roomId === '109');
}
break;

case '/spa-sessions':
if (httpMethod === 'GET') {
body = spaTreatments;
} else if (httpMethod === 'POST') {
body = spaTreatments.find(
(treatment) => treatment.treatmentId === '3'
);
}
break;

case '/golf-sessions':
if (httpMethod === 'GET') {
body = golfSessions;
} else if (httpMethod === 'POST') {
body = golfSessions.find((session) => session.sessionId === '1');
}
break;

default:
httpStatusCode = 500;
body =
'Sorry, I am unable to help you with that. Please try asking the question in a different way perhaps.';
break;
}

metrics.addMetric('SuccessfulActionGroupQuery', MetricUnits.Count, 1);

return {
messageVersion,
response: {
apiPath,
actionGroup,
httpMethod,
httpStatusCode,
sessionAttributes,
promptSessionAttributes,
responseBody: {
'application-json': {
body: JSON.stringify(body),
},
},
},
};
} catch (error) {
let errorMessage = 'Unknown error';
if (error instanceof Error) errorMessage = error.message;
logger.error(errorMessage);

metrics.addMetric('ActionGroupQueryError', MetricUnits.Count, 1);

throw error;
}
};

export const handler = middy(adapter)
.use(injectLambdaContext(logger))
.use(captureLambdaHandler(tracer))
.use(logMetrics(metrics));

You can see in our example above that we have simply hardcoded the returned data rather than calling out to other systems which we would typically do here.

An example of our hard coded data for this article in the /data/data.ts file

Now let’s look at the Stateless stack for querying our agent.

Stateless Stack

We start by creating a query Lambda function which also has a Lambda function URL setup with streaming:

// create the lambda for querying the agent
const queryModelLambda: nodeLambda.NodejsFunction =
new nodeLambda.NodejsFunction(this, 'QueryModelLambda', {
functionName: 'query-model-lambda',
runtime: lambda.Runtime.NODEJS_20_X,
entry: path.join(
__dirname,
'./src/adapters/primary/query-model/query-model.adapter.ts'
),
memorySize: 1024,
handler: 'handler',
timeout: cdk.Duration.minutes(3),
description: 'query model lambda function',
architecture: lambda.Architecture.ARM_64,
tracing: lambda.Tracing.ACTIVE,
bundling: {
minify: true,
},
environment: {
AGENT_ID: agentId,
AGENT_ALIAS_ID: agentAliasId,
...lambdaConfig,
},
});

// we add the function url for our query lambda with streamed responses
const queryModelLambdaUrl = queryModelLambda.addFunctionUrl({
authType: lambda.FunctionUrlAuthType.NONE,
invokeMode: lambda.InvokeMode.RESPONSE_STREAM,
cors: {
allowedOrigins: ['*'],
},
});

We then give it permission to invoke the agent, as shown below:

// we allow the query lambda function to query our models/KBs/agents
queryModelLambda.addToRolePolicy(
new iam.PolicyStatement({
actions: [
'bedrock:RetrieveAndGenerate',
'bedrock:Retrieve',
'bedrock:InvokeModel',
'bedrock:InvokeAgent',
],
resources: ['*'],
})
);

Now let’s take a look at our Query Lambda function which takes the prompt from the user via the function URL and invokes the agent:

import { MetricUnits, Metrics } from '@aws-lambda-powertools/metrics';
import {
BedrockAgentRuntimeClient,
InvokeAgentCommand,
InvokeAgentRequest,
InvokeAgentResponse,
} from '@aws-sdk/client-bedrock-agent-runtime';
import { ResponseStream, streamifyResponse } from 'lambda-stream';

import { config } from '@config';
import { ValidationError } from '@errors/validation-error';
import { logger } from '@shared/index';
import { APIGatewayProxyEventV2 } from 'aws-lambda';

const metrics = new Metrics();
const client = new BedrockAgentRuntimeClient();

const agentId = config.get('agentId');
const agentAliasId = config.get('agentAliasId');

function parseBase64(message: Uint8Array): string {
return Buffer.from(message).toString('utf-8');
}

export const queryModelAdapter = async (
{ body }: APIGatewayProxyEventV2,
responseStream: ResponseStream
): Promise<void> => {
try {
responseStream.setContentType('application/json');

if (!body) throw new ValidationError('no payload body');
const request = JSON.parse(body);

const { sessionAttributes, promptSessionAttributes, sessionId, prompt } =
request;

const input: InvokeAgentRequest = {
sessionState: {
sessionAttributes,
promptSessionAttributes,
},
agentId,
agentAliasId,
sessionId,
inputText: prompt,
};

const command: InvokeAgentCommand = new InvokeAgentCommand(input);
const response: InvokeAgentResponse = await client.send(command);

const chunks = [];
const completion = response.completion || [];

for await (const chunk of completion) {
if (chunk.chunk && chunk.chunk.bytes) {
const parsed = parseBase64(chunk.chunk.bytes);

chunks.push(parsed);
}
}

const returnMessage = {
sessionId: response.sessionId,
contentType: response.contentType,
message: chunks.join(' '),
};

metrics.addMetric('SuccessfulQueryModel', MetricUnits.Count, 1);

// Note: In the example we are not streaming, we are using the FURL request timeout feature
// but we could easily write the stream during the for loop if we wanted to
responseStream.write(returnMessage);
responseStream.end();
} catch (error) {
let errorMessage = 'Unknown error';
if (error instanceof Error) errorMessage = error.message;
logger.error(errorMessage);

metrics.addMetric('QueryModelError', MetricUnits.Count, 1);

responseStream.end();
throw error;
}
responseStream.end();
};

export const handler = streamifyResponse(queryModelAdapter);

We can see from the code above that we are streaming the responses from the Agent which come back in ‘chunks’ on a stream; however in this example we don’t update the user in realtime, and we wait until the action is completed and we respond with a JSON object. Now let’s test this out in the next section!

Testing the App 🧪

⚠️ Note: “Before deploying the example application, please note that OpenSearch Serverless on its own costs $700 a month, without the additional costs of Bedrock, CloudWatch, Lambda, API Gateway etc.”

Testing through Postman

You can use the Postman file in postman/Bedrock Agents.postman_collection.json to test with your own Lambda function URL information.

We can test this using an example JSON payload of:

{
"agentId": "agentId",
"agentAliasId": "agentAliasId",
"sessionId": "1f6aa00e-e585-49aa-aa2d-16adb64857c6",
"prompt": "Can I please book a morning golf session on 2024-02-25"
}

And we can see that our agent responds with the following:

An example call to our agent and the relevant response.

Under the hood, our agent has orchestrated multiple actions, the first being checking what golf sessions are available on that day:

We can see the agent first made a call to check available sessions

We can see that the agent decided it should do a ‘GET’ on ‘/golf-sessions/’ first which would return all available golf sessions on that day.

It then subsequently made a second operational call to ‘POST’ on ‘/golf-sessions/’ to make the booking:

A subsequent call to create the booking for the golf session

We can see the power of conversational AI with autonomous agents here as it has orchestrated multiple actions to support the customer.

We can see that our Agent has booked our Spa, Hotel and Golf booking

We can now test this with various other scenarios, like checking deals, booking spa sessions etc — let me know how you find the solution in the comments!

Wrapping up 👋🏽

I hope you enjoyed this article, and if you did then please feel free to share and feedback!

Please go and subscribe to my YouTube channel for similar content!

I would love to connect with you also on any of the following:

https://www.linkedin.com/in/lee-james-gilmore/
https://twitter.com/LeeJamesGilmore

If you enjoyed the posts please follow my profile Lee James Gilmore for further posts/series, and don’t forget to connect and say Hi 👋

Please also use the ‘clap’ feature at the bottom of the post if you enjoyed it! (You can clap more than once!!)

About me

Hi, I’m Lee, an AWS Community Builder, Blogger, AWS certified cloud architect, and Global Head of Technology & Architecture based in the UK; currently working for City Electrical Factors (UK) & City Electric Supply (US), having worked primarily in full-stack JavaScript on AWS for the past 6 years.

I consider myself a serverless advocate with a love of all things AWS, innovation, software architecture, and technology.

*** The information provided are my own personal views and I accept no responsibility for the use of the information. ***

You may also be interested in the following:

--

--

Global Head of Technology & Architecture | Serverless Advocate | Mentor | Blogger | AWS x 7 Certified 🚀