Amazon Bedrock Knowledge Bases with Private Data

Using TypeScript and the AWS CDK, you can integrate Knowledge Bases into Amazon Bedrock to provide foundational models with contextual data from your private data sources

Serverless Advocate
11 min readFeb 19, 2024

--

Preface

✔️ We cover what Amazon Bedrock Knowledge Bases are.
✔️ We talk through the AWS architecture.
✔️ We walk through the TypeScript and AWS CDK code.
✔️ We perform some tests to see it in action.

Introduction 👋🏽

In this article we are going to talk through Amazon Bedrock Knowledge Bases, and how we can equip the AI models with up-to-date private company information, allowing our users to utilise AI with our own custom data. We will talk through our full code example and the associated AWS architecture.

Our fictitious company, LJ Medical Center

In our example, we will talk through a use case for a fictitious company called ‘LJ Medical Center’ where we allow our reception staff to query our AI model for company information.

The system allows the receptionist to search using AI based on private data

The reception staff can query the private data using natural language, for example, asking what the policy is for late payments for healthcare.

The full code example written in TypeScript and the AWS CDK can be found here below:

👇 Before we go any further — please connect with me on LinkedIn for future blog posts and Serverless news https://www.linkedin.com/in/lee-james-gilmore/

What is Amazon Bedrock? 🤖

Let’s now talk through what Amazon Bedrock is, and how it works; starting with understanding a couple of key acronyms.

Acronyms

Before we get started, let’s cover some acronyms and what they mean:

✔️ FMs — Foundational Models.

In Amazon Bedrock, an FM stands for a Foundational Model. Amazon Bedrock provides access to pre-trained deep-learning models from Amazon and other third-party model providers through a simple API. These models are referred to as Foundational Models or FMs.

✔️ RAG — Retrieval Augmented Generation.

Retrieval Augmented Generation (RAG) is a technique used to enhance natural language generation by incorporating retrieved information into the generation process. In simpler terms, it combines two key elements: retrieval of relevant information from a knowledge base or data source, and generation of a response based on this retrieved information.

What are Bedrock Knowledge Bases? 🤖

To equip FMs with up-to-date and custom information, organisations and businesses use Retrieval Augmented Generation (RAG), a technique that fetches data from company data sources and enriches the prompt to provide more relevant and accurate responses.

“Knowledge base for Amazon Bedrock help you take advantage of Retrieval Augmented Generation (RAG), a popular technique that involves drawing information from a data store to augment the responses generated by Large Language Models (LLMs).”

A knowledge base can be used not only to answer user queries, but also to augment prompts provided to foundation models by providing context to the prompt.

Where do we store the custom data? 🤖

Knowledge Bases for Amazon Bedrock help you implement the entire RAG workflow from ingestion of data in Amazon S3 to retrieval and prompt augmentation without having to build custom integrations to data sources and manage data flows. Session context management is built in, so your app can readily support multi-turn conversations.

Once you have pointed to your custom data in Amazon S3, Knowledge Bases for Amazon Bedrock automatically fetches the data, divides it into blocks of text, converts the text into embeddings, and stores the embeddings in your vector database. In our article, we will store the embeddings in an Amazon OpenSearch Serverless vector.

Note: “Knowledge Bases for Amazon Bedrock support popular databases for vector storage, including vector engine for Amazon OpenSearch Serverless, Pinecone, Redis Enterprise Cloud, Amazon Aurora (coming soon), and MongoDB (coming soon). If you do not have an existing vector database, Amazon Bedrock creates an Amazon OpenSearch Serverless vector store for you.”

How is the custom data stored? 🤖

Vector embeddings include the numeric representations of text data within your documents. Each embedding aims to capture the semantic or contextual meaning of the data. Amazon Bedrock takes care of creating, storing, managing, and updating your embeddings in the vector store, and it ensures your data is always in sync with your vector store.

✔️ Pre-processing

To enhance data retrieval, documents are divided into smaller segments, converted into embeddings, and stored in a vector index, maintaining a link to the original document. These embeddings enable semantic similarity comparisons for efficient query matching in data sources. This process is depicted in the accompanying image.

https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html

✔️ Runtime execution

At runtime, the model converts the user’s query into a vector and searches the vector index for semantically similar chunks. These chunks are used to augment the user prompt, which is then sent to the model for generating a response. This process is depicted in the image below illustrating RAG’s runtime operation.

https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html

✔️ Syncing Data

When we upload new documents to our Amazon S3 bucket as our Knowledge Base data source, we need to periodically sync the data with the Knowledge Base for indexing and querying.

Syncing updates the knowledge base incrementally, processing only newly added or modified objects in your S3 bucket since the last sync.

What are we building? 🛠️

OK, so now we have done a deep dive into Knowledge Bases with Amazon Bedrock, and understand how they work theoretically, now, let’s look at what we are building in this article:

What we are building in this article

We can see from the diagram above that:

  1. A user from the Reception Staff team makes a request to Amazon API Gateway through their application.
  2. Amazon API Gateway invokes a Lambda function based on the POST request with the query.
  3. The Lambda function calls the Bedrock Knowledge Base which augments the user’s query with data from the OpenSearch Serverless Vector store.
  4. When an object is modified, created, or deleted in the S3 bucket we invoke an ingestion lambda.
  5. The Lambda function calls the Knowledge Base to sync the data from the Amazon S3 bucket since there have been modifications.

Now that we have talked through the overall architecture, let’s look at this in action, and then talk through key code.

Talking through key code 👨‍💻

OK, so we have seen this basic example in action, now let’s look through the TypeScript and CDK code. Remember that the full solution can be found here.

Note: we use the experimental AI CDK construct for Bedrock as linked below.

Stateful Stack

We start by looking at our Stateful stack, with our Amazon Bedrock Knowledge Base and our S3 bucket which will house our data:

// create the bedrock knowledge base
const kb = new bedrock.KnowledgeBase(this, 'BedrockKnowledgeBase', {
embeddingsModel: bedrock.BedrockFoundationModel.TITAN_EMBED_TEXT_V1,
instruction: `Use this knowledge base to answer questions about patient records.`,
});

// create the s3 bucket which houses our patient data as a source for bedrock
this.bucket = new s3.Bucket(this, 'PatientRecordsBucket', {
bucketName: 'lj-medical-center-patient-records',
autoDeleteObjects: true,
removalPolicy: cdk.RemovalPolicy.DESTROY,
});

We can see from the code above that we are using the Titan Text V1 foundational model.

Next, we ensure that on our first deploy, we push up the example documents from our data folder into our S3 bucket:

// ensure that the data is uploaded as part of the cdk deploy
new s3deploy.BucketDeployment(this, 'ClientBucketDeployment', {
sources: [s3deploy.Source.asset(path.join(__dirname, '../../data/'))],
destinationBucket: this.bucket,
});

“Supported data formats include .pdf, .txt, .md, .html, .doc and .docx, .csv, .xls, and .xlsx files. Files must be uploaded to Amazon S3. Simply point to the location of your data in Amazon S3, and Knowledge Bases for Amazon Bedrock takes care of the entire ingestion workflow into your vector database.” — https://aws.amazon.com/bedrock/faqs/

Lastly, we create our Data Source for the Knowledge Base, which points to our S3 bucket:

// set the data source of the s3 bucket for the knowledge base
const dataSource = new bedrock.S3DataSource(this, 'DataSource', {
bucket: this.bucket,
knowledgeBase: kb,
dataSourceName: 'patients',
chunkingStrategy: bedrock.ChunkingStrategy.DEFAULT,
maxTokens: 500,
overlapPercentage: 20,
});

We can deploy our Stateful stack using the npm script npm run deploy:stateful, and once complete can log into the console and hit ‘sync’:

Clicking on the Sync button the first time to sync the Knowledge Base and the Data Source in S3

Stateless Stack

Now let’s look at our Stateless stack, starting with adding some S3 triggers to call our Ingestion Lambda function any time we have changes in the S3 bucket (new files, modifications, deletions etc):

// create an s3 event source for objects being added, modified or removed
bucket.addEventNotification(
s3.EventType.OBJECT_CREATED_PUT,
new s3n.LambdaDestination(ingestionLambda)
);
bucket.addEventNotification(
s3.EventType.OBJECT_REMOVED,
new s3n.LambdaDestination(ingestionLambda)
);

The Ingestion Lambda then has a secondary adapter which runs the following code to sync the data source for us:

import {
BedrockAgentClient,
StartIngestionJobCommand,
StartIngestionJobCommandInput,
StartIngestionJobCommandOutput,
} from '@aws-sdk/client-bedrock-agent';

import { config } from '@config';
import { logger } from '@shared/logger';
import { v4 as uuid } from 'uuid';

const client = new BedrockAgentClient();
const knowledgeBaseId = config.get('knowledgeBaseId');
const dataSourceId = config.get('dataSourceId');

export async function ingestionProcess(): Promise<string> {
const input: StartIngestionJobCommandInput = {
knowledgeBaseId: knowledgeBaseId,
dataSourceId: dataSourceId,
clientToken: uuid(),
};
const command: StartIngestionJobCommand = new StartIngestionJobCommand(input);

const response: StartIngestionJobCommandOutput = await client.send(command);
logger.info(`response: ${response}`);

return JSON.stringify({
ingestionJob: response.ingestionJob,
});
}

We next look at the IAM policy required to allow the Lambda function to perform the sync which is shown below:

 // ensure that the lambda function can start a data ingestion job
ingestionLambda.addToRolePolicy(
new iam.PolicyStatement({
actions: ['bedrock:StartIngestionJob'],
resources: [knowledgeBaseArn],
})
);

We also add a similar policy for our Query Lambda, which allows it to perform actions against Amazon Bedrock:

// we allow the query lambda function to query our models
queryModelLambda.addToRolePolicy(
new iam.PolicyStatement({
actions: [
'bedrock:RetrieveAndGenerate',
'bedrock:Retrieve',
'bedrock:InvokeModel',
],
resources: ['*'],
})
);

The code for the secondary adapter for the Query Lambda is shown below:

import {
BedrockAgentRuntimeClient,
RetrieveAndGenerateCommand,
RetrieveAndGenerateCommandInput,
RetrieveAndGenerateCommandOutput,
} from '@aws-sdk/client-bedrock-agent-runtime';

import { config } from '@config';

const client = new BedrockAgentRuntimeClient();
const knowledgeBaseId = config.get('knowledgeBaseId');

export async function queryModel(prompt: string): Promise<string> {
const input: RetrieveAndGenerateCommandInput = {
input: {
text: prompt,
},
retrieveAndGenerateConfiguration: {
type: 'KNOWLEDGE_BASE',
knowledgeBaseConfiguration: {
knowledgeBaseId: knowledgeBaseId,
// we are using Anthropic Claude v2 in us-east-1 in this example
modelArn: `arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-v2`,
},
},
};
const command: RetrieveAndGenerateCommand = new RetrieveAndGenerateCommand(
input
);
const response: RetrieveAndGenerateCommandOutput = await client.send(command);
return response.output?.text as string;
}

The final part is to add our Amazon API Gateway and allow it to invoke the Query Lambda function on a POST of our /queries/ resource

// create the api for our receptionist app to use
const api: apigw.RestApi = new apigw.RestApi(this, 'Api', {
description: 'LJ Medical Center API',
restApiName: 'lj-medical-center-api',
deploy: true,
endpointTypes: [apigw.EndpointType.REGIONAL],
deployOptions: {
stageName: 'prod',
dataTraceEnabled: true,
loggingLevel: apigw.MethodLoggingLevel.INFO,
tracingEnabled: true,
metricsEnabled: true,
},
});

// create the queries resource for the api
const queries: apigw.Resource = api.root.addResource('queries');

// add the endpoint for querying our knowledge base (post) on prod/queries/
queries.addMethod(
'POST',
new apigw.LambdaIntegration(queryModelLambda, {
proxy: true,
allowTestInvoke: false,
})
);

Now when we deploy the stateless stack using the npm script npm run deploy:stateless we can now test the functionality.

Testing the App 🧪

⚠️ Note: “Before deploying the example application, please note that OpenSearch Serverless on its own costs $700 a month, without the additional costs of Bedrock, CloudWatch, Lambda, API Gateway etc.”

Testing through Postman

You can use the Postman file in postman/Bedrock Knowledge Bases.postman_collection.json to test with your own URL information.

Let’s start by asking a simple query about late payments:

“What is the policy for late payments?”

Example of asking what the policy is for late payments

We can see from the screenshot above that we get the correct response:

“The policy for late payments at LJ Medical Center is that late payments will incur an added fee of $50 per treatment.”

We can then ask a query like:

“How can complaints be received?”

An example query on how complaints can be received

We can see from the query above that we successfully get the answer:

“Complaints can be received verbally, in writing, or through electronic means (email, website, etc.). Frontline staff, department heads, or designated complaint officers are responsible for receiving complaints.”

These are just two examples of queries that our reception staff could use to quickly find the information that they need from all of our policies.

Why not try this out and start adding some fictitious patient records to the mix? Just remember the costs of deploying the solution!

Wrapping up 👋🏽

I hope you enjoyed this article, and if you did then please feel free to share and feedback!

Please go and subscribe to my YouTube channel for similar content!

I would love to connect with you also on any of the following:

https://www.linkedin.com/in/lee-james-gilmore/
https://twitter.com/LeeJamesGilmore

If you enjoyed the posts please follow my profile Lee James Gilmore for further posts/series, and don’t forget to connect and say Hi 👋

Please also use the ‘clap’ feature at the bottom of the post if you enjoyed it! (You can clap more than once!!)

About me

Hi, I’m Lee, an AWS Community Builder, Blogger, AWS certified cloud architect, and Global Head of Technology & Architecture based in the UK; currently working for City Electrical Factors (UK) & City Electric Supply (US), having worked primarily in full-stack JavaScript on AWS for the past 6 years.

I consider myself a serverless advocate with a love of all things AWS, innovation, software architecture, and technology.

*** The information provided are my own personal views and I accept no responsibility for the use of the information. ***

You may also be interested in the following:

--

--

Global Head of Technology & Architecture | Serverless Advocate | Mentor | Blogger | AWS x 7 Certified 🚀