Starting on July 15, AWS has added support for S3 vector stores for Bedrock knowledge bases. Currently, it supports multiple stores: AWS managed:Others:OpenSearchMongoDB Atlas S3 vector storePineconePostgreSQLRedis Enterprise Cloud Neptune AWS managed:Others:OpenSearchMongoDB Atlas S3 vector storePineconePostgreSQLRedis Enterprise Cloud Neptune AWS managed:Others:OpenSearchMongoDB Atlas S3 vector storePineconePostgreSQLRedis Enterprise Cloud Neptune AWS managed:Others: AWS managed: AWS managed: Others: Others: OpenSearchMongoDB Atlas OpenSearch MongoDB Atlas S3 vector storePinecone S3 vector store Pinecone PostgreSQLRedis Enterprise Cloud PostgreSQL Redis Enterprise Cloud Neptune Neptune What is each one? AWS Managed: OpenSearch: OpenSearch is a distributed, community-driven, Apache 2.0-licensed, 100% open-source search and analytics suite used for a broad set of use cases like real-time application monitoring, log analytics, and website search. OpenSearch provides a highly scalable system for providing fast access and response to large volumes of data with an integrated visualization tool, OpenSearch Dashboards, that makes it easy for users to explore their data. OpenSearch is powered by the Apache Lucene search library, and it supports a number of search and analytics capabilities such as k-nearest neighbors (KNN) search, SQL, Anomaly Detection, Machine Learning Commons, Trace Analytics, full-text search, and more. OpenSearch S3 Vector store: Amazon S3 Vectors is the first cloud object store with native support to store and query vectors, delivering purpose-built, cost-optimized vector storage for AI agents, AI inference, and semantic search of your content stored in Amazon S3. By reducing the cost of uploading, storing, and querying vectors by up to 90%, S3 Vectors makes it cost-effective to create and use large vector datasets to improve the memory and context of AI agents as well as semantic search results of your S3 data. Amazon S3 Vectors is the first cloud object store with native support to store and query vectors, delivering purpose-built, cost-optimized vector storage for AI agents, AI inference, and semantic search of your content stored in Amazon S3. By reducing the cost of uploading, storing, and querying vectors by up to 90%, S3 Vectors makes it cost-effective to create and use large vector datasets to improve the memory and context of AI agents as well as semantic search results of your S3 data. AWS Aurora PostgreSQL: Amazon Aurora PostgreSQL is a cloud-based, fully managed relational database service that is compatible with PostgreSQL. It combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases, specifically PostgreSQL. Essentially, it's a PostgreSQL-compatible database offered as a service by Amazon Web Services Amazon Aurora PostgreSQL is a cloud-based, fully managed relational database service that is compatible with PostgreSQL. It combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases, specifically PostgreSQL. Essentially, it's a PostgreSQL-compatible database offered as a service by Amazon Web Services Amazon Neptune: Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets And non-AWS managed: And non-AWS managed: MongoDB Atlas: MongoDB Atlas: MongoDB Atlas is a multi-cloud database service by the same people that build MongoDB. Atlas simplifies deploying and managing your databases while offering the versatility you need to build resilient and performant global applications on the cloud providers of your choice. Pinecone: Pinecone: Pinecone is a cloud-based vector database service designed for AI applications, particularly those involving retrieval-augmented generation Redis Enterprise Cloud: Redis Enterprise Cloud: Redis Enterprise Cloud is a fully-managed, on-demand, database-as-a-service (DBaaS) offering from Redis, built on the foundation of open-source Redis Now, we understand what each of the supported stores is. I'm going to test and compare only AWS-managed stores. I have created a Python lambda function. import json import time import boto3 def lambda_handler(event, context): """Demo: Bedrock Nova Micro with Knowledge Base timing comparison""" # Configuration - easily change these for testing MODEL_ID = "amazon.nova-micro-v1:0" # Allow override for comparison KNOWLEDGE_BASE_ID = event.get('kb_id') # Initialize clients bedrock_runtime = boto3.client('bedrock-runtime') bedrock_agent_runtime = boto3.client('bedrock-agent-runtime') query = event.get( 'query', 'Can you provide a list of bank holidays employers can have?') start_time = time.time() try: # 1. Retrieve from Knowledge Base kb_start = time.time() kb_response = bedrock_agent_runtime.retrieve( knowledgeBaseId=KNOWLEDGE_BASE_ID, retrievalQuery={'text': query}, retrievalConfiguration={ 'vectorSearchConfiguration': {'numberOfResults': 3}} ) kb_time = time.time() - kb_start # 2. Build context and prompt context = "\n".join([r['content']['text'] for r in kb_response.get('retrievalResults', [])]) prompt = f"Context: {context}\n\nQuestion: {query}\n\nAnswer:" # 3. Call Bedrock model model_start = time.time() response = bedrock_runtime.converse( modelId=MODEL_ID, messages=[{"role": "user", "content": [{"text": prompt}]}], inferenceConfig={"maxTokens": 500, "temperature": 0.7} ) model_time = time.time() - model_start total_time = time.time() - start_time answer = response['output']['message']['content'][0]['text'] return { 'statusCode': 200, 'body': json.dumps({ 'kb_id': KNOWLEDGE_BASE_ID, 'query': query, 'answer': answer, 'timing_ms': { 'kb_retrieval': round(kb_time * 1000), 'model_inference': round(model_time * 1000), 'total': round(total_time * 1000) }, 'chunks_found': len(kb_response.get('retrievalResults', [])) }) } except Exception as e: return { 'statusCode': 500, 'body': json.dumps({ 'error': str(e), 'kb_id': KNOWLEDGE_BASE_ID }) } import json import time import boto3 def lambda_handler(event, context): """Demo: Bedrock Nova Micro with Knowledge Base timing comparison""" # Configuration - easily change these for testing MODEL_ID = "amazon.nova-micro-v1:0" # Allow override for comparison KNOWLEDGE_BASE_ID = event.get('kb_id') # Initialize clients bedrock_runtime = boto3.client('bedrock-runtime') bedrock_agent_runtime = boto3.client('bedrock-agent-runtime') query = event.get( 'query', 'Can you provide a list of bank holidays employers can have?') start_time = time.time() try: # 1. Retrieve from Knowledge Base kb_start = time.time() kb_response = bedrock_agent_runtime.retrieve( knowledgeBaseId=KNOWLEDGE_BASE_ID, retrievalQuery={'text': query}, retrievalConfiguration={ 'vectorSearchConfiguration': {'numberOfResults': 3}} ) kb_time = time.time() - kb_start # 2. Build context and prompt context = "\n".join([r['content']['text'] for r in kb_response.get('retrievalResults', [])]) prompt = f"Context: {context}\n\nQuestion: {query}\n\nAnswer:" # 3. Call Bedrock model model_start = time.time() response = bedrock_runtime.converse( modelId=MODEL_ID, messages=[{"role": "user", "content": [{"text": prompt}]}], inferenceConfig={"maxTokens": 500, "temperature": 0.7} ) model_time = time.time() - model_start total_time = time.time() - start_time answer = response['output']['message']['content'][0]['text'] return { 'statusCode': 200, 'body': json.dumps({ 'kb_id': KNOWLEDGE_BASE_ID, 'query': query, 'answer': answer, 'timing_ms': { 'kb_retrieval': round(kb_time * 1000), 'model_inference': round(model_time * 1000), 'total': round(total_time * 1000) }, 'chunks_found': len(kb_response.get('retrievalResults', [])) }) } except Exception as e: return { 'statusCode': 500, 'body': json.dumps({ 'error': str(e), 'kb_id': KNOWLEDGE_BASE_ID }) } In this example, we will be using amazon.nova-micro-v1:0 bedrock model for comparing performance. This lambda function expects a test event in the format: { "query": "Can you provide a list of bank holidays employers can have?", "kb_id": "AAUAL8BHQV" } { "query": "Can you provide a list of bank holidays employers can have?", "kb_id": "AAUAL8BHQV" } query - is our query from my text example file. You can use any text file you want. kb_id - knowledge base ID; we will create a bedrock knowledge base to test And I have created 4 different knowledge bases using different data sources: And finally, we have everything we need to make our tests: Let's run the lambda function and change only the knowledge base ID to test it properly. OpenSearch: OpenSearch: Neptune: Neptune: PostgreSQL: PostgreSQL: S3 Vector store: S3 Vector store: And for a better visual, ordered by execution time: Vector storeExecution time in msOpensearch1695Postgresql1807Neptune2236S3 vector store2284 Vector storeExecution time in msOpensearch1695Postgresql1807Neptune2236S3 vector store2284 Vector storeExecution time in msOpensearch1695Postgresql1807Neptune2236S3 vector store2284 Vector storeExecution time in ms Vector store Vector store Execution time in ms Execution time in ms Opensearch1695 Opensearch 1695 Postgresql1807 Postgresql 1807 Neptune2236 Neptune 2236 S3 vector store2284 S3 vector store 2284 As you can see here, OpenSearch is a faster storage solution. But what about the cost? OpenSearch - pay as per OCU. OpenSearch OpenSearch Compute Unit (OCU) - Indexing $0.24 per OCU per hour OpenSearch Compute Unit (OCU) - Search and Query$0.24 per OCU per hour OpenSearch Compute Unit (OCU) - Indexing $0.24 per OCU per hour OpenSearch Compute Unit (OCU) - Search and Query$0.24 per OCU per hour OpenSearch Compute Unit (OCU) - Indexing $0.24 per OCU per hour OpenSearch Compute Unit (OCU) - Search and Query$0.24 per OCU per hour OpenSearch Compute Unit (OCU) - Indexing $0.24 per OCU per hour OpenSearch Compute Unit (OCU) - Indexing $0.24 per OCU per hour OpenSearch Compute Unit (OCU) - Search and Query$0.24 per OCU per hour OpenSearch Compute Unit (OCU) - Search and Query $0.24 per OCU per hour The minimum OCU you can pay for is 0.5. It means $0.24 * 24 hours * 30 days * 2 (indexing, search, and query) * 0.5 (minimum OCU) = $172. PostgreSQL - pay per ACU: PostgreSQL - pay per ACU Aurora Capacity Unit (ACU)$0.12 per ACU per hour Aurora Capacity Unit (ACU)$0.12 per ACU per hour Aurora Capacity Unit (ACU)$0.12 per ACU per hour Aurora Capacity Unit (ACU)$0.12 per ACU per hour Aurora Capacity Unit (ACU) $0.12 per ACU per hour The minimum ACU you can pay for is 0. But 1 ACU will cost you $0.12 * 24 hours * 30 days = $86 Neptune: Neptune Memory-optimized Neptune Capacity Units configurationCost16 m-NCUs$0.48 per hour 32 m-NCUs $0.96 per hour 64 m-NCUs $1.92 per hour 128 m-NCUs $3.84 per hour 256 m-NCUs $7.68 per hour 384 m-NCUs $11.52 per hour Memory-optimized Neptune Capacity Units configurationCost16 m-NCUs$0.48 per hour 32 m-NCUs $0.96 per hour 64 m-NCUs $1.92 per hour 128 m-NCUs $3.84 per hour 256 m-NCUs $7.68 per hour 384 m-NCUs $11.52 per hour Memory-optimized Neptune Capacity Units configurationCost16 m-NCUs$0.48 per hour 32 m-NCUs $0.96 per hour 64 m-NCUs $1.92 per hour 128 m-NCUs $3.84 per hour 256 m-NCUs $7.68 per hour 384 m-NCUs $11.52 per hour Memory-optimized Neptune Capacity Units configurationCost Memory-optimized Neptune Capacity Units configuration Cost 16 m-NCUs$0.48 per hour 16 m-NCUs $0.48 per hour 32 m-NCUs $0.96 per hour 32 m-NCUs $0.96 per hour 64 m-NCUs $1.92 per hour 64 m-NCUs $1.92 per hour 128 m-NCUs $3.84 per hour 128 m-NCUs $3.84 per hour 256 m-NCUs $7.68 per hour 256 m-NCUs $7.68 per hour 384 m-NCUs $11.52 per hour 384 m-NCUs $11.52 per hour Minimal instance is $0.48 per hour. It means per month it will cost you $0.48 * 24 hours * 30 days = $345. Wow! S3 vector store, here - you will need to pay for requests and storage. S3 vector store, here - you will need to pay for requests and storage. S3 Vector Storage /Month - monthly logical storage of vector data, key, and metadata = $0.06 per GB $0.06 per GB S3 Vectors request pricing S3 Vectors request pricing PUT requests (per GB)* GET, LIST and all other requests (per 1,000 requests) S3 Vectors Requests $0.20 per GB $0.055 *PUT is subject to a minimum charge of 128KB per PUT. To lower PUT costs, you can batch multiple vectors per PUT request. S3 Vectors query pricing S3 Vectors query requests (per 1,000 requests) $0.0025 S3 Vector data - sum of vectors per index multiplied by average vector size (vector data, key, and filterable metadata) First 100 thousand vectors $0.0040 per TB Over 100 thousand vectors $0.0020 per TB TLDR: S3 Vectors storage charge S3 Vectors storage charge ((4 bytes * 1024 dimensions) vector data/vector + 1 KB filterable metadata/vector + 1 KB non-filterable metadata/vector + 0.17 KB key/vector) = 6.17 KB logical storage per average vector. 6.17 KB/average vector * 250,000 vectors * 40 vector indexes = 59 GB logical storage. Total monthly storage cost = 59 GB * $0.06/GB per month = $3.54 $3.54 Final comparison table: Final comparison table: Vector store type Retrieval timeApprox pricing per monthS3 Vector2284 ms$3.54Neptune2236 ms$345PostgreSQL1807 ms$86OpenSearch1695 ms$172 Vector store type Retrieval timeApprox pricing per monthS3 Vector2284 ms$3.54Neptune2236 ms$345PostgreSQL1807 ms$86OpenSearch1695 ms$172 Vector store type Retrieval timeApprox pricing per monthS3 Vector2284 ms$3.54Neptune2236 ms$345PostgreSQL1807 ms$86OpenSearch1695 ms$172 Vector store type Retrieval timeApprox pricing per month Vector store type Vector store type Retrieval time Retrieval time Approx pricing per month Approx pricing per month S3 Vector2284 ms$3.54 S3 Vector 2284 ms $3.54 Neptune2236 ms$345 Neptune 2236 ms $345 PostgreSQL1807 ms$86 PostgreSQL 1807 ms $86 OpenSearch1695 ms$172 OpenSearch 1695 ms $172 If speed is not so critical, I'd choose the S3 vector store. The obvious winner, otherwise, is OpenSearch, which would probably be a better choice. Which vector store are you using in your project?