How to Choose the Right Vector Database for a Production-Ready RAG Chatbot

What You'll Learn Part 1: Evaluating vector databases against real-world filtering requirements Part 2: Step-by-step implementation for RAG Part 1: Evaluating vector databases against real-world filtering requirements Part 2: Step-by-step implementation for RAG Part 1: The Vector Database Evaluation Journey Part 1: The Vector Database Evaluation Journey You've built a chatbot powered by a sophisticated large language model (LLM). It is incredibly confident and impresses everyone in the demo. Then a customer asks: "Can I change my shipping address after placing an order?" Your chatbot confidently responds: "No, shipping addresses cannot be changed once an order is placed. You'll need to cancel and reorder." Oops. Oops. Actually, your policy allows address changes within 2 hours of ordering. One frustrated customer, who just posted a 1-star review about your 'broken chatbot'. Multiply that by the 50 others who will ask the same question. This gets even riskier with pricing. Imagine your chatbot confidently telling customers: "Our product A costs $99/month," when you actually changed it to $79/month last quarter. "Our product A costs $99/month," when you actually changed it to $79/month last quarter. The Search for a Reliable Solution The Search for a Reliable Solution When I set out to build a reliable chatbot for our customer service, I evaluated several approaches: Option 1: Fine-Tuning - Too expensive, needs constant updating Option 2: Bigger models - Higher costs with still-outdated knowledge Option 3: RAG (Retrieval-Augmented Generation) - Promising, but with a critical catch. RAG promised to tackle three core challenges of LLMs: RAG promised to tackle three core challenges of LLMs: Hallucination: Plausible-sounding information, but fabricated informationStatic Knowledge: The knowledge of an LLM is often frozen in time.Compute Cost: Extremely high, and requires extensive GPU time, eating your budget Hallucination: Plausible-sounding information, but fabricated information Static Knowledge: The knowledge of an LLM is often frozen in time. Compute Cost: Extremely high, and requires extensive GPU time, eating your budget The question isn't whether to enhance your chatbot, but how to do it cost-effectively. The Vector Database Dilemma The Vector Database Dilemma RAG promised to solve our problems, but introduced a new dilemma: "Which vector database should we use with our customer interactions?" "Which vector database should we use with our customer interactions?" Our chatbot needs to handle queries like: What is the return policy for electronics under $100?Can I upgrade shipping for orders over $200?I want to change my shipping address as I ordered last night. Show me laptops between $800-$1200 with 16GB RAM? Find laptops under $500? What is the return policy for electronics under $100? Can I upgrade shipping for orders over $200? I want to change my shipping address as I ordered last night. Show me laptops between $800-$1200 with 16GB RAM? Find laptops under $500? Notice the pattern? Real user queries combine numerical filtering with semantic understanding. The goal isn’t just to find similar texts, but to surface relevant information within specific constraints. When we analysed our conversation logs, we found that queries included price, updated promotion news, and policies. That's when it hit me: Semantic search alone isn't enough for real business logic. This wasn't an edge case; it was core functionality. Semantic search alone isn't enough for real business logic. Our vector database needed: Semantic understanding (What are they asking?)Numerical Filtering (Within what limits?)Both together, seamlessly Semantic understanding (What are they asking?) Numerical Filtering (Within what limits?) Both together, seamlessly The Vector Database Shortlist The Vector Database Shortlist Four options emerged as serious contenders: ChromaDB — the simplest option, optimized for ease of usePinecone — a fully managed solution with no infrastructure overheadMilvus — the clear choice for large-scale deploymentsWeaviate — a flexible platform with multiple hosting options ChromaDB — the simplest option, optimized for ease of use ChromaDB Pinecone — a fully managed solution with no infrastructure overhead Pinecone Milvus — the clear choice for large-scale deployments Milvus Weaviate — a flexible platform with multiple hosting options Weaviate The question wasn't which was "best" in theory - but which was right for our specific, filter-heavy, production-ready chatbot. Let's have a glance over setup. was right for our specific, filter-heavy, production-ready chatbot. Vector Database Setup Comparison FeatureChromaDBPineconeWeaviateMilvusInstallationpip install chromadbpip install pinecone-clientpip install weaviate-clientpip install pymilvusQuick StartInstantAPI key onlyDocker or cloudDocker ComposeSetup Time5 minutes5-10 minutes3-5 minutes for cloud 20 minutes for Docker45+ minutesInfrastructureNone neededNone neededDocker/K8s/CloudDocker Compose / K8sFree Tier DurationForever30 days14 days (cloud)Trial (Zilliz)After Free TierStill freePay or deletePay or self-hostPay or self-hostLocal DevelopmentExcellentCloud-onlyDocker simpleComplexLearning CurveEasyEasy MediumHard FeatureChromaDBPineconeWeaviateMilvusInstallationpip install chromadbpip install pinecone-clientpip install weaviate-clientpip install pymilvusQuick StartInstantAPI key onlyDocker or cloudDocker ComposeSetup Time5 minutes5-10 minutes3-5 minutes for cloud 20 minutes for Docker45+ minutesInfrastructureNone neededNone neededDocker/K8s/CloudDocker Compose / K8sFree Tier DurationForever30 days14 days (cloud)Trial (Zilliz)After Free TierStill freePay or deletePay or self-hostPay or self-hostLocal DevelopmentExcellentCloud-onlyDocker simpleComplexLearning CurveEasyEasy MediumHard FeatureChromaDBPineconeWeaviateMilvusInstallationpip install chromadbpip install pinecone-clientpip install weaviate-clientpip install pymilvusQuick StartInstantAPI key onlyDocker or cloudDocker ComposeSetup Time5 minutes5-10 minutes3-5 minutes for cloud 20 minutes for Docker45+ minutesInfrastructureNone neededNone neededDocker/K8s/CloudDocker Compose / K8sFree Tier DurationForever30 days14 days (cloud)Trial (Zilliz)After Free TierStill freePay or deletePay or self-hostPay or self-hostLocal DevelopmentExcellentCloud-onlyDocker simpleComplex FeatureChromaDBPineconeWeaviateMilvus Feature Feature ChromaDB ChromaDB Pinecone Pinecone Weaviate Weaviate Milvus Milvus Installationpip install chromadbpip install pinecone-clientpip install weaviate-clientpip install pymilvus Installation pip install chromadb pip install pinecone-client pip install weaviate-client pip install pymilvus Quick StartInstantAPI key onlyDocker or cloudDocker Compose Quick Start Instant API key only Docker or cloud Docker Compose Setup Time5 minutes5-10 minutes3-5 minutes for cloud 20 minutes for Docker45+ minutes Setup Time 5 minutes 5-10 minutes 3-5 minutes for cloud 20 minutes for Docker 45+ minutes InfrastructureNone neededNone neededDocker/K8s/CloudDocker Compose / K8s Infrastructure None needed None needed Docker/K8s/Cloud Docker Compose / K8s Free Tier DurationForever30 days14 days (cloud)Trial (Zilliz) Free Tier Duration Forever 30 days 14 days (cloud) Trial (Zilliz) After Free TierStill freePay or deletePay or self-hostPay or self-host After Free Tier Still free Pay or delete Pay or self-host Pay or self-host Local DevelopmentExcellentCloud-onlyDocker simpleComplex Local Development Excellent Cloud-only Docker simple Complex Learning CurveEasyEasy MediumHard Learning CurveEasyEasy MediumHard Learning Curve Easy Easy Medium Hard Query Comparison Query Comparison We have seen the setup, let's explore the code structure, which could tell usthe true story. Let's see how each database handles the exact queries our customers ask. Test Query 1: Give me laptops that are below $500 Test Query 1: Give me laptops that are below $500 1. Weaviate result = products.query.near_text( query="laptop", filters=Filter.by_property("price").less_than(500), limit=10 ) result.objects[0].properties['name'] result = products.query.near_text( query="laptop", filters=Filter.by_property("price").less_than(500), limit=10 ) result.objects[0].properties['name'] 2. Pinecone result=index.query(vector=get_embedding("laptop", filter={"price":})vector=get_embedding("laptop"), filter={"price": {"$lt": 500}}, top_k=10, include_metadata=True ) result['matches'][0]['metadata']['name'] result=index.query(vector=get_embedding("laptop", filter={"price":})vector=get_embedding("laptop"), filter={"price": {"$lt": 500}}, top_k=10, include_metadata=True ) result['matches'][0]['metadata']['name'] 3. ChromaDB result = collection.query( query_texts=["laptop"], where={"price": {"$lt": 500}}, n_results=10 ) result['metadatas'][0][0]['name'] result = collection.query( query_texts=["laptop"], where={"price": {"$lt": 500}}, n_results=10 ) result['metadatas'][0][0]['name'] 4. Milvus result = collection.search( data=[get_embedding("laptop")], anns_field="embedding", param={"metric_type": "L2", "params": {"nprobe": 10}}, expr="price < 500", limit=10, output_fields=["name", "price"] ) result[0][0].entity.get('name') result = collection.search( data=[get_embedding("laptop")], anns_field="embedding", param={"metric_type": "L2", "params": {"nprobe": 10}}, expr="price < 500", limit=10, output_fields=["name", "price"] ) result[0][0].entity.get('name') Let's examine what happens with a bit more complex query: "Show me laptops between $800-$1200 with 16GB RAM." Let's examine what happens with a bit more complex query: "Show me laptops between $800-$1200 with 16GB RAM." 1. Weaviate result = products.query.hybrid( query="laptop 16GB RAM", filters=( Filter.by_property("price").greater_or_equal(800) & Filter.by_property("price").less_or_equal(1200) & Filter.by_property("ram").equal("16GB") ), limit=10 ) # Access results for product in result.objects: print(f"{product.properties['name']}: ${product.properties['price']}") result = products.query.hybrid( query="laptop 16GB RAM", filters=( Filter.by_property("price").greater_or_equal(800) & Filter.by_property("price").less_or_equal(1200) & Filter.by_property("ram").equal("16GB") ), limit=10 ) # Access results for product in result.objects: print(f"{product.properties['name']}: ${product.properties['price']}") 2.PineCone result = index.query( vector=get_embedding("laptop 16GB RAM"), filter={ "price": {"$gte": 800, "$lte": 1200}, "ram": {"$eq": "16GB"} }, top_k=10, include_metadata=True ) # Access results for match in result['matches']: print(f"{match['metadata']['name']}: ${match['metadata']['price']}") result = index.query( vector=get_embedding("laptop 16GB RAM"), filter={ "price": {"$gte": 800, "$lte": 1200}, "ram": {"$eq": "16GB"} }, top_k=10, include_metadata=True ) # Access results for match in result['matches']: print(f"{match['metadata']['name']}: ${match['metadata']['price']}") 3.ChromaDB result = collection.query( query_texts=["laptop 16GB RAM"], where={ "$and": [ {"price": {"$gte": 800}}, {"price": {"$lte": 1200}}, {"ram": {"$eq": "16GB"}} ] }, n_results=10 ) # Access results for i, meta in enumerate(result['metadatas'][0]): print(f"{meta['name']}: ${meta['price']}") result = collection.query( query_texts=["laptop 16GB RAM"], where={ "$and": [ {"price": {"$gte": 800}}, {"price": {"$lte": 1200}}, {"ram": {"$eq": "16GB"}} ] }, n_results=10 ) # Access results for i, meta in enumerate(result['metadatas'][0]): print(f"{meta['name']}: ${meta['price']}") 4.Milvus result = collection.search( data=[get_embedding("laptop 16GB RAM")], anns_field="embedding", param={"metric_type": "L2", "params": {"nprobe": 10}}, expr='price >= 800 && price = 800 && price <= 1200 && ram == "16GB"', limit=10, output_fields=["name", "price", "ram"] ) # Access results for hits in result: for hit in hits: print(f"{hit.entity.get('name')}: ${hit.entity.get('price')}") Filter Syntax in Practice: Where developer experience meets production reality Filter Syntax in Practice: Where developer experience meets production reality DatabaseType SafetyReadability Developer ExperienceBest forWeaviateCompile-time validation (client-side)Excellent - Reads like natural language queriesClean method chaining, intuitive APIComplex business logic with multiple filter conditionsPineconeRuntime validation (server-side)Good - JSON/dictionary syntaxSimple dictionary-based filtersWith managed infrastructure (zero ops)ChromaDBNo validation(client-side only)Okay - With nested dictionary structureDictionary-based syntax, minimal learning curvePrototype and MVPs without complex filteringMilvusRuntime only(string parsing)Complex - String expressions String-based expressions, error-prone High-performance, large-scale deployments DatabaseType SafetyReadability Developer ExperienceBest forWeaviateCompile-time validation (client-side)Excellent - Reads like natural language queriesClean method chaining, intuitive APIComplex business logic with multiple filter conditionsPineconeRuntime validation (server-side)Good - JSON/dictionary syntaxSimple dictionary-based filtersWith managed infrastructure (zero ops)ChromaDBNo validation(client-side only)Okay - With nested dictionary structureDictionary-based syntax, minimal learning curvePrototype and MVPs without complex filteringMilvusRuntime only(string parsing)Complex - String expressions String-based expressions, error-prone High-performance, large-scale deployments DatabaseType SafetyReadability Developer ExperienceBest forWeaviateCompile-time validation (client-side)Excellent - Reads like natural language queriesClean method chaining, intuitive APIComplex business logic with multiple filter conditionsPineconeRuntime validation (server-side)Good - JSON/dictionary syntaxSimple dictionary-based filtersWith managed infrastructure (zero ops)ChromaDBNo validation(client-side only)Okay - With nested dictionary structureDictionary-based syntax, minimal learning curvePrototype and MVPs without complex filteringMilvusRuntime only(string parsing)Complex - String expressions String-based expressions, error-prone High-performance, large-scale deployments DatabaseType SafetyReadability Developer ExperienceBest for Database Database Type Safety Type Safety Readability Readability Developer Experience Developer Experience Best for Best for WeaviateCompile-time validation (client-side)Excellent - Reads like natural language queriesClean method chaining, intuitive APIComplex business logic with multiple filter conditions Weaviate Compile-time validation (client-side) Excellent - Reads like natural language queries Clean method chaining, intuitive API Complex business logic with multiple filter conditions PineconeRuntime validation (server-side)Good - JSON/dictionary syntaxSimple dictionary-based filtersWith managed infrastructure (zero ops) Pinecone Runtime validation (server-side) Good - JSON/dictionary syntax Simple dictionary-based filters With managed infrastructure (zero ops) ChromaDBNo validation(client-side only)Okay - With nested dictionary structureDictionary-based syntax, minimal learning curvePrototype and MVPs without complex filtering ChromaDB No validation(client-side only) Okay - With nested dictionary structure Dictionary-based syntax, minimal learning curve Prototype and MVPs without complex filtering MilvusRuntime only(string parsing)Complex - String expressions String-based expressions, error-prone High-performance, large-scale deployments Milvus Runtime only(string parsing) Complex - String expressions String-based expressions, error-prone High-performance, large-scale deployments The Decision: How We Found Our Perfect Match After weeks of evaluation, technical deep-dives, and real prototyping, we arrived at a clear winner. But this wasn't about finding the "best" vector database—it was about finding the right partner for our specific journey. right partner for our specific journey Our Non-Negotiables: The Filter That Filtered Our Options We built our decision framework around what truly mattered for our team, timeline, and business goals: 1. Week-1 Prototyping: We needed working code in 7 days, not 7 weeks 2. Future Self-Hosting: Cloud today, on-prem tomorrow without API changes 3. Zero-Cost Experimentation: Test ideas without budget approvals 4. Developer-First Experience: No DevOps PhD required 5. Complex Filtering + Hybrid Search: Our chatbot's core competency 6. Clean, Predictable Results: No black-box scoring mysteries Why It Felt Like Finding "The One" Our AnxietyWeaviate's answerWe'll get stuck in DevOps hellSingle Docker container or cloud instanceOur prototype will take monthsWorking in hours, production-ready in daysFiltering will be hacky and slowNative, optimized filtering during searchWe'll outgrow it quickly Scales beautifully to 10M+ vectorsThe learning curve will stall usIntuitive API our junior devs mastered instantly Our AnxietyWeaviate's answerWe'll get stuck in DevOps hellSingle Docker container or cloud instanceOur prototype will take monthsWorking in hours, production-ready in daysFiltering will be hacky and slowNative, optimized filtering during searchWe'll outgrow it quickly Scales beautifully to 10M+ vectorsThe learning curve will stall usIntuitive API our junior devs mastered instantly Our AnxietyWeaviate's answer Our AnxietyWeaviate's answer Our Anxiety Our Anxiety Weaviate's answer Weaviate's answer We'll get stuck in DevOps hellSingle Docker container or cloud instanceOur prototype will take monthsWorking in hours, production-ready in daysFiltering will be hacky and slowNative, optimized filtering during searchWe'll outgrow it quickly Scales beautifully to 10M+ vectorsThe learning curve will stall usIntuitive API our junior devs mastered instantly We'll get stuck in DevOps hellSingle Docker container or cloud instance We'll get stuck in DevOps hell Single Docker container or cloud instance Our prototype will take monthsWorking in hours, production-ready in days Our prototype will take months Working in hours, production-ready in days Filtering will be hacky and slowNative, optimized filtering during search Filtering will be hacky and slow Native, optimized filtering during search We'll outgrow it quickly Scales beautifully to 10M+ vectors We'll outgrow it quickly Scales beautifully to 10M+ vectors The learning curve will stall usIntuitive API our junior devs mastered instantly The learning curve will stall us Intuitive API our junior devs mastered instantly .................. .................. Part 2: Building Your RAG Chatbot After that exhaustive evaluation, we've arrived at our destination: Weaviate! Yes, it was a journey to get testing, comparing, prototyping - but every step was necessary. We didn't just pick a tool; we found a solution that fits our team, our timeline, and our technical requirements perfectly. Now comes the exciting part: Let's roll up our sleeves and build something amazing. I promise the implementation is much smoother than the evaluation was! Weaviate Step 1: Setting up your Weaviate Cloud Instance Before we dive into code, let's get our cluster and credentials ready. 1.1 Create your Weaviate Cloud Account Go over to Weaviate Cloud Console and sign up. The free tier gives you enough resources to follow along with this tutorial. Weaviate Cloud Console 1.2 Launch a New Cluster Click the "Create Cluster" button and configure. Cluster NameCloud ProviderRegion: Select the region closest to your users for low latencyTier: Start with the Sandbox Tier - It's free and perfect for prototyping without cost concerns. However, it will expire after 14 days as I described above. Cluster Name Cloud Provider Region: Select the region closest to your users for low latency Tier: Start with the Sandbox Tier - It's free and perfect for prototyping without cost concerns. However, it will expire after 14 days as I described above. 1.3 Secure Your Connection Once your cluster is provisioned (take about a few minutes), you will need to set up: API Key: Click "Create API Key" to generate API key, which is like your password - anyone with this key can access your entire vector database. API Key: Click "Create API Key" to generate API key, which is like your password - anyone with this key can access your entire vector database. Step 2: Core Functions: The Engine of Our RAG System I'll walk you through the key functions that make up our RAG system. For the complete implementation with all imports, helper functions, and configuration, check out the notebook at the end. 2.1 Loading Secrets Keys user_secret = UserSecretsClient() weaviate_key = user_secret.get_secret("weaviate_key") # Weaviate URL: "REST Endpoint" weaviate_url = user_secret.get_secret("weaviate_url") # Weaviate API key: "ADMIN" API key user_secret = UserSecretsClient() weaviate_key = user_secret.get_secret("weaviate_key") # Weaviate URL: "REST Endpoint" weaviate_url = user_secret.get_secret("weaviate_url") # Weaviate API key: "ADMIN" API key 2.2 Synthetic FAQ Data I have generated a synthetic FAQ dataset that mirrors real customer service conversations. Here is the structure: [ { "question": "How much does shipping cost?", "answer": "Shipping costs depend on your order total, shipping method, and destination. Standard shipping is free for orders over $50, otherwise it's $4.99. Express shipping costs $9.99, and overnight shipping is $19.99.", "category": "Shipping", "subcategory": "costs", "tags": [ "pricing", "shipping delivery" ] } ] [ { "question": "How much does shipping cost?", "answer": "Shipping costs depend on your order total, shipping method, and destination. Standard shipping is free for orders over $50, otherwise it's $4.99. Express shipping costs $9.99, and overnight shipping is $19.99.", "category": "Shipping", "subcategory": "costs", "tags": [ "pricing", "shipping delivery" ] } ] 2.3 The Embedding Generator This function transforms text into 384-dimensional normalized vectors suitable for similarity search in a database. Every FAQ gets converted into a mathematical fingerprint. def get_embedding(text: str,embedder: SentenceTransformer) ->list: "Generate normalized embedding for text" embedding = embedder.encode([text])[0] vector = np.array(embedding).astype("float32") return (vector / np.linalg.norm(vector)).tolist() def get_embedding(text: str,embedder: SentenceTransformer) ->list: "Generate normalized embedding for text" embedding = embedder.encode([text])[0] vector = np.array(embedding).astype("float32") return (vector / np.linalg.norm(vector)).tolist() 2.4 Data Validation with Pydantic class FAQ(BaseModel): question: str = Field(description="shipping and billing questions") answer: str=Field(description="shipping and billing answers") tags: List[str]= Field(description="tags for related customer questions such as payment options, refunding process, billing and shipping information") class FAQ(BaseModel): question: str = Field(description="shipping and billing questions") answer: str=Field(description="shipping and billing answers") tags: List[str]= Field(description="tags for related customer questions such as payment options, refunding process, billing and shipping information") 2.5 Schema Bridge: Pydantic to Weaviate This function will convert your Pydantic data model into Weaviate's schema: def convert_schema_to_weaviate(model:BaseModel) ->List[Property]: "Convert Pydantic model fields to Weaviate properties with robust type handling." # Handles: str, int, bool, List[str], nested models # Returns weaviate-ready property objects def convert_schema_to_weaviate(model:BaseModel) ->List[Property]: "Convert Pydantic model fields to Weaviate properties with robust type handling." # Handles: str, int, bool, List[str], nested models # Returns weaviate-ready property objects 2.6 Build Weaviate Collection This function creates your entire Weaviate collection, which is completed with vector indexing. def build_weaviate_collection( client: weaviate.WeaviateClient, model:BaseModel, class_name: str, class_description: str= "", vector_index_config = Configure.VectorIndex.hnsw() ) ->List[Property]: "Turn a Pydantic model into Weaviate Collection" # 1.Conver model to schema properties # 2.Configure vector indexing # 3.Create collection in cloud def build_weaviate_collection( client: weaviate.WeaviateClient, model:BaseModel, class_name: str, class_description: str= "", vector_index_config = Configure.VectorIndex.hnsw() ) ->List[Property]: "Turn a Pydantic model into Weaviate Collection" # 1.Conver model to schema properties # 2.Configure vector indexing # 3.Create collection in cloud 2.7 Data Ingestion: From JSON to Vector Search This is where you convert your raw FAQ data into searchable vectors: def load_faqs_to_weaviate( collection_name: str, embedder: SentenceTransformer, file_path:str, client: weaviate.Client )->None: "Embed and load FAQs data into existing Weaviate Collection Class using Sentence Transformer" # Processes Json validation -> embedding generation -> batch import # Includes error handling, and duplicate prevention def load_faqs_to_weaviate( collection_name: str, embedder: SentenceTransformer, file_path:str, client: weaviate.Client )->None: "Embed and load FAQs data into existing Weaviate Collection Class using Sentence Transformer" # Processes Json validation -> embedding generation -> batch import # Includes error handling, and duplicate prevention 2.8 Let's connect all together #1. Connect to Weaviate cloud client = weaviate.connect_to_weaviate_cloud( cluster_url= weaviate_url, auth_credentials= Auth.api_key(weaviate_key) ) #2. Create the collection with our schema name = "ecommerce_faqs" desc = "Shipping,Billing, Customer queries" print(client.collections.exists(name)) build_weaviate_collection(client, FAQ, name, desc) #3. Initialize our embedding model embedder = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2') #4. Load and vectorize our FAQs load_faqs_to_weaviate( collection_name="ecommerce_faqs", embedder= embedder, file_path= "/kaggle/input/shipping-data-set/shipping_billing_faqs.json", client=client ) #1. Connect to Weaviate cloud client = weaviate.connect_to_weaviate_cloud( cluster_url= weaviate_url, auth_credentials= Auth.api_key(weaviate_key) ) #2. Create the collection with our schema name = "ecommerce_faqs" desc = "Shipping,Billing, Customer queries" print(client.collections.exists(name)) build_weaviate_collection(client, FAQ, name, desc) #3. Initialize our embedding model embedder = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2') #4. Load and vectorize our FAQs load_faqs_to_weaviate( collection_name="ecommerce_faqs", embedder= embedder, file_path= "/kaggle/input/shipping-data-set/shipping_billing_faqs.json", client=client ) #This is the output: Loaded 20 FAQs from /kaggle/input/shipping-data-set/shipping_billing_faqs.json Generating embeddings and importing FAQs for collection class.....ecommerce_faqs Successfully Loaded 20 FAQs into collection ecommerce_faqs #This is the output: Loaded 20 FAQs from /kaggle/input/shipping-data-set/shipping_billing_faqs.json Generating embeddings and importing FAQs for collection class.....ecommerce_faqs Successfully Loaded 20 FAQs into collection ecommerce_faqs We built Vector database connection to Weaviate CloudSchema creation with necessary fieldsGenerate embeddingsData ingestion with vectorization and indexingFAQ search engine Vector database connection to Weaviate Cloud Schema creation with necessary fields Generate embeddings Data ingestion with vectorization and indexing FAQ search engine Yay, Ready for Search Now We have a semantic understanding of customer questions. Now, we got accurate and grounded answer. Weaviate searches find 3 most relevant FAQs: #1. Aks a natural language question query= "Explain me about my delayed order" collection = client.collections.get("ecommerce_faqs") #2. Perform hybrid search (keyword + semantic) results= collection.query.hybrid( query=query, vector=get_embedding(query,embedder), limit=3, return_properties=["answer","question"], return_metadata=["score"], ).objects print(results) #3. Display the results for i,result in enumerate(results,1): score= getattr(result.metadata,"score","N/A") question= result.properties.get("question","N/A") answer=result.properties.get("answer", "No answer available") print(f"I:{i}, Score{score}, Question={question}, Answer={answer}") print("****************") #1. Aks a natural language question query= "Explain me about my delayed order" collection = client.collections.get("ecommerce_faqs") #2. Perform hybrid search (keyword + semantic) results= collection.query.hybrid( query=query, vector=get_embedding(query,embedder), limit=3, return_properties=["answer","question"], return_metadata=["score"], ).objects print(results) #3. Display the results for i,result in enumerate(results,1): score= getattr(result.metadata,"score","N/A") question= result.properties.get("question","N/A") answer=result.properties.get("answer", "No answer available") print(f"I:{i}, Score{score}, Question={question}, Answer={answer}") print("****************") Output I:1, Score1.0, Question=Why is my order delayed?, Answer=Delays can occur due to weather conditions, carrier issues, customs processing, or incorrect address information. Please check your tracking number for detailed updates or contact our support team for assistance. **************** I:2, Score0.6934776306152344, Question=How can I track my order?, Answer=You can track your order using the tracking link in your shipping confirmation email, or log into your account and visit the "Order History" section. Tracking updates are provided by the carrier every 24 hours. **************** I:3, Score0.6007568836212158, Question=Can I change my payment method for an existing order?, Answer=You can change the payment method for an order that hasn't shipped yet. Please contact customer service with your order number and new payment details. Once an order ships, payment method changes are not possible. **************** I:1, Score1.0, Question=Why is my order delayed?, Answer=Delays can occur due to weather conditions, carrier issues, customs processing, or incorrect address information. Please check your tracking number for detailed updates or contact our support team for assistance. **************** I:2, Score0.6934776306152344, Question=How can I track my order?, Answer=You can track your order using the tracking link in your shipping confirmation email, or log into your account and visit the "Order History" section. Tracking updates are provided by the carrier every 24 hours. **************** I:3, Score0.6007568836212158, Question=Can I change my payment method for an existing order?, Answer=You can change the payment method for an order that hasn't shipped yet. Please contact customer service with your order number and new payment details. Once an order ships, payment method changes are not possible. **************** The Score Explained 0.8+: Excellent match (directly answers your question)0.6-0.8: Good match (related information regarding your question)<0.5 : Weak match (it might not be relevant to your question) 0.8+: Excellent match (directly answers your question) 0.6-0.8: Good match (related information regarding your question) <0.5 : Weak match (it might not be relevant to your question) This isn't just search - It's understanding. In our old way, the search is based on keyword only, and it didn't work if the question is in different words. Our RAG system beats traditional search. For those interested, here’s the link to my code, so you can follow along or adapt it for your purposes! here’s the link What's Next? Curious about what happens after filtering? In my next piece, I'll dive into how prompt engineering bridges filtered data with natural LLM responses. Stay tuned!!!