Overcome LLM Hallucinations Using Knowledge Bases

What? If you prompt an LLM “suggest a great programming language for machine learning” LLMs response would be: “One of the most recommended programming languages for machine learning is Python. Python is a high-level…” What if you want your organization to provide a verified organization specific information i.e. enhance the response with authentic organization information? Let us make it happen when interacting with LLM Why? Popular LLMs like OpenAI’s chatGPT, Google’s Gemini are trained on publicly available data. They often lack organization specific information. There are certain times where organizations would like to rely on LLMs. However, would like to enhance the response specific to a particular organization or add disclaimers when no grounding data is available. The process of doing this is known as Grounding of LLM’s response using Knowledge bases. How? While, I can just talk about it. As an engineer looking at some code snippets gives me confidence. Executing them elevates my confidence and also gives happiness. Sharing gives me satisfaction 😄 Code? Why not! → Python? Of course!! Install required libraries pip install openai faiss-cpu numpy python-dotenv openai: To interact with OpenAI’s GPT models and embeddings. faiss-cpu: A library by Facebook AI for efficient similarity search, used to store and search embeddings. numpy: For numerical operations, including handling embeddings as vectors. python-dotenv: To load environment variables (e.g., API keys) from a .env file securely. Set Up Environment variables Navigate to https://platform.openai.com/settings/organization/api-keys Click on “Create new secret key” as show in the image below. Provide details, You can use a Service Account. Provide a name for the “Service account ID” and select a project. Copy the secret key to clipboard Create a .env file in your project directory. Add your OpenAI API key to this file. OPENAI_API_KEY=your_openai_api_key_here This file keeps your API key secure and separated from the code. Initialize client and load environment variables load_dotenv() loads the .env file, and os.getenv("OPENAI_API_KEY") retrieves the API key. This setup keeps your API key secure. import os from openai import OpenAI from dotenv import load_dotenv import faiss import numpy as np # Load environment variables load_dotenv() client = OpenAI(api_key=os.getenv("OPENAI_API_KEY")) Define Grounding Data/Knowledge base This dictionary contains the grounding information for topics. In reality, this could be a larger dataset or a database. # Grounding data grounding_data = { "Python": "Python is dynamically typed, which can be a double-edged sword. While it makes coding faster and more flexible, it can lead to runtime errors that might have been caught at compile-time in statically-typed languages.", "LLMs": "Large Language Models (LLMs) are neural networks trained on large text datasets.", "Data Science": "Data Science involves using algorithms, data analysis, and machine learning to understand and interpret data.", "Java": "Java is great, it powers most of the machine learning code, and has a rich set of libraries available." } Generate Text Embeddings A function to generate embeddings for a given text using OpenAI’s embedding model. This function calls the OpenAI API to get the embedding for a text input, which is then returned as a NumPy array # Function to generate embedding for a text def get_embedding(text): response = client.embeddings.create( model="text-embedding-ada-002", input=text ) return np.array(response.data[0].embedding) FAISS Index and embeddings for Grounding Data Create a FAISS index, a structure optimized for fast similarity searches, and populate it with embeddings of the grounding data. # Create FAISS index and populate it with grounding data embeddings dimension = len(get_embedding("test")) # Dimension of embeddings index = faiss.IndexFlatL2(dimension) # L2 distance index for similarity search grounding_embeddings = [] grounding_keys = list(grounding_data.keys()) for key, text in grounding_data.items(): embedding = get_embedding(text) grounding_embeddings.append(embedding) index.add(np.array([embedding]).astype("float32")) dimension: The size of each embedding, needed to initialize the FAISS index. index = faiss.IndexFlatL2(dimension): Creates a FAISS index that uses Euclidean distance (L2) for similarity. For each entry in grounding_data, this code generates an embedding and adds it to the FAISS index. Vector search function Function searches the FAISS index for the most similar grounding data entry to a query. # Function to perform vector search on FAISS def vector_search(query_text, threshold=0.8): query_embedding = get_embedding(query_text).astype("float32").reshape(1, -1) D, I = index.search(query_embedding, 1) # Search for the closest vector if I[0][0] != -1 and D[0][0] <= threshold: return grounding_data[grounding_keys[I[0][0]]] else: return None # No similar grounding information available Query Embedding: Converts the query text into an embedding vector. FAISS Search: Searches the index for the closest vector to the query. Threshold Check: If the closest vector’s distance (D) is below the threshold, it returns the grounding information. Otherwise, it indicates no reliable grounding was found. Query the LLM We query the LLM using the OpenAI’s chatgpt api and gpt-4 model. # Query the LLM def query_llm(prompt): response = client.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] ) return response.choices[0].message.content Enhanced response Appends grounding information if it’s available, or Adds a disclaimer if no relevant grounding information is found. def enhance_response(topic, llm_response): grounding_info = vector_search(llm_response) if grounding_info: # Check if the LLM's response aligns well with grounding information return f"{llm_response}\n\n(Verified Information: {grounding_info})" else: # Add a disclaimer when no grounding data is available return f"{llm_response}\n\n(Disclaimer: This information could not be verified against known data and may contain inaccuracies.)" Define the main function The main function combines everything, allowing you to input a topic, query the LLM, and check if the response aligns with the grounding data. # Main function to execute the grounding check def main(): topic = input("Enter a topic: ") llm_response = query_llm(f"What can you tell me about {topic}?") grounding_info = vector_search(llm_response, threshold=0.8) print(f"LLM Response: {llm_response}") print(f"Grounding Information: {grounding_info}") if grounding_info != "No grounding information available": print("Response is grounded and reliable.") else: print("Potential hallucination detected. Using grounded information instead.") print(f"Grounded Answer: {grounding_info}") if __name__ == "__main__": main() Outcome Execute the script Invoke this snippet using python groundin_llm.py The response: Explanation If you notice the response, although the response from LLM was “One of the most recommended programming languages for machine learning…”, the grounded response was “Java is great, it power most of the Machine learning code, it has a rich set of libraries available”. This is possible using Meta’s FAISS library for vector search based on similarity. Process: First retrieve the LLMs response. Check if a our knowledge base has any relevant information using vector search. If exists return the response from "the knowledge base” If not return the LLM response as is. Here is the code: https://github.com/sundeep110/groundingLLMs Happy Grounding!! What? If you prompt an LLM “suggest a great programming language for machine learning” LLMs response would be: “One of the most recommended programming languages for machine learning is Python. Python is a high-level…” What if you want your organization to provide a verified organization specific information i.e. enhance the response with authentic organization information? Let us make it happen when interacting with LLM Why? Popular LLMs like OpenAI’s chatGPT, Google’s Gemini are trained on publicly available data. They often lack organization specific information. There are certain times where organizations would like to rely on LLMs. However, would like to enhance the response specific to a particular organization or add disclaimers when no grounding data is available. add disclaimers The process of doing this is known as Grounding of LLM’s response using Knowledge bases. How? While, I can just talk about it. As an engineer looking at some code snippets gives me confidence. Executing them elevates my confidence and also gives happiness. Sharing gives me satisfaction 😄 Code? Why not! → Python? Of course!! Install required libraries pip install openai faiss-cpu numpy python-dotenv Install required libraries pip install openai faiss-cpu numpy python-dotenv Install required libraries Install required libraries pip install openai faiss-cpu numpy python-dotenv pip install openai faiss-cpu numpy python-dotenv openai: To interact with OpenAI’s GPT models and embeddings. faiss-cpu: A library by Facebook AI for efficient similarity search, used to store and search embeddings. numpy: For numerical operations, including handling embeddings as vectors. python-dotenv: To load environment variables (e.g., API keys) from a .env file securely. openai : To interact with OpenAI’s GPT models and embeddings. openai faiss-cpu : A library by Facebook AI for efficient similarity search, used to store and search embeddings. faiss-cpu numpy : For numerical operations, including handling embeddings as vectors. numpy python-dotenv : To load environment variables (e.g., API keys) from a .env file securely. python-dotenv .env Set Up Environment variables Navigate to https://platform.openai.com/settings/organization/api-keys Click on “Create new secret key” as show in the image below. Provide details, You can use a Service Account. Provide a name for the “Service account ID” and select a project. Copy the secret key to clipboard Create a .env file in your project directory. Add your OpenAI API key to this file. OPENAI_API_KEY=your_openai_api_key_here This file keeps your API key secure and separated from the code. Initialize client and load environment variables load_dotenv() loads the .env file, and os.getenv("OPENAI_API_KEY") retrieves the API key. This setup keeps your API key secure. Set Up Environment variables Navigate to https://platform.openai.com/settings/organization/api-keys Click on “Create new secret key” as show in the image below. Provide details, You can use a Service Account. Provide a name for the “Service account ID” and select a project. Copy the secret key to clipboard Create a .env file in your project directory. Add your OpenAI API key to this file. OPENAI_API_KEY=your_openai_api_key_here This file keeps your API key secure and separated from the code. Set Up Environment variables Set Up Environment variables Navigate to https://platform.openai.com/settings/organization/api-keys Click on “Create new secret key” as show in the image below. Provide details, You can use a Service Account. Provide a name for the “Service account ID” and select a project. Copy the secret key to clipboard Create a .env file in your project directory. Add your OpenAI API key to this file. Navigate to https://platform.openai.com/settings/organization/api-keys https://platform.openai.com/settings/organization/api-keys Click on “Create new secret key” as show in the image below. Provide details, You can use a Service Account. Provide a name for the “Service account ID” and select a project. Copy the secret key to clipboard Create a .env file in your project directory. Add your OpenAI API key to this file. .env OPENAI_API_KEY=your_openai_api_key_here OPENAI_API_KEY=your_openai_api_key_here This file keeps your API key secure and separated from the code. This file keeps your API key secure and separated from the code. This file keeps your API key secure and separated from the code. Initialize client and load environment variables load_dotenv() loads the .env file, and os.getenv("OPENAI_API_KEY") retrieves the API key. This setup keeps your API key secure. Initialize client and load environment variables Initialize client and load environment variables load_dotenv() loads the .env file, and os.getenv("OPENAI_API_KEY") retrieves the API key. This setup keeps your API key secure. load_dotenv() loads the .env file, and os.getenv("OPENAI_API_KEY") retrieves the API key. This setup keeps your API key secure. load_dotenv() .env os.getenv("OPENAI_API_KEY") import os from openai import OpenAI from dotenv import load_dotenv import faiss import numpy as np # Load environment variables load_dotenv() client = OpenAI(api_key=os.getenv("OPENAI_API_KEY")) import os from openai import OpenAI from dotenv import load_dotenv import faiss import numpy as np # Load environment variables load_dotenv() client = OpenAI(api_key=os.getenv("OPENAI_API_KEY")) Define Grounding Data/Knowledge base This dictionary contains the grounding information for topics. In reality, this could be a larger dataset or a database. Define Grounding Data/Knowledge base This dictionary contains the grounding information for topics. In reality, this could be a larger dataset or a database. Define Grounding Data/Knowledge base This dictionary contains the grounding information for topics. In reality, this could be a larger dataset or a database. This dictionary contains the grounding information for topics. In reality, this could be a larger dataset or a database. # Grounding data grounding_data = { "Python": "Python is dynamically typed, which can be a double-edged sword. While it makes coding faster and more flexible, it can lead to runtime errors that might have been caught at compile-time in statically-typed languages.", "LLMs": "Large Language Models (LLMs) are neural networks trained on large text datasets.", "Data Science": "Data Science involves using algorithms, data analysis, and machine learning to understand and interpret data.", "Java": "Java is great, it powers most of the machine learning code, and has a rich set of libraries available." } # Grounding data grounding_data = { "Python": "Python is dynamically typed, which can be a double-edged sword. While it makes coding faster and more flexible, it can lead to runtime errors that might have been caught at compile-time in statically-typed languages.", "LLMs": "Large Language Models (LLMs) are neural networks trained on large text datasets.", "Data Science": "Data Science involves using algorithms, data analysis, and machine learning to understand and interpret data.", "Java": "Java is great, it powers most of the machine learning code, and has a rich set of libraries available." } Generate Text Embeddings A function to generate embeddings for a given text using OpenAI’s embedding model. This function calls the OpenAI API to get the embedding for a text input, which is then returned as a NumPy array # Function to generate embedding for a text def get_embedding(text): response = client.embeddings.create( model="text-embedding-ada-002", input=text ) return np.array(response.data[0].embedding) FAISS Index and embeddings for Grounding Data Create a FAISS index, a structure optimized for fast similarity searches, and populate it with embeddings of the grounding data. # Create FAISS index and populate it with grounding data embeddings dimension = len(get_embedding("test")) # Dimension of embeddings index = faiss.IndexFlatL2(dimension) # L2 distance index for similarity search grounding_embeddings = [] grounding_keys = list(grounding_data.keys()) for key, text in grounding_data.items(): embedding = get_embedding(text) grounding_embeddings.append(embedding) index.add(np.array([embedding]).astype("float32")) dimension: The size of each embedding, needed to initialize the FAISS index. index = faiss.IndexFlatL2(dimension): Creates a FAISS index that uses Euclidean distance (L2) for similarity. For each entry in grounding_data, this code generates an embedding and adds it to the FAISS index. Vector search function Function searches the FAISS index for the most similar grounding data entry to a query. Generate Text Embeddings A function to generate embeddings for a given text using OpenAI’s embedding model. This function calls the OpenAI API to get the embedding for a text input, which is then returned as a NumPy array # Function to generate embedding for a text def get_embedding(text): response = client.embeddings.create( model="text-embedding-ada-002", input=text ) return np.array(response.data[0].embedding) Generate Text Embeddings Generate Text Embeddings A function to generate embeddings for a given text using OpenAI’s embedding model. This function calls the OpenAI API to get the embedding for a text input, which is then returned as a NumPy array A function to generate embeddings for a given text using OpenAI’s embedding model. This function calls the OpenAI API to get the embedding for a text input, which is then returned as a NumPy array # Function to generate embedding for a text def get_embedding(text): response = client.embeddings.create( model="text-embedding-ada-002", input=text ) return np.array(response.data[0].embedding) # Function to generate embedding for a text def get_embedding(text): response = client.embeddings.create( model="text-embedding-ada-002", input=text ) return np.array(response.data[0].embedding) FAISS Index and embeddings for Grounding Data Create a FAISS index, a structure optimized for fast similarity searches, and populate it with embeddings of the grounding data. # Create FAISS index and populate it with grounding data embeddings dimension = len(get_embedding("test")) # Dimension of embeddings index = faiss.IndexFlatL2(dimension) # L2 distance index for similarity search grounding_embeddings = [] grounding_keys = list(grounding_data.keys()) for key, text in grounding_data.items(): embedding = get_embedding(text) grounding_embeddings.append(embedding) index.add(np.array([embedding]).astype("float32")) FAISS Index and embeddings for Grounding Data FAISS Index and embeddings for Grounding Data Create a FAISS index, a structure optimized for fast similarity searches, and populate it with embeddings of the grounding data. Create a FAISS index, a structure optimized for fast similarity searches, and populate it with embeddings of the grounding data. # Create FAISS index and populate it with grounding data embeddings dimension = len(get_embedding("test")) # Dimension of embeddings index = faiss.IndexFlatL2(dimension) # L2 distance index for similarity search grounding_embeddings = [] grounding_keys = list(grounding_data.keys()) for key, text in grounding_data.items(): embedding = get_embedding(text) grounding_embeddings.append(embedding) index.add(np.array([embedding]).astype("float32")) # Create FAISS index and populate it with grounding data embeddings dimension = len(get_embedding("test")) # Dimension of embeddings index = faiss.IndexFlatL2(dimension) # L2 distance index for similarity search grounding_embeddings = [] grounding_keys = list(grounding_data.keys()) for key, text in grounding_data.items(): embedding = get_embedding(text) grounding_embeddings.append(embedding) index.add(np.array([embedding]).astype("float32")) dimension: The size of each embedding, needed to initialize the FAISS index. index = faiss.IndexFlatL2(dimension): Creates a FAISS index that uses Euclidean distance (L2) for similarity. For each entry in grounding_data, this code generates an embedding and adds it to the FAISS index. dimension: The size of each embedding, needed to initialize the FAISS index. index = faiss.IndexFlatL2(dimension): Creates a FAISS index that uses Euclidean distance (L2) for similarity. For each entry in grounding_data, this code generates an embedding and adds it to the FAISS index. dimension : The size of each embedding, needed to initialize the FAISS index. dimension index = faiss.IndexFlatL2(dimension) : Creates a FAISS index that uses Euclidean distance (L2) for similarity. index = faiss.IndexFlatL2(dimension) For each entry in grounding_data , this code generates an embedding and adds it to the FAISS index. grounding_data Vector search function Function searches the FAISS index for the most similar grounding data entry to a query. Vector search function Vector search function Function searches the FAISS index for the most similar grounding data entry to a query. Function searches the FAISS index for the most similar grounding data entry to a query. # Function to perform vector search on FAISS def vector_search(query_text, threshold=0.8): query_embedding = get_embedding(query_text).astype("float32").reshape(1, -1) D, I = index.search(query_embedding, 1) # Search for the closest vector if I[0][0] != -1 and D[0][0] <= threshold: return grounding_data[grounding_keys[I[0][0]]] else: return None # No similar grounding information available # Function to perform vector search on FAISS def vector_search(query_text, threshold=0.8): query_embedding = get_embedding(query_text).astype("float32").reshape(1, -1) D, I = index.search(query_embedding, 1) # Search for the closest vector if I[0][0] != -1 and D[0][0] <= threshold: return grounding_data[grounding_keys[I[0][0]]] else: return None # No similar grounding information available Query Embedding: Converts the query text into an embedding vector. FAISS Search: Searches the index for the closest vector to the query. Threshold Check: If the closest vector’s distance (D) is below the threshold, it returns the grounding information. Otherwise, it indicates no reliable grounding was found. Query Embedding : Converts the query text into an embedding vector. Query Embedding FAISS Search : Searches the index for the closest vector to the query. FAISS Search Threshold Check : If the closest vector’s distance (D) is below the threshold, it returns the grounding information. Otherwise, it indicates no reliable grounding was found. Threshold Check Query the LLM We query the LLM using the OpenAI’s chatgpt api and gpt-4 model. # Query the LLM def query_llm(prompt): response = client.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] ) return response.choices[0].message.content Query the LLM We query the LLM using the OpenAI’s chatgpt api and gpt-4 model. # Query the LLM def query_llm(prompt): response = client.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] ) return response.choices[0].message.content Query the LLM Query the LLM We query the LLM using the OpenAI’s chatgpt api and gpt-4 model. # Query the LLM def query_llm(prompt): response = client.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] ) return response.choices[0].message.content # Query the LLM def query_llm(prompt): response = client.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] ) return response.choices[0].message.content Enhanced response Appends grounding information if it’s available, or Adds a disclaimer if no relevant grounding information is found. def enhance_response(topic, llm_response): grounding_info = vector_search(llm_response) if grounding_info: # Check if the LLM's response aligns well with grounding information return f"{llm_response}\n\n(Verified Information: {grounding_info})" else: # Add a disclaimer when no grounding data is available return f"{llm_response}\n\n(Disclaimer: This information could not be verified against known data and may contain inaccuracies.)" Define the main function The main function combines everything, allowing you to input a topic, query the LLM, and check if the response aligns with the grounding data. # Main function to execute the grounding check def main(): topic = input("Enter a topic: ") llm_response = query_llm(f"What can you tell me about {topic}?") grounding_info = vector_search(llm_response, threshold=0.8) print(f"LLM Response: {llm_response}") print(f"Grounding Information: {grounding_info}") if grounding_info != "No grounding information available": print("Response is grounded and reliable.") else: print("Potential hallucination detected. Using grounded information instead.") print(f"Grounded Answer: {grounding_info}") if __name__ == "__main__": main() Enhanced response Appends grounding information if it’s available, or Adds a disclaimer if no relevant grounding information is found. def enhance_response(topic, llm_response): grounding_info = vector_search(llm_response) if grounding_info: # Check if the LLM's response aligns well with grounding information return f"{llm_response}\n\n(Verified Information: {grounding_info})" else: # Add a disclaimer when no grounding data is available return f"{llm_response}\n\n(Disclaimer: This information could not be verified against known data and may contain inaccuracies.)" Enhanced response Enhanced response Appends grounding information if it’s available, or Adds a disclaimer if no relevant grounding information is found. Appends grounding information if it’s available, or Adds a disclaimer if no relevant grounding information is found. def enhance_response(topic, llm_response): grounding_info = vector_search(llm_response) if grounding_info: # Check if the LLM's response aligns well with grounding information return f"{llm_response}\n\n(Verified Information: {grounding_info})" else: # Add a disclaimer when no grounding data is available return f"{llm_response}\n\n(Disclaimer: This information could not be verified against known data and may contain inaccuracies.)" def enhance_response(topic, llm_response): grounding_info = vector_search(llm_response) if grounding_info: # Check if the LLM's response aligns well with grounding information return f"{llm_response}\n\n(Verified Information: {grounding_info})" else: # Add a disclaimer when no grounding data is available return f"{llm_response}\n\n(Disclaimer: This information could not be verified against known data and may contain inaccuracies.)" Define the main function The main function combines everything, allowing you to input a topic, query the LLM, and check if the response aligns with the grounding data. # Main function to execute the grounding check def main(): topic = input("Enter a topic: ") llm_response = query_llm(f"What can you tell me about {topic}?") grounding_info = vector_search(llm_response, threshold=0.8) print(f"LLM Response: {llm_response}") print(f"Grounding Information: {grounding_info}") if grounding_info != "No grounding information available": print("Response is grounded and reliable.") else: print("Potential hallucination detected. Using grounded information instead.") print(f"Grounded Answer: {grounding_info}") if __name__ == "__main__": main() Define the main function Define the main function The main function combines everything, allowing you to input a topic, query the LLM, and check if the response aligns with the grounding data. # Main function to execute the grounding check def main(): topic = input("Enter a topic: ") llm_response = query_llm(f"What can you tell me about {topic}?") grounding_info = vector_search(llm_response, threshold=0.8) print(f"LLM Response: {llm_response}") print(f"Grounding Information: {grounding_info}") if grounding_info != "No grounding information available": print("Response is grounded and reliable.") else: print("Potential hallucination detected. Using grounded information instead.") print(f"Grounded Answer: {grounding_info}") if __name__ == "__main__": main() # Main function to execute the grounding check def main(): topic = input("Enter a topic: ") llm_response = query_llm(f"What can you tell me about {topic}?") grounding_info = vector_search(llm_response, threshold=0.8) print(f"LLM Response: {llm_response}") print(f"Grounding Information: {grounding_info}") if grounding_info != "No grounding information available": print("Response is grounded and reliable.") else: print("Potential hallucination detected. Using grounded information instead.") print(f"Grounded Answer: {grounding_info}") if __name__ == "__main__": main() Outcome Execute the script Invoke this snippet using python groundin_llm.py python groundin_llm.py The response: Explanation If you notice the response, although the response from LLM was “One of the most recommended programming languages for machine learning…”, the grounded response was “Java is great, it power most of the Machine learning code, it has a rich set of libraries available”. This is possible using Meta’s FAISS library for vector search based on similarity. Process : Process First retrieve the LLMs response. Check if a our knowledge base has any relevant information using vector search. If exists return the response from "the knowledge base” If not return the LLM response as is. First retrieve the LLMs response. Check if a our knowledge base has any relevant information using vector search. If exists return the response from "the knowledge base” If not return the LLM response as is. Here is the code: https://github.com/sundeep110/groundingLLMs Here is the code: https://github.com/sundeep110/groundingLLMs https://github.com/sundeep110/groundingLLMs Happy Grounding!! Happy Grounding!!