Ukiuliza LLM "pendekeza lugha nzuri ya programu ya kujifunza kwa mashine"
Jibu la LLM litakuwa: "Mojawapo ya lugha zinazopendekezwa zaidi za ujifunzaji wa mashine ni Python. Chatu ni mtu wa kiwango cha juu…”
Je, iwapo ungependa shirika lako litoe maelezo mahususi ya shirika lililothibitishwa, yaani, kuboresha jibu kwa taarifa halisi ya shirika?
Wacha tuifanye wakati wa kuingiliana na LLM
LLM maarufu kama OpenAI's chatGPT, Gemini ya Google hufunzwa kuhusu data inayopatikana kwa umma. Mara nyingi hawana habari maalum ya shirika. Kuna nyakati fulani ambapo mashirika yangependa kutegemea LLMs. Hata hivyo, ingependa kuboresha jibu mahususi kwa shirika fulani au kuongeza kanusho wakati hakuna data ya msingi inayopatikana.
Mchakato wa kufanya hivi unajulikana kama Kuweka Majibu ya LLM kwa kutumia misingi ya Maarifa.
Wakati, naweza tu kuzungumza juu yake.
Kama mhandisi kuangalia vijisehemu vya msimbo hunipa ujasiri.
Kuzitekeleza huinua ujasiri wangu na pia hunipa furaha. Kushiriki kunanipa kuridhika 😄
Sakinisha maktaba zinazohitajika
pip install openai faiss-cpu numpy python-dotenv
openai
: Ili kuingiliana na miundo ya GPT ya OpenAI na upachikaji.faiss-cpu
: Maktaba ya Facebook AI kwa utafutaji bora wa mfanano, unaotumika kuhifadhi na kutafuta upachikaji.numpy
: Kwa shughuli za nambari, pamoja na kushughulikia upachikaji kama vidhibiti.python-dotenv
: Kupakia anuwai za mazingira (kwa mfano, vitufe vya API) kutoka kwa faili ya .env
kwa usalama.
Weka vigezo vya Mazingira
.env
katika saraka ya mradi wako. Ongeza kitufe chako cha OpenAI API kwenye faili hii. OPENAI_API_KEY=your_openai_api_key_here
Faili hii huweka ufunguo wako wa API salama na kutengwa na msimbo.
Anzisha vigezo vya mteja na upakiaji wa mazingira
load_dotenv()
hupakia faili .env
, na os.getenv("OPENAI_API_KEY")
hurejesha ufunguo wa API. Usanidi huu huweka ufunguo wako wa API salama. import os from openai import OpenAI from dotenv import load_dotenv import faiss import numpy as np # Load environment variables load_dotenv() client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# Grounding data grounding_data = { "Python": "Python is dynamically typed, which can be a double-edged sword. While it makes coding faster and more flexible, it can lead to runtime errors that might have been caught at compile-time in statically-typed languages.", "LLMs": "Large Language Models (LLMs) are neural networks trained on large text datasets.", "Data Science": "Data Science involves using algorithms, data analysis, and machine learning to understand and interpret data.", "Java": "Java is great, it powers most of the machine learning code, and has a rich set of libraries available." }
Tengeneza Upachikaji wa Maandishi
# Function to generate embedding for a text def get_embedding(text): response = client.embeddings.create( model="text-embedding-ada-002", input=text ) return np.array(response.data[0].embedding)
Fahirisi ya FAISS na upachikaji kwa Data ya Kutuliza
# Create FAISS index and populate it with grounding data embeddings dimension = len(get_embedding("test")) # Dimension of embeddings index = faiss.IndexFlatL2(dimension) # L2 distance index for similarity search grounding_embeddings = [] grounding_keys = list(grounding_data.keys()) for key, text in grounding_data.items(): embedding = get_embedding(text) grounding_embeddings.append(embedding) index.add(np.array([embedding]).astype("float32"))
dimension
: Ukubwa wa kila upachikaji, unaohitajika ili kuanzisha faharasa ya FAISS.index = faiss.IndexFlatL2(dimension)
: Huunda faharasa ya FAISS inayotumia umbali wa Euclidean (L2) kwa mfanano.grounding_data
, msimbo huu hutoa upachikaji na kuuongeza kwenye faharasa ya FAISS.
Kitendaji cha utafutaji wa Vekta
# Function to perform vector search on FAISS def vector_search(query_text, threshold=0.8): query_embedding = get_embedding(query_text).astype("float32").reshape(1, -1) D, I = index.search(query_embedding, 1) # Search for the closest vector if I[0][0] != -1 and D[0][0] <= threshold: return grounding_data[grounding_keys[I[0][0]]] else: return None # No similar grounding information available
Query Embedding
: Hubadilisha maandishi ya hoja kuwa vekta ya kupachika.FAISS Search
: Hutafuta faharasa kwa vekta iliyo karibu zaidi na swali.Threshold Check
: Ikiwa umbali wa vekta wa karibu zaidi (D) uko chini ya kizingiti, hurejesha maelezo ya msingi. Vinginevyo, inaonyesha kuwa hakuna msingi wa kuaminika uliopatikana.Swali kwa LLM
Tunauliza LLM kwa kutumia chatgpt api ya OpenAI na modeli ya gpt-4.
# Query the LLM def query_llm(prompt): response = client.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] ) return response.choices[0].message.content
Jibu lililoimarishwa
def enhance_response(topic, llm_response): grounding_info = vector_search(llm_response) if grounding_info: # Check if the LLM's response aligns well with grounding information return f"{llm_response}\n\n(Verified Information: {grounding_info})" else: # Add a disclaimer when no grounding data is available return f"{llm_response}\n\n(Disclaimer: This information could not be verified against known data and may contain inaccuracies.)"
Fafanua kazi kuu
Kazi kuu inachanganya kila kitu, hukuruhusu kuingiza mada, kuuliza LLM, na angalia ikiwa jibu linalingana na data ya msingi.
# Main function to execute the grounding check def main(): topic = input("Enter a topic: ") llm_response = query_llm(f"What can you tell me about {topic}?") grounding_info = vector_search(llm_response, threshold=0.8) print(f"LLM Response: {llm_response}") print(f"Grounding Information: {grounding_info}") if grounding_info != "No grounding information available": print("Response is grounded and reliable.") else: print("Potential hallucination detected. Using grounded information instead.") print(f"Grounded Answer: {grounding_info}") if __name__ == "__main__": main()
Omba kijisehemu hiki ukitumia
python groundin_llm.py
Jibu:
Ukigundua jibu, ingawa jibu kutoka kwa LLM lilikuwa "Mojawapo ya lugha zinazopendekezwa zaidi za upangaji kwa ujifunzaji wa mashine...", jibu la msingi lilikuwa "Java ni nzuri, ina nguvu nyingi za msimbo wa kujifunza wa Mashine, ina seti nyingi za maktaba zilizopo”.
Hii inawezekana kwa kutumia maktaba ya Meta ya FAISS kwa utaftaji wa vekta kulingana na kufanana.
Mchakato :
Hii ndio nambari: https://github.com/sundeep110/groundingLLMs
Furaha ya Kutuliza!!