paint-brush
LLM Alucinaciones nisqakunata atipay Yachaypa Bases nisqawanby@kattasundeep110
Musuq historia

LLM Alucinaciones nisqakunata atipay Yachaypa Bases nisqawan

by Sundeep Goud Katta7m2024/11/07
Read on Terminal Reader

Nishu unay; Ñawinchanapaq

LLMkunapi alucinaciones nisqakunaqa hark’asqa utaq pisiyachisqa kanmanmi, yachaykunata llamk’achispa kutichiyta kallpachaspa. Yachaypa tiyanankuna mayqin tantanakuypa willayninkunapas kanman. LLM kutichiyqa allpachasqa kachkan chaymanta huk kallpachasqa kutichiy llamk'achiqman qusqa.
featured image - LLM Alucinaciones nisqakunata atipay Yachaypa Bases nisqawan
Sundeep Goud Katta HackerNoon profile picture
0-item

Ima?

Sichus huk LLM tapunki “huk hatun programacion simita yuyaychay makina yachaypaq” .


LLMs kutichiyqa kayhina kanman: “Huknin aswan yuyaychasqa programacion simikuna makina yachaypaq Python. Python nisqaqa hatun patayuqmi...”


¿Imataq kanman sichus organizacionniyki huk chiqaqchasqa organizacionpa específico willayta qunanta munanki chaymanta kutichiyta chiqa organizacion willakuywan aswan allinchayta munanki?


LLMwan tinkuchkaspa chayta ruwasun

Imanasqa?

OpenAI kaqpa chatGPT kaqhina riqsisqa LLMkuna, Google kaqpa Gemini kaqhina llapa runapaq willakuykunapi yachachisqa kanku. Sapa kutim mana organizacionmanta willakuyniyuqchu kanku. Kanmi wakin pachakuna maypi organizacionkuna LLMkunapi hapipakuyta munankuman. Ichaqa, huk organizacionpaq específica kutichiyta aswan allinta ruwayta munanman utaq mana ruway atiykunata yapayta munanman mayk'aq mana allpamanta willaykuna kanchu.


Chayta ruwanapaq ruwayqa riqsisqam Grounding of LLMpa kutichiyninwan Yachaypa sapinkunata servichikuspa.

Imayna?

Mientras, chaymanta rimaylla atiyman.


Ingeniero hina wakin código fragmentokunata qhawayqa confianzata quwan.


Chaykunata hunt’asqayqa confianzaytan hoqarin, kusikuytapas qowallantaqmi. Compartir quwan satisfacción 😄

¿Código? Imanasqa mana! → ¿Pitonchu? Riki!!

  1. Munasqa bibliotecakunata churay

     pip install openai faiss-cpu numpy python-dotenv
  • openai : OpenAI kaqpa GPT modelonkunawan chaymanta churasqakunawan tinkinapaq.
  • faiss-cpu : Facebook AI kaqpa bibliotecan allin rikch'anapaq maskanapaq, llamk'achisqa waqaychaypaq chaymanta maskanapaq churasqakuna.
  • numpy : Yupay llamk'anakunapaq, chaymanta churasqakunata vector hina hap'inapaq.
  • python-dotenv : Pachamama tikraqkunata (kayhina, API llavekuna) .env willañiqimanta amachasqa karganapaq.


  1. Pachamamapa tikraqninkunata churay

    • https://platform.openai.com/settings/organization/api-keys nisqaman riy
    • “Mosoq pakasqa llaveta ruway” nisqapi ñit’iy uraypi kaq siq’ipi rikusqanchis hina.
    • Detalles quy, Huk Servicio Cuenta llamk'achiyta atikunki. “ID de cuenta de servicio” nisqapaq sutita quy hinaspa huk proyectota akllay.
    • Pakasqa llaveta portapapeles nisqaman copiay
    • .env willañiqita llamk'apuy qillqana mayt'uykipi ruray. OpenAI API llaveykita kay willañiqiman yapay.
     OPENAI_API_KEY=your_openai_api_key_here
    • Kay willañiqiqa API llaveykita waqaychasqa chaymanta t'aqasqa codigomanta waqaychan.


  2. Cliente chaymanta carga muyuriq tikraqkunata qallariy

    • load_dotenv() .env willañiqita karga, chaymanta os.getenv("OPENAI_API_KEY") API llaveta kutichin. Kay churayqa API llaveykita waqaychasqa waqaychan.
 import os from openai import OpenAI from dotenv import load_dotenv import faiss import numpy as np # Load environment variables load_dotenv() client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))


  1. Allpamanta Willayta/Yachaykunapa sapinmanta sut’inchana
    • Kay simi pirwapiqa temakunapaq allpamanta willakuymi kachkan. Chiqamanta, kayqa aswan hatun willay huñu utaq willay tantana kanman.
 # Grounding data grounding_data = { "Python": "Python is dynamically typed, which can be a double-edged sword. While it makes coding faster and more flexible, it can lead to runtime errors that might have been caught at compile-time in statically-typed languages.", "LLMs": "Large Language Models (LLMs) are neural networks trained on large text datasets.", "Data Science": "Data Science involves using algorithms, data analysis, and machine learning to understand and interpret data.", "Java": "Java is great, it powers most of the machine learning code, and has a rich set of libraries available." }


  1. Texto Embeddings nisqakunata paqarichiy

    • OpenAI kaqpa churana rikch'ayninta llamk'achispa huk qusqa qillqapaq churaykunata paqarichinanpaq llamk'ana. Kay ruwana OpenAI API waqyan huk qillqa yaykuypaq churayta chaskinanpaq, chaymanta huk NumPy matriz hina kutichisqa
     # Function to generate embedding for a text def get_embedding(text): response = client.embeddings.create( model="text-embedding-ada-002", input=text ) return np.array(response.data[0].embedding)


  2. FAISS Índice chaymanta embeddings nisqakuna Allpaman churanapaq Willakuykunapaq

    • Huk FAISS indice ruway, huk estructura allinchasqa utqaylla rikch'akuy maskanakunapaq, chaymanta hunt'achiy allpapi willaypa churasqakunawan.
     # Create FAISS index and populate it with grounding data embeddings dimension = len(get_embedding("test")) # Dimension of embeddings index = faiss.IndexFlatL2(dimension) # L2 distance index for similarity search grounding_embeddings = [] grounding_keys = list(grounding_data.keys()) for key, text in grounding_data.items(): embedding = get_embedding(text) grounding_embeddings.append(embedding) index.add(np.array([embedding]).astype("float32"))

    • dimension : Sapa churasqapa sayaynin, FAISS indiceta qallarichinapaq necesitakun.
    • index = faiss.IndexFlatL2(dimension) : Huk FAISS indis nisqatam ruran, chaymi rikch'akuyninpaq euclídeo karu (L2) nisqawan llamk'achin.
    • Sapa yaykusqapaq grounding_data kaqpi, kay codigo huk churayta paqarichimun chaymanta FAISS indisman yapan.


  3. Vector maskana ruway

    • Ruwayqa FAISS indispi maskan aswan rikch'aq allpamanta willakuy yaykuypaq huk tapuyman.
 # Function to perform vector search on FAISS def vector_search(query_text, threshold=0.8): query_embedding = get_embedding(query_text).astype("float32").reshape(1, -1) D, I = index.search(query_embedding, 1) # Search for the closest vector if I[0][0] != -1 and D[0][0] <= threshold: return grounding_data[grounding_keys[I[0][0]]] else: return None # No similar grounding information available
  • Query Embedding : Tapuy qillqata huk churay vectorman tikran.
  • FAISS Search : Tapuyman aswan qaylla vector nisqapaq indis nisqapi maskan.
  • Threshold Check : Aswan qaylla vectorpa karun (D) umbralmanta uraypi kaptinqa, allpamanta willayta kutichin. Mana chayqa, mana confiable allpachakuy tarikusqanmantam qawarichin.
  1. LLM nisqamanta tapukuy

    LLM tapuyku OpenAI kaqpa chatgpt api chaymanta gpt-4 modelonwan.

     # Query the LLM def query_llm(prompt): response = client.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] ) return response.choices[0].message.content


  1. Aswan allin kutichiy

    • Allpamanta willayta yapan sichus kan, utaq
    • Huk mana ruway atiyta yapan sichus mana tupaq allpamanta willakuy tarikunchu.
     def enhance_response(topic, llm_response): grounding_info = vector_search(llm_response) if grounding_info: # Check if the LLM's response aligns well with grounding information return f"{llm_response}\n\n(Verified Information: {grounding_info})" else: # Add a disclaimer when no grounding data is available return f"{llm_response}\n\n(Disclaimer: This information could not be verified against known data and may contain inaccuracies.)"
  2. Hatun ruwayta sut’inchana

    Hatun ruwayqa tukuy imata huñun, huk tema yaykuchiyta, LLM tapuyta, chaymanta kutichiy allpapi willakuywan tupachisqa kasqanmanta qhawayta atikun.

     # Main function to execute the grounding check def main(): topic = input("Enter a topic: ") llm_response = query_llm(f"What can you tell me about {topic}?") grounding_info = vector_search(llm_response, threshold=0.8) print(f"LLM Response: {llm_response}") print(f"Grounding Information: {grounding_info}") if grounding_info != "No grounding information available": print("Response is grounded and reliable.") else: print("Potential hallucination detected. Using grounded information instead.") print(f"Grounded Answer: {grounding_info}") if __name__ == "__main__": main()

Tukusqa

Qillqasqata hunt’achiy

Kay phatmata waqyay llamk'achispa

 python groundin_llm.py


Chay kutichiyqa:

Allpachasqa kallpachasqa kutichiy

Willay

Sichus kutichiyta reparanki, LLM kaqmanta kutichiy “Huk aswan yuyaychasqa programacion simikuna makina yachaypaq...” kaptinpas, allpapi kutichiyqa karqan “Java hatunmi, aswan Maquina yachay código kallpachan, huk qhapaq huñuyuqmi bibliotecakuna kasqanmanta”.


Kayqa atikunmi Metapa FAISS bibliotecanwan vector maskanapaq rikchakuyninpi hapipakuspa.

Ruway :

  1. Ñawpaqtaqa LLMs kutichiyta kutichiy.
  2. Qhaway sichus huk yachayniyku ima willakuypas tupaqnin vector maskanawan.
  3. Sichus kan chayqa kutichiy kutichiyta "yachay wasimanta” .
  4. Mana kaptinqa LLM kutichiyta kutichiy imayna kasqanman hina.


Kaypim kachkan chay codigo: https://github.com/sundeep110/groundingLLMs

Kusisqa Allpachay!!