Uluntu, iingxoxo, kunye neeforamu ngumthombo ongapheliyo wolwazi kwizihloko ezininzi. I-Slack ihlala ithatha indawo yamaxwebhu obugcisa, kwaye iTelegram kunye noluntu lweDiscord lunceda ngemidlalo, uqalo, i-crypto, kunye nemibuzo yokuhamba. Ngaphandle kokubaluleka kolwazi lomntu siqu, luhlala lungalungiswanga kakhulu, lusenza kube nzima ukuphanda. Kweli nqaku, siza kuphonononga ubunzima bokuphumeza i-Telegram bot eya kufumana iimpendulo kwimibuzo ngokukhupha ulwazi kwimbali yemiyalezo yengxoxo. Nantsi imiceli mngeni esilindeleyo: . Impendulo inokuba saa kwincoko yabantu abaninzi okanye kwikhonkco loncedo lwangaphandle. Fumana imiyalezo efanelekileyo . Kukho i-spam eninzi kunye ne-off-topics, ekufuneka sifunde ukuyichonga kunye nokuhluza Ukungahoyi ngaphandle kwesihloko . Ulwazi luphelelwa lixesha. Uyazi njani impendulo echanekileyo ukuza kuthi ga ngoku? Ukubeka phambili esiza kukuphumeza Ukuhamba komsebenzisi we-chatbot esisisiseko Umsebenzisi ubuza i-bot umbuzo I-bot ifumana iimpendulo ezikufutshane kwimbali yemiyalezo I-bot ishwankathela iziphumo zophendlo ngoncedo lwe-LLM Ibuyisela kumsebenzisi impendulo yokugqibela kunye namakhonkco kwimiyalezo efanelekileyo Masihambe kumanqanaba aphambili oku kuqukuqela komsebenzisi kwaye siqaqambise imiceli mngeni ephambili esiza kujongana nayo. Ukulungiswa kwedatha Ukulungiselela imbali yomyalezo wokukhangela, kufuneka siyile i-embeddings yale miyalezo - iVectorized text representations. Ngelixa ujongana nenqaku le-wiki okanye uxwebhu lwePDF, siya kwahlula isicatshulwa sibe yimihlathi kwaye sibale uHlenzeko lwesivakalisi nganye. Nangona kunjalo, kufuneka sithathele ingqalelo izinto ezingaqhelekanga eziqhelekileyo kwiincoko kwaye hayi isicatshulwa esimiswe kakuhle: Imiyalezo emifutshane elandelayo evela kumsebenzisi omnye. Kwiimeko ezinjalo, kufanelekile ukudibanisa imiyalezo kwiibhloko zeteksti ezinkulu Eminye yemiyalezo mide kakhulu kwaye igubungela izihloko ezininzi ezahlukeneyo Imiyalezo engenantsingiselo kunye ne-spam kufuneka siyihluze ngaphandle Umsebenzisi angaphendula ngaphandle kokuphawula umyalezo wokuqala. Umbuzo kunye nempendulo inokwahlulwa kwimbali yengxoxo ngeminye imiyalezo emininzi Umsebenzisi unokuphendula ngekhonkco kumthombo wangaphandle (umzekelo, inqaku okanye uxwebhu) Emva koko, kufuneka sikhethe imodeli yokufakela. Kukho iimodeli ezininzi ezahlukeneyo zokwakha ukufakwa, kwaye izinto ezininzi kufuneka ziqwalaselwe xa ukhetha imodeli efanelekileyo. . Okukhona kuphezulu, kokukhona imodeli inokufunda ngakumbi kwiinkcukacha. Ukukhangela kuya kuchaneka ngakumbi kodwa kufuna imemori eninzi kunye nezixhobo zokubala. Umlinganiselo wokuzinzisa apho imodeli yokuhloma yaqeqeshwa khona. Oku kuya kugqiba, umzekelo, ukuba iluxhasa kangakanani ulwimi oludingayo. Iseti yedatha Ukuphucula umgangatho weziphumo zokhangelo, sinokuyihlela imiyalezo ngokwesihloko. Umzekelo, kwincoko enikezelwe kuphuhliso lwangaphambili, abasebenzisi banokuxoxa ngezihloko ezinje: CSS, tooling, React, Vue, etc. Ungasebenzisa iLLM (ixabiso elingaphezulu) okanye iindlela zakudala zokubumba izihloko kwiilayibrari ezifana ukuhlela imiyalezo nge. izihloko. neBERTopic Kwakhona siya kufuna i-database ye-vector yokugcina ukufakwa kunye nolwazi lwe-meta (izixhumanisi kwizithuba zokuqala, iindidi, imihla). Uninzi lweendawo zokugcina i-vector, ezifana , , okanye , zikhona ngenxa yale njongo. I-PostgreSQL eqhelekileyo kunye nolwandiso nayo iya kusebenza. FAISS Milvus Pinecone lwe-pgvector Kusetyenzwa umbuzo wabasebenzisi Ukuze siphendule umbuzo womsebenzisi, kufuneka siguqule umbuzo ube yifom ephendliweyo, kwaye ngaloo ndlela sibale ukuhlonziswa kombuzo, kunye nokumisela injongo yawo. Isiphumo sophendlo lwesemantic kumbuzo sinokuba yimibuzo efanayo kwimbali yengxoxo kodwa hayi iimpendulo kubo. Ukuphucula oku, sinokusebenzisa enye ye- edumileyo (i-hypothetical embeddings) ubuchule bokulusebenzisa. Umbono kukuvelisa impendulo eqikelelwayo kumbuzo usebenzisa i-LLM kwaye emva koko ubale uzinziso lwempendulo. Le ndlela kwezinye iimeko ivumela uphendlo oluchanekileyo nolusebenzayo lwemiyalezo efanelekileyo phakathi kweempendulo kunemibuzo. HyDE Ukufumana eyona miyalezo ifanelekileyo Sakuba sinombuzo wokuzinzisa, sinokukhangela eyona miyalezo ikufutshane kuvimba weenkcukacha. I-LLM inomda wefestile yentsingiselo, ngoko ke sisenokungakwazi ukongeza zonke iziphumo zokhangelo ukuba zininzi kakhulu. Umbuzo uvela wendlela yokubeka phambili iimpendulo. Kukho iindlela ezininzi zoku: . Ngokuhamba kwexesha, ulwazi luphelelwa lixesha, kwaye ukubeka phambili imiyalezo emitsha, ungabala amanqaku amvanje usebenzisa ifomula elula Amanqaku amvanje 1 / (today - date_of_message + 1) (kufuneka uchonge isihloko sombuzo kunye nezithuba). Oku kunceda ukucutha ukhangelo lwakho, kushiye kuphela ezo zithuba zihambelana nesihloko osikhangelayo Uhluzo lweMetadata. . Uphendlo olupheleleyo lweklasikhi, oluxhaswe kakuhle ngabo bonke oovimba beenkcukacha ezidumileyo, ngamanye amaxesha lunokuba luncedo. Uphendlo olupheleleyo . Sakuba sizifumene iimpendulo, sinokuzihlela ngokwenqanaba 'lokusondela' kumbuzo, sishiye kuphela ezona zifanelekileyo. Ukuhlaziywa kwakhona kuya kufuna imodeli , okanye sinokusebenzisa i-API yokuhlaziya, umzekelo, ukusuka . Ukuhlengahlengisa ye-CrossEncoder kwi-Cohere Ukuvelisa impendulo yokugqibela Emva kokukhangela kunye nokuhlelwa kwinqanaba langaphambili, sinokugcina i-50-100 izithuba ezifanelekileyo eziya kungena kumxholo we-LLM. Inyathelo elilandelayo kukudala ingcaciso ecacileyo nemfutshane yeLLM usebenzisa umbuzo wokuqala womsebenzisi kunye neziphumo zophando. Kufuneka icacise kwiLLM indlela yokuphendula umbuzo, umbuzo womsebenzisi, kunye nomxholo - imiyalezo efanelekileyo esiyifumeneyo. Ukulungiselela le njongo, kubalulekile ukuqwalasela le miba: yimiyalelo kwimodeli echaza ukuba kufuneka iqhube njani ulwazi. Umzekelo, ungaxelela iLLM ukuba ijonge impendulo kuphela kwidatha enikiweyo. I-System Prompt - obona bude buphezulu bemiyalezo esinokuyisebenzisa njengegalelo. Sinokubala inani lamathokheni usebenzisa i-tokenizer ehambelana nomzekelo esiwusebenzisayo. Umzekelo, i-OpenAI isebenzisa iTiktoken. Ubude bomxholo - umzekelo, iqondo lokushisa linoxanduva lokuba imodeli iya kuba njani kwiimpendulo zayo. Imodeli ye-hyperparameters . Akusoloko kufanelekile ukuhlawula ngaphezulu kweyona modeli inkulu kwaye inamandla. Kunengqiqo ukwenza iimvavanyo ezininzi ngeemodeli ezahlukeneyo kwaye uthelekise iziphumo zabo. Kwezinye iimeko, iimodeli ezinobuncwane obuncinci ziya kwenza umsebenzi ukuba azifuni ukuchaneka okuphezulu. Ukukhethwa kwemodeli Ukuphunyezwa Ngoku makhe sizame ukuphumeza la manyathelo ngeNodeJS. Nasi isitakhi setekhnoloji endiza kusisebenzisa: kunye NodeJS ne-TypeScript -Isikhokelo sebhot yeTelegram IGrammy - njengendawo yokugcina kuyo yonke idatha yethu I-PostgreSQL -I-PostgreSQL ulwandiso lokugcina ukufakwa kokubhaliweyo kunye nemiyalezo pgvector -LLM и kunye neemodeli zokuzinzisa I-OpenAI API - ukwenza lula ukusebenzisana kwe-db Mikro-ORM Masitsibe amanyathelo asisiseko okufaka ukuxhomekeka kunye nokuseta ibhot yetelegram kwaye siqhubele phambili ngokuthe ngqo kwezona mpawu zibalulekileyo. I-schema yedathabheyisi, eya kufuneka kamva: import { Entity, Enum, Property, Unique } from '@mikro-orm/core'; @Entity({ tableName: 'groups' }) export class Group extends BaseEntity { @PrimaryKey() id!: number; @Property({ type: 'bigint' }) channelId!: number; @Property({ type: 'text', nullable: true }) title?: string; @Property({ type: 'json' }) attributes!: Record<string, unknown>; } @Entity({ tableName: 'messages' }) export class Message extends BaseEntity { @PrimaryKey() id!: number; @Property({ type: 'bigint' }) messageId!: number; @Property({ type: TextType }) text!: string; @Property({ type: DateTimeType }) date!: Date; @ManyToOne(() => Group, { onDelete: 'cascade' }) group!: Group; @Property({ type: 'string', nullable: true }) fromUserName?: string; @Property({ type: 'bigint', nullable: true }) replyToMessageId?: number; @Property({ type: 'bigint', nullable: true }) threadId?: number; @Property({ type: 'json' }) attributes!: { raw: Record<any, any>; }; } @Entity({ tableName: 'content_chunks' }) export class ContentChunk extends BaseEntity { @PrimaryKey() id!: number; @ManyToOne(() => Group, { onDelete: 'cascade' }) group!: Group; @Property({ type: TextType }) text!: string; @Property({ type: VectorType, length: 1536, nullable: true }) embeddings?: number[]; @Property({ type: 'int' }) tokens!: number; @Property({ type: new ArrayType<number>((i: string) => +i), nullable: true }) messageIds?: number[]; @Property({ persist: false, nullable: true }) distance?: number; } Yahlula iingxoxo zabasebenzisi zibe ngamaqhekeza Ukwahlula iingxoxo ezinde phakathi kwabasebenzisi abaninzi kwii-chunks ayingowona msebenzi uncinci. Ngelishwa, iindlela ezingagqibekanga ezifana ne , ekhoyo kwilayibrari yaseLangchain, musa ukuphendula kuzo zonke izinto ezikhethekileyo zokuxoxa. Nangona kunjalo, kwimeko yeTelegram, sinokuthatha ithuba yeTelegram equlethe imiyalezo ehambelanayo kunye neempendulo ezithunyelwe ngabasebenzisi. -RecursiveCharacterTextSplitter threads Ngalo lonke ixesha ibhetshi entsha yemiyalezo ifika isuka kwigumbi lencoko, ibot yethu ifuna ukwenza amanyathelo ambalwa: Hluza imiyalezo emifutshane ngoluhlu lwamagama okumisa (umz. 'molo', 'bye', njl.njl.) Hlanganisa imiyalezo evela kumsebenzisi omnye ukuba ithunyelwe ngokulandelelanayo ngexesha elifutshane Hlanganisa yonke imiyalezo ephuma kumsonto omnye Dibanisa amaqela omyalezo afunyenweyo kwiibhloko zeteksti ezinkulu kwaye ucande ngakumbi le bhloko yokubhaliweyo ibe ziziqwenga usebenzisa RecursiveCharacterTextSplitter Bala izinto ezizinzisiweyo kwisiqwenga ngasinye Zingisa iziqwenga zeteksti kwisiseko sedatha kunye nofakelo lwazo kunye namakhonkco kwimiyalezo yoqobo class ChatContentSplitter { constructor( private readonly splitter RecursiveCharacterTextSplitter, private readonly longMessageLength = 200 ) {} public async split(messages: EntityDTO<Message>[]): Promise<ContentChunk[]> { const filtered = this.filterMessage(messages); const merged = this.mergeUserMessageSeries(filtered); const threads = this.toThreads(merged); const chunks = await this.threadsToChunks(threads); return chunks; } toThreads(messages: EntityDTO<Message>[]): EntityDTO<Message>[][] { const threads = new Map<number, EntityDTO<Message>[]>(); const orphans: EntityDTO<Message>[][] = []; for (const message of messages) { if (message.threadId) { let thread = threads.get(message.threadId); if (!thread) { thread = []; threads.set(message.threadId, thread); } thread.push(message); } else { orphans.push([message]); } } return [Array.from(threads.values()), ...orphans]; } private async threadsToChunks( threads: EntityDTO<Message>[][], ): Promise<ContentChunk[]> { const result: ContentChunk[] = []; for await (const thread of threads) { const content = thread.map((m) => this.dtoToString(m)) .join('\n') const texts = await this.splitter.splitText(content); const messageIds = thread.map((m) => m.id); const chunks = texts.map((text) => new ContentChunk(text, messageIds) ); result.push(...chunks); } return result; } mergeMessageSeries(messages: EntityDTO<Message>[]): EntityDTO<Message>[] { const result: EntityDTO<Message>[] = []; let next = messages[0]; for (const message of messages.slice(1)) { const short = message.text.length < this.longMessageLength; const sameUser = current.fromId === message.fromId; const subsequent = differenceInMinutes(current.date, message.date) < 10; if (sameUser && subsequent && short) { next.text += `\n${message.text}`; } else { result.push(current); next = message; } } return result; } // .... } Ufakelo Emva koko, kufuneka sibale ii-embeddings kwi-chunks nganye. Kule nto sinokusebenzisa imodeli ye-OpenAI text-embedding-3-large public async getEmbeddings(chunks: ContentChunks[]) { const chunked = groupArray(chunks, 100); for await (const chunk of chunks) { const res = await this.openai.embeddings.create({ input: c.text, model: 'text-embedding-3-large', encoding_format: "float" }); chunk.embeddings = res.data[0].embedding } await this.orm.em.flush(); } Ukuphendula imibuzo yabasebenzisi Ukuphendula umbuzo womsebenzisi, siqala sibala uzinziso lombuzo kwaye emva koko sifumane eyona miyalezo ifanelekileyo kwimbali yencoko. public async similaritySearch(embeddings: number[], groupId; number): Promise<ContentChunk[]> { return this.orm.em.qb(ContentChunk) .where({ embeddings: { $ne: null }, group: this.orm.em.getReference(Group, groupId) }) .orderBy({[l2Distance('embedding', embedding)]: 'ASC'}) .limit(100); } Emva koko siphinda sihlengahlengise iziphumo zophando ngoncedo lwemodeli yokuhlaziya ye-Cohere public async rerank(query: string, chunks: ContentChunk[]): Promise<ContentChunk> { const { results } = await cohere.v2.rerank({ documents: chunks.map(c => c.text), query, model: 'rerank-v3.5', }); const reranked = Array(results.length).fill(null); for (const { index } of results) { reranked[index] = chunks[index]; } return reranked; } Okulandelayo, cela iLLM ukuba iphendule umbuzo womsebenzisi ngokushwankathela iziphumo zophendlo. Uguqulelo olwenziwe lula lokusetyenzwa kombuzo wokukhangela luya kujongeka ngolu hlobo: public async search(query: string, group: Group) { const queryEmbeddings = await this.getEmbeddings(query); const chunks = this.chunkService.similaritySearch(queryEmbeddings, group.id); const reranked = this.cohereService.rerank(query, chunks); const completion = await this.openai.chat.completions.create({ model: 'gpt-4-turbo', temperature: 0, messages: [ { role: 'system', content: systemPrompt }, { role: 'user', content: this.userPromptTemplate(query, reranked) }, ] ] return completion.choices[0].message; } // naive prompt public userPromptTemplate(query: string, chunks: ContentChunk[]) { const history = chunks .map((c) => `${c.text}`) .join('\n----------------------------\n') return ` Answer the user's question: ${query} By summarizing the following content: ${history} Keep your answer direct and concise. Provide refernces to the corresponding messages.. `; } Uphuculo olongezelelweyo Nasemva kwalo lonke ulungiselelo, sinokuziva ukuba iimpendulo ze-bot ze-LLM azilunganga kwaye aziphelelanga. Yintoni enye enokuphuculwa? Kwizithuba zabasebenzisi ezibandakanya amakhonkco, sinokuphinda sicazulule amaphepha ewebhu kunye nomxholo we-pdf-amaxwebhu. — imibuzo eqondisa umsebenzisi kweyona datha ifanelekileyo, imodeli, okanye isalathiso esekwe kwinjongo yombuzo kunye nomxholo wokwandisa ukuchaneka, ukusebenza kakuhle, kunye neendleko. Query-Routing Sinokubandakanya izixhobo ezihambelana nesihloko segumbi lengxoxo kwisalathiso sokukhangela - emsebenzini, ingaba ngamaxwebhu avela kwi-Confluence, kwiingxoxo ze-visa, iiwebhusayithi ze-consulate ezinemigaqo, njl. -Sifuna ukuseta umbhobho wokuvavanya umgangatho weempendulo zethu zebhot UVavanyo lweRAG