Ukuhlolwa I-Google yasungulwa i-Gemini File Search, futhi abacwaningi abacabanga ukuthi inikezelo yokuzalwa kwe-homebrew RAG (i-Retrieval Augmented Generation). Umzekelo kuyinto ukuthi ngoku umenzi we-app akufanele ukunakekelwa kwe-chunking, i-embedding, i-file storage, i-vector database, i-metadata, i-recovery optimization, i-context management, njll. Futhi wonke idokhumenti ye-Q&A (eyenziwe njenge-middleware kanye ne-application layer logic) iyahambisana ne-Gemini model kanye nezinzuzo zayo ze-peripheral cloud. Kule nqaku, siza kuhlola i-Gemini File Search futhi usahlanganyele ne-homebrew RAG uhlelo ngokuvumelana nezinsizakalo, ukusebenza, izindleko, ukunambitheka kanye ne-transparency. Uyakwazi ukuthatha isixazululo esidumile ye-use case yakho. Futhi ukuhlaziywa kwe-development yakho, i . isicelo yami isibonelo ku-GitHub isicelo yami isibonelo ku-GitHub Ngiya ku-Original : Google announcement Imininingwane ye-Google Yenza i-Agentic RAG yakho I-Traditional RAG - I-A Refresher I-Architecture ye-RAG ye-traditional ibonisa lokhu, elihlanganisa ngezinyathelo eziningana. Izidakamizwa zokuqala zihlanganiswa, zihlanganiswa, futhi zihlanganiswa ku-vector database. Ngokuvamile, ama-metadata ezihlobene zihlanganiswa kumadokhumenti we-database. I-user query yakhelwe futhi yakhelwe ku-vector DB search ukuze uthole ama-chunk ezifanele. Futhi lokugqibela, isibuyekezo se-user original kanye nezinhlayiyana ezitholwe (njenge-context) zithunyelwa kuma-AI amamodeli ukuze ikhiqize impendulo yomsebenzisi. I-Agent RAG I-Architecture ye-Agentic RAG uhlelo lithunyelwe i-reflection & react loop, lapho i-agent uyifaka ukuba imiphumela iyatholakala futhi ephelele, bese u-rewrite isixazululo ukuhlangabezana ne-quality search. Ngakho-ke, imodeli ye-AI isetshenziselwa eminye izindawo: ukuhlangabezana isixazululo se-user ku-vector DB isixazululo, ukulawula ukuthi ukufumana iyatholakala, futhi ekugcineni ukukhiqiza impendulo ye-user. I-Use Case Isibonelo - I-Camera Manual Q&A Zonke izinzuzo eziyinhloko ziye zihlanganisa ukuthi ama-cameras ezidlulileyo zihlanganisa izindlela ezizodwa futhi ngezinye izikhathi ezinzima zokusebenza, ngisho izinto eziyisisekelo, njenge-loading ifilimu nokuguqulwa kwe-film frame counter. Ngaphandle kwalokho, ungakwazi ngisho ukunciphisa i-camera uma usebenza izinto ezithile ku-"ukudluliselwa okungagunyaziwe." Ngakho-ke, izicelo ezisodwa futhi ezingenalutho kusuka kumadokhumenti we-camera zihlanganisa. I-camera manual archive ibhokisi ama-camera amahhovisi amane angama-9000, ikakhulu ama-PDF e-scanned. E-ideal world, ufuna kuphela ukulanda ama-camera yakho, ukufundisa kwabo, ukujabulela, futhi uye kwenziwe. Kodwa thina bonke abantu amakhulu amakhulu amabili abacwaningi noma abaphakathi. Ngakho-ke, sincoma Q&A ku-camera manual PDFs ngesikhathi sokuhamba, isibonelo, ku-telephone app. Kuyinto efanelekayo kakhulu ku-agentic RAG scope. Futhi ngithanda ukuthi iyatholakala ngamazwe amaningi ama-hobbies (i-music instruments, imishini ye-Hi-Fi, ama-cars ye-vintage) ezihlangene ukufumana ulwazi kusuka kumadokhumenti amaningi we-user. I-Homebrew RAG ye-PDF Q&A inkqubo yethu ye-RAG yasungulwa ekuqaleni ngoLwesihlanu kulingana nge customization enkulu: I-LlaMAIndex RAG Workflow I-LlaMAIndex RAG Workflow Ukusebenzisa Qrrant database vector: isilinganiso elungileyo intengo ukusebenza, ukweseka metadata. Ukusetshenziswa kwe-Mistral OCR API ukuchithwa kwe-PDF: ukusebenza kahle ekutholeni amafayela ze-PDF ezinamandla nge-illustrations kanye ne-tables. Ukubhalisa izithombe kwezinye amakhasi ze-PDF ukuze abasebenzisi akwazi ukufinyelela ngqo ku-illustration ye-graphic ye-camera operations, ngaphandle kwezicelo ze-text. Ukongeza isibambo se-agentic ye-reflection ne-react esekelwe ku-Google / i-Langchain isibonelo ye-agentic search. I-Google / i-Langchain isibonelo yokufinyelela kwebhizinisi Yini mayelana Multi-Modal LLMs? Ukusuka ku-2024, i-multimodal LLM iyatholakala kakhulu. Isisombululo se-alternative eyakhelwe ukusetshenziswa kwe-user query kanye ne-PDF ephelele ku-LLM kanye nokufumana impendulo. Lokhu kuyinto isixazululo esilula kakhulu okuyinto akufanele ukugcina i-DB ye-vector noma i-middleware. I-RAG iyona ngokukhawuleza, kakhulu, futhi kunzima kakhulu uma inani le-user query ngosuku angaphezu kuka-10. Ngakho-ke, i-"directo feeding user query and entire matching PDF to a Multi-modal LLM" kuphela ikakhulukazi isebenza ngenxa yama-prototyping noma ukusetshenziswa kwamandla amancane kakhulu (ama-query amancane ngosuku). Kuleli xesha, kuboniswe ukubukeka kwethu ukuthi i-homebrew RAG iyatholakala ngokuqondile kuze ku-Google ukunciphisa i-Gemini File Search. Ngingathanda ukuthi isixazululo akuyona elula kakhulu. I-Gemini File Search - Umzekelo Ngithole isicelo isibonelo ye-camera manual Q&A isicelo, esekelwe isibonelo se-Google AI Studio. Kuyinto ukuze ungakwazi ukujabulela ngokushesha kakhulu. Lapha i-screenshot ye-user interface ne-chat thread. , open source on GitHub I-open source ku-GitHub Isibonelo Q&A nge-PDF usebenzisa i-Gemini File Search: https://github.com/zbruceli/pdf_qa https://github.com/zbruceli/pdf_qa Izinyathelo eziyinhloko ezihlangene ku-source code: Yenza i-File Search Store, futhi uqhubeke ku-sessions ezahlukene. I-Upload Multiple Files Simultaneously, futhi i-Google backend iyathola zonke i-chunking kanye ne-embedding. It ngisho ikhiqiza imibuzo ye-sampling kubasebenzisi. Ngaphezu kwalokho, ungakwazi ukuguqulwa kwestrategy ye-chunking futhi ukulanda ama-metadata eyakhelwe. Ukushesha i-Standard Generation Query (RAG): Phakathi nesihlalo, kuyinto enhle futhi ungathola ikhwalithi kweziphumo ngaphambi kokwenza impendulo lokugqibela. Izinzuzo ezininzi ze-Developer Gemini File Search API isicelo https://ai.google.dev/gemini-api/docs/file-search https://ai.google.dev/gemini-api/docs/file-search Umhlahlandlela by Phil Schmidt https://www.philschmid.de/gemini-file-search-javascript https://www.philschmid.de/gemini-file-search-javascript Izindleko ze-Gemini File Search Izivakashi zithunyelwe ukulungiselela ngesikhathi sokubhalisa ngokuvumelana nezindleko zokubhalisa zangaphambili ($0.15 ngalinye 1M tokens). Isitoreji kuyinto mahhala. I-Query time embeddings kuyinto mahhala. I-Document Tokens ebonakalayo ifakwe njenge-context tokens ezivamile. Izindleko ze-embeddings Imininingwane Tokens Ngiyazi, Yini engcono? Njengoba i-Gemini File Search iyiphi okungenani, ukubuyekezwa kwangaphambili kumahora we-approx. Umthamo Ukubala I-Gemini File Search inesibopho zonke izici ezisodwa ze-homebrew RAG system Chunking (ngakwazi ukuguqulwa ubukhulu kanye nokuhlanganisa) Ukuhlobisa Vector DB ekusebenziseni ukufinyelela kwe-metadata Ukuhlobisa Ukukhiqiza Generative Futhi izici ezilandelayo ngaphansi kwe-hood: Umthamo we-agent to assess the quality of retrieval Uma kufuneka nitpick, imidwebo yokukhipha kwangaphakathi. Ngaphezu kwalokho, umphumela we-Google File Search iyahambisana kuphela umbhalo, kanti i-RAG eyenziwe ngempumelelo inokukhipha imidwebo kusuka ku-PDF e-scanned. Ngingathanda ukuthi akuyona engaphansi kakhulu ukuze Gemini File Search inikeza umphumela we-multimodal ekugcineni. Ukuhlobisa Ukucaciswa: ku-par. Akukho ukuguqulwa okuhlobene kwama-recovery noma umgangatho we-generation. I-Gemini File Search ingatholakala ngokushesha, njengoko i-vector DB ne-LLM zihlanganisa ngaphakathi kwe-infrastructure ye-Google Cloud. Izindleko Ukubala Okokuqala, Gemini File Search kuyinto inkqubo ephelele ekhukhwini ukuthi kungabangela izindleko Ngaphandle kwe-homebrew system. less Ukubunjwa kwedokumenthi yasungulwa kuphela, futhi kunemali $0.15 ngenyanga tokens. Lokhu kuyinto ixabiso ephakeme okuyinto ephakeme zonke izinhlelo RAG, futhi kungenziwa ku-amortised ngexesha lokusebenza kwedokument Q & A isicelo. In my use case of camera manuals, lokhu ixabiso ephakeme kuyinto ingxenye encane yemali ephakeme. Njengoba i-Gemini File Search inikeza isitoreji se-file kanye ne-database e-”free”, lokhu kuyinto ukunakekelwa kwinkqubo ye-homebrew RAG. Izindleko ze-inference zihlukile, njengoba inani le-input tokens (isiphumela se-question plus ye-vector search njenge-context) kanye ne-output tokens zihlukile phakathi kwe-Gemini File Search ne-homebrew system. Flexibility & Ukuhlobisa Ukuze Tuning futhi Debugging Ngokwemvelo, i-Gemini File Search ibhalisele amamodeli we-Gemini AI yokufaka kanye nokuxhumana. It ikakhulukazi ibhizinisi ngokuvumelana nokuvimbela nokufaka. In terms of fine-tuning your RAG system, Gemini File Search provides some level of customization. For example, you can define a chunkingConfig during upload to specify parameters like maxTokensPerChunk and maxOverlapTokens, and customMetadata to attach key-value pairs to the document. Nangona kunjalo, kubonakala kungenzeka ukuba unemibuzo eyinhloko ye-Gemini File Search uhlelo yokubhalisa kanye nokulawula ukusebenza. Ngakho, ungasebenzisa kakhulu noma kancane njenge-black box. Imibuzo I-Google's Gemini File Search iyona kakuhle kakhulu kumazwe amaningi futhi abantu abaninzi ngexabiso emangalisayo kakhulu. It is super easy to use and has minimal operational overhead. It is not only good for quick prototyping and mock-ups, kodwa futhi enhle kakhulu ngoba uhlelo lokukhiqiza nge amakhulu abasebenzisi. Nangona kunjalo, kukhona izibonelo ezimbalwa ukuthi ungathanda uhlelo le-homebrew RAG: Ngaba unemibuzo ku-Google ukuhumusha i-documents yakho ye-proprietary. Ingabe ufuna ukuguqulwa izithombe kubasebenzisi evela kumadokhumenti yokuqala. Ingabe ufuna ukujabulela ngokuphelele futhi ukucubungula ukuthi LLM ukusetshenziswa ukulungiselela nokugqwala, indlela ukwenza chunking, indlela ukulawula umphumela we-agent ye-RAG, futhi indlela yokuguqulwa kwezimo zokusebenza kwezinga. Ngiyaxolisa i-File Search ye-Gemini futhi uzama. Ungasebenzisa i- njengoba umdlalo, noma ungasebenzisa . Ngiyaxolisa ngezansi mayelana neziphumo zakho zokusebenzisa izimo zakho. I-Google i-AI Studio Ikhodi yami isibonelo ku-GitHub I-Google i-AI Studio Ikhodi yami isibonelo ku-GitHub