4,245 測定値

ナレッジ管理のための AI: QE-RAG アーキテクチャによる RAG の反復

に Shanglun Wang27m2023/09/12

長すぎる; 読むには

Retrieval-Augmented Generation (RAG) は、強力な LLM アプリを開発するための一般的なアーキテクチャです。ただし、現在のアーキテクチャには実際の制限がいくつかあります。 RAG アプリケーションの構築を段階的に説明し、QE-RAG と呼ばれる新しいアーキテクチャを使用してそれを改善する方法を検討します。

featured image - ナレッジ管理のための AI: QE-RAG アーキテクチャによる RAG の反復

LLM 革命が具体化し始めるにつれ、誇大宣伝は商業開発に取って代わられました。最初の興奮の波が静まるにつれ、生成 AI はもはや全知のブラックボックスではなく、エンジニアの武器庫の、たとえ非常に強力ではあるとしてもツールの構成要素として見られるようになりました。その結果、起業家や技術者は現在、LLM アプリケーションを開発するためのツールと技術のセットをますます成熟させています。

LLM の最も興味深い使用例の 1 つは、ナレッジマネジメントの分野です。 OpenAI の GPT テクノロジー、またはLLaMa 2や Flan-T5 などのオープンソースモデルに基づく特殊な LLM は、大量のデータを管理するために賢い方法で使用されています。以前は、大規模なテキストデータセットを持つ組織は、あいまい一致や全文インデックス作成などのテキスト検索手法に依存する必要がありましたが、現在では、情報を見つけるだけでなく、時間効率が高く読みやすい形式で情報を要約できる強力なシステムにアクセスできるようになりました。ファッション。

このユースケースの中で、検索拡張生成アーキテクチャ(RAG) が、非常に高い柔軟性とパフォーマンスを備えた傑出したアーキテクチャとして浮上しました。このアーキテクチャを使用すると、組織は作業本体のインデックスを迅速に作成し、それに対してセマンティッククエリを実行し、コーパスに基づいてユーザー定義のクエリに対する有益で説得力のある回答を生成できます。 RAG アーキテクチャの実装をサポートするためにいくつかの企業やサービスが誕生し、その持続力が強調されています。

RAG は効果的ですが、このアーキテクチャには実際の制限もいくつかあります。この記事では、RAG アーキテクチャを調査し、その制限を特定し、これらの制限を解決するための改善されたアーキテクチャを提案します。

他のすべての記事と同様に、私は他の技術者や AI 愛好家とつながりたいと考えています。このアーキテクチャをどのように改善できるかについて考えがある場合、または議論したい AI についてのアイデアがある場合は、お気軽にお問い合わせください。 Github または LinkedIn で私を見つけることができます。リンクは私のプロフィールとこの記事の下部にあります。

コンテンツの概要

検索拡張生成 (RAG) アーキテクチャ
RAG アーキテクチャの制限
QE-RAG (質問拡張 RAG) の提案
結論

検索拡張生成 (RAG) アーキテクチャ

RAG、Flan、LLaMa などの名前があり、AI コミュニティが未来的でスタイリッシュな名前で賞を受賞することは当分ないでしょう。しかし、RAG アーキテクチャは、LLM の開発によって利用可能になった 2 つの非常に強力な技術 (コンテキストドキュメントの埋め込みとプロンプトエンジニアリング) を組み合わせたものであるため、確かに賞に値します。

最も単純な RAG アーキテクチャは、埋め込みベクトル検索を使用して質問に最も関連するコーパスの部分を見つけ、その部分をプロンプトに挿入し、プロンプトエンジニアリングを使用して、答えはプロンプトに示された抜粋に基づいています。少しわかりにくいと思われる場合は、各コンポーネントを順番に説明しますので、読み続けてください。サンプルコードも載せておきますので、参考にしてみてください。

埋め込みモデル

何よりもまず、効果的な RAG システムには強力な埋め込みモデルが必要です。埋め込みモデルは、自然なテキスト文書を、文書の意味論的な内容を大まかに表す一連の数値、つまり「ベクトル」に変換します。埋め込みモデルが適切であると仮定すると、ベクトル演算を使用して 2 つの異なるドキュメントの意味値を比較し、2 つのドキュメントが意味的に類似しているかどうかを判断できます。

これの動作を確認するには、次のコードを Python ファイルに貼り付けて実行します。

 import openai from openai.embeddings_utils import cosine_similarity openai.api_key = [YOUR KEY] EMBEDDING_MODEL = "text-embedding-ada-002" def get_cos_sim(input_1, input_2): embeds = openai.Embedding.create(model=EMBEDDING_MODEL, input=[input_1, input_2]) return cosine_similarity(embeds['data'][0]['embedding'], embeds['data'][1]['embedding']) print(get_cos_sim('Driving a car', 'William Shakespeare')) print(get_cos_sim('Driving a car', 'Riding a horse'))

上記のコードは、コサイン類似度アルゴリズムを使用して相互に比較する前に、「車の運転」、「ウィリアムシェイクスピア」、および「馬に乗る」というフレーズの埋め込みを生成します。フレーズが意味的に類似している場合、コサイン類似度はより高くなることが予想されるため、「車の運転」と「馬に乗る」はかなり近くなるはずですが、「車の運転」と「ウィリアム・シェイクスピア」は似ていないはずです。

OpenAI の埋め込みモデル ada-002 によると、「車の運転」というフレーズは「馬に乗る」というフレーズに 88% 類似し、「ウィリアムシェイクスピア」というフレーズに 76% 類似していることがわかります。これは、埋め込みモデルが期待どおりに動作していることを意味します。この意味的類似性の決定は、RAG システムの基礎です。

コサイン類似度の考え方は、より大きなドキュメントの比較に拡張すると、非常に堅牢になります。たとえば、シェイクスピアのマクベスからの力強い独白「明日も明日も明日も」を考えてみましょう。

 monologue = '''Tomorrow, and tomorrow, and tomorrow, Creeps in this petty pace from day to day, To the last syllable of recorded time; And all our yesterdays have lighted fools The way to dusty death. Out, out, brief candle! Life's but a walking shadow, a poor player, That struts and frets his hour upon the stage, And then is heard no more. It is a tale Told by an idiot, full of sound and fury, Signifying nothing.''' print(get_cos_sim(monologue, 'Riding a car')) print(get_cos_sim(monologue, 'The contemplation of mortality'))

このモノローグは「車に乗る」というアイデアと 75% しか類似しておらず、「死すべき運命についての熟考」というアイデアと 82% 似ていることがわかるはずです。

しかし、独白をアイデアと比較するだけでなく、実際に独白を質問と比較することもできます。例えば：

 get_cos_sim('''Tomorrow, and tomorrow, and tomorrow, Creeps in this petty pace from day to day, To the last syllable of recorded time; And all our yesterdays have lighted fools The way to dusty death. Out, out, brief candle! Life's but a walking shadow, a poor player, That struts and frets his hour upon the stage, And then is heard no more. It is a tale Told by an idiot, full of sound and fury, Signifying nothing.''', 'Which Shakespearean monologue contemplates mortality?') get_cos_sim('''Full of vexation come I, with complaint Against my child, my daughter Hermia. Stand forth, Demetrius. My noble lord, This man hath my consent to marry her. Stand forth, Lysander. And my gracious Duke, This man hath bewitch'd the bosom of my child. Thou, thou, Lysander, thou hast given her rhymes, And interchanged love-tokens with my child: Thou hast by moonlight at her window sung With feigning voice verses of feigning love, And stol'n the impression of her fantasy With bracelets of thy hair, rings, gauds, conceits, Knacks, trifles, nosegays, sweetmeats (messengers Of strong prevailment in unharden'd youth): With cunning hast thou filch'd my daughter's heart, Turn'd her obedience, which is due to me, To stubborn harshness. And, my gracious Duke, Be it so she will not here, before your Grace, Consent to marry with Demetrius, I beg the ancient privilege of Athens: As she is mine, I may dispose of her; Which shall be either to this gentleman, Or to her death, according to our law Immediately provided in that case.''', 'Which Shakespearean monologue contemplates mortality?')

埋め込みを見ると、マクベスの独白が文脈的に「死すべき運命を熟考しているシェイクスピアの独白はどれですか?」という質問にはるかに近いことがわかるはずです。イージウスの独白よりも、死について言及しているものの、死すべき運命の概念に直接取り組んでいない。

ベクトルルックアップ

埋め込みができたので、RAG システムでそれをどのように使用するのでしょうか?さて、RAG システムにシェイクスピアのすべての独白の知識を与えて、シェイクスピアに関する質問に答えられるようにしたいとします。この場合、シェイクスピアの独白をすべてダウンロードし、それらの埋め込みを生成します。手順に従っている場合は、次のように埋め込みを生成できます。

 embedding = openai.Embedding.create(model=EMBEDDING_MODEL, input=[monologue])['data'][0]['embedding']

エンベディングを取得したら、それらをクエリして新しいエンベディングと比較できる方法で保存する必要があります。通常、それらをVector Databaseと呼ばれるものに置きます。これは、2 つのベクトルの高速比較を可能にする特殊なデータストアです。ただし、コーパスが非常に大規模でない限り、パフォーマンスが重要ではないほとんどの非運用の実験的なユースケースでは、総当たり比較は驚くほど耐えられます。

データベースの使用を選択するかどうかに関係なく、質問に最も適合する項目をコーパス内で検索できるシステムを構築する必要があります。この例では、ユーザーの現在の質問に最も関連するモノローグを見つける機能が必要になります。次のようなことを行うとよいでしょう。

 monologues_embeddings = [ ['Tomorrow, and tomorrow, and tomorrow...', [...]], # text in the left position, embedding in the right position ['Full of vexation come I...', [...]], … # More monologues and their embeddings as you see fit. ] def lookup_most_relevant(question): embed = openai.Embedding.create(model=EMBEDDING_MODEL, input=[question])['data'][0]['embedding'] top_monologue = sorted(monologues_embeddings, key=lambda x: cosine_similarity(embed, x[1]), reverse=True)[0] return top_monologue lookup_most_relevant(“How does Macbeth evaluate his life when he is confronted with his mortality?“)

この例を実行すると、質問と約 82% の類似性を持つマクベスの独白が選択されていることがわかります。

プロンプトエンジニアリング

RAG モデルの最後のステップは、プロンプトエンジニアリングです。私たちの場合、それはそれほど難しいことではありません。モノローグが手元にあるので、クエリにモノローグを埋め込んで質問するだけでプロンプトを作成できます。

 completion = openai.ChatCompletion.create( model="gpt-4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": """ Consider the following monologue: Tomorrow, and tomorrow, and tomorrow, Creeps in this petty pace from day to day, To the last syllable of recorded time; And all our yesterdays have lighted fools The way to dusty death. Out, out, brief candle! Life's but a walking shadow, a poor player, That struts and frets his hour upon the stage, And then is heard no more. It is a tale Told by an idiot, full of sound and fury, Signifying nothing. How does Macbeth evaluate his life when he is confronted with his mortality? """}, ] )

次のような答えが表示されるはずです。

 In the well-known "Tomorrow and tomorrow and tomorrow" monologue from Shakespeare's Macbeth, the titular character plunges into existential despair when confronted with his impending death. Macbeth evaluates his life as hollow, futile, and meaningless when facing his mortality. He sees it as a "walking shadow" and himself as a "poor player," highlighting his view of life as a fleeting, contrived performance with no lasting substance or impact. The "brief candle" is a metaphor for his own mortality, suggesting that life's flame can be extinguished suddenly and unceremoniously. By stating "It is a tale told by an idiot, full of sound and fury, signifying nothing," Macbeth is expressing his belief that life, despite all its noise and action, is ultimately empty, absurd, and void of meaning. Overall, Macbeth's examination of his life is profoundly pessimistic, revealing his deep despair and cynicism.

もちろん、この特定の例は、RAG アーキテクチャの最も強力なデモンストレーションではありません。ほとんどの GPT モデルはすでにシェイクスピアの独白を認識しており、インターネット上で公開されているシェイクスピアの大量の分析でトレーニングされているからです。実際、独り言を埋め込まずに GPT-4 にこの正確な質問をすると、おそらく非常に良い答えが得られるでしょう。ただし、独り言への引用はそれほど多くないでしょう。ただし、商用環境では、この手法を既存の GPT 実装ではアクセスできない独自のデータセットや難解なデータセットに相互適用できることは明らかです。

実際、私の前回の記事「 ChatGPT、Google Cloud、Python を使用したドキュメントアナライザーの構築」に精通している読者は、この手法の最後の部分がその記事で行ったプロンプトエンジニアリングと非常に似ていることに気づくかもしれません。このアイデアから拡張すると、日本政府の出版物 (その記事のサンプルデータ) をベースにして構築された RAG システムを非常に簡単に想像できます。これにより、ユーザーは日本の経済政策について検索したり質問したりできるようになります。このシステムは、最も関連性の高い文書を迅速に取得して要約し、基本 GPT モデルでは利用できない領域固有の深い知識に基づいて回答を生成します。このパワーとシンプルさこそが、RAG アーキテクチャが LLM 開発者の間で大きな注目を集めている理由です。

RAG アーキテクチャについて説明したので、このアーキテクチャの欠点をいくつか見てみましょう。

RAG アーキテクチャの制限

デバッグ機能の埋め込み

多くの RAG システムは、質問と関連ドキュメントを結び付けるためにドキュメントの埋め込みとベクトル検索に依存しているため、システム全体が使用される埋め込みモデルと同じくらい優れていることがよくあります。 OpenAI 埋め込みモデルは非常に柔軟であり、埋め込みに合わせて調整するためのテクニックが多数あります。 GPT に対する Meta のオープンソースの競合製品である LLaMa は、微調整可能な埋め込みモデルを提供します。ただし、埋め込みモデルには避けられないブラックボックスの側面があります。これは、短いテキスト文字列を比較する場合、短い文字列とはるかに長いドキュメントを比較する場合には、ある程度管理可能です。前の例では、埋め込みルックアップが「死」を「明日、明日、そして明日」のモノローグに結びつけることができるという、少し飛躍した信念を持たなければなりません。これは、透明性とデバッグ可能性が重要なワークロードにとっては非常に不快な場合があります。

コンテキストのオーバーロード

RAG モデルのもう 1 つの制限は、RAG モデルに渡せるコンテキストの量が比較的限られていることです。埋め込みモデルが適切に機能するにはドキュメントレベルのコンテキストが必要であるため、埋め込みのためにコーパスを分割する場合は注意が必要です。マクベスの独白は、死亡率に関する質問と 82% の類似性を持っているかもしれませんが、その質問を独白の最初の 2 行の埋め込みと比較すると、その数は 78% に下がります。明日。記録された時間の最後の音節まで、このささいなペースで毎日忍び寄る。」

その結果、RAG プロンプトに渡されるコンテキストはかなり大きくなる必要があります。現在、最もハイコンテキストな GPT モデルはまだ 16,000 トークンに制限されており、これはかなりの量のテキストですが、長いインタビューの記録やコンテキストの豊富な記事を扱う場合は、提供できるコンテキストの量が制限されます。最終生成プロンプトで。

新しい用語

RAG モデルの最後の制限は、新しい用語を扱うことができないことです。特定の分野で働く人は、その分野に特有の用語や話し方を開発する傾向があります。これらの用語が埋め込みモデルのトレーニングデータに存在しない場合、検索プロセスに影響が生じます。

たとえば、ada-002 埋め込みモデルは、「Rust プログラミング言語」が「LLVM」に関連していることを知らない可能性があります。実際、78% という比較的低いコサイン類似度が返されます。これは、たとえ 2 つのアイデアが実際には密接に関連しているとしても、LLVM について説明しているドキュメントは、Rust に関するクエリでは強い類似性を示さない可能性があることを意味します。

通常、新しい用語の問題は、迅速なエンジニアリングによって克服できますが、埋め込み検索のコンテキストでは、それを行うのは比較的困難です。前述したように、埋め込みモデルを微調整することは可能ですが、あらゆるコンテキストで新しい用語を埋め込みモデルに教えることは、間違いが発生しやすく、時間がかかる可能性があります。

QE-RAG (質問拡張 RAG) の提案

これらの制限を考慮して、上記の制限の多くを回避する新しいクラスの RAG システムの実装を可能にする、修正されたアーキテクチャを提案したいと思います。このアイデアは、コーパスに加えてよくある質問に対してベクトル検索を実行し、LLM を使用して質問のコンテキストでコーパスを前処理することに基づいています。このプロセスが複雑に聞こえるかもしれませんが、心配しないでください。このセクションでは、実装の詳細と、実行に使用できるコード例を説明します。

注意すべき点の 1 つは、必要に応じて別の実装にフォールバックできるように、QE-RAG は標準の RAG 実装と並行して実行する必要があることです。実装が成熟するにつれて、フォールバックの必要性はますます少なくなるはずですが、QE-RAG は依然として、標準的な RAG アーキテクチャを置き換えるのではなく、拡張することを目的としています。

建築学、建築物、建築様式

QE-RAG アーキテクチャの大まかなストロークは次のとおりです。

コーパスに関して尋ねられる可能性がある、または尋ねられる可能性が高い質問のベクトルデータベースを作成します。
ベクトルデータベース内の質問に対してコーパスを前処理して要約します。
ユーザーのクエリが受信されると、ユーザーのクエリとベクトルデータベース内の質問を比較します。
データベース内の質問がユーザーのクエリと非常に類似している場合は、質問に答えるために要約されたバージョンのコーパスを取得します。
要約されたコーパスを使用してユーザーの質問に答えます。
DB 内にユーザークエリとよく似た質問がない場合は、標準の RAG 実装にフォールバックします。

各部分を順番に見ていきましょう。

質問の埋め込み

このアーキテクチャは、通常の RAG と同様に、埋め込みとベクトルデータベースから始まります。ただし、ドキュメントを埋め込む代わりに、一連の質問を埋め込みます。

これを説明するために、シェイクスピアの専門家である LLM を構築しようとしていると仮定します。次のような質問に答えられるようにしたいかもしれません。

 questions = [ "How does political power shape the way characters interact in Shakespeare's plays?", "How does Shakespeare use supernatural elements in his plays?", "How does Shakespeare explore the ideas of death and mortality in his plays?", "How does Shakespeare explore the idea of free will in his plays?" ]

このように埋め込みを作成し、保存するか、後で使用します。

 questions_embed = openai.Embedding.create(model=EMBEDDING_MODEL, input=questions)

前処理と要約

質問が揃ったので、コーパスをダウンロードして要約したいと思います。この例では、マクベスとハムレットの HTML バージョンをダウンロードします。

 import openai import os import requests from bs4 import BeautifulSoup plays = { 'shakespeare_macbeth': 'https://www.gutenberg.org/cache/epub/1533/pg1533-images.html', 'shakespeare_hamlet': 'https://www.gutenberg.org/cache/epub/1524/pg1524-images.html', } if not os.path.exists('training_plays'): os.mkdir('training_plays') for name, url in plays.items(): print(name) file_path = os.path.join('training_plays', '%s.txt' % name) if not os.path.exists(file_path): res = requests.get(url) with open(file_path, 'w') as fp_write: fp_write.write(res.text)

次に、HTML タグをガイドとして使用して、演劇をシーンに処理します。

 with open(os.path.join('training_plays', 'shakespeare_hamlet.txt')) as fp_file: soup = BeautifulSoup(''.join(fp_file.readlines())) headers = soup.find_all('div', {'class': 'chapter'})[1:] scenes = [] for header in headers: cur_act = None cur_scene = None lines = [] for i in header.find_all('h2')[0].parent.find_all(): if i.name == 'h2': print(i.text) cur_act = i.text elif i.name == 'h3': print('\t', i.text.replace('\n', ' ')) if cur_scene is not None: scenes.append({ 'act': cur_act, 'scene': cur_scene, 'lines': lines }) lines = [] cur_scene = i.text elif (i.text != '' and not i.text.strip('\n').startswith('ACT') and not i.text.strip('\n').startswith('SCENE') ): lines.append(i.text)

ここが QE-RAG をユニークにしている部分です。特定のシーンの埋め込みを作成するのではなく、各質問を対象とした要約を作成します。

 def summarize_for_question(text, question, location): completion = openai.ChatCompletion.create( model="gpt-3.5-turbo-16k", messages=[ {"role": "system", "content": "You are a literature assistant that provides helpful summaries."}, {"role": "user", "content": """Is the following excerpt from %s relevant to the following question? %s === %s === If so, summarize the sections that are relevant. Include references to specific passages that would be useful. If not, simply say: \"nothing is relevant\" without additional explanation""" % ( location, question, text )}, ] ) return completion

この関数は ChatGPT に 2 つのことを実行するように要求します。1) その文章が現在の質問に答えるのに実際に役立つかどうかを識別する、2) 質問に答えるのに役立つシーンの部分を要約する。

マクベスやハムレットのいくつかの重要なシーンでこの機能を試してみると、GPT3.5 はシーンが質問に関連しているかどうかを識別するのに非常に優れており、要約はシーン自体よりもかなり短くなることがわかります。これにより、後のプロンプトエンジニアリングステップでの組み込みが非常に簡単になります。

これで、すべてのシーンに対してこれを行うことができます。

 for scene in scenes: scene_text = ''.join(scene['lines']) question_summaries = {} for question in questions: completion = summarize_for_question(''.join(scene['lines']), question, "Shakespeare's Hamlet") question_summaries[question] = completion.choices[0].message['content'] scene['question_summaries'] = question_summaries

実稼働ワークロードでは、概要をデータベースに保存しますが、今回の場合は、それを JSON ファイルとしてディスクに書き込むだけです。

2 段階のベクトル検索

ここで、次のようなユーザーの質問を受けたとします。

 user_question = "How do Shakespearean characters deal with the concept of death?"

バニラ RAG と同様に、質問の埋め込みを作成する必要があります。

 uq_embed = openai.Embedding.create(model=EMBEDDING_MODEL, input=[user_question])['data'][0]['embedding']

バニラ RAG では、ユーザーの質問の埋め込みとシェイクスピアのシーンの埋め込みを比較しますが、QE-RAG では、質問の埋め込みと比較します。

 print([cosine_similarity(uq_embed, q) for q in question_embed])

ベクトル検索により、質問 3 が最も関連性の高い質問として (正しく) 識別されたことがわかります。ここで、質問 3 の概要データを取得します。

 relevant_texts = [] for scene in hamlet + macbeth: # hamlet and macbeth are the scene lists from the above code if "NOTHING IS RELEVANT" not in scene['question_summaries'][questions[2]].upper() and \ "NOTHING IN THIS EXCERPT" not in scene['question_summaries'][questions[2]].upper() and \ 'NOTHING FROM THIS EXCERPT' not in scene['question_summaries'][questions[2]].upper() and \ "NOT DIRECTLY ADDRESSED" not in scene['question_summaries'][questions[2]].upper(): relevant_texts.append(scene['question_summaries'][questions[2]])

GPT 要約は決定論的ではないため、シーンが当面の質問に関連していないことを示すいくつかの異なる文字列が得られる場合があることに注意してください。重要なのは、関連する抜粋のみを関連する概要のリストにプッシュすることです。

この段階では、第 2 レベルのベクトル検索を実行して、最も関連性の高い概要のみをプロンプトに含めることができますが、コーパスのサイズを考慮すると、単にプロンプトでrelevant_textsリスト全体を使用することもできます。

迅速なエンジニアリング

これですべての抜粋が揃ったので、次のようにプロンプトを構築できます。

 completion = openai.ChatCompletion.create( model="gpt-3.5-turbo-16k", messages=[ {"role": "system", "content": "You are a literature assistant that answers questions based on excerpts."}, {"role": "user", "content": """Consider the following passages: %s ==== Answer this question: %s Be detailed and complete in your analysis. Make references to specific passages where helpful. Answer only based on passages provided. """ % ( '\n===\n'.join(relevant_texts), user_question )}, ] )

API 呼び出しを実行すると、次のような内容が表示されるはずです。

 Based on the passages provided, Shakespearean characters deal with the concept of death in various ways. In Hamlet, for example, the characters contemplate the nature of death and its implications. Hamlet himself reflects on the brevity of life and the corrupting influence of vices. He questions the purpose of existence and contemplates suicide as a release from the burdens of life. In the passage, Hamlet says, "To die, to sleep— / To sleep, perchance to dream. Ay, there's the rub, / For in that sleep of death what dreams may come" (Lines 65-67). This shows his contemplation of the afterlife and the uncertainty surrounding it. In Macbeth, characters are confronted with the consequences of their actions and the fragility of life. Lady Macbeth, for instance, experiences guilt and remorse for her involvement in the murder of King Duncan. She is haunted by visions of blood on her hands and expresses the desire to escape the consequences of her actions. In the passage, Lady Macbeth says, "Out, damned spot! Out, I say!" (Line 30). This demonstrates her struggle to cope with the guilt and the psychological toll it takes on her. Furthermore, both Macbeth and Lady Macbeth reflect on the inevitability and the consequences of their actions. In the passage, Macbeth laments the decay of life and the loss of purpose and meaning. He reflects on the transitory nature of existence and the insignificance of worldly achievements. Macbeth states, "Life's but a walking shadow, a poor player / That struts and frets his hour upon the stage" (Lines 217-218). This shows his acceptance of mortality and his realization that material possessions and power do not bring true fulfillment. In summary, Shakespearean characters deal with the concept of death in various ways. They contemplate the nature of life and the afterlife, struggle with guilt and remorse, and question the meaning and purpose of their existence. Shakespeare's exploration of death and mortality through his characters provides insight into the complexities of human nature and the universal human experience of confronting one's mortality.

これはハムレットとマクベスのさまざまな部分を引用した非常に優れた要約です。

比較のために、シーン全体のエンベディングを作成し、同じプロンプトを使用してバニラ RAG ルックアップを実行し、トークンが 10,000 に達するまで最も関連性の高い上位 N 個のシーンを見つけました。

 relevant_texts = [] for i in range(10): if len(encoding.encode('\n===\n'.join(relevant_texts))) > 10000: break relevant_texts.append(''.join(macbeth[sorted( [(idx, cosine_similarity(uq_embed, q)) for idx, q in enumerate(macbeth_embed)], key=lambda x: x[1], reverse=True )[i][0]]['lines'])) relevant_texts.append(''.join(hamlet[sorted( [(idx, cosine_similarity(uq_embed, q)) for idx, q in enumerate(hamlet_embed)], key=lambda x: x[1], reverse=True )[i][0]]['lines']))

これをプロンプトにフィードすると、次のようなものが表示されるはずです。

 Based on the provided passages, Shakespearean characters deal with the concept of death in various ways. In Hamlet, the characters reflect on death and mortality extensively. Hamlet himself contemplates suicide, expressing his desire to end his suffering and questioning the purpose of life. He says, "To be, or not to be: that is the question" (Hamlet, Act III, Scene I, Lines 56). This soliloquy reveals his struggle with the concept of death and his contemplation of the consequences of choosing either life or death. Additionally, in the same play, the appearance of the Ghost prompts discussions about death. The characters question the meaning and significance of encountering the dead, and Hamlet grapples with the idea of mortality in relation to his father's death. He reflects, "Oh God! a beast that wants discourse of reason / Would have mourn'd longer" (Hamlet, Act I, Scene II, Lines 152-153). This passage highlights Hamlet's grief and his pondering of the brevity of life and the transitory nature of existence. In Macbeth, the characters also confront death and its consequences. Macbeth, after killing King Duncan, reflects on the significance of the act and its implications. He says, "Will all great Neptune's ocean wash this blood Clean from my hand? No, this my hand will rather The multitudinous seas incarnadine" (Macbeth, Act II, Scene II, Lines 77-79). This passage showcases Macbeth's awareness of the guilt associated with murder and his realization that the blood on his hands cannot be cleansed. Lady Macbeth, on the other hand, shows a different perspective on death. She suppresses her guilt and tries to detach herself from the consequences of their actions. She states, "A little water clears us of this deed" (Macbeth, Act II, Scene II, Line 69). This passage reveals her denial and determination to move forward without remorse. In conclusion, Shakespearean characters deal with the concept of death in various ways. Some contemplate the meaning of life and question the purpose of existence, while others grapple with the guilt and consequences that come with taking a life. Shakespeare explores the complexities of death and mortality through his characters' introspection and their different responses to the inevitability of death.

これは非常に説得力のある分析ですが、ハムレットとマクベスの最も重要な一節の多くとは関係ありません。ご覧のとおり、QE-RAG には、標準の RAG システムよりも関連性の高いコンテキストを埋め込むことができるという明確な利点があります。

ただし、上記の例では、QE-RAG の別の利点、つまり開発者が埋め込みプロセスをより適切に制御できる機能は実証されていません。 QE-RAG がこれをどのように達成するかを確認するために、この問題の拡張、つまり新しい用語の扱いを見てみましょう。

QE-RAG を新しい用語に拡張

QE-RAG が真価を発揮するのは、新しい用語を導入するときです。たとえば、日本語の「絶望」という新しい概念を導入するとします。これは、絶望と絶望の間に位置する用語であり、特に自分の状況への降伏を意味します。それはイギリスの絶望の概念ほど即座に破滅的なものではなく、起こっている不快な出来事を黙認することについてのものです。

次のような質問に答えたいとします。

user_question = "How do Shakespearean characters cope with Zetsubou?"

バニラ RAG では、エンベディング検索を行ってから、最後のプロンプトエンジニアリングステップで説明を追加します。

 relevant_texts = [] for i in range(10): if len(encoding.encode('\n===\n'.join(relevant_texts))) > 10000: break relevant_texts.append(''.join(macbeth[sorted( [(idx, cosine_similarity(uq_embed, q)) for idx, q in enumerate(macbeth_embed)], key=lambda x: x[1], reverse=True )[i][0]]['lines'])) relevant_texts.append(''.join(hamlet[sorted( [(idx, cosine_similarity(uq_embed, q)) for idx, q in enumerate(hamlet_embed)], key=lambda x: x[1], reverse=True )[i][0]]['lines'])) completion = openai.ChatCompletion.create( model="gpt-3.5-turbo-16k", messages=[ {"role": "system", "content": "You are a literature assistant that answers questions based on excerpts."}, {"role": "user", "content": """Zetsubou is the concept of hopelessness and despair, combined with a surrender to whim of one's circumstances. Consider the following passages: %s ==== Answer this question: %s Be detailed and complete in your analysis. Make references to specific passages where helpful. Answer only based on passages provided. """ % ( '\n===\n'.join(relevant_texts), user_question )}, ] )

その結果、ハムレットのいくつかのシーンに焦点を当てた、非常によく書かれた説得力のある、しかし少し大げさな答えが得られました。どのシーンも埋め込み検索に合格しなかったため、この回答ではマクベスについてはまったく言及されていません。埋め込みを見ると、「絶望」の意味論的な意味が適切に捉えられていないため、そこから関連するテキストを取得できなかったことは明らかです。

QE-RAG では、要約段階で新しい用語の定義を挿入でき、システムがアクセスできるテキストの品質が大幅に向上します。

 def summarize_for_question(text, question, location, context=''): completion = openai.ChatCompletion.create( model="gpt-3.5-turbo-16k", messages=[ {"role": "system", "content": "You are a literature assistant that provides helpful summaries."}, {"role": "user", "content": """%s Is the following excerpt from %s relevant to the following question? %s === %s === If so, summarize the sections that are relevant. Include references to specific passages that would be useful. If not, simply say: \"nothing is relevant\" without additional explanation""" % ( context, location, question, text )}, ] ) return completion questions = [ "How do characters deal with Zetsubou in Shakespearean plays?" ] summarize_for_question(''.join(scene['lines']), questions[0], "Shakespeare's Macbeth", "Zetsubou is the concept of hopelessness and despair, combined with a surrender to whim of one's circumstances." )

この要約プロンプトを文章とともに使用すると、非常に正確で状況に応じた要約が表示されます。これらの要約は、後続の QE-RAG で使用できます。

さらに、ada-002 は「絶望」の正確な意味を知りませんが、両方の質問にこの単語が存在するため、質問の埋め込みは非常に似ています。

したがって、QE-RAG を使用して絶望に関する元の質問に答えると、次の答えが得られます。

 Based on the passages provided, Shakespearean characters cope with Zetsubou, or hopelessness and despair, in different ways. In "Macbeth," Lady Macbeth copes with Zetsubou through guilt and anguish. She is consumed by the guilt of her and her husband's actions, and this guilt manifests in her sleepwalking and attempts to wash her hands of imagined blood stains. This can be seen in her utterances of "Out, damned spot!" and her preoccupation with the blood on her hands. Lady Macbeth's coping mechanism is to face the consequences of her actions head-on, even if it means descending into madness. The Doctor and Gentlewoman observe her behavior and acknowledge that her condition requires divine intervention. In "Hamlet," the excerpt does not provide specific passages or examples of characters coping with Zetsubou. However, it does address Hamlet's inner turmoil and struggle to cope with his circumstances. Hamlet's state of mind, confusion, and madness are a reflection of despair. The characters in this scene are attempting to understand Hamlet's condition and find ways to deal with it, but the excerpt does not provide details on how they cope with Zetsubou. In "Macbeth," Macbeth himself copes with Zetsubou by surrendering to his dire circumstances. He becomes desensitized to fear and horror, indicating a kind of resignation to his grim circumstances. He acknowledges that there is no escape from his fate and chooses to face it head-on, even though he knows it will lead to his downfall. Macbeth's reaction to the news of his wife's death and his subsequent soliloquy reflect a deep despair and hopelessness. He sees life as meaningless, a "tale Told by an idiot, full of sound and fury, Signifying nothing." Macbeth's surrender to his circumstances and his belief in the futility of life demonstrate his coping mechanism for Zetsubou. In "Macbeth," Lady Macduff copes with Zetsubou by expressing her feelings of despair and hopelessness. She feels betrayed and abandoned by her husband, who has fled and left their family behind. She questions his actions and believes that they show a lack of love and wisdom. Lady Macduff experiences a sense of abandonment, which aligns with the concept of Zetsubou. She tries to make her son understand the severity of their circumstances, but he questions the definition of a traitor. Lady Macduff ultimately meets a tragic end, indicating that her coping mechanism for Zetsubou is not successful. In summary, Shakespearean characters cope with Zetsubou in different ways. Lady Macbeth copes through guilt and faces the consequences of her actions, even if it means descending into madness. Macbeth copes through surrendering to his dire circumstances and embracing his fate. Lady Macduff copes through expressing her feelings of despair and hopelessness but ultimately meets a tragic end. These characters' coping mechanisms reflect different responses to Zetsubou in Shakespearean plays

出力はこの概念に関してまだ少し苦労していますが、QE-RAG が質問の核心にもっと取り組み、さまざまな情報源から引き出すことができることは明らかです。

以上で、QE-RAG を実装し、そのユースケースを実証しました。

結論

今日の記事では、人気が高まっている RAG アーキテクチャとその制限について検討しました。次に、QE-RAG と呼ばれる新しいアーキテクチャで RAG アーキテクチャを拡張しました。これは、大規模な言語モデルの機能をより完全に活用することを目的としています。 QE-RAG は、精度とコンテキストアクセスの向上に加えて、ユーザーとのやり取りに応じてシステム全体が成長し、尋ねられる質問の種類にさらに詳しくなり、企業がオープンソースに基づいて独自の知的財産を開発できるようにします。または市販の LLM 。

もちろん、実験的なアイデアである QE-RAG は完璧ではありません。このアーキテクチャをどのように改善できるかについてアイデアがある場合、または単純に LLM テクノロジについて議論したい場合は、遠慮なくGithubまたはLinkedInを通じて私に連絡してください。