4,222 讀數

AI 知识管理：使用 QE-RAG 架构迭代 RAG

经过 Shanglun Wang27m2023/09/12

太長; 讀書

检索增强生成（RAG）是一种用于开发功能强大的 LLM 应用程序的流行架构。然而，当前的架构有一些实际的局限性。我们逐步构建 RAG 应用程序，然后了解如何使用名为 QE-RAG 的新架构来改进它

featured image - AI 知识管理：使用 QE-RAG 架构迭代 RAG

随着法学硕士革命开始形成，炒作已经让位于商业发展。随着最初的兴奋浪潮消退，生成式人工智能不再被视为一个无所不知的黑匣子，而更多地被视为工程师武器库中的一个组成部分（即使极其强大）的工具。因此，企业家和技术人员现在拥有一套日益成熟的工具和技术来开发法学硕士申请。

法学硕士最有趣的用例之一是知识管理领域。基于 OpenAI 的 GPT 技术或LLaMa 2和 Flan-T5 等开源模型的专业法学硕士正在以巧妙的方式用于管理大量数据。以前拥有大型文本数据集的组织必须依赖模糊匹配或全文索引等文本搜索技术，现在他们可以使用强大的系统，该系统不仅可以查找信息，还可以以省时且易于阅读的方式总结信息时尚。

在此用例中，检索增强生成架构（RAG）已成为具有巨大灵活性和性能的杰出架构。通过这种架构，组织可以快速索引工作主体，对其执行语义查询，并根据语料库为用户定义的查询生成信息丰富且令人信服的答案。多家公司和服务如雨后春笋般涌现，支持 RAG 架构的实施，凸显了其持久力。

尽管 RAG 非常有效，但该架构也存在一些实际限制。在本文中，我们将探索 RAG 架构，找出其局限性，并提出一种改进的架构来解决这些局限性。

与所有其他文章一样，我希望与其他技术专家和人工智能爱好者建立联系。如果您对如何改进此架构有想法，或者有关于人工智能的想法想要讨论，请随时与我们联系！您可以在 Github 或 LinkedIn 上找到我，链接位于我的个人资料以及本文底部。

内容概述

检索增强生成 (RAG) 架构
RAG 架构的局限性
提出 QE-RAG，或问题增强 RAG
结论

检索增强生成 (RAG) 架构

像 RAG、Flan 和 LLaMa 这样的名字，人工智能社区很可能不会很快赢得未来主义和时尚名字的奖项。然而，RAG 架构当然值得获奖，因为它结合了法学硕士开发中提供的两种极其强大的技术——上下文文档嵌入和提示工程。

最简单的说，RAG 架构是一个系统，它使用嵌入向量搜索来查找语料库中与问题最相关的部分，将这些部分插入到提示中，然后使用提示工程来确保答案基于提示中给出的摘录。如果这一切听起来有点令人困惑，请继续阅读，因为我将依次解释每个组件。我还将包含示例代码，以便您可以遵循。

嵌入模型

首先，有效的 RAG 系统需要强大的嵌入模型。嵌入模型将自然文本文档转换为一系列数字或“向量”，粗略地表示文档的语义内容。假设嵌入模型是一个好的模型，您将能够比较两个不同文档的语义值，并使用向量算术确定这两个文档在语义上是否相似。

要查看实际效果，请将以下代码粘贴到 Python 文件中并运行它：

 import openai from openai.embeddings_utils import cosine_similarity openai.api_key = [YOUR KEY] EMBEDDING_MODEL = "text-embedding-ada-002" def get_cos_sim(input_1, input_2): embeds = openai.Embedding.create(model=EMBEDDING_MODEL, input=[input_1, input_2]) return cosine_similarity(embeds['data'][0]['embedding'], embeds['data'][1]['embedding']) print(get_cos_sim('Driving a car', 'William Shakespeare')) print(get_cos_sim('Driving a car', 'Riding a horse'))

上面的代码生成短语“Driving a car”、“William Shakespeare”和“Riding a horse”的嵌入，然后使用余弦相似度算法将它们相互比较。当短语语义相似时，我们预计余弦相似度会更高，因此“驾驶汽车”和“骑马”应该更接近，而“驾驶汽车”和“威廉莎士比亚”应该不相似。

您应该看到，根据 OpenAI 的嵌入模型 ada-002，短语“drive a car”与短语“riding a horse”相似度为 88%，与短语“William Shakespeare”相似度为 76%。这意味着嵌入模型的性能符合我们的预期。语义相似度的确定是 RAG 系统的基础。

当你将余弦相似度的想法扩展到更大的文档的比较时，它是非常强大的。例如，莎士比亚的《麦克白》中强有力的独白“明天，明天，明天”：

 monologue = '''Tomorrow, and tomorrow, and tomorrow, Creeps in this petty pace from day to day, To the last syllable of recorded time; And all our yesterdays have lighted fools The way to dusty death. Out, out, brief candle! Life's but a walking shadow, a poor player, That struts and frets his hour upon the stage, And then is heard no more. It is a tale Told by an idiot, full of sound and fury, Signifying nothing.''' print(get_cos_sim(monologue, 'Riding a car')) print(get_cos_sim(monologue, 'The contemplation of mortality'))

你应该看到，这段独白与“骑车”的思想相似度只有75%，与“死亡的沉思”的思想相似度高达82%。

但我们不仅仅需要将独白与想法进行比较，我们实际上可以将独白与问题进行比较。例如：

 get_cos_sim('''Tomorrow, and tomorrow, and tomorrow, Creeps in this petty pace from day to day, To the last syllable of recorded time; And all our yesterdays have lighted fools The way to dusty death. Out, out, brief candle! Life's but a walking shadow, a poor player, That struts and frets his hour upon the stage, And then is heard no more. It is a tale Told by an idiot, full of sound and fury, Signifying nothing.''', 'Which Shakespearean monologue contemplates mortality?') get_cos_sim('''Full of vexation come I, with complaint Against my child, my daughter Hermia. Stand forth, Demetrius. My noble lord, This man hath my consent to marry her. Stand forth, Lysander. And my gracious Duke, This man hath bewitch'd the bosom of my child. Thou, thou, Lysander, thou hast given her rhymes, And interchanged love-tokens with my child: Thou hast by moonlight at her window sung With feigning voice verses of feigning love, And stol'n the impression of her fantasy With bracelets of thy hair, rings, gauds, conceits, Knacks, trifles, nosegays, sweetmeats (messengers Of strong prevailment in unharden'd youth): With cunning hast thou filch'd my daughter's heart, Turn'd her obedience, which is due to me, To stubborn harshness. And, my gracious Duke, Be it so she will not here, before your Grace, Consent to marry with Demetrius, I beg the ancient privilege of Athens: As she is mine, I may dispose of her; Which shall be either to this gentleman, Or to her death, according to our law Immediately provided in that case.''', 'Which Shakespearean monologue contemplates mortality?')

您应该看到，嵌入表明麦克白独白在上下文上更接近“哪一个莎士比亚独白思考死亡？”的问题。与埃吉斯的独白相比，它确实提到了死亡，但没有直接讨论死亡的概念。

向量查找

现在我们已经有了嵌入，我们如何在 RAG 系统中使用它？好吧，假设我们想为 RAG 系统提供所有莎士比亚独白的知识，以便它可以回答有关莎士比亚的问题。在这种情况下，我们将下载莎士比亚的所有独白，并为它们生成嵌入。如果您按照步骤进行操作，则可以像这样生成嵌入：

 embedding = openai.Embedding.create(model=EMBEDDING_MODEL, input=[monologue])['data'][0]['embedding']

一旦我们有了嵌入，我们将希望以一种允许我们查询并将它们与新嵌入进行比较的方式存储它们。通常我们会将它们放入所谓的向量数据库中，这是一种专门的数据存储，可以快速比较两个向量。然而，除非您的语料库非常大，否则对于大多数性能并不重要的非生产、实验用例来说，强力比较是令人惊讶的。

无论您是否选择使用数据库，您都需要构建一个可以在语料库中找到最适合问题的项目的系统。在我们的示例中，我们希望能够找到与当前用户问题最相关的独白。您可能想做类似的事情：

 monologues_embeddings = [ ['Tomorrow, and tomorrow, and tomorrow...', [...]], # text in the left position, embedding in the right position ['Full of vexation come I...', [...]], … # More monologues and their embeddings as you see fit. ] def lookup_most_relevant(question): embed = openai.Embedding.create(model=EMBEDDING_MODEL, input=[question])['data'][0]['embedding'] top_monologue = sorted(monologues_embeddings, key=lambda x: cosine_similarity(embed, x[1]), reverse=True)[0] return top_monologue lookup_most_relevant(“How does Macbeth evaluate his life when he is confronted with his mortality?“)

如果运行此示例，您应该会看到选择了麦克白独白，与问题的相似度约为 82%。

即时工程

RAG 模型的最后一步是即时工程。就我们而言，这并不太困难。现在我们已经有了独白，我们可以通过简单地将独白嵌入到我们的查询中来构建提示，然后提出问题：

 completion = openai.ChatCompletion.create( model="gpt-4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": """ Consider the following monologue: Tomorrow, and tomorrow, and tomorrow, Creeps in this petty pace from day to day, To the last syllable of recorded time; And all our yesterdays have lighted fools The way to dusty death. Out, out, brief candle! Life's but a walking shadow, a poor player, That struts and frets his hour upon the stage, And then is heard no more. It is a tale Told by an idiot, full of sound and fury, Signifying nothing. How does Macbeth evaluate his life when he is confronted with his mortality? """}, ] )

您应该看到如下答案：

 In the well-known "Tomorrow and tomorrow and tomorrow" monologue from Shakespeare's Macbeth, the titular character plunges into existential despair when confronted with his impending death. Macbeth evaluates his life as hollow, futile, and meaningless when facing his mortality. He sees it as a "walking shadow" and himself as a "poor player," highlighting his view of life as a fleeting, contrived performance with no lasting substance or impact. The "brief candle" is a metaphor for his own mortality, suggesting that life's flame can be extinguished suddenly and unceremoniously. By stating "It is a tale told by an idiot, full of sound and fury, signifying nothing," Macbeth is expressing his belief that life, despite all its noise and action, is ultimately empty, absurd, and void of meaning. Overall, Macbeth's examination of his life is profoundly pessimistic, revealing his deep despair and cynicism.

当然，这个特定的例子并不是 RAG 架构最有力的演示，因为大多数 GPT 模型已经了解莎士比亚的独白，并且已经接受过互联网上公开的大量莎士比亚分析的训练。事实上，如果您在没有嵌入独白的情况下问 GPT-4 这个确切的问题，您可能会得到一个非常好的答案，尽管它可能不会对独白进行那么多引用。然而，显而易见的是，在商业环境中，该技术可以交叉应用于现有 GPT 实现无法访问的专有或深奥数据集。

事实上，熟悉我上一篇文章《使用 ChatGPT、Google Cloud 和 Python 构建文档分析器》的读者可能会认识到该技术的最后一部分与我在那篇文章中所做的提示工程非常相似。从这个想法出发，我们可以很容易地想象一个建立在日本政府出版物（该文章的样本数据）之上的 RAG 系统，该系统将允许用户搜索并提出有关日本经济政策的问题。该系统将快速检索最相关的文档，对其进行总结，并根据基本 GPT 模型无法获得的深层特定领域知识生成答案。这种强大功能和简单性正是 RAG 架构在 LLM 开发人员中受到广泛关注的原因。

现在我们已经了解了 RAG 架构，让我们探讨一下该架构的一些缺点。

RAG 架构的局限性

嵌入可调试性

由于许多 RAG 系统依赖文档嵌入和向量搜索来连接问题和相关文档，因此整个系统通常与所使用的嵌入模型一样好。 OpenAI 嵌入模型非常灵活，并且有许多技术可以调整嵌入。 LLaMa 是 Meta 的 GPT 开源竞争对手，提供可微调的嵌入模型。然而，嵌入模型有一个不可避免的黑匣子方面。当比较短文本字符串和更长的文档时，这在某种程度上是可以管理的。在我们之前的例子中，我们必须稍微相信嵌入查找能够将“死亡”与“明天、明天、明天”独白联系起来。对于透明度和可调试性至关重要的工作负载来说，这可能会非常不舒服。

上下文过载

RAG 模型的另一个限制是可以传递给它的上下文数量相对有限。由于嵌入模型需要文档级上下文才能正常工作，因此我们在分割语料库以进行嵌入时需要小心。麦克白的独白可能与死亡率问题有 82% 的相似度，但当你将该问题与独白前两行的嵌入进行比较时，这个数字会下降到 78%，即“明天、明天、还有”。明天。日复一日地以这种微小的步伐蠕动，直到记录时间的最后一个音节。”

因此，传递到 RAG 提示符的上下文需要相当大。目前，大多数高上下文 GPT 模型仍然限制为 16,000 个标记，这是相当多的文本，但是当您处理长采访记录或上下文丰富的文章时，您可以提供的上下文数量将受到限制在最后的生成提示中。

新颖的术语

RAG 模型的最后一个限制是它无法使用新术语。在特定领域工作的人们倾向于开发该领域独有的术语和说话方式。当这些术语不存在于嵌入模型的训练数据中时，查找过程将会受到影响。

例如，ada-002 嵌入模型可能不知道“Rust 编程语言”与“LLVM”相关。事实上，它返回的余弦相似度相对较低，为 78%。这意味着讨论 LLVM 的文档在关于 Rust 的查询中可能不会表现出很强的相似性，尽管这两个想法在现实生活中密切相关。

通常，新术语的问题可以通过一些即时工程来克服，但在嵌入搜索的背景下，这相对困难。如前所述，微调嵌入模型是可能的，但在所有上下文中向嵌入模型教授新术语可能容易出错且耗时。

提出 QE-RAG，或问题增强 RAG

鉴于这些限制，我想提出一种修改后的架构，该架构将允许实现一类新型 RAG 系统，从而避开上述许多限制。这个想法是基于除了语料库之外对常见问题进行向量搜索，并使用法学硕士在问题的背景下预处理语料库。如果该过程听起来很复杂，请不要担心，我们将在本节中介绍实现细节以及您可以用来遵循的代码示例。

需要注意的一件事是，QE-RAG 应与普通 RAG 实现一起运行，以便在需要时可以依靠另一个实现。随着实现的成熟，它需要的回退应该会越来越少，但 QE-RAG 仍然旨在增强而不是替代普通 RAG 架构。

架构

QE-RAG 架构的大致流程如下：

创建一个向量数据库，其中包含可能或可能会被询问的有关语料库的问题。
针对向量数据库中的问题对语料库进行预处理和总结。
当用户查询到来时，将用户查询与向量数据库中的问题进行比较。
如果数据库中的问题与用户查询高度相似，则检索汇总来回答该问题的语料库版本。
使用汇总语料库回答用户问题。
如果数据库中没有问题与用户查询高度相似，则回退到普通 RAG 实现。

让我们依次讨论每个部分。

问题嵌入

该架构非常类似于普通的 RAG，以嵌入和向量数据库开始。然而，我们不会嵌入文档，而是嵌入一系列问题。

为了说明这一点，假设我们正在尝试建立一个莎士比亚专家的法学硕士。我们可能希望它回答以下问题：

 questions = [ "How does political power shape the way characters interact in Shakespeare's plays?", "How does Shakespeare use supernatural elements in his plays?", "How does Shakespeare explore the ideas of death and mortality in his plays?", "How does Shakespeare explore the idea of free will in his plays?" ]

我们将像这样为它们创建一个嵌入，并保存它们或稍后使用：

 questions_embed = openai.Embedding.create(model=EMBEDDING_MODEL, input=questions)

预处理和总结

现在我们有了问题，我们需要下载并总结语料库。在本例中，我们将下载《麦克白》和《哈姆雷特》的 HTML 版本：

 import openai import os import requests from bs4 import BeautifulSoup plays = { 'shakespeare_macbeth': 'https://www.gutenberg.org/cache/epub/1533/pg1533-images.html', 'shakespeare_hamlet': 'https://www.gutenberg.org/cache/epub/1524/pg1524-images.html', } if not os.path.exists('training_plays'): os.mkdir('training_plays') for name, url in plays.items(): print(name) file_path = os.path.join('training_plays', '%s.txt' % name) if not os.path.exists(file_path): res = requests.get(url) with open(file_path, 'w') as fp_write: fp_write.write(res.text)

然后我们使用 HTML 标签作为指导，将戏剧处理成场景：

 with open(os.path.join('training_plays', 'shakespeare_hamlet.txt')) as fp_file: soup = BeautifulSoup(''.join(fp_file.readlines())) headers = soup.find_all('div', {'class': 'chapter'})[1:] scenes = [] for header in headers: cur_act = None cur_scene = None lines = [] for i in header.find_all('h2')[0].parent.find_all(): if i.name == 'h2': print(i.text) cur_act = i.text elif i.name == 'h3': print('\t', i.text.replace('\n', ' ')) if cur_scene is not None: scenes.append({ 'act': cur_act, 'scene': cur_scene, 'lines': lines }) lines = [] cur_scene = i.text elif (i.text != '' and not i.text.strip('\n').startswith('ACT') and not i.text.strip('\n').startswith('SCENE') ): lines.append(i.text)

这是使 QE-RAG 独一无二的部分，我们不是为特定场景创建嵌入，而是为它们创建摘要，针对每个问题：

 def summarize_for_question(text, question, location): completion = openai.ChatCompletion.create( model="gpt-3.5-turbo-16k", messages=[ {"role": "system", "content": "You are a literature assistant that provides helpful summaries."}, {"role": "user", "content": """Is the following excerpt from %s relevant to the following question? %s === %s === If so, summarize the sections that are relevant. Include references to specific passages that would be useful. If not, simply say: \"nothing is relevant\" without additional explanation""" % ( location, question, text )}, ] ) return completion

此函数要求 ChatGPT 做 2 件事：1）确定该段落是否确实对回答当前问题有用，2）总结场景中对回答问题有用的部分。

如果您在《麦克白》或《哈姆雷特》中的几个关键场景中尝试此功能，您会发现 GPT3.5 非常擅长识别场景是否与问题相关，并且摘要将比场景本身短很多。这使得稍后在即时工程步骤中嵌入变得更加容易。

现在我们可以对所有场景执行此操作。

 for scene in scenes: scene_text = ''.join(scene['lines']) question_summaries = {} for question in questions: completion = summarize_for_question(''.join(scene['lines']), question, "Shakespeare's Hamlet") question_summaries[question] = completion.choices[0].message['content'] scene['question_summaries'] = question_summaries

在生产工作负载中，我们会将摘要放入数据库中，但在我们的例子中，我们只是将其作为 JSON 文件写入磁盘。

两阶段矢量搜索

现在假设我们收到如下用户问题：

 user_question = "How do Shakespearean characters deal with the concept of death?"

与普通 RAG 一样，我们希望为问题创建一个嵌入：

 uq_embed = openai.Embedding.create(model=EMBEDDING_MODEL, input=[user_question])['data'][0]['embedding']

在普通 RAG 中，我们会将用户问题嵌入与莎士比亚场景的嵌入进行比较，但在 QE-RAG 中，我们将与问题嵌入进行比较：

 print([cosine_similarity(uq_embed, q) for q in question_embed])

我们看到向量搜索（正确）将问题 3 识别为最相关的问题。现在，我们检索问题 3 的摘要数据：

 relevant_texts = [] for scene in hamlet + macbeth: # hamlet and macbeth are the scene lists from the above code if "NOTHING IS RELEVANT" not in scene['question_summaries'][questions[2]].upper() and \ "NOTHING IN THIS EXCERPT" not in scene['question_summaries'][questions[2]].upper() and \ 'NOTHING FROM THIS EXCERPT' not in scene['question_summaries'][questions[2]].upper() and \ "NOT DIRECTLY ADDRESSED" not in scene['question_summaries'][questions[2]].upper(): relevant_texts.append(scene['question_summaries'][questions[2]])

请注意，由于 GPT 汇总不是确定性的，因此您可能会得到几个不同的字符串来指示某个场景与当前问题不相关。关键是仅将相关摘录推入相关摘要列表中。

在此阶段，我们可以进行二级向量搜索，仅在提示中包含最相关的摘要，但考虑到语料库的大小，我们可以简单地在提示中使用整个relevant_texts列表。

及时工程

现在我们有了所有的摘录，我们可以像这样构建提示：

 completion = openai.ChatCompletion.create( model="gpt-3.5-turbo-16k", messages=[ {"role": "system", "content": "You are a literature assistant that answers questions based on excerpts."}, {"role": "user", "content": """Consider the following passages: %s ==== Answer this question: %s Be detailed and complete in your analysis. Make references to specific passages where helpful. Answer only based on passages provided. """ % ( '\n===\n'.join(relevant_texts), user_question )}, ] )

让 API 调用运行，您应该会看到如下所示的内容：

 Based on the passages provided, Shakespearean characters deal with the concept of death in various ways. In Hamlet, for example, the characters contemplate the nature of death and its implications. Hamlet himself reflects on the brevity of life and the corrupting influence of vices. He questions the purpose of existence and contemplates suicide as a release from the burdens of life. In the passage, Hamlet says, "To die, to sleep— / To sleep, perchance to dream. Ay, there's the rub, / For in that sleep of death what dreams may come" (Lines 65-67). This shows his contemplation of the afterlife and the uncertainty surrounding it. In Macbeth, characters are confronted with the consequences of their actions and the fragility of life. Lady Macbeth, for instance, experiences guilt and remorse for her involvement in the murder of King Duncan. She is haunted by visions of blood on her hands and expresses the desire to escape the consequences of her actions. In the passage, Lady Macbeth says, "Out, damned spot! Out, I say!" (Line 30). This demonstrates her struggle to cope with the guilt and the psychological toll it takes on her. Furthermore, both Macbeth and Lady Macbeth reflect on the inevitability and the consequences of their actions. In the passage, Macbeth laments the decay of life and the loss of purpose and meaning. He reflects on the transitory nature of existence and the insignificance of worldly achievements. Macbeth states, "Life's but a walking shadow, a poor player / That struts and frets his hour upon the stage" (Lines 217-218). This shows his acceptance of mortality and his realization that material possessions and power do not bring true fulfillment. In summary, Shakespearean characters deal with the concept of death in various ways. They contemplate the nature of life and the afterlife, struggle with guilt and remorse, and question the meaning and purpose of their existence. Shakespeare's exploration of death and mortality through his characters provides insight into the complexities of human nature and the universal human experience of confronting one's mortality.

这是一个很好的总结，借鉴了《哈姆雷特》和《麦克白》的许多不同部分。

为了进行比较，我为整个场景创建了嵌入，并使用相同的提示运行了普通的 RAG 查找，找到前 N 个最相关的场景，直到达到 10,000 个标记

relevant_texts = [] for i in range(10): if len(encoding.encode('\n===\n'.join(relevant_texts))) > 10000: break relevant_texts.append(''.join(macbeth[sorted( [(idx, cosine_similarity(uq_embed, q)) for idx, q in enumerate(macbeth_embed)], key=lambda x: x[1], reverse=True )[i][0]]['lines'])) relevant_texts.append(''.join(hamlet[sorted( [(idx, cosine_similarity(uq_embed, q)) for idx, q in enumerate(hamlet_embed)], key=lambda x: x[1], reverse=True )[i][0]]['lines']))

如果您将其输入到提示符中，您应该会看到如下所示的内容：

 Based on the provided passages, Shakespearean characters deal with the concept of death in various ways. In Hamlet, the characters reflect on death and mortality extensively. Hamlet himself contemplates suicide, expressing his desire to end his suffering and questioning the purpose of life. He says, "To be, or not to be: that is the question" (Hamlet, Act III, Scene I, Lines 56). This soliloquy reveals his struggle with the concept of death and his contemplation of the consequences of choosing either life or death. Additionally, in the same play, the appearance of the Ghost prompts discussions about death. The characters question the meaning and significance of encountering the dead, and Hamlet grapples with the idea of mortality in relation to his father's death. He reflects, "Oh God! a beast that wants discourse of reason / Would have mourn'd longer" (Hamlet, Act I, Scene II, Lines 152-153). This passage highlights Hamlet's grief and his pondering of the brevity of life and the transitory nature of existence. In Macbeth, the characters also confront death and its consequences. Macbeth, after killing King Duncan, reflects on the significance of the act and its implications. He says, "Will all great Neptune's ocean wash this blood Clean from my hand? No, this my hand will rather The multitudinous seas incarnadine" (Macbeth, Act II, Scene II, Lines 77-79). This passage showcases Macbeth's awareness of the guilt associated with murder and his realization that the blood on his hands cannot be cleansed. Lady Macbeth, on the other hand, shows a different perspective on death. She suppresses her guilt and tries to detach herself from the consequences of their actions. She states, "A little water clears us of this deed" (Macbeth, Act II, Scene II, Line 69). This passage reveals her denial and determination to move forward without remorse. In conclusion, Shakespearean characters deal with the concept of death in various ways. Some contemplate the meaning of life and question the purpose of existence, while others grapple with the guilt and consequences that come with taking a life. Shakespeare explores the complexities of death and mortality through his characters' introspection and their different responses to the inevitability of death.

这是非常有说服力的分析，但没有涉及《哈姆雷特》和《麦克白》中许多最重要的段落。正如您所看到的，QE-RAG 具有明显的优势，能够嵌入比标准 RAG 系统更相关的上下文。

然而，上面的示例并没有展示 QE-RAG 的另一个优点，即能够让开发人员更好地控制嵌入过程。为了了解 QE-RAG 如何实现这一目标，让我们看看这个问题的扩展 - 处理新术语。

将 QE-RAG 扩展到新术语

QE-RAG 真正发挥作用的地方是当您引入新术语时。例如，假设您正在引入一个新概念，例如日语单词“zetsubou”，这是一个介于绝望和绝望之间的术语，具体表达了对自己所处环境的投降。它并不像英语中的绝望概念那样立即造成灾难性的后果，而更多的是对正在发生的不愉快事情的默许。

假设我们想回答这样的问题：

user_question = "How do Shakespearean characters cope with Zetsubou?"

使用 vanilla RAG，我们将进行嵌入搜索，然后在最后的提示工程步骤中添加解释器：

 relevant_texts = [] for i in range(10): if len(encoding.encode('\n===\n'.join(relevant_texts))) > 10000: break relevant_texts.append(''.join(macbeth[sorted( [(idx, cosine_similarity(uq_embed, q)) for idx, q in enumerate(macbeth_embed)], key=lambda x: x[1], reverse=True )[i][0]]['lines'])) relevant_texts.append(''.join(hamlet[sorted( [(idx, cosine_similarity(uq_embed, q)) for idx, q in enumerate(hamlet_embed)], key=lambda x: x[1], reverse=True )[i][0]]['lines'])) completion = openai.ChatCompletion.create( model="gpt-3.5-turbo-16k", messages=[ {"role": "system", "content": "You are a literature assistant that answers questions based on excerpts."}, {"role": "user", "content": """Zetsubou is the concept of hopelessness and despair, combined with a surrender to whim of one's circumstances. Consider the following passages: %s ==== Answer this question: %s Be detailed and complete in your analysis. Make references to specific passages where helpful. Answer only based on passages provided. """ % ( '\n===\n'.join(relevant_texts), user_question )}, ] )

结果是一个写得很好、令人信服但有点过度延伸的答案，重点关注《哈姆雷特》中的几个场景。这个答案中根本没有提到麦克白，因为没有一个场景通过嵌入搜索。在查看嵌入时，很明显，“zetsubou”的语义含义没有被正确捕获，因此无法从中检索相关文本。

在 QE-RAG 中，我们可以在摘要阶段注入新术语的定义，从而显着提高系统可访问的文本质量：

 def summarize_for_question(text, question, location, context=''): completion = openai.ChatCompletion.create( model="gpt-3.5-turbo-16k", messages=[ {"role": "system", "content": "You are a literature assistant that provides helpful summaries."}, {"role": "user", "content": """%s Is the following excerpt from %s relevant to the following question? %s === %s === If so, summarize the sections that are relevant. Include references to specific passages that would be useful. If not, simply say: \"nothing is relevant\" without additional explanation""" % ( context, location, question, text )}, ] ) return completion questions = [ "How do characters deal with Zetsubou in Shakespearean plays?" ] summarize_for_question(''.join(scene['lines']), questions[0], "Shakespeare's Macbeth", "Zetsubou is the concept of hopelessness and despair, combined with a surrender to whim of one's circumstances." )

在段落中使用此摘要提示，您将看到极其准确且符合上下文的摘要。这些总结可以在后续的 QE-RAG 中使用。

此外，问题嵌入将非常相似，因为尽管 ada-002 不知道“zetsubou”的确切含义，但两个问题中该词的存在使得问题嵌入极其相似。

因此，使用 QE-RAG 来回答关于 Zetsubou 的原始问题，我们得到以下答案：

 Based on the passages provided, Shakespearean characters cope with Zetsubou, or hopelessness and despair, in different ways. In "Macbeth," Lady Macbeth copes with Zetsubou through guilt and anguish. She is consumed by the guilt of her and her husband's actions, and this guilt manifests in her sleepwalking and attempts to wash her hands of imagined blood stains. This can be seen in her utterances of "Out, damned spot!" and her preoccupation with the blood on her hands. Lady Macbeth's coping mechanism is to face the consequences of her actions head-on, even if it means descending into madness. The Doctor and Gentlewoman observe her behavior and acknowledge that her condition requires divine intervention. In "Hamlet," the excerpt does not provide specific passages or examples of characters coping with Zetsubou. However, it does address Hamlet's inner turmoil and struggle to cope with his circumstances. Hamlet's state of mind, confusion, and madness are a reflection of despair. The characters in this scene are attempting to understand Hamlet's condition and find ways to deal with it, but the excerpt does not provide details on how they cope with Zetsubou. In "Macbeth," Macbeth himself copes with Zetsubou by surrendering to his dire circumstances. He becomes desensitized to fear and horror, indicating a kind of resignation to his grim circumstances. He acknowledges that there is no escape from his fate and chooses to face it head-on, even though he knows it will lead to his downfall. Macbeth's reaction to the news of his wife's death and his subsequent soliloquy reflect a deep despair and hopelessness. He sees life as meaningless, a "tale Told by an idiot, full of sound and fury, Signifying nothing." Macbeth's surrender to his circumstances and his belief in the futility of life demonstrate his coping mechanism for Zetsubou. In "Macbeth," Lady Macduff copes with Zetsubou by expressing her feelings of despair and hopelessness. She feels betrayed and abandoned by her husband, who has fled and left their family behind. She questions his actions and believes that they show a lack of love and wisdom. Lady Macduff experiences a sense of abandonment, which aligns with the concept of Zetsubou. She tries to make her son understand the severity of their circumstances, but he questions the definition of a traitor. Lady Macduff ultimately meets a tragic end, indicating that her coping mechanism for Zetsubou is not successful. In summary, Shakespearean characters cope with Zetsubou in different ways. Lady Macbeth copes through guilt and faces the consequences of her actions, even if it means descending into madness. Macbeth copes through surrendering to his dire circumstances and embracing his fate. Lady Macduff copes through expressing her feelings of despair and hopelessness but ultimately meets a tragic end. These characters' coping mechanisms reflect different responses to Zetsubou in Shakespearean plays

这个概念的输出结果仍然有点困难，但很明显，QE-RAG 更能够处理问题的实质，并从各种来源中汲取经验。

这样，我们就实现了 QE-RAG，并演示了它的用例！

结论

在今天的文章中，我们研究了日益流行的 RAG 架构及其局限性。然后，我们使用名为 QE-RAG 的新架构扩展了 RAG 架构，该架构旨在更充分地利用大型语言模型的功能。除了提高准确性和上下文访问之外，QE-RAG 还允许整个系统在与用户交互时不断发展，并更加熟悉所提出的问题类型，从而使公司能够在开源基础上开发独特的知识产权或商业化的法学硕士。

当然，作为一个实验性的想法，QE-RAG 并不完美。如果您对如何改进此架构有想法，或者只是想讨论 LLM 技术，请随时通过我的Github或LinkedIn给我留言。