论文阅读 RAFT

Paper

创新点

提出了新颖的特定 RAG 场景下的大模型微调方法(有明确文档域的 RAG);

基于 COT,进一步完善了模型的推理能力;

背景

当前的开放域对话模型做的是”闭卷考试“;

RAG 做的是”开卷考试“,其结果依赖于检索器的效果;

PAFT 应用于”特定领域的开卷考试“,模型先在特定域文档内学习如何做”开卷考试“,可以提高模型 RAG 的能力,还能减少幻觉的产生;

实验

方法

前置条件:$\mathrm{Q}$问题; $\mathrm{A}^{}$ 答案; $\mathrm{D}^{}$ 答案来源文档; $D_n$ 无关文档。

正例样本:$Q+ \mathrm{D^} +D_1+D_2+\dots+D_k=>\mathrm{A}^{}$

负例样本:
$\mathrm{Q}+\mathbf{D}{1}+\mathbf{D}{2}+\dots+\mathbf{D}_{\mathbf{k}}=>\mathbf{A}^{\star}$

论文中基于数据集做全量 SFT 训练

数据集

Question: The Oberoi family is part of a hotel company that has a head office in what city?

context: [The Oberoi family is an Indian family that is famous for its involvement in hotels, namely through The Oberoi Group]...[It is located in city center of Jakarta, near Mega Kuningan, adjacent to the sister JW Marriott Hotel. It is operated by The Ritz-Carlton Hotel Company. The complex has two towers that comprises a hotel and the Airlangga Apartment respectively]...[The Oberoi Group is a hotel company with its head office in Delhi.]

Instruction: Given the question, context and answer above, provide a logical reasoning for that answer. Please use the format of: ##Reason: {reason} ##Answer: {answer}.

CoT Answer: ##Reason: The document ##begin_quote## The Oberoi family is an Indian family that is famous for its involvement in hotels, namely through The Oberoi Group. ##end_quote## establishes that the Oberoi family is involved in the Oberoi group, and the document ##begin_quote## The Oberoi Group is a hotel company with its head office in Delhi. ##end_quote## establishes the head office of The Oberoi Group. Therefore, the Oberoi family is part of a hotel company whose head office is in Delhi. ##Answer: Delhi

从数据集 Answer 中可以看出,提问者引导模型以思维链的方式去回答问题,先说原因 Reason 再说答案 Answer

文中采用 ##begin_quote## 和 ##end_quote## 表示直接从上下文复制粘贴的引用,研究人员发现这是防止模型产生幻觉并坚持提供的上下文的有效方法。

实验结果

DSF(Domain-Specific Finetuning):特定领域内微调

PAFT:特定领域微调+特定领域 RAG

总结

将模型在特定领域的文档上做思维链微调

  • 通过特定领域微调,模型可以学会如何做“开卷考试“;
  • 当领域内数据发生一些变化时,模型依然能找到答案,即使没有正确答案也不会出现幻觉;
  • 可以认为,模型需要专业的微调来训练其在特定领域的 RAG(开卷考试)能力;