引言:RAG 构建属于你的大模型
大语言模型(LLM)的知识体系本质上仅限于它所接受的训练数据。
其一在知识时效性方面,模型参数固化于训练完成的时点,而现实世界中的知识和信息持续动态更新。
其二在非公开数据层面,企业内部的机密文档(如产品设计图、商业策略等)及个人隐私数据均未被纳入训练范围;
其三,领域专业性维度,特定垂直领域的前沿进展或专业壁垒较高的知识可能未被充分覆盖。基于以上方面通过RAG技术连接我们外部的知识库,来补充增强数据的准确性。
一、 RAG 原理简述
简单来说是在将数据发送给LLM之前从数据中查找相关信息片段并将其注入提示符的方法。这样,LLM将获得相关信息,并能够使用这些信息进行回复。
RAG过程分为两个不同的阶段:索引和检索。LangChain4j为这两个阶段提供了工具。
1.1Indexing
这个过程可以根据具体所使用的信息检索方法而变化。对于向量搜索,我们需要对文本切分成小的块,并将块进行向量化,最后将它们存储在量数据库中。
下面是索引阶段的简化图:
1.2 Retrieval
对于向量搜索,先将用户的问题在向量数据库中进行相似度检索。然后将相关内容(原始文档的片段)注入提示并发送给LLM。
这是检索阶段的简化示意图:
二、 应用实践
2.1 maven 引入相关pom 依赖
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-ollama-spring-boot-starter</artifactId>
<version>1.0.0-beta3</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-spring-boot-starter</artifactId>
<version>1.0.0-beta3</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-embeddings-all-minilm-l6-v2</artifactId>
<version>1.0.0-beta3</version>
</dependency>
在application.properties 中添加配置
langchain4j.ollama.chat-model.base-url=http://127.0.0.1:11868 #ollma 访问地址
langchain4j.ollama.chat-model.model-name=deepseek-r1:14b # 填写本地运行的模型
langchain4j.ollama.chat-model.log-requests=true
langchain4j.ollama.chat-model.log-responses=true
2.2 代码示例
以下是官方提供的示例代码,示例切分文档 miles-of-smiles-terms-of-use.txt
Miles of Smiles Car Rental Services Terms of Use
1. Introduction
These Terms of Service (“Terms”) govern the access or use by you, an individual, from within any country in the world, of applications, websites, content, products, and services (“Services”) made available by Miles of Smiles Car Rental Services, a company registered in the United States of America.
2. The Services
Miles of Smiles rents out vehicles to the end user. We reserve the right to temporarily or permanently discontinue the Services at any time and are not liable for any modification, suspension or discontinuation of the Services.
3. Bookings
3.1 Users may make a booking through our website or mobile application.
3.2 You must provide accurate, current and complete information during the reservation process. You are responsible for all charges incurred under your account.
3.3 All bookings are subject to vehicle availability.
4. Cancellation Policy
4.1 Reservations can be cancelled up to 7 days prior to the start of the booking period.
4.2 If the booking period is less than 3 days, cancellations are not permitted.
5. Use of Vehicle
5.1 All cars rented from Miles of Smiles must not be used:
for any illegal purpose or in connection with any criminal offense.
for teaching someone to drive.
in any race, rally or contest.
while under the influence of alcohol or drugs.
6. Liability
6.1 Users will be held liable for any damage, loss, or theft that occurs during the rental period.
6.2 We do not accept liability for any indirect or consequential loss, damage, or expense including but not limited to loss of profits.
7. Governing Law
These terms will be governed by and construed in accordance with the laws of the United States of America, and any disputes relating to these terms will be subject to the exclusive jurisdiction of the courts of United States.
8. Changes to These Terms
We may revise these terms of use at any time by amending this page. You are expected to