NIM平台入门体验亲手搭建“基于NVIDIA NIM 平台的问答系统

news2025/7/11 0:41:05

操作步骤

导入相关依赖项

获取NVIDIA API密钥

启动项目

操作步骤

启动 Pycharm ，如果出现 Import Pycharm Settings 页面，勾选Do not import settings，点击OK（没有这个弹框则看后续步骤）

看到 Welcome to Pycharm 页面，就代表首次启动好了。点击 New Project 即可创建新项目

一般只需要改一下项目存放位置（项目名一般使用英文命名），然后点击create

导入相关依赖项

创建好项目后，右键项目文件夹创建一个新的Python文件，名称自拟，最好也是英文，例如“nim_test.py”

然后再次右键点击这个项目-Open in-Terminal 打开终端

在打开的窗口安装我们所需要的依赖项，输入下方命令

pip install langchain_nvidia_ai_endpoints langchain-community langchain-text-splitters faiss-cpu gradio==3.50.0 setuptools beautifulsoup4

备注：

langchain_nvidia_ai_endpoints: 这个包可能是用于与 NVIDIA AI 端点进行交互，但请注意这个包名可能不是标准库的一部分，它可能是特定项目或公司创建的一个自定义库。
langchain-community: 可能是 LangChain 项目的社区扩展或者插件，LangChain 是一个用来处理语言模型的应用程序框架。
langchain-text-splitters: 同样地，这可能是 LangChain 项目中的文本分割器模块，用来将大块文本拆分成更小的部分。
faiss-cpu: Faiss 是 Facebook 开源的一个用于高效相似性搜索和聚类的库，faiss-cpu 版本意味着它只支持 CPU 计算，不包含 GPU 加速。
gradio==3.50.0: Gradio 是一个快速构建用户界面的库，允许开发者轻松创建基于机器学习模型的演示界面。这里指定了版本 3.50.0。
setuptools: 这是一个用于打包 Python 项目的工具。
beautifulsoup4: BeautifulSoup4 是一个可以从 HTML 或 XML 文件中提取数据的 Python 库。

根据网络状况，等待安装完成，大约 10-30 分钟，如发现安装不了，下载某文件时遇到了网络超时问题，可以重新输入以下命令：

使用国内镜像源：由于在国内，访问国外的 PyPI 服务器可能会比较慢或不稳定。你可以使用国内的镜像源来加速下载，例如阿里云、腾讯云等提供的镜像服务。你可以通过在 pip install 命令中添加 -i 参数来指定镜像源。例如：

pip install -i https://mirrors.aliyun.com/pypi/simple/ langchain_nvidia_ai_endpoints langchain-community langchain-text-splitters faiss-cpu gradio==3.50.0 setuptools beautifulsoup4

增加超时时间：也可以尝试增加 pip 的超时时间，以便给它更多的时间来完成下载。可以通过设置环境变量 PIP_DEFAULT_TIMEOUT 来实现：

set PIP_DEFAULT_TIMEOUT=600  # 设置超时时间为600秒（10分钟）
pip install langchain_nvidia_ai_endpoints langchain-community langchain-text-splitters faiss-cpu gradio==3.50.0 setuptools beautifulsoup4

或者直接在命令行中指定超时时间：

pip --default-timeout=600 install langchain_nvidia_ai_endpoints langchain-community langchain-text-splitters faiss-cpu gradio==3.50.0 setuptools beautifulsoup4

根据网络状况，等待安装完成，大约 10-30 分钟

然后我们把下面的代码复制粘贴到一开始创建的 Python 文件中，例如“nim_test.py”

# -*- coding: utf-8 -*-

# 导入必要的库
from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings, ChatNVIDIA
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import chain
import os
import gradio as gr
from datetime import datetime
# Even if you do not know the full answer, generate a one-paragraph hypothetical answer to the below question in Chinese
# 定义假设性回答模板
hyde_template = """Even if you do not know the full answer, generate a one-paragraph hypothetical answer to the below question:

{question}"""

# 定义最终回答模板
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

# 定义函数来处理问题
def process_question(url, api_key, model_name, question):
    # 初始化加载器并加载数据
    loader = WebBaseLoader(url)
    docs = loader.load()

    # 设置环境变量
    os.environ['NVIDIA_API_KEY'] = api_key

    # 初始化嵌入层
    embeddings = NVIDIAEmbeddings()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
    documents = text_splitter.split_documents(docs)
    vector = FAISS.from_documents(documents, embeddings)
    retriever = vector.as_retriever()

    # 初始化模型
    model = ChatNVIDIA(model=model_name)

    # 创建提示模板
    hyde_prompt = ChatPromptTemplate.from_template(hyde_template)
    hyde_query_transformer = hyde_prompt | model | StrOutputParser()

    # 定义检索函数
    @chain
    def hyde_retriever(question):
        hypothetical_document = hyde_query_transformer.invoke({"question": question})
        return retriever.invoke(hypothetical_document)

    # 定义最终回答链
    prompt = ChatPromptTemplate.from_template(template)
    answer_chain = prompt | model | StrOutputParser()

    @chain
    def final_chain(question):
        documents = hyde_retriever.invoke(question)
        response = ""
        for s in answer_chain.stream({"question": question, "context": documents}):
            response += s
        return response

    # 调用最终链获取答案
    return str(datetime.now())+final_chain.invoke(question)

# 定义可用的模型列表
models = ["mistralai/mistral-7b-instruct-v0.2","meta/llama-3.1-405b-instruct"]

# 启动Gradio应用
iface = gr.Interface(
    fn=process_question,
    inputs=[
        gr.Textbox(label="输入需要学习的网址"),
        gr.Textbox(label="NVIDIA API Key"),
        gr.Dropdown(models, label="选择语言模型"),
        gr.Textbox(label="输入问题")
    ],
    outputs="text",
    title="网页知识问答系统"
)

# 启动Gradio界面
iface.launch()