DeepSearch:WebThinker开启AI搜索研究新纪元!

news2025/5/18 2:50:47

1,项目简介

WebThinker 是一个深度研究智能体,使 LRMs 能够在推理过程中自主搜索网络、导航网页,并撰写研究报告。这种技术的目标是革命性的:让用户通过简单的查询就能在互联网的海量信息中进行深度搜索、挖掘和整合,从而为知识密集型领域(如金融、科学、工程)的研究人员大幅降低信息收集的时间和成本。

项目地址:https://github.com/RUC-NLPIR/WebThinker

评价:很垃圾,需要接入微软宙斯的东西,Bing搜索。这些都需要替换成国产的,而且质量一般,不兼容图片等等等,远远不如字节的 DeerFlow 强大。

现有的开源深度搜索智能体通常采用检索增强生成(Retrieval-Augmented Generation, RAG)技术,依循预定义的工作流程,这限制了 LRM 探索更深层次网页信息的能力,也阻碍了 LRM 与搜索引擎之间的紧密交互。

  • 传统 RAG:仅进行浅层搜索,缺乏思考深度和连贯性
  • 进阶 RAG:使用预定义工作流,包括查询拆解、多轮 RAG 等,但仍缺乏灵活性
  • WebThinker:在连续深思考过程中自主调用工具,实现端到端任务执行

WebThinker 使 LRM 能够在单次生成中自主执行操作,无需遵循预设的工作流程,从而实现真正的端到端任务执行。

2,替换原LLM

【替换Bing的检索】run_web_thinker.py & run_web_thinker_report.py

async def generate_response(
    client: AsyncOpenAI,
    prompt: str,
    semaphore: asyncio.Semaphore,
    generate_mode: str = "chat",
    temperature: float = 0.0,
    top_p: float = 1.0,
    max_tokens: int = 32768,
    repetition_penalty: float = 1.0,
    top_k: int = 1,
    min_p: float = 0.0,
    model_name: str = "QwQ-32B",
    stop: List[str] = [END_SEARCH_QUERY],
    retry_limit: int = 3,
    bad_words: List[str] = [f"{END_SEARCH_RESULT}\n\n{tokenizer.eos_token}"],
) -> Tuple[str, str]:
    """Generate a single response with retry logic"""
    for attempt in range(retry_limit):
        try:
            async with semaphore:
                if generate_mode == "chat":
                    messages = [{"role": "user", "content": prompt}]
                    if 'qwq' in model_name.lower() or 'deepseek' in model_name.lower() or 'r1' in model_name.lower():
                        formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
                    else:
                        formatted_prompt = aux_tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
                    if ('deepseek' in model_name.lower() or 'r1' in model_name.lower()) and "<think>\n" not in formatted_prompt:
                        formatted_prompt = formatted_prompt + "<think>\n"
                else:
                    formatted_prompt = prompt

                response = await client.completions.create(
                    model=model_name,
                    prompt=formatted_prompt,
                    temperature=temperature,
                    top_p=top_p,
                    max_tokens=max_tokens,
                    stop=stop,
                    extra_body={
                        'top_k': top_k,
                        'include_stop_str_in_output': True,
                        'repetition_penalty': repetition_penalty,
                        # 'bad_words': bad_words,
                        # 'min_p': min_p
                    },
                    timeout=3600,
                )
                return formatted_prompt, response.choices[0].text
        except Exception as e:
            print(f"Generate Response Error occurred: {e}, Starting retry attempt {attempt + 1}")
            # print(prompt)
            if "maximum context length" in str(e).lower():
                # If length exceeds limit, reduce max_tokens by half
                max_tokens = max_tokens // 2
                print(f"Reducing max_tokens to {max_tokens}")
            if attempt == retry_limit - 1:
                print(f"Failed after {retry_limit} attempts: {e}")
                return "", ""
            await asyncio.sleep(1 * (attempt + 1))
    return "", ""

替换为:

async def generate_response(
    client: AsyncOpenAI,
    prompt: str,
    semaphore: asyncio.Semaphore,
    generate_mode: str = "chat",
    temperature: float = 0.0,
    top_p: float = 1.0,
    max_tokens: int = 32768,
    repetition_penalty: float = 1.0,
    top_k: int = 1,
    min_p: float = 0.0,
    model_name: str = "QwQ-32B",
    stop: List[str] = [END_SEARCH_QUERY],
    retry_limit: int = 3,
    bad_words: List[str] = [f"{END_SEARCH_RESULT}\n\n{tokenizer.eos_token}"],
) -> Tuple[str, str]:
    """Generate a single response with retry logic"""
    for attempt in range(retry_limit):
        try:
            async with semaphore:
                if generate_mode == "chat":
                    messages = [{"role": "user", "content": prompt}]
                    if 'qwq' in model_name.lower() or 'deepseek' in model_name.lower() or 'r1' in model_name.lower():
                        formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
                    else:
                        formatted_prompt = aux_tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
                    if ('deepseek' in model_name.lower() or 'r1' in model_name.lower()) and "<think>\n" not in formatted_prompt:
                        formatted_prompt = formatted_prompt + "<think>\n"
                else:
                    formatted_prompt = prompt

                client = OpenAI(api_key="sk-*****",
                                base_url="https://api.deepseek.com/beta")
                response = client.chat.completions.create(
                    model="deepseek-chat",
                    temperature=temperature,
                    messages=[
                        {"role": "system", "content": formatted_prompt},
                    ],
                    stream=False
                )
                return formatted_prompt, response.choices[0].message.content
        except Exception as e:
            print(f"Generate Response Error occurred: {e}, Starting retry attempt {attempt + 1}")
            # print(prompt)
            if "maximum context length" in str(e).lower():
                # If length exceeds limit, reduce max_tokens by half
                max_tokens = max_tokens // 2
                print(f"Reducing max_tokens to {max_tokens}")
            if attempt == retry_limit - 1:
                print(f"Failed after {retry_limit} attempts: {e}")
                return "", ""
            await asyncio.sleep(1 * (attempt + 1))
    return "", ""

def extract_relevant_info(search_results):
    """
    Extract relevant information from Bing search results.

    Args:
        search_results (dict): JSON response from the Bing Web Search API.

    Returns:
        list: A list of dictionaries containing the extracted information.
    """
    useful_info = []

    if 'webPages' in search_results and 'value' in search_results['webPages']:
        for id, result in enumerate(search_results['webPages']['value']):
            info = {
                'id': id + 1,  # Increment id for easier subsequent operations
                'title': result.get('name', ''),
                'url': result.get('url', ''),
                'site_name': result.get('siteName', ''),
                'date': result.get('datePublished', '').split('T')[0],
                'snippet': result.get('snippet', ''),  # Remove HTML tags
                # Add context content to the information
                'context': ''  # Reserved field to be filled later
            }
            useful_info.append(info)

    return useful_info

替换为:

def extract_relevant_info(search_results):
    """
    Extract relevant information from search results.
    Compatible with custom search result JSON (not Bing).

    Args:
        search_results (dict): JSON response containing search results.

    Returns:
        list: A list of dictionaries containing the extracted information.
    """
    useful_info = []

    if 'results' in search_results and isinstance(search_results['results'], list):
        for id, result in enumerate(search_results['results']):
            info = {
                'id': id + 1,
                'title': result.get('title', ''),
                'url': result.get('url', ''),
                'site_name': '',  # 该结构中无 site_name 字段
                'date': '',       # 该结构中无 datePublished 字段
                'snippet': result.get('content', ''),  # 使用 content 字段
                'context': ''
            }
            useful_info.append(info)

    return useful_info

3,替换Bing搜索

目前Bing关闭的API,原方式无法继续使用。修改 bing_search.py:

async def bing_web_search_async(query, subscription_key, endpoint, market='en-US', language='en', timeout=20):
    """
    Perform an asynchronous search using the Bing Web Search API.

    Args:
        query (str): Search query.
        subscription_key (str): Subscription key for the Bing Search API.
        endpoint (str): Endpoint for the Bing Search API.
        market (str): Market, e.g., "en-US" or "zh-CN".
        language (str): Language of the results, e.g., "en".
        timeout (int): Request timeout in seconds.

    Returns:
        dict: JSON response of the search results. Returns empty dict if all retries fail.
    """
    headers = {
        "Ocp-Apim-Subscription-Key": subscription_key
    }
    params = {
        "q": query,
        "mkt": market,
        "setLang": language,
        "textDecorations": True,
        "textFormat": "HTML"
    }

    max_retries = 5
    retry_count = 0

    while retry_count < max_retries:
        try:
            response = session.get(endpoint, headers=headers, params=params, timeout=timeout)
            response.raise_for_status()
            search_results = response.json()
            return search_results
        except Exception as e:
            retry_count += 1
            if retry_count == max_retries:
                print(f"Bing Web Search Request Error occurred: {e} after {max_retries} retries")
                return {}
            print(f"Bing Web Search Request Error occurred, retrying ({retry_count}/{max_retries})...")
            time.sleep(1)  # Wait 1 second between retries

替换为:

async def bing_web_search_async(query, subscription_key, endpoint, market='en-US', language='en', timeout=20):
    client = TavilyClient("tvly-dev-*****")
    response = client.search(
        query=query
    )
    response.raise_for_status()
    search_results = response.json()
    print("=========",search_results)
    return search_results

4,命令行启动

 【问题解决模式】

python scripts/run_web_thinker.py \
  --single_question "What is OpenAI Deep Research?" \
  --bing_subscription_key "YOUR_BING_SUBSCRIPTION_KEY" \  # 用于调用 Bing 搜索API实现搜索增强(如 RAG)
  --api_base_url "YOUR_API_BASE_URL" \                    # 主模型的 API 接口基础地址,如 http://localhost:8000/v1
  --model_name "QwQ-32B" \                                # 主模型名称,可为本地模型或远程托管模型
  --tokenizer_path "PATH_TO_YOUR_TOKENIZER" \             # 主模型使用的分词器路径,如 ./tokenizers/qwq32b/
  --aux_api_base_url "YOUR_AUX_API_BASE_URL" \            # 辅助模型 API 接口地址
  --aux_model_name "Qwen2.5-32B-Instruct" \               # 辅助模型名称,用于多模型结果对比、增强或检验
  --aux_tokenizer_path "PATH_TO_YOUR_AUX_TOKENIZER"       # 辅助模型的分词器路径

【报告生成模式】

python scripts/run_web_thinker_report.py \
    --single_question "What are the models of OpenAI and what are the differences?" \
    --bing_subscription_key "YOUR_BING_SUBSCRIPTION_KEY" \
    --api_base_url "YOUR_API_BASE_URL" \
    --model_name "QwQ-32B" \
    --aux_api_base_url "YOUR_AUX_API_BASE_URL" \
    --aux_model_name "Qwen2.5-32B-Instruct" \
    --tokenizer_path "PATH_TO_YOUR_TOKENIZER" \
    --aux_tokenizer_path "PATH_TO_YOUR_AUX_TOKENIZER"

5,web 启动

cd demo
streamlit run run_demo.py

【报错】There is no current event loop in thread 'ScriptRunner.scriptThread'.

【解决】修改 bin_search.py 457 行:关闭加锁。

self.lock = asyncio.Lock()
👇
self.lock = None

【报错】Generate Response Error occurred: Connection error

【解决】修改 settings.py _load_client()

   def _load_client(self, api_base_url, aux_api_base_url):
        self.client = AsyncOpenAI(
            api_key="empty",
            base_url=api_base_url,
        )
        self.aux_client = AsyncOpenAI(
            api_key="empty",
            base_url=aux_api_base_url,
        )   
    👇 
    def _load_client(self, api_base_url, aux_api_base_url):
        client = OpenAI(api_key="sk-***",
                        base_url="https://api.deepseek.com/beta")

        self.client = client
        self.aux_client = client
# bing_search.py
use_model_name='QwQ-32B',
aux_model_name='Qwen2.5-72B-Instruct',
👇
use_model_name='deepseek-chat',
aux_model_name='deepseek-chat',

【报错】Invalid max_tokens value, the valid range of max_tokens is [1, 8192]

【解决】修改 bing_search.py

response = await client.completions.create(
        model=env.aux_model_name,
        max_tokens=4096,
        prompt=prompt,
        timeout=3600,
    )
👇
response = client.chat.completions.create(
        model=env.aux_model_name,
        messages=[
            {"role": "system", "content": prompt},
        ],
    )

【报错】Generate Response Error occurred: Error code: 400 - {'error': {'message': 'Invalid max_tokens value, the valid range of max_tokens is [1, 8192]', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_request_error'}}, Starting retry attempt 1

【解决】修改 run_login.py generate_response()

async def generate_response(
    client: AsyncOpenAI,
    prompt: str,
    temperature: float = 0.0,
    top_p: float = 1.0,
    max_tokens: int = 4096,
    repetition_penalty: float = 1.0,
    top_k: int = 1,
    min_p: float = 0.0,
    model_name: str = "QwQ-32B",
    stop: List[str] = ["<|end_search_query|>"],
    retry_limit: int = 3,
):
    """Generate a streaming response with retry logic"""
    for attempt in range(retry_limit):
        try:
            client = OpenAI(api_key="sk-****",
                            base_url="https://api.deepseek.com/beta")
            response = client.chat.completions.create(
                model="deepseek-chat",
                temperature=temperature,
                messages=[
                    {"role": "system", "content": prompt},
                ],
                stream=True
            )
            
            async for chunk in response:
                if chunk.choices[0].message.content:
                    yield chunk.choices[0].message.content
            return  

        except Exception as e:
            print(f"Generate Response Error occurred: {e}, Starting retry attempt {attempt + 1}")
            if attempt == retry_limit - 1:
                print(f"Failed after {retry_limit} attempts: {e}")
            await asyncio.sleep(0.5 * (attempt + 1))
    
    yield ""  

【报错】Generate Response Error occurred: 'async for' requires an object with __aiter__ method, got Stream, Starting

【解决】DeepSeek的输出不是标准的迭代格式,修改 run_logit.py generate_response()

async for chunk in response:
         if chunk.choices[0].message.content:
             yield chunk.choices[0].message.content
      return  
👇
yield response.choices[0].message.content

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2378171.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

springCloud/Alibaba常用中间件之Setinel实现熔断降级

文章目录 SpringCloud Alibaba:依赖版本补充Sentinel:1、下载-运行&#xff1a;Sentinel(1.8.6)下载sentinel&#xff1a;运行&#xff1a;Sentinel <br> 2、流控规则① 公共的测试代码以及需要使用的测试Jmeter①、流控模式1. 直接:2. 并联:3. 链路: ②、流控效果1. 快速…

Deeper and Wider Siamese Networks for Real-Time Visual Tracking

现象&#xff1a; the backbone networks used in Siamese trackers are relatively shallow, such as AlexNet , which does not fully take advantage of the capability of modern deep neural networks. direct replacement of backbones with existing powerful archite…

黑马程序员C++2024版笔记 第0章 C++入门

1.C代码的基础结构 以hello_world代码为例&#xff1a; 预处理指令 #include<iostream> using namespace std; 代码前2行是预处理指令&#xff0c;即代码编译前的准备工作。&#xff08;编译是将源代码转化为可执行程序.exe文件的过程&#xff09; 主函数 主函数是…

foxmail - foxmail 启用超大附件提示密码与帐号不匹配

foxmail 启用超大附件提示密码与帐号不匹配 问题描述 在 foxmail 客户端中&#xff0c;启用超大附件功能&#xff0c;输入了正确的账号&#xff08;邮箱&#xff09;与密码&#xff0c;但是提示密码与帐号不匹配 处理策略 找到 foxmail 客户端目录/Global 目录下的 domain.i…

Crowdfund Insider聚焦:CertiK联创顾荣辉解析Web3.0创新与安全平衡之术

近日&#xff0c;权威金融科技媒体Crowdfund Insider发布报道&#xff0c;聚焦CertiK联合创始人兼CEO顾荣辉教授在Unchained Summit的主题演讲。报道指出&#xff0c;顾教授的观点揭示了Web3.0生态当前面临的挑战&#xff0c;以及合规与技术在推动行业可持续发展中的关键作用。…

PowerBI链接EXCEL实现自动化报表

PowerBI链接EXCEL实现自动化报表 曾经我将工作中一天的工作缩短至2个小时&#xff0c;其中最关键的一步就是使用PowerBI链接Excel做成一个自动化报表&#xff0c;PowerBI更新源数据&#xff0c;Excel更新报表并且保留报表格式。 以制作一个超市销售报表为例&#xff0c;简单叙…

腾讯云MCP数据智能处理:简化数据探索与分析的全流程指南

引言 在当今数据驱动的商业环境中&#xff0c;企业面临着海量数据处理和分析的挑战。腾讯云MCP(Managed Cloud Platform)提供的数据智能处理解决方案&#xff0c;为数据科学家和分析师提供了强大的工具集&#xff0c;能够显著简化数据探索、分析流程&#xff0c;并增强数据科学…

Android framework 中间件开发(一)

在Android开发中,经常会调用到一些系统服务,这些系统服务简化了上层应用的开发,这便是中间件的作用,中间件是介于系统和应用之间的桥梁,将复杂的底层逻辑进行一层封装,供上层APP直接调用,或者将一些APP没有权限一些操作放到中间件里面来实施. 假设一个需求,通过中间件调节系统亮…

MATLAB中的概率分布生成:从理论到实践

MATLAB中的概率分布生成&#xff1a;从理论到实践 引言 MATLAB作为一款强大的科学计算软件&#xff0c;在统计分析、数据模拟和概率建模方面提供了丰富的功能。本文将介绍如何使用MATLAB生成各种常见的概率分布&#xff0c;包括均匀分布、正态分布、泊松分布等&#xff0c;并…

C# 面向对象 构造函数带参无参细节解析

继承类构造时会先调用基类构造函数&#xff0c;不显式调用基类构造函数时&#xff0c;默认调用基类无参构造函数&#xff0c;但如果基类没有写无参构造函数&#xff0c;会无法调用从而报错&#xff1b;此时&#xff0c;要么显式的调用基类构造函数&#xff0c;并按其格式带上参…

在 C# 中将 DataGridView 数据导出为 CSV

在此代码示例中&#xff0c;我们将学习如何使用 C# 代码将 DataGridView 数据导出到 CSV 文件并将其保存在文件夹中。 在这个程序中&#xff0c;首先&#xff0c;我们必须连接到数据库并从中获取数据。然后&#xff0c;我们将在数据网格视图中显示该数据&#xff0c;…

MySQL中表的增删改查(CRUD)

一.在表中增加数据&#xff08;Create&#xff09; INSERT [INTO] TB_NAME [(COLUMN1,COLUMN2,...)] VALUES (value_list1),(value_list2),...;into可以省略可仅选择部分列选择插入&#xff0c;column即选择的列&#xff0c; 如图例可以选择仅在valuelist中插入age和id如果不指…

项目思维vs产品思维

大家好&#xff0c;我是大明同学。 这期内容&#xff0c;我们来聊一下项目思维和产品思维的区别。 项目是实施关键&#xff0c;力求每一步都精准到位&#xff1b;产品则是战略导向&#xff0c;确保所选之路正确无误。若缺乏优异成果&#xff0c;即便按时完成&#xff0c;也只…

游戏引擎学习第285天:“Traversables 的事务性占用”

回顾并为当天的工作做准备 我们有一个关于玩家移动的概念&#xff0c;玩家可以在点之间移动&#xff0c;而且当这些点移动时&#xff0c;玩家会随之移动。现在这个部分基本上已经在工作了。我们本来想实现的一个功能是&#xff1a;当玩家移动到某个点时&#xff0c;这个点能“…

文件上传Ⅲ

#文件-解析方案-执行权限&解码还原 1、执行权限 文件上传后存储目录不给执行权限&#xff08;即它并不限制你上传文件的类型&#xff0c;但不会让相应存有后门代码的PHP文件执行&#xff0c;但是PNG图片是可以访问的&#xff09; 2、解码还原 数据做存储&#xff0c;解…

基于深度学习的工业OCR数字识别系统架构解析

一、项目场景 春晖数字识别视觉检测系统专注于工业自动化生产监控、设备运行数据记录等关键领域。系统通过高精度OCR算法&#xff0c;能够实时识别设备上显示的关键数据&#xff08;如温度、压力、计数等&#xff09;&#xff0c;并定时存储至Excel文件中。这些数据对于生产过…

go-中间件的使用

中间件介绍 Gin框架允许开发者在处理请求的过程中加入用户自己的钩子(Hook)函数这个钩子函数就是中间件&#xff0c;中间件适合处理一些公共的业务逻辑比如登录认证&#xff0c;权限校验&#xff0c;数据分页&#xff0c;记录日志&#xff0c;耗时统计 1.定义全局中间件 pac…

学习以任务为中心的潜动作,随地采取行动

25年5月来自香港大学、OpenDriveLab 和智元机器人的论文“Learning to Act Anywhere with Task-centric Latent Actions”。 通用机器人应该在各种环境中高效运行。然而&#xff0c;大多数现有方法严重依赖于扩展动作标注数据来增强其能力。因此&#xff0c;它们通常局限于单一…

15.springboot-控制器处理参数传递

22.springMVC Spring MVC 是非常著名的 Web 应用框架&#xff0c;现在的大多数 Web 项目都采用 Spring MVC。它与 Spring 有着紧 密的关系。是 Spring 框架中的模块&#xff0c;专注 Web 应用&#xff0c;能够使用 Spring 提供的强大功能&#xff0c;IoC , Aop 等等。 Spring…

半成品的开源双系统VLA模型,OpenHelix-发表于2025.5.6

半成品的开源双系统VLA模型&#xff0c;OpenHelix https://openhelix-robot.github.io/ 0. 摘要 随着OpenVLA的开源&#xff0c;VLA如何部署到真实的机器人上获得了越来越多的关注&#xff0c;各界人士也都开始尝试解决OpenVLA的效率问题&#xff0c;双系统方案是其中一个非…