书生浦语第六期 L1-G3000-L1 Intern-S1-mini 本地部署实践
LMDeploy 部署1、开发机的选择在创建开发机界面选择镜像为 Cuda12.2-conda并选择 GPU 为 30%A100安装依赖conda create -n lmdeploy python3.10 -y conda activate lmdeploypip install lmdeploy0.9.2.post1 transformers4.55.22、启动lmdeploy serve api_server /root/share/new_models/Intern-S1-mini \ --reasoning-parser intern-s1 \ --tool-call-parser intern-s1 \ --cache-max-entry-count 0.1 \ --max-batch-size 8 \ --backend turbomind \ --session-len 20483、推理infer2.pyfrom openai import OpenAI import json messages [ { role: user, content: who are you }, { role: assistant, content: I am an AI }, { role: user, content: AGI is? }] openai_api_key EMPTY openai_api_base http://0.0.0.0:23333/v1 client OpenAI( api_keyopenai_api_key, base_urlopenai_api_base, ) model_name client.models.list().data[0].id response client.chat.completions.create( modelmodel_name, messagesmessages, temperature0.8, top_p0.8, max_tokens2048, extra_body{ enable_thinking: False, } ) print(json.dumps(response.model_dump(), indent2, ensure_asciiFalse))
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2430554.html
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!