Phi-3.5-mini-instruct部署教程:Ubuntu 22.04 + RTX 4090 D完整环境搭建步骤
Phi-3.5-mini-instruct部署教程Ubuntu 22.04 RTX 4090 D完整环境搭建步骤1. 模型简介Phi-3.5-mini-instruct是微软推出的轻量级开源指令微调大模型在长上下文代码理解RepoQA、多语言MMLU等基准测试中表现优异显著超越同规模模型部分任务甚至能与更大模型媲美。该模型特别适合本地或边缘部署在RTX 4090单卡上显存占用仅约7GB。2. 环境准备2.1 硬件要求GPU: NVIDIA GeForce RTX 4090 D (23GB VRAM)显存占用: 约7.7GB存储空间: 至少15GB可用空间2.2 软件要求操作系统: Ubuntu 22.04 LTSCUDA版本: 12.1Python版本: 3.103. 基础环境安装3.1 安装NVIDIA驱动sudo apt update sudo apt install -y nvidia-driver-535 sudo reboot3.2 安装CUDA Toolkitwget https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda_12.1.0_530.30.02_linux.run sudo sh cuda_12.1.0_530.30.02_linux.run3.3 安装Minicondawget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh source ~/.bashrc4. 项目部署4.1 创建Conda环境conda create -n torch28 python3.10 -y conda activate torch284.2 安装依赖包pip install torch2.8.0cu121 --index-url https://download.pytorch.org/whl/cu121 pip install transformers4.57.6 gradio6.6.0 protobuf7.34.14.3 下载模型mkdir -p /root/ai-models/AI-ModelScope cd /root/ai-models/AI-ModelScope git clone https://huggingface.co/microsoft/Phi-3.5-mini-instruct5. 服务配置5.1 创建项目目录mkdir -p /root/Phi-3.5-mini-instruct/logs5.2 创建WebUI脚本创建/root/Phi-3.5-mini-instruct/webui.py文件内容如下from transformers import AutoModelForCausalLM, AutoTokenizer import gradio as gr import torch model_path /root/ai-models/AI-ModelScope/Phi-3___5-mini-instruct tokenizer AutoTokenizer.from_pretrained(model_path) model AutoModelForCausalLM.from_pretrained( model_path, torch_dtypetorch.float16, device_mapauto ) def generate(prompt, max_length256, temperature0.3, top_p0.8, top_k20, repetition_penalty1.1): inputs tokenizer(prompt, return_tensorspt).to(cuda) outputs model.generate( **inputs, max_lengthmax_length, temperaturetemperature, top_ptop_p, top_ktop_k, repetition_penaltyrepetition_penalty, use_cacheFalse # 解决DynamicCache bug ) return tokenizer.decode(outputs[0], skip_special_tokensTrue) iface gr.Interface( fngenerate, inputs[ gr.Textbox(labelPrompt), gr.Slider(32, 1024, value256, labelMax Length), gr.Slider(0.1, 1.0, value0.3, labelTemperature), gr.Slider(0.1, 1.0, value0.8, labelTop P), gr.Slider(1, 100, value20, labelTop K), gr.Slider(1.0, 2.0, value1.1, labelRepetition Penalty) ], outputstext, titlePhi-3.5-mini-instruct Demo ) iface.launch(server_name0.0.0.0, server_port7860)5.3 配置Supervisor创建/etc/supervisor/conf.d/phi-3.5-mini-instruct.conf文件[program:phi-3.5-mini-instruct] command/opt/miniconda3/envs/torch28/bin/python /root/Phi-3.5-mini-instruct/webui.py directory/root/Phi-3.5-mini-instruct userroot autostarttrue autorestarttrue stdout_logfile/root/Phi-3.5-mini-instruct/logs/phi35.log stderr_logfile/root/Phi-3.5-mini-instruct/logs/phi35.err environmentPATH/opt/miniconda3/envs/torch28/bin:%(ENV_PATH)s更新Supervisor配置sudo supervisorctl reread sudo supervisorctl update6. 服务管理6.1 启动服务sudo supervisorctl start phi-3.5-mini-instruct6.2 查看服务状态sudo supervisorctl status phi-3.5-mini-instruct6.3 查看日志tail -f /root/Phi-3.5-mini-instruct/logs/phi35.log7. 使用指南7.1 访问Web界面服务启动后通过浏览器访问http://服务器IP:78607.2 API调用示例curl -X POST http://localhost:7860/gradio_api/call/generate \ -H Content-Type: application/json \ -d {data:[Hello,256,0.3,0.8,20,1.1]}7.3 参数说明参数默认值说明max_length256最大生成长度temperature0.3温度越低越确定top_p0.8核采样概率top_k20Top-k采样repetition_penalty1.1重复惩罚8. 常见问题解决8.1 transformers 5.5.0 Bug问题: transformers 5.5.0存在DynamicCache bug导致生成时报错DynamicCache object has no attribute seen_tokens解决方案:降级transformers:pip install transformers5.0.0在生成时添加参数:use_cacheFalse8.2 GPU未被使用检查CUDA是否可用:python -c import torch; print(torch.cuda.is_available())8.3 端口冲突检查7860端口占用情况:ss -tlnp | grep 78609. 总结本教程详细介绍了在Ubuntu 22.04系统上使用RTX 4090 D显卡部署Phi-3.5-mini-instruct模型的完整步骤。该模型轻量高效特别适合本地开发和边缘计算场景。通过Gradio提供的Web界面用户可以方便地与模型交互也可以直接调用API进行集成开发。获取更多AI镜像想探索更多AI镜像和应用场景访问 CSDN星图镜像广场提供丰富的预置镜像覆盖大模型推理、图像生成、视频生成、模型微调等多个领域支持一键部署。
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2548089.html
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!