CANN Triton NPU推理后端

news2026/5/9 17:12:15

Resnet example 运行教程【免费下载链接】triton-inference-server-ge-backendge-backend基于triton inference server框架实现对接NPU生态快速实现传统CV\NLP等模型的服务化。项目地址: https://gitcode.com/cann/triton-inference-server-ge-backend模型准备从网站下载onnx文件: https://media.githubusercontent.com/media/onnx/models/refs/heads/main/validated/vision/classification/resnet/model/resnet18-v1-7.onnx?downloadtrue在example/resnet 文件夹下创建名称为 1 的文件夹并将下载的onnx文件放置此文件夹中。最终目录结构如下example └── resnet ├── 1 │ └── resnet18-v1-7.onnx └── config.pbtxt运行推理服务尝试运行triton inference server(建议使用AscendHub中的镜像直接运行)/opt/tritonserver/bin/tritonserver --model-repository {/path/to/example}启动完成后在输出中可看到相应的 http端口信息。I0301 14:17:48.002634 11040 grpc_server.cc:2519] Started GRPCInferenceService at 0.0.0.0:8001 I0301 14:17:48.002913 11040 http_server.cc:4637] Started HTTPService at 0.0.0.0:8000 I0301 14:17:48.044199 11040 http_server.cc:320] Started Metrics Service at 0.0.0.0:8002服务端调用测试通过调用client.py 进行测试cd example python client.py执行成功后打印如下resnetv24_dense0_fwd shape (1, 1000) resnetv24_dense0_fwd data [[-1.4480009 -0.14706227 0.71502316 0.60883063 1.0058776 1.0106554 1.0276837 -0.89346164 -0.9704908 -0.7546704 -0.4772439 0.57412636 -0.39269644 0.37755248 -0.4234915 -0.51555425 -1.4987887 -1.698892 ...【免费下载链接】triton-inference-server-ge-backendge-backend基于triton inference server框架实现对接NPU生态快速实现传统CV\NLP等模型的服务化。项目地址: https://gitcode.com/cann/triton-inference-server-ge-backend创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/2598181.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！