每一篇文章前后都增加返回目录
回到目录
【评测】Qwen3-Embedding模型初体验
模型的介绍页面
本机配置:八代i5-8265U,16G内存,无GPU核显运行,win10操作系统
ollama可以通过下面命令拉取模型:
ollama pull modelscope.cn/Qwen/Qwen3-Embedding-8B-GGUF
下面直接使用介绍页面的sample代码体验一下模型的威力。
1. modelscope下载模型
$ modelscope download --model Qwen/Qwen3-Embedding-0.6B
$ modelscope download --model Qwen/Qwen3-Embedding-8B
0.6B模型 1.12GB 8B模型 14.1GB
2. 修改sample代码从本地加载模型
默认代码运行报错:
OSError: We couldn’t connect to ‘https://huggingface.co’ to load the files, and couldn’t find them in the cached files.
# test_qwen3-embedding.py
# Requires transformers>=4.51.0
# Requires sentence-transformers>=2.7.0
from sentence_transformers import SentenceTransformer
# Load the model
#model = SentenceTransformer("Qwen/Qwen3-Embedding-8B") 改为下面代码本地加载模型
model = SentenceTransformer("C:\\Users\\Administrator\\.cache\\modelscope\\hub\models\\Qwen\\Qwen3-Embedding-8B")
# We recommend enabling flash_attention_2 for better acceleration and memory saving,
# together with setting `padding_side` to "left":
# model = SentenceTransformer(
# "Qwen/Qwen3-Embedding-8B",
# model_kwargs={"attn_implementation": "flash_attention_2", "device_map": "auto"},
# tokenizer_kwargs={"padding_side": "left"},
# )
# The queries and documents to embed
queries = [
"What is the capital of China?",
"Explain gravity",
]
documents = [
"The capital of China is Beijing.",
"Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
]
# Encode the queries and documents. Note that queries benefit from using a prompt
# Here we use the prompt called "query" stored under `model.prompts`, but you can
# also pass your own prompt via the `prompt` argument
query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)
# Compute the (cosine) similarity between the query and document embeddings
similarity = model.similarity(query_embeddings, document_embeddings)
print(similarity)
# tensor([[0.7493, 0.0751],
# [0.0880, 0.6318]])
可能是机器配置太低问题,无法正常执行出结果
D:\workspace\test_qwen3-embedding.py:8: SyntaxWarning: invalid escape sequence ‘\m’
model = SentenceTransformer(“C:\Users\Administrator\.cache\modelscope\hub\models\Qwen\Qwen3-Embedding-8B”)
Loading checkpoint shards: 25%|██████████████▎ | 1/4 [00:14<00:42, 14.24s/it]
3. 修改sample代码为0.6B模型
# test_qwen3-embedding.py
。。。
# Load the model
#model = SentenceTransformer("Qwen/Qwen3-Embedding-8B") 改为下面代码本地加载模型
model = SentenceTransformer("C:\\Users\\Administrator\\.cache\\modelscope\\hub\models\\Qwen\\Qwen3-Embedding-8B")
。。。
(workspace) PS D:\workspace> uv run .\test_qwen3-embedding.py
D:\workspace\test_qwen3-embedding.py:8: SyntaxWarning: invalid escape sequence ‘\m’
model = SentenceTransformer(“C:\Users\Administrator\.cache\modelscope\hub\models\Qwen\Qwen3-Embedding-0.6B”)
tensor([[0.7646, 0.1414],
[0.1355, 0.6000]])
运行成功,几秒钟出结果,CPU呼呼的转,最终效果可以接受吗?
每一篇文章前后都增加返回目录
回到目录