主要参考知乎帖子:
MiniGPT-4 本地部署 RTX 3090 - 知乎
MiniGPT-4部署比麻烦,首先需要获取LLaMA权重,并结合Vicuna的bitwise XOR增量文件完成Vicuna模型权重生成,最后准备好预训练的MiniGPT-4进行模型部署。为了便于理解,我画了个流程框图:

系统版本:Ubuntu 20.04

我的硬件设备:Nvidia GeForce RTX-3090,显存24G

1、准备环境
克隆MiniGPT-4库,准备environment.yml中所需的环境。
git clone https://github.com/Vision-CAIR/MiniGPT-4.git
cd MiniGPT-4
conda env create -f environment.yml
conda activate minigpt4

……

2、LLaMA权重获取
首先我们需要从huggingface下载模型权重,pip安装huggingface_hub。
pip install huggingface_hub

由于显卡限制,我选用了参数量最小的模型 llama-7b-hf,huggingface下载链接如下:
LLaMA:
decapoda-research (Decapoda Research)
本文选择:decapoda-research/llama-7b-hf
decapoda-research/llama-7b-hf at main
注意:文件需要全部下载,原文中是用snapshot_download下载的,我直接网页版下载,因为git容易断,还可能出现checkout失败,可以手动下载。

3、Vicuna增量文件
选用模型vicuna-7b-delta-v1.1,huggingface下载链接如下:
lmsys (Large Model Systems Organization)
lmsys/vicuna-7b-delta-v1.1 at main

注:vicuna权重分为v0和v1.1两个版本,MiniGPT-4作者采用的是v0,当使用v0版本时,生成vicuna权重出错(bug:tensor尺度不一致),而换为v1.1版本即可解决。我之前试用过v0这个版本,没有搞成功,不是上述原因,待查,所以建议选用v1.1版本。
4、Vicuna权重生成
克隆FastChat库:
git clone https://github.com/lm-sys/FastChat.git
GitHub - lm-sys/FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.

在终端输入以下命令:
python3 -m fastchat.model.apply_delta --base-model-path /home/train/mycharm/MiniGPT-4/model/llama-7b-hf/ --target-model-path /home/train/mycharm/new/vicuna --delta /home/train/mycharm/new/lmsys/lmsysvicuna-7b-delta-v1.1 --low-cpu-mem
说明:
base-model-path是llama原始模型权重,7b参数的,target-model-path 是要生成的vicuna权重,delta是vicuna delta权重,低CPU内存需加入--low-cpu-mem,可以把大的权重文件分割成多个小份,并使用磁盘作为临时存储。可以使峰值内存保持在16GB以下。不然无法载入vicuna增量文件,CPU内存占满,程序直接被kill,绿色表示已有的vicuna-7b-delta权重。
这行命令对初学者有些迷惑:简单说就是利用llama的权重,结合vicuna的delta权重,然后生成vicuna权重,根源在于meta发布的llama权重没有正式公开导致,能下载只是网络行为。
运行结果如下:

新生成的vicuna的权重在设定的目录中:

5、MiniGPT-4启动
本文采用的权重为原作者的checkpoints,prerained_minigpt4_7b.pth,并放在生成的vicuna权重路径下,目录一定要放对。
下载链接:
https://link.zhihu.com/?target=https%3A//drive.google.com/file/d/1RY9jV0dyqLX-o38LrumkKRh6Jtaop58R/view
此处要用谷歌,下载其他版本应该也可以,我没试。

下载完成后放到上述刚生成的vicuna目录下:

修改配置文件模型权重存放的目录:
下面这两步很关键,要修改权重文件的目录,要根据自己实际情况修改:
1)、修改MiniGPT-4/minigpt4/configs/models/minigpt4.yaml 文件中llama_model的值为vicuna-7b的路径:比如,我的在/home/train/mycharm/new/vicuna/在这个目录下,原文件在16行。

2)、修改MiniGPT-4/eval_configs/minigpt4_eval.yaml,将ckpt的值改成prerained_minigpt4_7b.pth的路径,原文件在11行。
比如我的在这个目录:/home/train/mycharm/new/vicuna/prerained_minigpt4_7b.pth

6、启动MiniGPT-4 demo
进入到MiniGPT-4目录:
python demo.py --cfg-path eval_configs/minigpt4_eval.yaml --gpu-id 0

执行成功。
里面有个警告,疑似pytorch和torchvision版本不一致导致,不影响此处功能,参照以下帖子:
Failed to load image Python extension: libtorch_cuda_cu.so_牧羊女说的博客-CSDN博客
下面是原作者的运行图:

附录:
虚拟环境中各个包的名称及版本
| Package | Version |
| absl-py | 1.4.0 |
| accelerate | 0.15.0 |
| addict | 2.4.0 |
| aiofiles | 23.1.0 |
| aiohttp | 3.8.4 |
| aiosignal | 1.3.1 |
| altair | 5.0.1 |
| anyio | 3.7.1 |
| appdirs | 1.4.4 |
| apturl | 0.5.2 |
| argon2-cffi | 21.3.0 |
| argon2-cffi-bindings | 21.2.0 |
| arrow | 1.2.3 |
| asttokens | 2.2.1 |
| async-timeout | 4.0.2 |
| attrs | 23.1.0 |
| autopep8 | 2.0.1 |
| backcall | 0.1.0 |
| bcrypt | 3.1.7 |
| beautifulsoup4 | 4.12.2 |
| bleach | 6.0.0 |
| blinker | 1.4 |
| Brlapi | 0.7.0 |
| cachetools | 5.3.0 |
| catfish | 1.4.13 |
| certifi | 2019.11.28 |
| cffi | 1.15.1 |
| chardet | 3.0.4 |
| charset-normalizer | 3.1.0 |
| chrome-gnome-shell | 0.0.0 |
| Click | 7 |
| cmake | 3.25.2 |
| colorama | 0.4.3 |
| coloredlogs | 15.0.1 |
| comm | 0.1.3 |
| command-not-found | 0.3 |
| configobj | 5.0.6 |
| contourpy | 1.0.7 |
| cryptography | 2.8 |
| cuda | 0.0.1 |
| cupshelpers | 1 |
| cycler | 0.11.0 |
| dbus-python | 1.2.16 |
| debugpy | 1.6.7 |
| decorator | 4.4.2 |
| defer | 1.0.6 |
| defusedxml | 0.7.1 |
| distro | 1.4.0 |
| distro-info | 0.23ubuntu1 |
| docker-pycreds | 0.4.0 |
| dulwich | 0.19.15 |
| duplicity | 0.8.12.0 |
| entrypoints | 0.3 |
| exceptiongroup | 1.1.2 |
| executing | 1.2.0 |
| fairscale | 0.4.13 |
| fastapi | 0.99.1 |
| fastChat | 0.1.1 |
| fasteners | 0.14.1 |
| fastimport | 0.9.8 |
| fastjsonschema | 2.17.1 |
| ffmpy | 0.3.0 |
| filelock | 3.12.2 |
| fire | 0.5.0 |
| flatbuffers | 23.1.21 |
| fonttools | 4.38.0 |
| fqdn | 1.5.1 |
| frozenlist | 1.3.3 |
| fschat | 0.2.18 |
| fsspec | 2023.6.0 |
| future | 0.18.2 |
| gitdb | 4.0.10 |
| GitPython | 3.1.31 |
| google-auth | 2.16.1 |
| google-auth-oauthlib | 0.4.6 |
| gradio | 3.35.2 |
| gradio_client | 0.2.7 |
| graphviz | 0.8.4 |
| grpcio | 1.51.1 |
| h11 | 0.14.0 |
| h5py | 3.8.0 |
| hiq-python | 1.1.12 |
| httpcore | 0.17.3 |
| httplib2 | 0.14.0 |
| httpx | 0.24.1 |
| huggingface-hub | 0.16.3 |
| humanfriendly | 10 |
| idna | 2.8 |
| imageio | 2.22.2 |
| importlib-metadata | 6.0.0 |
| importlib-resources | 5.12.0 |
| ipykernel | 6.24.0 |
| ipython | 8.12.2 |
| ipython_genutils | 0.2.0 |
| ipywidgets | 8.0.7 |
| isoduration | 20.11.0 |
| jedi | 0.18.2 |
| Jinja2 | 3.1.2 |
| joblib | 1.2.0 |
| jsonpatch | 1.32 |
| jsonpointer | 2.3 |
| jsonschema | 4.18.0 |
| jsonschema-specifications | 2023.6.1 |
| jupyter | 1.0.0 |
| jupyter_client | 8.3.0 |
| jupyter-console | 6.6.3 |
| jupyter_core | 5.3.1 |
| jupyter-events | 0.6.3 |
| jupyter_server | 2.7.0 |
| jupyter_server_terminals | 0.4.4 |
| jupyterlab-pygments | 0.2.2 |
| jupyterlab-widgets | 3.0.8 |
| keyring | 18.0.1 |
| kiwisolver | 1.4.4 |
| labelImg | 1.8.6 |
| language-selector | 0.1 |
| launchpadlib | 1.10.13 |
| lazr.restfulclient | 0.14.2 |
| lazr.uri | 1.0.3 |
| lightdm-gtk-greeter-settings | 1.2.2 |
| linkify-it-py | 2.0.2 |
| lit | 15.0.7 |
| lockfile | 0.12.2 |
| louis | 3.12.0 |
| lxml | 4.6.2 |
| macaroonbakery | 1.3.1 |
| Mako | 1.1.0 |
| Markdown | 3.4.1 |
| markdown-it-py | 2.2.0 |
| markdown2 | 2.4.9 |
| MarkupSafe | 2.1.2 |
| matplotlib | 3.6.3 |
| matplotlib-inline | 0.1.6 |
| mdit-py-plugins | 0.3.3 |
| mdurl | 0.1.2 |
| meld | 3.20.2 |
| menulibre | 2.2.1 |
| mistune | 3.0.1 |
| mmcv | 1.7.1 |
| mmdet | 2.28.1 |
| monotonic | 1.5 |
| mpmath | 1.2.1 |
| mugshot | 0.4.2 |
| multidict | 6.0.4 |
| mxnet | 1.9.1 |
| nbclassic | 1.0.0 |
| nbclient | 0.8.0 |
| nbconvert | 7.6.0 |
| nbformat | 5.9.0 |
| nest-asyncio | 1.5.6 |
| netifaces | 0.10.4 |
| networkx | 2.8.7 |
| nh3 | 0.2.14 |
| notebook | 6.5.4 |
| notebook_shim | 0.2.3 |
| numpy | 1.24.4 |
| oauthlib | 3.1.0 |
| olefile | 0.46 |
| onboard | 1.4.1 |
| onnx | 1.13.0 |
| onnxruntime | 1.14.0 |
| opencv-python | 4.6.0.66 |
| orjson | 3.9.1 |
| overrides | 7.3.1 |
| packaging | 21.3 |
| pandas | 1.5.3 |
| pandocfilters | 1.5.0 |
| paramiko | 2.6.0 |
| parso | 0.8.3 |
| pathtools | 0.1.2 |
| peft | 0.3.0 |
| pexpect | 4.6.0 |
| pickleshare | 0.7.5 |
| Pillow | 9.0.0 |
| pip | 23.1.2 |
| pkgutil_resolve_name | 1.3.10 |
| platformdirs | 3.8.1 |
| prometheus-client | 0.17.0 |
| prompt-toolkit | 3.0.39 |
| protobuf | 3.19.0 |
| psutil | 5.9.4 |
| ptyprocess | 0.7.0 |
| pure-eval | 0.2.2 |
| py-itree | 0.0.19 |
| pyasn1 | 0.4.8 |
| pyasn1-modules | 0.2.8 |
| pycairo | 1.16.2 |
| pycocotools | 2.0.6 |
| pycodestyle | 2.10.0 |
| pycparser | 2.21 |
| pycups | 1.9.73 |
| pydantic | 1.10.11 |
| pydub | 0.25.1 |
| Pygments | 2.15.1 |
| PyGObject | 3.36.0 |
| PyJWT | 1.7.1 |
| pyllama | 0.0.8 |
| pymacaroons | 0.13.0 |
| PyNaCl | 1.3.0 |
| pyparsing | 3.0.9 |
| PyQt5 | 5.10.1 |
| PyQt5-Qt5 | 5.15.2 |
| PyQt5-sip | 12.12.1 |
| pyRFC3339 | 1.1 |
| pysvn | 1.9.9 |
| python-apt | 2.0.1 |
| python-dateutil | 2.8.2 |
| python-debian | 0.1.36ubuntu1 |
| python-json-logger | 2.0.7 |
| python-multipart | 0.0.6 |
| pytz | 2022.7.1 |
| PyWavelets | 1.4.1 |
| pyxdg | 0.26 |
| PyYAML | 5.3.1 |
| pyzmq | 25.1.0 |
| qtconsole | 5.4.3 |
| QtPy | 2.3.1 |
| rabbitvcs | 0.18 |
| referencing | 0.29.1 |
| regex | 2023.6.3 |
| reportlab | 3.5.34 |
| requests | 2.31.0 |
| requests-oauthlib | 1.3.1 |
| requests-unixsocket | 0.2.0 |
| rfc3339-validator | 0.1.4 |
| rfc3986-validator | 0.1.1 |
| rich | 13.4.2 |
| rpds-py | 0.8.8 |
| rsa | 4.9 |
| safetensors | 0.3.1 |
| scikit-image | 0.19.3 |
| scikit-learn | 1.1.2 |
| scipy | 1.9.3 |
| seaborn | 0.12.2 |
| SecretStorage | 2.3.1 |
| semantic-version | 2.10.0 |
| Send2Trash | 1.8.2 |
| sentencepiece | 0.1.97 |
| sentry-sdk | 1.15.0 |
| setproctitle | 1.3.2 |
| setuptools | 67.8.0 |
| sgt-launcher | 0.2.5 |
| shortuuid | 1.0.11 |
| simplejson | 3.16.0 |
| sip | 4.19.21 |
| six | 1.14.0 |
| sklearn | 0 |
| smmap | 5.0.0 |
| sniffio | 1.3.0 |
| soupsieve | 2.4.1 |
| ssh-import-id | 5.1 |
| stack-data | 0.6.2 |
| starlette | 0.27.0 |
| svgwrite | 1.4.3 |
| sympy | 1.11.1 |
| systemd-python | 234 |
| tensorboard | 2.12.0 |
| tensorboard-data-server | 0.7.0 |
| tensorboard-logger | 0.1.0 |
| tensorboard-plugin-wit | 1.8.1 |
| termcolor | 2.3.0 |
| terminado | 0.17.1 |
| terminaltables | 3.1.10 |
| thop | 0.1.1.post2209072238 |
| threadpoolctl | 3.1.0 |
| tifffile | 2022.10.10 |
| tiktoken | 0.4.0 |
| tinycss2 | 1.2.1 |
| tokenizers | 0.13.3 |
| tomli | 2.0.1 |
| toolz | 0.12.0 |
| torch | 1.13.1+cu116 |
| torchaudio | 0.13.1+cu116 |
| torchscope | 0.1.0 |
| torchvision | 0.14.1+cu116 |
| tornado | 6.2 |
| tqdm | 4.65.0 |
| traitlets | 5.9.0 |
| transformers | 4.28.1 |
| triton | 2.0.0 |
| typing_extensions | 4.7.1 |
| ubuntu-advantage-tools | 8001 |
| ubuntu-drivers-common | 0.0.0 |
| uc-micro-py | 1.0.2 |
| ufw | 0.36 |
| ultralytics | 8.0.109 |
| unattended-upgrades | 0.1 |
| uri-template | 1.3.0 |
| urllib3 | 1.26.14 |
| usb-creator | 0.3.7 |
| uvicorn | 0.22.0 |
| visdom | 0.2.4 |
| wadllib | 1.3.3 |
| wandb | 0.13.10 |
| wavedrom | 2.0.3.post3 |
| wcwidth | 0.1.8 |
| webcolors | 1.13 |
| webencodings | 0.5.1 |
| websocket-client | 1.5.1 |
| websockets | 11.0.3 |
| Werkzeug | 2.2.3 |
| wheel | 0.34.2 |
| widgetsnbextension | 4.0.8 |
| xcffib | 0.8.1 |
| xkit | 0.0.0 |
| xxx | 0.0.1 |
| yapf | 0.32.0 |
| yarl | 1.9.2 |
| zipp | 3.15.0 |



















