ChatGLM2-6B-Int4本地部署

文章目录

- - - 1、先看效果
    - 2、本地部署
    - - 部署环境
      - 下载
      - 创建虚拟环境，安装库
      - 本地模型下载
      - int-4推理
      - ```web_demo.py```
      - 遇到的问题

原文链接：http://wangguo.site/posts/9d8c1768.html

ChatGLM2-6B 是开源中英双语对话模型 ChatGLM-6B 的第二代版本
GitHub地址：https://github.com/THUDM/ChatGLM2-6B

1、先看效果

在这里插入图片描述

2、本地部署

部署环境

wsl2-ubuntu22.04 LTS+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.104      Driver Version: 528.79       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
| N/A   45C    P8     5W /  80W |    928MiB /  6144MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A        23      G   /Xwayland                       N/A      |
+-----------------------------------------------------------------------------+

下载

git clone https://github.com/THUDM/ChatGLM2-6B
cd ChatGLM2-6B

创建虚拟环境，安装库

virtualenv venv
source venv/bin/activatepip install -r requirements.txt

本地模型下载

git clone https://huggingface.co/THUDM/chatglm2-6b-int4

然后在清华大学云盘下载相应的模型参数文件,并将文件拷贝到chatglm2-6b-int4文件夹下

在这里插入图片描述

int-4推理

需要先修改web_demo.py,修改内容

//第7行,修改为本地模型参数地址# model = AutoModel.from_pretrained("THUDM/chatglm2-6b-int4", trust_remote_code=True).cuda()model = AutoModel.from_pretrained("./chatglm2-6b-int4", trust_remote_code=True).cuda()

`web_demo.py`

执行

python web_demo.py

在这里插入图片描述

遇到的问题

问题1

OSError: model/chatglm2-6b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.

– 解决，输入

huggingface-cli login

在这里插入图片描述

并点击链接生成new token,拷贝到shell中输入即可
在这里插入图片描述

在这里插入图片描述

问题2

RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

– 解决，输入

sudo apt install libcudart11.0 libcublaslt11

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.rhkb.cn/news/51831.html

如若内容造成侵权/违法违规/事实不符，请联系长河编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！

ChatGLM2-6B-Int4本地部署

文章目录

1、先看效果

2、本地部署

部署环境

下载

创建虚拟环境，安装库

本地模型下载

int-4推理

`web_demo.py`

遇到的问题

相关文章

本地部署 privateGPT

LiteFlow v2.10.6 发布！一款社区驱动型优秀的规则引擎框架

chatgpt赋能python：Python自定义colormap集锦

一个学生关于鸿蒙系统的一些看法

海外用户用不了鸿蒙系统,海外用不了！鸿蒙系统成国内专用，华为如何才能战胜谷歌安卓...

华为鸿蒙比较乐视电视系统,华为彻底告别安卓！三亿台设备将使用鸿蒙系统，但实际远远不够...

鸿蒙真能兼容所有安卓应用,鸿蒙系统真要来了！已能全面兼容安卓应用：并且还有57万多APP支持...

鸿蒙系统支持软件,鸿蒙系统上线在即你最希望哪些软件能够支持呢

如果微软狠心鸿蒙系统,微软强制用户升级华为鸿蒙系统出来你会放弃windows使用鸿蒙吗...

鸿蒙OS无法安装APP,假如鸿蒙系统不能下载第三方APP，iPhone用户会路转粉吗

鸿蒙会和安卓一样吃内存吗,鸿蒙系统到底能不能替代安卓？

小米能安装鸿蒙吗,小米等友商手机可以使用鸿蒙系统吗？华为官方表态给力了...

华为平板能安装linux软件吗,华为鸿蒙OS能安装在我们的电脑、笔记本、平板和手机上吗...

如何安装鸿蒙应用,华为鸿蒙OS系统手机怎么安装第三方的应用程序？

三星手机能支持鸿蒙系统吗,鸿蒙系统可以在哪些手机上使用

鸿蒙系统怎么没有微信界面,如果没有微信，使用鸿蒙系统的华为会怎么样？

鸿蒙OS可以装电脑吗,华为的鸿蒙系统可以用在电脑上吗？

非华为手机可以刷鸿蒙系统吗,非华为手机用户，你愿意尝试鸿蒙系统吗？

解决fatal: unable to access ‘https://github.com/xxx/xxx.git/‘: Failed to connect to github.com port 4

已解决：fatal: unable to access ‘https://github.com/.......‘: OpenSSL SSL_read: Connection was reset,