text-generation-webui
这个GitHub项目具有以下显著优点:
- 易用性:该项目提供了一个直观的Web用户界面(UI),使得用户无需具备复杂的编程技能即可与各种大型语言模型进行交互。通过简单的点击操作,用户就能提交输入文本并获取模型生成的输出。
- 模型集成:支持整合多种流行的大型语言模型,如ChatGPT、RWOV-Raven、Vicuna、MOSS、LLaMA、llama.cpp、GPT-J、Pythia、OPT和GALACTICA等,这意味着用户在一个统一平台上就能体验多种不同模型的能力。
- 本地部署:允许用户在自己的设备或服务器上部署该WebUI,这对于数据隐私、性能需求或者网络限制有较高要求的场景特别有用,因为它消除了对云服务的依赖。
- 可定制性:由于它是开源项目,开发者可以根据自身需求对其进行定制化开发,比如对接更多的语言模型服务,或者调整界面布局和功能。
- 支持 API 服务: 除了 Web UI 外,还提供了 API 接口,方便与其他应用集成。
- 功能扩展性强: 支持自定义脚本实现特殊功能,并且可以通过 Extensions 插件机制扩展功能。
环境python3.11.7
创建conda 23.11.0
cuda12.1
重要的是下面版本要对。按照下面的版本按照。要不然报错。
exllamav2 0.0.12+cu121
torch 2.1.2+cu121
torchaudio 2.1.2+cu121
torchvision 0.16.2+cu121
flash-attn 2.3.4
ctransformers 0.2.27+cu121
transformers 4.37.2
1
2
3
4
5
| conda env remove --name text-generation-webui
conda create -n text-generation-webui python=3.11.7
conda activate text-generation-webui
pushd D:\Software\AI\text-generation-webui
python server.py
|
Recommended if you have some experience with the command-line.
https://docs.conda.io/en/latest/miniconda.html
On Linux or WSL, it can be automatically installed with these two commands (source):
1
2
| curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"
bash Miniconda3.sh
|
1
2
| conda create -n textgen python=3.11
conda activate textgen
|
System | GPU | Command |
---|
Linux/WSL | NVIDIA | pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 |
Linux/WSL | CPU only | pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu |
Linux | AMD | pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6 |
MacOS + MPS | Any | pip3 install torch torchvision torchaudio |
Windows | NVIDIA | pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 |
Windows | CPU only | pip3 install torch torchvision torchaudio |
The up-to-date commands can be found here: https://pytorch.org/get-started/locally/.
For NVIDIA, you also need to install the CUDA runtime libraries:
1
| conda install -y -c "nvidia/label/cuda-12.1.1" cuda-runtime
|
If you need nvcc
to compile some library manually, replace the command above with
1
| conda install -y -c "nvidia/label/cuda-12.1.1" cuda
|
1
2
3
| git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r <requirements file according to table below> 比如:pip install -r requirements.txt
|
这里选择 pip install -r requirements.txt 。AVX2是Intel公司推出的第二代高级向量扩展指令集。新的12 13 14代cpu支持。
Requirements file to use:
GPU | CPU | requirements file to use |
---|
NVIDIA | has AVX2 | requirements.txt |
NVIDIA | no AVX2 | requirements_noavx2.txt |
AMD | has AVX2 | requirements_amd.txt |
AMD | no AVX2 | requirements_amd_noavx2.txt |
CPU only | has AVX2 | requirements_cpu_only.txt |
CPU only | no AVX2 | requirements_cpu_only_noavx2.txt |
Apple | Intel | requirements_apple_intel.txt |
Apple | Apple Silicon | requirements_apple_silicon.txt |
1
2
3
| conda activate textgen
cd text-generation-webui
python server.py
|
Then browse to
1
| http://localhost:7860/?__theme=dark
|
问题:
import exllamav2_ext
Traceback (most recent call last):
File “”, line 1, in ImportError: DLL load failed while importing exllamav2_ext:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| --- ls
#pip3 uninstall torch torchvision torchaudio
# conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
#pip install -r requirements.txt --upgrade
--- ls
##pip install torch==2.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
conda install -y -c "nvidia/label/cuda-11.8.0" cuda-runtime
pip install -r requirements.txt
---
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 安装后
pip install -r requirements.txt 安装后
So fo an older version of exllamav2 , which is built on torch 2.1.2: 报错
From https://github.com/turboderp/exllamav2/releases/tag/v0.0.12
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
| pip install exllamav2-0.0.12+cu121-cp311-cp311-win_amd64.whl 下载到本地安装,然后下面安装
pip install torch==2.1.2 --index-url https://download.pytorch.org/whl/cu121
pip install torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121
pip install torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu121
变成下面的ok
(text-generation-webui) D:\Software\AI\text-generation-webui> pip list |findstr exllamav
exllamav2 0.0.12+cu121
(text-generation-webui) D:\Software\AI\text-generation-webui> pip list |findstr torch
torch 2.1.2+cu121
torchaudio 2.1.2+cu121
torchvision 0.16.2+cu121
(text-generation-webui) D:\Software\AI\text-generation-webui> pip list |findstr flash
flash-attn 2.3.4
(text-generation-webui) D:\Software\AI\text-generation-webui> pip list |findstr transformers
ctransformers 0.2.27+cu121
transformers 4.37.2
|
python server.py 启动ok 不报错。
问题解决了。
其它:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
| # bitsandbytes
bitsandbytes==0.41.1; platform_system != "Windows"
https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl; platform_system == "Windows"
# llama-cpp-python (CPU only, AVX2)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.36+cpuavx2-cp311-cp311-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.36+cpuavx2-cp310-cp310-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.36+cpuavx2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.36+cpuavx2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
# llama-cpp-python (CUDA, no tensor cores)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.36+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.36+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.36+cu121-cp311-cp311-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.36+cu121-cp310-cp310-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
# llama-cpp-python (CUDA, tensor cores)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.36+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.36+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.36+cu121-cp311-cp311-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.36+cu121-cp310-cp310-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
# CUDA wheels
https://github.com/jllllll/AutoGPTQ/releases/download/v0.6.0/auto_gptq-0.6.0+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/jllllll/AutoGPTQ/releases/download/v0.6.0/auto_gptq-0.6.0+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/jllllll/AutoGPTQ/releases/download/v0.6.0/auto_gptq-0.6.0+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/jllllll/AutoGPTQ/releases/download/v0.6.0/auto_gptq-0.6.0+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/jllllll/flash-attention/releases/download/v2.3.4/flash_attn-2.3.4+cu121torch2.1cxx11abiFALSE-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/jllllll/flash-attention/releases/download/v2.3.4/flash_attn-2.3.4+cu121torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/Dao-AILab/flash-attention/releases/download/v2.3.4/flash_attn-2.3.4+cu122torch2.1cxx11abiFALSE-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/Dao-AILab/flash-attention/releases/download/v2.3.4/flash_attn-2.3.4+cu122torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/jllllll/GPTQ-for-LLaMa-CUDA/releases/download/0.1.1/gptq_for_llama-0.1.1+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/jllllll/GPTQ-for-LLaMa-CUDA/releases/download/0.1.1/gptq_for_llama-0.1.1+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/jllllll/GPTQ-for-LLaMa-CUDA/releases/download/0.1.1/gptq_for_llama-0.1.1+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/jllllll/GPTQ-for-LLaMa-CUDA/releases/download/0.1.1/gptq_for_llama-0.1.1+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/jllllll/ctransformers-cuBLAS-wheels/releases/download/AVX2/ctransformers-0.2.27+cu121-py3-none-any.whl
autoawq==0.1.8; platform_system == "Linux" or platform_system == "Windows"
|
我的模型在G:\models\longchain-chatchat\Qwen\Qwen-14B-Chat 但程序要用的模型目录在 D:\Software\AI\text-generation-webui\models
不转移文件夹怎么实现 或用快捷方式可以吗?
- 切换到需要创建符号链接的目录,例如D:\Software\AI\text-generation-webui\models。
- 输入以下命令来创建符号链接(假设您要创建的链接名为Qwen-14B-Chat):
bash
1
| mklink /d Qwen-14B-Chat "G:\models\longchain-chatchat\Qwen\Qwen-14B-Chat"
|
请注意:
- 这个命令需要以管理员权限运行。
- 符号链接在Windows 10中可用,在一些老版本的Windows中可能不可用。
mklink rwkv-5-h-world-3B.pth “G:\models\RWKV\rwkv-5-h-world-3B.pth”
mklink /d Qwen-14B-Chat “G:\models\longchain-chatchat\Qwen\Qwen-14B-Chat”