text-generation-webui部署最新教程

alex 收录于类别 Ai Aigc

2024-02-19 2024-02-19 约 1665 字预计阅读 8 分钟

1 text-generation-webui优点

text-generation-webui这个GitHub项目具有以下显著优点：

易用性：该项目提供了一个直观的Web用户界面（UI），使得用户无需具备复杂的编程技能即可与各种大型语言模型进行交互。通过简单的点击操作，用户就能提交输入文本并获取模型生成的输出。
模型集成：支持整合多种流行的大型语言模型，如ChatGPT、RWOV-Raven、Vicuna、MOSS、LLaMA、llama.cpp、GPT-J、Pythia、OPT和GALACTICA等，这意味着用户在一个统一平台上就能体验多种不同模型的能力。
本地部署：允许用户在自己的设备或服务器上部署该WebUI，这对于数据隐私、性能需求或者网络限制有较高要求的场景特别有用，因为它消除了对云服务的依赖。
可定制性：由于它是开源项目，开发者可以根据自身需求对其进行定制化开发，比如对接更多的语言模型服务，或者调整界面布局和功能。
支持 API 服务: 除了 Web UI 外,还提供了 API 接口,方便与其他应用集成。
功能扩展性强: 支持自定义脚本实现特殊功能,并且可以通过 Extensions 插件机制扩展功能。

2 安装text-generation-webui

环境python3.11.7 创建conda 23.11.0 cuda12.1 重要的是下面版本要对。按照下面的版本按照。要不然报错。 exllamav2 0.0.12+cu121 torch 2.1.2+cu121 torchaudio 2.1.2+cu121 torchvision 0.16.2+cu121 flash-attn 2.3.4 ctransformers 0.2.27+cu121 transformers 4.37.2

conda env remove --name  text-generation-webui
conda create -n text-generation-webui  python=3.11.7
conda activate text-generation-webui
pushd  D:\Software\AI\text-generation-webui
python server.py

Recommended if you have some experience with the command-line.

2.0.1 0. Install Conda

https://docs.conda.io/en/latest/miniconda.html

On Linux or WSL, it can be automatically installed with these two commands (source):

curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"
bash Miniconda3.sh

2.0.2 1. Create a new conda environment

conda create -n textgen python=3.11
conda activate textgen

2.0.3 2. Install Pytorch

System	GPU	Command
Linux/WSL	NVIDIA	`pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121`
Linux/WSL	CPU only	`pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu`
Linux	AMD	`pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6`
MacOS + MPS	Any	`pip3 install torch torchvision torchaudio`
Windows	NVIDIA	`pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121`
Windows	CPU only	`pip3 install torch torchvision torchaudio`

The up-to-date commands can be found here: https://pytorch.org/get-started/locally/.

For NVIDIA, you also need to install the CUDA runtime libraries:

conda install -y -c "nvidia/label/cuda-12.1.1" cuda-runtime

If you need nvcc to compile some library manually, replace the command above with

conda install -y -c "nvidia/label/cuda-12.1.1" cuda

2.0.4 3. Install the web UI

git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r <requirements file according to table below>   比如：pip install -r requirements.txt

这里选择 pip install -r requirements.txt 。AVX2是Intel公司推出的第二代高级向量扩展指令集。新的12 13 14代cpu支持。

Requirements file to use:

GPU	CPU	requirements file to use
NVIDIA	has AVX2	`requirements.txt`
NVIDIA	no AVX2	`requirements_noavx2.txt`
AMD	has AVX2	`requirements_amd.txt`
AMD	no AVX2	`requirements_amd_noavx2.txt`
CPU only	has AVX2	`requirements_cpu_only.txt`
CPU only	no AVX2	`requirements_cpu_only_noavx2.txt`
Apple	Intel	`requirements_apple_intel.txt`
Apple	Apple Silicon	`requirements_apple_silicon.txt`

3 Start the web UI

conda activate textgen
cd text-generation-webui
python server.py

Then browse to

http://localhost:7860/?__theme=dark

问题：

import exllamav2_ext Traceback (most recent call last): File “”, line 1, in ImportError: DLL load failed while importing exllamav2_ext:

--- ls
#pip3 uninstall torch torchvision torchaudio
# conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
#pip install -r requirements.txt --upgrade
--- ls
 ##pip install torch==2.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

  conda install -y -c "nvidia/label/cuda-11.8.0" cuda-runtime
  
  pip install -r requirements.txt
---  
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121  安装后
  pip install -r requirements.txt 安装后
 
So fo an older version of exllamav2 , which is built on torch 2.1.2: 报错
From https://github.com/turboderp/exllamav2/releases/tag/v0.0.12

pip install exllamav2-0.0.12+cu121-cp311-cp311-win_amd64.whl 下载到本地安装，然后下面安装

pip install torch==2.1.2 --index-url https://download.pytorch.org/whl/cu121
pip install torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121
pip install torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu121

变成下面的ok
(text-generation-webui) D:\Software\AI\text-generation-webui> pip list |findstr exllamav
exllamav2                         0.0.12+cu121

(text-generation-webui) D:\Software\AI\text-generation-webui> pip list |findstr torch
torch                             2.1.2+cu121
torchaudio                        2.1.2+cu121
torchvision                       0.16.2+cu121

(text-generation-webui) D:\Software\AI\text-generation-webui> pip list |findstr flash
flash-attn                        2.3.4

(text-generation-webui) D:\Software\AI\text-generation-webui> pip list |findstr transformers
ctransformers                     0.2.27+cu121
transformers                      4.37.2

python server.py 启动ok 不报错。

问题解决了。

其它：

# bitsandbytes
bitsandbytes==0.41.1; platform_system != "Windows"
https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl; platform_system == "Windows"

# llama-cpp-python (CPU only, AVX2)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.36+cpuavx2-cp311-cp311-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.36+cpuavx2-cp310-cp310-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.36+cpuavx2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.36+cpuavx2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"

# llama-cpp-python (CUDA, no tensor cores)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.36+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.36+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.36+cu121-cp311-cp311-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.36+cu121-cp310-cp310-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"

# llama-cpp-python (CUDA, tensor cores)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.36+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.36+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.36+cu121-cp311-cp311-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.36+cu121-cp310-cp310-manylinux_2_31_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"

# CUDA wheels
https://github.com/jllllll/AutoGPTQ/releases/download/v0.6.0/auto_gptq-0.6.0+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/jllllll/AutoGPTQ/releases/download/v0.6.0/auto_gptq-0.6.0+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/jllllll/AutoGPTQ/releases/download/v0.6.0/auto_gptq-0.6.0+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/jllllll/AutoGPTQ/releases/download/v0.6.0/auto_gptq-0.6.0+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/jllllll/flash-attention/releases/download/v2.3.4/flash_attn-2.3.4+cu121torch2.1cxx11abiFALSE-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/jllllll/flash-attention/releases/download/v2.3.4/flash_attn-2.3.4+cu121torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/Dao-AILab/flash-attention/releases/download/v2.3.4/flash_attn-2.3.4+cu122torch2.1cxx11abiFALSE-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/Dao-AILab/flash-attention/releases/download/v2.3.4/flash_attn-2.3.4+cu122torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/jllllll/GPTQ-for-LLaMa-CUDA/releases/download/0.1.1/gptq_for_llama-0.1.1+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/jllllll/GPTQ-for-LLaMa-CUDA/releases/download/0.1.1/gptq_for_llama-0.1.1+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/jllllll/GPTQ-for-LLaMa-CUDA/releases/download/0.1.1/gptq_for_llama-0.1.1+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/jllllll/GPTQ-for-LLaMa-CUDA/releases/download/0.1.1/gptq_for_llama-0.1.1+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/jllllll/ctransformers-cuBLAS-wheels/releases/download/AVX2/ctransformers-0.2.27+cu121-py3-none-any.whl
autoawq==0.1.8; platform_system == "Linux" or platform_system == "Windows"

我的模型在G:\models\longchain-chatchat\Qwen\Qwen-14B-Chat 但程序要用的模型目录在 D:\Software\AI\text-generation-webui\models 不转移文件夹怎么实现或用快捷方式可以吗？

切换到需要创建符号链接的目录，例如D:\Software\AI\text-generation-webui\models。
输入以下命令来创建符号链接（假设您要创建的链接名为Qwen-14B-Chat）：

bash

mklink /d Qwen-14B-Chat "G:\models\longchain-chatchat\Qwen\Qwen-14B-Chat"

请注意：

这个命令需要以管理员权限运行。
符号链接在Windows 10中可用，在一些老版本的Windows中可能不可用。

mklink rwkv-5-h-world-3B.pth “G:\models\RWKV\rwkv-5-h-world-3B.pth”

mklink /d Qwen-14B-Chat “G:\models\longchain-chatchat\Qwen\Qwen-14B-Chat”

目录