6 大语言模型 - Topline

6.1 Setup

6.1.1 硬件

CPU: Intel Xeon E-2286M CPU@ 2.4GHz
Memory: 64G 2400 MHz
GPU: Nvidia RTX quadro 5000 显卡, 显存 16G

6.1.2 软件 (在 Windows 上)

6.1.2.1 安装 Nvidia Driver

https://www.nvidia.com/download/index.aspx

6.1.2.2 安装 WSL - Ubuntu

Enable the Windows Subsystem for Linux

Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux

Enable Virtual Machine feature

dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart

Download the Linux kernel update package

https://learn.microsoft.com/en-us/windows/wsl/install-manual#step-4—download-the-linux-kernel-update-package

Set WSL 2 as your default version

wsl --set-default-version 2

安装 Ubuntu 22.04 ( Link to Microsoft Store )

6.1.3 软件 (在 WSL 上)

打卡 Windows Terminal, 添加一个 Ubuntu bash 的标签页. 以下操作在 Ubuntu 下操作.

6.1.3.1 安装 Git Large File Storage (Git-LFS)

参考: https://github.com/git-lfs/git-lfs/blob/main/INSTALLING.md

Adding the packagecloud repository
- curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
Installing packages
- sudo apt-get install git-lfs

6.1.3.2 拉取大语言模型 (LLM) 镜像

cd ~
git clone https://github.com/KMnO4-zx/xfg-paper.git
git clone https://huggingface.co/THUDM/chatglm2-6b

6.1.3.3 配置 Python 环境

安装 Miniconda

mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh

升级 Conda

conda update --all

创建虚拟环境

conda create -n datawhale python
conda activate datawhale

安装 Pytorch

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

在 WSL 安装 Pytorch

建议用 pip 安装 Pytorch, 不建议用 conda 安装 Pytorch. 因为, 我用 conda 安装的 Pytorch 遇到了无法找到 cuda 设备的问题. 参考

安装其他的 Python 模块

cd ~/xfg-paper
pip install -r requirements.txt

6.2 跑大语言模型

6.2.1 微调

cd ~/xfg-paper
bash ./xfg_train.sh

在我的硬件配置下, 需要 4小时30分左右 (截止提交第三次打卡前, 尚未完成)