️安装 (on Apple Silicon)

PyTorch uses the new Metal Performance Shaders (MPS) backend for GPU training acceleration. This MPS backend extends the PyTorch framework, providing scripts and capabilities to set up and run operations on Mac. The MPS framework optimizes compute performance with kernels that are fine-tuned for the unique characteristics of each Metal GPU family. The new mps device maps machine learning computational graphs and primitives on the MPS Graph framework and tuned kernels provided by MPS.

安装 Conda

miniforgeminiconda 的社区版本. 其更早支持 Mac M1. 故此, 我选择了 miniforge,

cd ~/Downloads
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
sh Miniforge3-MacOSX-arm64.sh
conda update conda

安装 Torch 和d2l软件包

截止 2023-03-21 《动手学深度学习(PyTorch版)》 配置的环境为 Python 3.9, torch 1.12.0 以及 torchvision 0.13.0. 然而 PyTorch 2.0 于 2023-03-15 发布. 相对于 1.13.1 版本, 它有以下优点

  • torch.compile is the main API for PyTorch 2.0, which wraps your model and returns a compiled model. It is a fully additive (and optional) feature and hence 2.0 is 100% backward compatible by definition.
  • Metal Performance Shaders (MPS) backend provides GPU accelerated PyTorch training on Mac platforms with added support for Top 60 most used ops, bringing coverage to over 300 operators.

因此, 在下面介绍如何在 Mac M1 上配置 Torch 2.0.

我们可以按如下方式安装PyTorch的CPU或GPU版本:

conda create --name d2l-torch2 python=3.10 -y
conda activate d2l-torch2
conda install pytorch torchvision torchaudio -c pytorch

我们的下一步是安装d2l包,以方便调取本书中经常使用的函数和类:

conda install matplotlib pandas scikit-learn requests jupyter jupyterlab
conda install gym=0.21.0 
pip install d2l==1.0.0b0

截至 2023-03-21

  • d2l 1.0.0b0 为 pre-release 版本, 这是英文版教材使用的版本
  • d2l 1.0.0b0 强制要求 gym=0.21.0
  • gym<=0.26.1 (gym 所有已发布版本) 与 python 最新版 (3.11) 不兼容

下载 D2L Notebook

接下来,需要下载这本书的代码。

mkdir d2l-zh && cd d2l-zh
curl https://zh-v2.d2l.ai/d2l-zh-2.0.0.zip -o d2l-zh.zip
unzip d2l-zh.zip && rm d2l-zh.zip
cd pytorch

安装完成后我们可以通过运行以下命令打开Jupyter笔记本.

conda activate d2l-torch2
jupyter-lab

安装验证

You can verify mps support using a simple Python script:

import torch
if torch.backends.mps.is_available():
    mps_device = torch.device("mps")
    x = torch.ones(1, device=mps_device)
    print (x)
else:
    print ("MPS device not found.")

The output should show:

tensor([1.], device='mps:0')

性能测试

jupyter-lab 中前往 chapter_convolutional-modern/lenet.ipynb

将 notebook 最后的函数中的, d2l.try_gpu() 替换为 torch.device("mps") 以便启用 Apple Silicon 芯片的 GPU 加速. 替换后的代码框如下

lr, num_epochs = 0.9, 10
train_ch6(net, train_iter, test_iter, num_epochs, lr, torch.device("mps"))

在 Apple M1 芯片 8 GB + 512 GB 的 MacBook Air 下 (MacOS Ventura 13.2.1), 我获得以下处理速度

loss 0.468, train acc 0.826, test acc 0.808
24814.3 examples/sec on mps

计算过程中, 可在 MacOS 的 Activity monitor 中开启 GPU History 以查看 GPU 占用情况.