小白的进阶之路系列之三----人工智能从初步到精通pytorch计算机视觉详解下

我们将继续计算机视觉内容的讲解。

我们已经知道了计算机视觉，用在什么地方，如何用Pytorch来处理数据，设定一些基础的设置以及模型。下面，我们将要解释剩下的部分，包括以下内容：

主题	内容
Model 1 ：加入非线性	实验是机器学习的很大一部分，让我们尝试通过添加非线性层来改进我们的基线模型。
Model 2：卷积神经网络（CNN）	是时候让计算机视觉具体化，并介绍强大的卷积神经网络架构。
比较我们的模型	我们已经建立了三种不同的模型，让我们来比较一下。
评估我们的最佳模型	让我们对随机图像做一些预测，并评估我们的最佳模型。
制作混淆矩阵	混淆矩阵是评估分类模型的好方法，让我们看看如何制作一个。
保存和载入最佳性能模型	由于我们可能希望稍后使用我们的模型，让我们保存它并确保它正确地加载回来。

下面开始正文：

6 Model 1：建立一个更好的非线性模型

我们在第二篇文章里学过非线性的力量。

看看我们一直在处理的数据，你认为它需要非线性函数吗？

记住，线性意味着直线，非线性意味着非直线。

让我们来看看。

我们将通过重新创建与之前类似的模型来实现这一点，只不过这次我们将在每个线性层之间放置非线性函数（nn.ReLU()）。

# Create a model with non-linear and linear layers
class FashionMNISTModelV1(nn.Module):
    def __init__(self, input_shape: int, hidden_units: int, output_shape: int):
        super().__init__()
        self.layer_stack = nn.Sequential(
            nn.Flatten(), # flatten inputs into single vector
            nn.Linear(in_features=input_shape, out_features=hidden_units),
            nn.ReLU(),
            nn.Linear(in_features=hidden_units, out_features=output_shape),
            nn.ReLU()
        )
    
    def forward(self, x: torch.Tensor):
        return self.layer_stack(x)

看起来不错。

现在让我们用之前使用的相同设置实例化它。

我们需要input_shape=784（等于图像数据的特征数量），hidden_units=10（开始时较小，与基线模型相同）和output_shape=len(class_names)（每个类一个输出单元）。

[!TIP]

注意：注意我们如何保持我们的模型的大部分设置相同，除了一个变化：添加非线性层。这是运行一系列机器学习实验的标准做法，改变一件事，看看会发生什么，然后再做一次，一次，一次。

torch.manual_seed(42)
model_1 = FashionMNISTModelV1(input_shape=784, # number of input features
    hidden_units=10,
    output_shape=len(class_names) # number of output classes desired
).to(device) # send model to GPU if it's available
print(next(model_1.parameters()).device) # check model device      )

输出为：

cuda:0

6.1设置损耗、优化器和评估指标

像往常一样，我们将设置一个损失函数、一个优化器和一个评估指标（我们可以设置多个评估指标，但现在我们将坚持准确性）。

from helper_functions import accuracy_fn
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(params=model_1.parameters(), 
                            lr=0.1)

6.2训练和测试回路的功能化

到目前为止，我们已经一遍又一遍地编写train和test循环。

我们再写一遍，但是这次我们把它们放在函数里，这样它们就可以被反复调用了。

因为我们现在使用的是与设备无关的代码，我们将确保在特征(X)和目标(y)张量上调用.to(device）。

对于训练循环，我们将创建一个名为train_step()的函数，它接受一个模型、一个数据加载器(DataLoader)、一个损失函数和一个优化器。

测试循环将是类似的，但它将被称为test_step()，它将接受一个模型、一个DataLoader、一个损失函数和一个求值函数。

[!TIP]

注意：由于这些都是函数，您可以以任何喜欢的方式自定义它们。我们在这里所做的可以看作是针对我们的特定分类用例的基本训练和测试功能。

def train_step(model: torch.nn.Module,
               data_loader: torch.utils.data.DataLoader,
               loss_fn: torch.nn.Module,
               optimizer: torch.optim.Optimizer,
               accuracy_fn,
               device: torch.device = device):
    train_loss, train_acc = 0, 0
    model.to(device)
    for batch, (X, y) in enumerate(data_loader):
        # Send data to GPU
        X, y = X.to(device), y.to(device)

        # 1. Forward pass
        y_pred = model(X)

        # 2. Calculate loss
        loss = loss_fn(y_pred, y)
        train_loss += loss
        train_acc += accuracy_fn(y_true=y,
                                 y_pred=y_pred.argmax(dim=1)) # Go from logits -> pred labels

        # 3. Optimizer zero grad
        optimizer.zero_grad()

        # 4. Loss backward
        loss.backward()

        # 5. Optimizer step
        optimizer.step()

    # Calculate loss and accuracy per epoch and print out what's happening
    train_loss /= len(data_loader)
    train_acc /= len(data_loader)
    print(f"Train loss: {
     train_loss:.5f} | Train accuracy: {
     train_acc:.2f}%")

def test_step(data_loader: torch.utils.data.DataLoader,
              model: torch.nn.Module,
              loss_fn: torch.nn.Module,
              accuracy_fn,
              device: torch.device = device):
    test_loss, test_acc = 0, 0
    model.to(device)
    model.eval() # put model in eval mode
    # Turn on inference context manager
    with torch.inference_mode(): 
        for X, y in data_loader:
            # Send data to GPU
            X, y = X.to(device), y.to(device)
            
            # 1. Forward pass
            test_pred = model(X)
            
            # 2. Calculate loss and accuracy
            test_loss += loss_fn(test_pred, y)
            test_acc += accuracy_fn(y_true=y,
                y_pred=test_pred.argmax(dim=1) # Go from logits -> pred labels
            )
        
        # Adjust metrics and print out
        test_loss /= len(data_loader)
        test_acc /= len(data_loader)
        print(f"Test loss: {
     test_loss:.5f} | Test accuracy: {
     test_acc:.2f}%\n")

哦吼!

现在我们有了一些用于训练和测试模型的函数，让我们运行它们。

我们将在每个epoch的另一个循环中这样做。

这样，对于每个epoch，我们都要经历一个训练步骤和一个测试步骤。

[!TIP]

注意：您可以自定义执行测试步骤的频率。有时人们每隔5个时期或10个时期做一次，在我们的例子中，每个时期做一次。

我们还可以计时，看看代码在GPU上运行需要多长时间。

# Import tqdm for progress bar
from tqdm.auto import tqdm

torch.manual_seed(42)

# Measure time
from timeit import default_timer as timer
train_time_start_on_gpu = timer()

epochs = 3
for epoch in tqdm(range(epochs)):
    print(f"Epoch: {
     epoch}\n---------")
    train_step(data_loader=train_dataloader, 
        model=model_1, 
        loss_fn=loss_fn,
        optimizer=optimizer,
        accuracy_fn=accuracy_fn
    )
    test_step(data_loader=test_dataloader,
        model=model_1,
        loss_fn=loss_fn,
        accuracy_fn=accuracy_fn
    )

train_time_end_on_gpu = timer()
total_train_time_model_1 = print_train_time(start=train_time_start_on_gpu,
                                            end=train_time_end_on_gpu,
                                            device=device)

输出为：

  0%|                                                                                            | 0/3 [00:00<?, ?it/s]Epoch: 0
---------
Train loss: 1.09199 | Train accuracy: 61.34%
Test loss: 0.95636 | Test accuracy: 65.00%

 33%|████████████████████████████                                                        | 1/3 [00:05<00:11,  5.79s/it]Epoch: 1
---------
Train loss: 0.78101 | Train accuracy: 71.93%
Test loss: 0.72227 | Test accuracy: 73.91%

 67%|████████████████████████████████████████████████████████                            | 2/3 [00:12<00:06,  6.26s/it]Epoch: 2
---------
Train loss: 0.67027 | Train accuracy: 75.94%
Test loss: 0.68500 | Test accuracy: 75.02%

100%|████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:19<00:00,  6.39s/it]
Train time on cuda: 19.161 seconds

太好了!

我们的模型训练了，但是训练时间更长？

[!TIP]

注意：CUDA vs CPU的训练时间很大程度上取决于你使用的CPU/GPU的质量。请继续阅读，以获得更详细的答案。

问题：“我用了GPU，但我的模型训练得并不快，这是为什么呢？”

答：嗯，一个原因可能是因为你的数据集和模型都很小（就像我们正在使用的数据集和模型），使用GPU的好处被实际传输数据所需的时间所抵消。

在将数据从CPU内存（默认）复制到GPU内存之间存在一个小瓶颈。

因此，对于较小的模型和数据集，CPU实际上可能是进行计算的最佳位置。

但对于更大的数据集和模型，GPU提供的计算速度通常远远超过获取数据的成本。

然而，这在很大程度上取决于您使用的硬件。通过练习，您将习惯训练模型的最佳地点。

让我们使用eval_model（）函数对训练好的model_1求值，看看结果如何。

# Note: This will error due to `eval_model()` not using device agnostic code 
model_1_results = eval_model(model=model_1, 
    data_loader=test_dataloader,
    loss_fn=loss_fn, 
    accuracy_fn=accuracy_fn) 
print(model_1_results )

输出为：

...
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

噢,不!

看起来我们的eval_model（）函数出错了：

RuntimeError：期望所有张量都在同一设备上，但发现至少有两个设备，cuda:0和cpu！（当检查wrapper_addmm方法中参数mat1的参数时）

这是因为我们已经将数据和模型设置为使用与设备无关的代码，而不是我们的求值函数。

我们通过将目标设备参数传递给eval_model（）函数来解决这个问题怎么样？

然后我们再试着计算结果。

# Move values to device
torch.manual_seed(42)
def eval_model(model: torch.nn.Module, 
               data_loader: torch.utils.data.DataLoader, 
               loss_fn: torch.nn.Module, 
               accuracy_fn, 
               device: torch.device = device):
    """Evaluates a given model on a given dataset.

    Args:
        model (torch.nn.Module): A PyTorch model capable of making predictions on data_loader.
        data_loader (torch.utils.data.DataLoader): The target dataset to predict on.
        loss_fn (torch.nn.Module): The loss function of model.
        accuracy_fn: An accuracy function to compare the models predictions to the truth labels.
        device (str, optional): Target device to compute on. Defaults to device.

    Returns:
        (dict): Results of model making predictions on data_loader.
    """
    loss, acc = 0, 0
    model.eval()
    with torch.inference_mode():
        for X, y in data_loader:
            # Send data to the target device
            X, y = X.to(device), y.to(device)
            y_pred = model(X)
            loss += loss_fn(y_pred, y)
            acc += accuracy_fn(y_true=y, y_pred=y_pred.argmax(dim=1))
        
        # Scale loss and acc
        loss /= len(data_loader)
        acc /= len(data_loader)
    return {
   "model_name": model.__class__.__name__, # only works when model was created with a class
            "model_loss": loss.item(),
            "model_acc": acc}

# Calculate model 1 results with device-agnostic code 
model_1_results = eval_model(model=model_1, data_loader=test_dataloader,
    loss_fn=loss_fn, accuracy_fn=accuracy_fn,
    device=device
)
print(model_1_results)

输出为：

{
   'model_name': 'FashionMNISTModelV1', 'model_loss': 2.302107095718384, 'model_acc': 10.75279552715655}

# Check baseline results
print(model_0_results)

输出为：

{
   'model_name': 'FashionMNISTModelV0', 'model_loss': 2.3190648555755615, 'model_acc': 10.85263578