反向传播

计算图Computational Graph

两层神经网络

$\hat{y}=W_2(W_1 \cdot X + b_1) + b_2$

上述式子其实是可以化简的，即多层网络能找到一个等价的一层网络

如果在每一层末尾加一个非线性函数，式子变得无法化简：

链式求导

反馈过程

一个具体的例子，首先是一个前馈过程，再是一个反馈的过程：

loss对要更新的参数求偏导

PyTorch实现

在Tensor中包含了数据和梯度

代码如下：

import torch

x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]

w = torch.Tensor([1.0])
w.requires_grad = True  # If autograd mechanics are required, the element variable
                         # requires_grad of Tensor has to be set to True

def forward(x):
    return x * w

def loss(x, y):
    y_pred = forward(x)
    return (y_pred - y) ** 2

print("predict (before training)", 4, forward(4).item())

for epoch in range(100):
    for x, y in zip(x_data, y_data):
        l = loss(x, y)
        l.backward()    # Backward, compute grad for Tensor whose requires_grad set to True
        print("\tgrad:", x, y, w.grad.item())
        w.data = w.data - 0.01 * w.grad.data # 注意：Tensor的运算会生成计算图，
                                               # 这里w.data - 0.01 * w.grad.data一定要赋值给w.data而不是一个新的变量（否则会生成新的计算图）

        w.grad.data.zero_() # The grad computed by .backward() will be accumulated.
                            # So after update, remember set the grad to ZERO!

    print("progress:", epoch, l.item())

print("predict (after training)", 4, forward(4).item())

在Colab上运行

课程来源：[https://www.bilibili.com/video/BV1Y7411d7Ys?p=4)

mhlwsk的博客

《PyTorch深度学习实践》4.反向传播

反向传播

计算图Computational Graph

链式求导

PyTorch实现