《PyTorch深度学习实践》4.反向传播

反向传播

计算图Computational Graph

两层神经网络

$\hat{y}=W_2(W_1 \cdot X + b_1) + b_2$

image-20200809153529355

上述式子其实是可以化简的,即多层网络能找到一个等价的一层网络

image-20200809153814657

如果在每一层末尾加一个非线性函数,式子变得无法化简:
image-20200809154034556

链式求导

image-20200809154656704

反馈过程

image-20200809160816544

一个具体的例子,首先是一个前馈过程,再是一个反馈的过程:

image-20200812235428202

loss对要更新的参数求偏导

PyTorch实现

在Tensor中包含了数据梯度

代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import torch

x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]

w = torch.Tensor([1.0])
w.requires_grad = True # If autograd mechanics are required, the element variable
# requires_grad of Tensor has to be set to True

def forward(x):
return x * w

def loss(x, y):
y_pred = forward(x)
return (y_pred - y) ** 2

print("predict (before training)", 4, forward(4).item())

for epoch in range(100):
for x, y in zip(x_data, y_data):
l = loss(x, y)
l.backward() # Backward, compute grad for Tensor whose requires_grad set to True
print("\tgrad:", x, y, w.grad.item())
w.data = w.data - 0.01 * w.grad.data # 注意:Tensor的运算会生成计算图,
# 这里w.data - 0.01 * w.grad.data一定要赋值给w.data而不是一个新的变量(否则会生成新的计算图)

w.grad.data.zero_() # The grad computed by .backward() will be accumulated.
# So after update, remember set the grad to ZERO!

print("progress:", epoch, l.item())

print("predict (after training)", 4, forward(4).item())

在Colab上运行

课程来源:[https://www.bilibili.com/video/BV1Y7411d7Ys?p=4)