Suppose this module PyTorch is a data extravagance circuit that allows us to filter information several times, and we can decide each time we decide the final result.
A simpler perspective of how to work with PyTorch can be explained by a simple example.
It's like a Christmas baby (PyTorch) that opens a multi-packed gift until it gets the final product - the desired gift.
The opening operations of the package involve smart moves called:
forward and backward passes.
The child's feedback can be called:
loss and backpropagate.
In this case, the child will try to remove from his package until he is satisfied and will not be lost (loss and backpropagate functions).
To compute the backward pass for a gradient and every time we backpropagate the gradient from a variable, the gradient is accumulative instead of being reset and replaced (most of networks designs call backward multiple times).
PyTorch comes with many loss functions.
Most of examples code create a mean square error loss function and later backpropagate the gradients based on the loss.
Will you ask me if the gift is shaped? I can tell you that the gift can contain from
Verge à Saint-Nicolas (unidimensional) to complex (multidimensional) structures - the most simplistic and worn out is the square one (two-dimensional matrix).
This gift is packed with magic in mathematical functions which allows the child to understand what is in the gift.
But the child is more special. He recognizes forms (matrices, shapes, simple formulas) and this allows him to open parts of the gift.
El poate roti acesta parti din cadou (mm).
The
mm is a matrix multiplication.
He can see the corners he can get from the gift.
ReLU stands for "rectified linear unit" and is a type of activation function.
Mathematically, it is defined as y = max(0, x).
He can see which parts of the gift are bigger or smaller so he can understand the gift.
This
clamp function clamps all elements in input into the returns[ min, max ] and returns a resulting tensor:
The
clamp should only affect gradients for values outside the min and max range.
The
pow function power with the exponent.
The
clone returns a copy of the self tensor. The copy has the same size and data type as self.
A common example is:
clamp(min=0) is exactly
ReLU().
PyTorch provides ReLU and its variants through the
torch.nn module.
If you run the program to look at the output, you will understand that the child has only five operations left and is already pleased with the way the gift result.
The source code is based on one example from
here:
import torch
dtype = torch.float
device = torch.device("cpu")
batch,input,hidden,output = 2,10,2,5
x = torch.randn(batch,input,device=device,dtype=dtype)
y = torch.randn(hidden,output,device=device,dtype=dtype)
w1 = torch.randn(input,hidden,device=device,dtype=dtype)
w2 = torch.randn(hidden,output,device=device,dtype=dtype)
l_r = 1e-6
for t in range(5):
h = x.mm(w1)
h_r = h.clamp(min=0)
y_p = h_r.mm(w2)
loss = (y_p - y).pow(2).sum().item()
print("t=",t,"loss=",loss,"\n")
g_y_p = 2.0 * (y_p -y)
g_w2 = h_r.t().mm(g_y_p)
g_h_r = g_y_p.mm(w2.t())
g_h = g_h_r.clone()
g_h[h<0 -="l_r" 0="" g_w1="" g_w2="" n="" print="" w1=",w1," w2=",w2,">0>
The child's result after five operations.
...
t= 4 loss= 25.40263557434082
w1= tensor([[ 1.5933, 0.3818],
[-1.0043, -1.3362],
[ 0.5841, -1.9811],
[ 2.3483, 0.5748],
[ 0.5904, -0.2521],
[-0.6612, 2.7945],
[ 0.4841, -0.5894],
[-1.4434, -0.1421],
[-1.2712, -1.4269],
[ 0.7929, 0.2040]]) w2= tensor([[ 1.7389, 0.4337, 0.4557, 1.3704, 0
.3819],
[ 0.2937, 0.0212, -0.4604, -1.0564, -1.5403]])