PyTorch 基础操作

在面试的过程中深刻认识到自己的框架方面的代码能力严重不足
在问到熟悉Pytorch还是Tensorflow的时候完全没有底气
痛定思痛，决心恶补一下Pytorch的基础
以日常中经常用到的feature为主，进行一个查漏补缺

tensor常用属性

1	x = torch.rand(2, 2, 4)

tensor维度 x.dim()
tensor形状 x.shape, 其是x.size()的别名，两者等价

tensor操作

repeat_interleave

作用：将tensor中的每个元素复制若干份，得到新的一个tensor

1 2	a = torch.tensor([[2.0, 3.0], [1.0, 4.0]]) torch.repeat_interleave(a, 2)

得到结果tensor([2., 2., 3., 3., 1., 1., 4., 4.])

增加维度

可以对tensor新加入一个维度，该维度只有一个元素，(“unsqueeze” a dimension)
可以通过用None完成索引操作，就可以插入一个维度
例如对于valid_len = tensor([2, 2, 3, 3])
valid_len[:, None]可以得到tensor([[2],[2],[3],[3]])

reshape

作用：将tensor改变维度形状
常见用法a.reshpe(-1)，把整个tensor展平变成一个一维向量
再比如a.reshape([1, 2, 2]，传入一个向量形状的list

利用广播机制构造mask矩阵

假设对于形状为[x, 4]的tensor X，以及一个形状为[4]的valid_len tensor
可以通过上面提到的增加维度机制以及广播机制构造出一个maks矩阵，将非valid_len的部分变为0
做法如下

1 2	maxlen = X.shape[-1] mask = torch.arange((maxlen), dtype=torch.float32,)[None, :] < valid_len[:, None]

前部分会是一个tensor([[0., 1., 2., 3.]])
后部分是tensor([[2],[2],[3],[3]])
得到的结果是

tensor([[ True,  True, False, False],
        [ True,  True, False, False],
        [ True,  True,  True, False],
        [ True,  True,  True, False]])

再进行一步X[~mask] = 0就可以将将非valid_len的部分置为0

gather

作用：从原tensor中获取指定dim和指定index的数据
用途：方便从批量tensor中获取指定索引下的数据，该索引是高度自定义化的，可乱序的
具体用法不是很容易懂，只介绍基础的用法
在求交叉熵的时候，需要把输入的batch中的标签值对应的概率取出来作为loss的计算
对于给定y_hat = torch.tensor([[0.1, 0.3, 0.6], [0.3, 0.2, 0.5]])，标签值y = torch.LongTensor([0, 2])
需要分别取出每个tensor对应的第0位和第2位，就可以用到这个gather函数
通过y_hat.gather(1, y_hat.reshape(-1, 1))就可以得到tensor([[0.6000],[0.2000]])
其中第一个参数为指定的dim, dim=1表示的是对于每一行分别取出对应索引的位置
再给几个例子理解

import torch
tensor_0 = torch.arange(3, 12).view(3, 3)
print(tensor_0)
# tensor([[ 3,  4,  5],
#         [ 6,  7,  8],
#         [ 9, 10, 11]])
index = torch.tensor([[2, 1, 0]])
print(tensor_0.gather(0, index))
# tensor([[9, 7, 5]])
print(tensor_0.gather(1, index))
# tensor([[5, 4, 3]])

Softmax函数

来源

1	import torch.nn.functional as F

作用：将一个tensor中的数值变换到0-1之间，且所有数值加和为1，符合概率的条件
使用方法

1 2	x = torch.rand(2, 2, 4) F.softmax(x, dim=-1)

原始X

tensor([[[0.9074, 0.9394, 0.3391, 0.6002],
         [0.0155, 0.6998, 0.3552, 0.4390]],
        [[0.4515, 0.6038, 0.4322, 0.6438],
         [0.0143, 0.0564, 0.3362, 0.3688]]])

变换后的X

tensor([[[0.2999, 0.3097, 0.1699, 0.2206],
         [0.1691, 0.3352, 0.2375, 0.2582]],
        [[0.2295, 0.2673, 0.2251, 0.2782],
         [0.2063, 0.2151, 0.2846, 0.2940]]])

也可以直接对tensor进行softmax操作，语法类似

1	a.softmax(dim=-1)

可以注意到，在进行softmax的时候需要用到一个dim操作，指定要进行softmax的维度
绝大部分场景下可以取-1，即对最后一个维度的向量取softmax

pad函数

来源

1	import torch.nn.functional as F

作用：可以将输入的tensor按照需要在前后填充padding
def pad(input, pad, mode=’constant’, value=0)
input : 输入张量
pad：指定padding的维度和数目，形式是元组。
mode: 填充模式，
value：填充的值
对于tensor x
可以通过代码pad(x, (1, 4))进行填充，对于x最后一个的维度，在前面填充1个单位，后面填充4个单位
并且可以根据需要进行更多维度上的填充，只需要在pad元组上继续添加即可

1
2
3

tensor([[[-0.6197,  0.5243],
         [-1.1442,  1.2088],
         [ 0.0783, -0.2526]]])

施加pad操作F.pad(t, (1,1))后，返回扩充后的tensor

1
2
3

tensor([[[ 0.0000, -0.6197,  0.5243,  0.0000],
         [ 0.0000, -1.1442,  1.2088,  0.0000],
         [ 0.0000,  0.0783, -0.2526,  0.0000]]])

如果对于多个维度添加F.pad(t, (1,1,1,2))，可以得到

tensor([[[ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000, -0.6197,  0.5243,  0.0000],
         [ 0.0000, -1.1442,  1.2088,  0.0000],
         [ 0.0000,  0.0783, -0.2526,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.0000]]])