本文代碼基于PyTorch 1.0版本,,需要用到以下包 import collectionsimport osimport shutilimport tqdmimport numpy as npimport PIL.Imageimport torchimport torchvision
1 基礎(chǔ)配置1-1 檢查PyTorch版本
1-2 更新PyTorchPyTorch將被安裝在anaconda3/lib/python3.7/site-packages/torch/目錄下,。 conda update pytorch torchvision -c pytorch
1-3 固定隨機(jī)種子
1-4 指定程序運(yùn)行在特定GPU卡上在命令行指定環(huán)境變量 CUDA_VISIBLE_DEVICES=0,1 python train.py
或在代碼中指定
1-5 判斷是否有CUDA支持torch.cuda.is_available()
1-6 設(shè)置為cuDNN benchmark模式Benchmark模式會(huì)提升計(jì)算速度,但是由于計(jì)算中有隨機(jī)性,,每次網(wǎng)絡(luò)前饋結(jié)果略有差異,。
如果想要避免這種結(jié)果波動(dòng),,設(shè)置 torch.backends.cudnn.deterministic = True
1-7 清除GPU存儲(chǔ)有時(shí)Control-C中止運(yùn)行后GPU存儲(chǔ)沒有及時(shí)釋放,需要手動(dòng)清空,。在PyTorch內(nèi)部可以
或在命令行可以先使用ps找到程序的PID,,再使用kill結(jié)束該進(jìn)程 ps aux | grep pythonkill -9 [pid]
或者直接重置沒有被清空的GPU
2 張量處理2-1 張量基本信息tensor.type() # Data typetensor.size() # Shape of the tensor. It is a subclass of Python tupletensor.dim() # Number of dimensions.
2-2 數(shù)據(jù)類型轉(zhuǎn)換Set default tensor type. Float in PyTorch is much faster than double.
Type convertions. tensor = tensor.cuda()tensor = tensor.cpu()tensor = tensor.float()tensor = tensor.long()
2-3 torch.Tensor與np.ndarray轉(zhuǎn)換torch.Tensor -> np.ndarray.
np.ndarray -> torch.Tensor. tensor = torch.from_numpy(ndarray).float()tensor = torch.from_numpy(ndarray.copy()).float() # If ndarray has negative stride
2-4 torch.Tensor與PIL.Image轉(zhuǎn)換PyTorch中的張量默認(rèn)采用N×D×H×W的順序,并且數(shù)據(jù)范圍在[0, 1],,需要進(jìn)行轉(zhuǎn)置和規(guī)范化,。 torch.Tensor -> PIL.Image.
PIL.Image -> torch.Tensor. tensor = torch.from_numpy(np.asarray(PIL.Image.open(path))).permute(2, 0, 1).float() / 255tensor = torchvision.transforms.functional.to_tensor(PIL.Image.open(path)) # Equivalently way
2-5 np.ndarray與PIL.Image轉(zhuǎn)換np.ndarray -> PIL.Image.
PIL.Image -> np.ndarray. ndarray = np.asarray(PIL.Image.open(path))
2-6 從只包含一個(gè)元素的張量中提取值這在訓(xùn)練時(shí)統(tǒng)計(jì)loss的變化過程中特別有用。否則這將累積計(jì)算圖,,使GPU存儲(chǔ)占用量越來越大,。
2-7 張量形變張量形變常常需要用于將卷積層特征輸入全連接層的情形。相比torch.view,,torch.reshape可以自動(dòng)處理輸入張量不連續(xù)的情況,。 tensor = torch.reshape(tensor, shape)
2-8 打亂順序
2-9 水平翻轉(zhuǎn)PyTorch不支持tensor[::-1]這樣的負(fù)步長操作,水平翻轉(zhuǎn)可以用張量索引實(shí)現(xiàn),。 tensor = tensor[:, :, :, torch.arange(tensor.size(3) - 1, -1, -1).long()]
2-10 復(fù)制張量有三種復(fù)制的方式,,對(duì)應(yīng)不同的需求。
2-11 拼接張量注意torch.cat和torch.stack的區(qū)別在于torch.cat沿著給定的維度拼接,,而torch.stack會(huì)新增一維,。例如當(dāng)參數(shù)是3個(gè)10×5的張量,torch.cat的結(jié)果是30×5的張量,,而torch.stack的結(jié)果是3×10×5的張量,。
2-12 將整數(shù)標(biāo)記轉(zhuǎn)換成獨(dú)熱(one-hot)編碼PyTorch中的標(biāo)記默認(rèn)從0開始。 N = tensor.size(0)one_hot = torch.zeros(N, num_classes).long()one_hot.scatter_(dim=1, index=torch.unsqueeze(tensor, dim=1), src=torch.ones(N, num_classes).long())
2-13 得到非零/零元素
2-14 張量擴(kuò)展Expand tensor of shape torch.reshape(tensor, (64, 512, 1, 1)).expand(64, 512, 7, 7)
2-15 矩陣乘法Matrix multiplication:
Batch matrix multiplication: result = torch.bmm(tensor1, tensor2)
Element-wise multiplication.
2-16 計(jì)算兩組數(shù)據(jù)之間的兩兩歐式距離X1 is of shape X1 = torch.unsqueeze(X1, dim=1).expand(m, n, d)
X2 is of shape
dist is of shape m*n, where dist = torch.sqrt(torch.sum((X1 - X2) ** 2, dim=2))
3 模型定義3-1 卷積層最常用的卷積層配置是
如果卷積層配置比較復(fù)雜,,不方便計(jì)算輸出大小時(shí),,可以利用如下可視化工具輔助 3-2 GAP(Global average pooling)層gap = torch.nn.AdaptiveAvgPool2d(output_size=1)
3-3 雙線性匯合(bilinear pooling)
3-4 多卡同步BN(Batch normalization)當(dāng)使用torch.nn.DataParallel將代碼運(yùn)行在多張GPU卡上時(shí),PyTorch的BN層默認(rèn)操作是各卡上數(shù)據(jù)獨(dú)立地計(jì)算均值和標(biāo)準(zhǔn)差,,同步BN使用所有卡上的數(shù)據(jù)一起計(jì)算BN層的均值和標(biāo)準(zhǔn)差,,緩解了當(dāng)批量大?。╞atch size)比較小時(shí)對(duì)均值和標(biāo)準(zhǔn)差估計(jì)不準(zhǔn)的情況,是在目標(biāo)檢測(cè)等任務(wù)中一個(gè)有效的提升性能的技巧,。 鏈接:https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 3-5 類似BN滑動(dòng)平均如果要實(shí)現(xiàn)類似BN滑動(dòng)平均的操作,,在forward函數(shù)中要使用原地(inplace)操作給滑動(dòng)平均賦值。 class BN(torch.nn.Module) def __init__(self): ... self.register_buffer('running_mean', torch.zeros(num_features)) def forward(self, X): ... self.running_mean += momentum * (current - self.running_mean)
3-6 計(jì)算模型整體參數(shù)量
類似Keras的model.summary()輸出模型信息 3-7 模型權(quán)值初始化注意model.modules()和model.children()的區(qū)別:model.modules()會(huì)迭代地遍歷模型的所有子層,,而model.children()只會(huì)遍歷模型下的一層,。 # Common practise for initialization.for layer in model.modules(): if isinstance(layer, torch.nn.Conv2d): torch.nn.init.kaiming_normal_(layer.weight, mode='fan_out', nonlinearity='relu') if layer.bias is not None: torch.nn.init.constant_(layer.bias, val=0.0) elif isinstance(layer, torch.nn.BatchNorm2d): torch.nn.init.constant_(layer.weight, val=1.0) torch.nn.init.constant_(layer.bias, val=0.0) elif isinstance(layer, torch.nn.Linear): torch.nn.init.xavier_normal_(layer.weight) if layer.bias is not None: torch.nn.init.constant_(layer.bias, val=0.0)# Initialization with given tensor.layer.weight = torch.nn.Parameter(tensor)
3-8 部分層使用預(yù)訓(xùn)練模型注意如果保存的模型是torch.nn.DataParallel,則當(dāng)前的模型也需要是torch.nn.DataParallel,。torch.nn.DataParallel(model).module == model。
3-9 將在GPU保存的模型加載到CPUmodel.load_state_dict(torch.load('model,pth', map_location='cpu'))
4 數(shù)據(jù)準(zhǔn)備,、特征提取與微調(diào)4-1 得到視頻數(shù)據(jù)基本信息
4-2 TSN每段(segment)采樣一幀視頻K = self._num_segmentsif is_train: if num_frames > K: # Random index for each segment. frame_indices = torch.randint( high=num_frames // K, size=(K,), dtype=torch.long) frame_indices += num_frames // K * torch.arange(K) else: frame_indices = torch.randint( high=num_frames, size=(K - num_frames,), dtype=torch.long) frame_indices = torch.sort(torch.cat(( torch.arange(num_frames), frame_indices)))[0]else: if num_frames > K: # Middle index for each segment. frame_indices = num_frames / K // 2 frame_indices += num_frames // K * torch.arange(K) else: frame_indices = torch.sort(torch.cat(( torch.arange(num_frames), torch.arange(K - num_frames))))[0]assert frame_indices.size() == (K,)return [frame_indices[i] for i in range(K)]
4-3 提取ImageNet預(yù)訓(xùn)練模型某層的卷積特征VGG-16 relu5-3 feature.
VGG-16 pool5 feature. model = torchvision.models.vgg16(pretrained=True).features
VGG-16 fc7 feature.
ResNet GAP feature. model = torchvision.models.resnet18(pretrained=True)model = torch.nn.Sequential(collections.OrderedDict( list(model.named_children())[:-1]))with torch.no_grad(): model.eval() conv_representation = model(image)
4-4 提取ImageNet預(yù)訓(xùn)練模型多層的卷積特征
4-5 其他預(yù)訓(xùn)練模型鏈接:https://github.com/Cadene/pretrained-models.pytorch 4-6 微調(diào)全連接層model = torchvision.models.resnet18(pretrained=True)for param in model.parameters(): param.requires_grad = Falsemodel.fc = nn.Linear(512, 100) # Replace the last fc layeroptimizer = torch.optim.SGD(model.fc.parameters(), lr=1e-2, momentum=0.9, weight_decay=1e-4)
4-7 以較大學(xué)習(xí)率微調(diào)全連接層,,較小學(xué)習(xí)率微調(diào)卷積層
5 模型訓(xùn)練5-1 常用訓(xùn)練和驗(yàn)證數(shù)據(jù)預(yù)處理其中ToTensor操作會(huì)將PIL.Image或形狀為H×W×D,數(shù)值范圍為[0, 255]的np.ndarray轉(zhuǎn)換為形狀為D×H×W,,數(shù)值范圍為[0.0, 1.0]的torch.Tensor,。 train_transform = torchvision.transforms.Compose([ torchvision.transforms.RandomResizedCrop(size=224, scale=(0.08, 1.0)), torchvision.transforms.RandomHorizontalFlip(), torchvision.transforms.ToTensor(), torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)), ]) val_transform = torchvision.transforms.Compose([ torchvision.transforms.Resize(224), torchvision.transforms.CenterCrop(224), torchvision.transforms.ToTensor(), torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),])
5-2 訓(xùn)練基本代碼框架
5-3 標(biāo)記平滑(label smoothing)for images, labels in train_loader: images, labels = images.cuda(), labels.cuda() N = labels.size(0) # C is the number of classes. smoothed_labels = torch.full(size=(N, C), fill_value=0.1 / (C - 1)).cuda() smoothed_labels.scatter_(dim=1, index=torch.unsqueeze(labels, dim=1), value=0.9) score = model(images) log_prob = torch.nn.functional.log_softmax(score, dim=1) loss = -torch.sum(log_prob * smoothed_labels) / N optimizer.zero_grad() loss.backward() optimizer.step()
5-4 Mixup
5-5 L1正則化l1_regularization = torch.nn.L1Loss(reduction='sum')loss = ... # Standard cross-entropy lossfor param in model.parameters(): loss += torch.sum(torch.abs(param))loss.backward()
5-6 不對(duì)偏置項(xiàng)進(jìn)行L2正則化/權(quán)值衰減(weight decay)
5-7 梯度裁剪(gradient clipping)torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=20)
5-8 計(jì)算Softmax輸出的準(zhǔn)確率
5-9 可視化模型前饋的計(jì)算圖鏈接:https://github.com/szagoruyko/pytorchviz 5-10 可視化學(xué)習(xí)曲線有 Facebook 自己開發(fā)的 Visdom 和 Tensorboard 兩個(gè)選擇。 # Example using Visdom.vis = visdom.Visdom(env='Learning curve', use_incoming_socket=False)assert self._visdom.check_connection()self._visdom.close()options = collections.namedtuple('Options', ['loss', 'acc', 'lr'])( loss={'xlabel': 'Epoch', 'ylabel': 'Loss', 'showlegend': True}, acc={'xlabel': 'Epoch', 'ylabel': 'Accuracy', 'showlegend': True}, lr={'xlabel': 'Epoch', 'ylabel': 'Learning rate', 'showlegend': True})for t in epoch(80): tran(...) val(...) vis.line(X=torch.Tensor([t + 1]), Y=torch.Tensor([train_loss]), name='train', win='Loss', update='append', opts=options.loss) vis.line(X=torch.Tensor([t + 1]), Y=torch.Tensor([val_loss]), name='val', win='Loss', update='append', opts=options.loss) vis.line(X=torch.Tensor([t + 1]), Y=torch.Tensor([train_acc]), name='train', win='Accuracy', update='append', opts=options.acc) vis.line(X=torch.Tensor([t + 1]), Y=torch.Tensor([val_acc]), name='val', win='Accuracy', update='append', opts=options.acc) vis.line(X=torch.Tensor([t + 1]), Y=torch.Tensor([lr]), win='Learning rate', update='append', opts=options.lr)
5-11 得到當(dāng)前學(xué)習(xí)率If there is one global learning rate (which is the common case).
If there are multiple learning rates for different layers. all_lr = []for param_group in optimizer.param_groups: all_lr.append(param_group['lr'])
5-12 學(xué)習(xí)率衰減Reduce learning rate when validation accuarcy plateau.
Cosine annealing learning rate. scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=80)
Reduce learning rate by 10 at given epochs.
Learning rate warmup by 10 epochs. scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda t: t / 10)for t in range(0, 10): scheduler.step() train(...); val(...)
5-13 保存與加載斷點(diǎn)注意為了能夠恢復(fù)訓(xùn)練,,我們需要同時(shí)保存模型和優(yōu)化器的狀態(tài),,以及當(dāng)前的訓(xùn)練輪數(shù)。
Load checkpoint. if resume: model_path = os.path.join('model', 'checkpoint.pth.tar') assert os.path.isfile(model_path) checkpoint = torch.load(model_path) best_acc = checkpoint['best_acc'] start_epoch = checkpoint['epoch'] model.load_state_dict(checkpoint['model']) optimizer.load_state_dict(checkpoint['optimizer']) print('Load checkpoint at epoch %d.' % start_epoch)
5-14 計(jì)算準(zhǔn)確率,、查準(zhǔn)率(precision),、查全率(recall)
6 PyTorch其他注意事項(xiàng)6-1 模型定義
6-2 PyTorch性能與調(diào)試
或者在命令行運(yùn)行 python -m torch.utils.bottleneck main.py
本文轉(zhuǎn)載自:PyTorch Cookbook(常用代碼段整理合集)https://zhuanlan.zhihu.com/p/59205847 |
|