盡管pytorch已經(jīng)足夠易用,,但是仍然有一個(gè)模板化的代碼,,比如循環(huán)迭代訓(xùn)練網(wǎng)絡(luò),。參考keras的api封裝理念,,我們可以把常用的pytorch的代碼片斷,,也封裝起來,。
BaseModel的實(shí)現(xiàn) 繼承自nn.Module
compile傳入優(yōu)化器,,損失函數(shù),。注意,,我們在調(diào)用compile的時(shí)候再初始化
fit函數(shù)是sklearn風(fēng)格的訓(xùn)練接口。 loss = self.loss_fn(y_pred, y_train),,這里必須先傳y_pred,后傳y_train,,損失函數(shù)會(huì)檢查后者沒有梯度函數(shù)
如下三步是訓(xùn)練的核心: 每個(gè)迭代,需要把導(dǎo)數(shù)清零,,否則導(dǎo)數(shù)是累計(jì)的,;然后對loss進(jìn)行求導(dǎo)loss.backward(),最后往導(dǎo)數(shù)的方向迭代一步self.optimizer.step()
predict函數(shù),,在訓(xùn)練完成,,使用模型時(shí),需要把model標(biāo)記為測試模式,,即調(diào)用.eval方法,。
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
import torch
from torch import nn
from torch import optim
class BaseModel(nn.Module):
def __init__(self):
super(BaseModel,self).__init__()
def predict(self,x_test):
self.eval()
y_pred = self(x_test)
return y_pred
def save(self,path):
torch.save(self.state_dict(), path)
def compile(self,optimizer,loss,metrics=None):
for p in self.parameters():
print(p)
self.optimizer = optimizer(self.parameters(), lr=1e-4)
self.loss_fn = loss()
if metrics is not None:
self.metrics = metrics()
def fit(self,x_train,y_train):
# 開始訓(xùn)練
num_epochs = 1000
for epoch in range(num_epochs):
y_pred = self(x_train)
#這里順序不能反,前者是預(yù)測值out,,后者是target
# backward
self.optimizer.zero_grad()
loss = self.loss_fn(y_pred, y_train)
loss.backward()
self.optimizer.step()
if (epoch + 1) % 20 == 0:
print('Epoch[{}/{}], loss: {:.6f}'
.format(epoch + 1, num_epochs, loss.data[0]))
線性模型的改造 在這個(gè)BaseModel的基礎(chǔ)上對前文的線性模型進(jìn)行改造,,代碼就非常簡潔了。
定義模型的代碼不需要變:
# 線性模型
class LinearRegression(BaseModel):
def __init__(self):
super(LinearRegression, self).__init__()
self.linear = nn.Linear(1, 1) # input and output is 1 dimension
def forward(self, x):
out = self.linear(x)
return out
模型初始化,,只需要兩行代碼就夠了:
model = LinearRegression()
model.compile(optimizer=optim.SGD,loss=nn.MSELoss)
準(zhǔn)備數(shù)據(jù)并訓(xùn)練,,訓(xùn)練只需要調(diào)用一下fit就好了,相當(dāng)?shù)暮啙崳?/p>
import numpy as np
x_train = np.array([[3.3], [4.4], [5.5], [6.71], [6.93], [4.168],
[9.779], [6.182], [7.59], [2.167], [7.042],
[10.791], [5.313], [7.997], [3.1]])
y_train = np.array([[1.7], [2.76], [2.09], [3.19], [1.694], [1.573],
[3.366], [2.596], [2.53], [1.221], [2.827],
[3.465], [1.65], [2.904], [1.3]])
model.fit(torch.Tensor(x_train),torch.Tensor(y_train))
最后是模型的應(yīng)用,,對結(jié)果進(jìn)行繪制:
y_pred = model.predict(torch.Tensor(x_train))
data = y_pred.data
import matplotlib.pyplot as plt
plt.plot(x_train, y_pred.data.numpy(), label='Fitting Line')
plt.show()
model.save('./mymodel.pth')
邏輯回歸的實(shí)現(xiàn) logistic回歸是一種廣義線性回歸(generalized linear model),,因此與多重線性回歸分析有很多相同之處。它們的模型形式基本上相同,,都具有 wx+b,,其中w和b是待求參數(shù),其區(qū)別在于他們的因變量不同,,多重線性回歸直接將wx+b作為因變量,,即y =wx+b,而logistic回歸則通過函數(shù)L將wx+b對應(yīng)一個(gè)隱狀態(tài)p,,p=L(wx+b),然后根據(jù)p 與1-p的大小決定因變量的值,。如果L是logistic函數(shù),,就是logistic回歸,如果L是多項(xiàng)式函數(shù)就是多項(xiàng)式回歸,。
邏輯回歸模型,,雖然模型名字里有“回歸”兩字,但它是一個(gè)分類算法,,我們用mnist的圖片集來驗(yàn)證它的分類效果,。
模型的實(shí)現(xiàn)與線性模型幾乎一致,區(qū)別在于損失函數(shù)上:loss=nn.CrossEntropyLoss交叉熵?fù)p失,。
輸入?yún)?shù):in_dim是輸入N*784,輸出分類n_class為10類
# 定義 Logistic Regression 模型
class Logstic_Regression(BaseModel):
def __init__(self, in_dim, n_class):
super(Logstic_Regression, self).__init__()
self.logstic = nn.Linear(in_dim, n_class)
def forward(self, x):
out = self.logstic(x)
return out
mnist數(shù)據(jù)集準(zhǔn)備,torsh.util.data下的dataloader為數(shù)據(jù)預(yù)處理提供了很好的模型,,當(dāng)然我們也可以自己實(shí)現(xiàn)和擴(kuò)展,。
from torchvision import datasets,transforms
from torch.utils.data import DataLoader
train_dataset = datasets.MNIST(
root='./data', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = datasets.MNIST(
root='./data', train=False, transform=transforms.ToTensor())
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
為使用dataloader進(jìn)行模型訓(xùn)練,我們需要新增一個(gè)fit_loader的函數(shù),。主要是針對一批數(shù)據(jù)較大的情況,,比如minist訓(xùn)練集有6萬多張圖片,不可能一次在內(nèi)存里計(jì)算,。我們會(huì)設(shè)定batch_size=32,,那一次讀取32張,作為一批訓(xùn)練的樣本,。
def fit_dataloader(self,loader):
num_epochs = 3
for epoch in range(num_epochs):
for i,data in enumerate(loader):
img,y_train = data
y_pred = self(img.view(img.size()[0],-1))
#y_train = y_train.view(y_train.size()[0],-1)
print(y_train.size())
# backward
self.optimizer.zero_grad()
loss = self.loss_fn(y_pred, y_train)
loss.backward()
self.optimizer.step()
if i % 300 == 0:
print('Epoch[{}/{}], loss: {:.6f}'
.format(epoch + 1, num_epochs, loss.data[0]))
多層感知器MLP的實(shí)現(xiàn) MLP(Multi-Layer Perceptron),,即多層感知器,是一種前向結(jié)構(gòu)的人工神經(jīng)網(wǎng)絡(luò),,映射一組輸入向量到一組輸出向量,。MLP可以被看做是一個(gè)有向圖,由多個(gè)節(jié)點(diǎn)層組成,,每一層全連接到下一層,。除了輸入節(jié)點(diǎn),每個(gè)節(jié)點(diǎn)都是一個(gè)帶有非線性激活函數(shù)的神經(jīng)元(或稱處理單元),。一種被稱為反向傳播算法的監(jiān)督學(xué)習(xí)方法中間一個(gè)隱層,,克服了感知器不能對線性不可分?jǐn)?shù)據(jù)進(jìn)行識(shí)別的弱點(diǎn)。
模型實(shí)現(xiàn):
import torch
from torch import nn
from torch import optim
class MLP(BaseModel):
def __init__(self,n_input,n_hidden_1,n_hidden_2,n_output):
super(MLP,self).__init__()
self.layer1 = nn.Linear(n_input,n_hidden_1)
self.layer2 = nn.Linear(n_hidden_1,n_hidden_2)
self.layer3 = nn.Linear(n_hidden_2,n_output)
def forward(self, x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
return out
模型調(diào)用:
from torchvision import datasets,transforms
from torch.utils.data import DataLoader
train_dataset = datasets.MNIST(
root='./data', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = datasets.MNIST(
root='./data', train=False, transform=transforms.ToTensor())
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
model = MLP(28*28,300,100,10)
model.compile(optimizer=optim.SGD,loss=nn.CrossEntropyLoss)
model.fit_dataloader(loader)
優(yōu)化fit_dataloader函數(shù),,增加每一個(gè)batch_size的losss,計(jì)算準(zhǔn)確性的統(tǒng)計(jì),。
def fit_dataloader(self,loader):
num_epochs = 10
for epoch in range(num_epochs):
batch_loss = 0.0
batch_acc = 0.0
step = 300
for i,data in enumerate(loader):
img,y_train = data
y_pred = self(img.view(img.size()[0],-1))
#計(jì)算當(dāng)前批次的準(zhǔn)確率(max:返回每列最大的,第二個(gè)返回值是對應(yīng)的下標(biāo))
_, pred = torch.max(y_pred, 1)
num_correct = (pred == y_train).sum()
batch_acc += num_correct.data.item()
# 反向傳播
self.optimizer.zero_grad()
loss = self.loss_fn(y_pred, y_train)
#計(jì)算當(dāng)前批次的損失
batch_loss += loss.data.item()
loss.backward()
self.optimizer.step()
if (i+1) % step == 0:
print('Epoch[{}/{}],batch:{}, avg loss: {:.6f},train acc:{:.6f}'
.format(epoch + 1, num_epochs,i+1, batch_loss/step,batch_acc/(step*32)))
batch_loss = 0.0
batch_acc = 0.0
這樣可以清楚看出,,訓(xùn)練過程中的損失在逐漸背叛以及準(zhǔn)確性在穩(wěn)步上升,。
Epoch[1/10],batch:300, avg loss: 2.286891,train acc:0.163438
Epoch[1/10],batch:600, avg loss: 2.283830,train acc:0.160000
Epoch[1/10],batch:900, avg loss: 2.276441,train acc:0.180312
......
Epoch[10/10],batch:300, avg loss: 1.938921,train acc:0.677188
Epoch[10/10],batch:600, avg loss: 1.928737,train acc:0.679896
Epoch[10/10],batch:900, avg loss: 1.920588,train acc:0.690208
Epoch[10/10],batch:1200, avg loss: 1.912090,train acc:0.691562
Epoch[10/10],batch:1500, avg loss: 1.908464,train acc:0.689583
Epoch[10/10],batch:1800, avg loss: 1.892407,train acc:0.696667
對模型進(jìn)行評(píng)估,擴(kuò)展BaseModel,,新增predict_loader函數(shù):
def predict_dataloader(self,loader):
self.eval()
total_loss = 0.0
acc = 0.0
for data in test_loader:
img, y_train = data
img = img.view(img.size(0), -1)
y_pred = self(img)
loss = self.loss_fn(y_pred, y_train)
total_loss += loss.data.item()
_, pred = torch.max(y_pred, 1)
num_correct = (pred == y_train).sum()
acc += num_correct.data.item()
print(total_loss/len(loader),acc/32/len(loader))
從結(jié)果上看出來,,經(jīng)過10輪的迭代,在測試集上得到總體67.16%的準(zhǔn)確率。
1.9051747440149227 0.6716253993610224
關(guān)于作者:魏佳斌,互聯(lián)網(wǎng)產(chǎn)品/技術(shù)總監(jiān),,北京大學(xué)光華管理學(xué)院(MBA),特許金融分析師(CFA),,資深產(chǎn)品經(jīng)理/碼農(nóng)。偏愛python,,深度關(guān)注互聯(lián)網(wǎng)趨勢,,人工智能,AI金融量化,。致力于使用最前沿的認(rèn)知技術(shù)去理解這個(gè)復(fù)雜的世界,。AI量化開源項(xiàng)目: