DL之DNN優(yōu)化技術(shù):DNN優(yōu)化器的參數(shù)優(yōu)化—更新參數(shù)的四種最優(yōu)化方法(SGD/Momentum/AdaGrad/Adam)的案例理解,、圖表可視化比較
四種最優(yōu)化方法簡(jiǎn)介
DL之DNN優(yōu)化技術(shù):神經(jīng)網(wǎng)絡(luò)算法簡(jiǎn)介之GD/SGD算法(BP算法)的簡(jiǎn)介,、理解,、代碼實(shí)現(xiàn),、SGD缺點(diǎn)及改進(jìn)(Momentum/NAG/Ada系列/RMSProp)之詳細(xì)攻略
優(yōu)化器案例理解
輸出結(jié)果
設(shè)計(jì)思路
核心代碼
#T1,、SGD算法
class SGD:
'……'
def update(self, params, grads):
for key in params.keys():
params[key] -= self.lr * grads[key]
#T2、Momentum算法
import numpy as np
class Momentum:
'……'
def update(self, params, grads):
if self.v is None:
self.v = {}
for key, val in params.items():
self.v[key] = np.zeros_like(val)
for key in params.keys():
self.v[key] = self.momentum*self.v[key] - self.lr*grads[key]
params[key] += self.v[key]
#T3,、AdaGrad算法
'……'
def update(self, params, grads):
if self.h is None:
self.h = {}
for key, val in params.items():
self.h[key] = np.zeros_like(val)
for key in params.keys():
self.h[key] += grads[key] * grads[key]
params[key] -= self.lr * grads[key] / (np.sqrt(self.h[key]) + 1e-7)
#T4,、Adam算法
'……'
def update(self, params, grads):
if self.m is None:
self.m, self.v = {}, {}
for key, val in params.items():
self.m[key] = np.zeros_like(val)
self.v[key] = np.zeros_like(val)
self.iter += 1
lr_t = self.lr * np.sqrt(1.0 - self.beta2**self.iter) / (1.0 - self.beta1**self.iter)
for key in params.keys():
self.m[key] += (1 - self.beta1) * (grads[key] - self.m[key])
self.v[key] += (1 - self.beta2) * (grads[key]**2 - self.v[key])
params[key] -= lr_t * self.m[key] / (np.sqrt(self.v[key]) + 1e-7)
相關(guān)文章
DL之DNN:自定義五層DNN(5*100+ReLU+SGD/Momentum/AdaGrad/Adam四種最優(yōu)化)對(duì)MNIST數(shù)據(jù)集訓(xùn)練進(jìn)而比較不同方法的性能