TensorFlow - TF-Slim 封裝模塊

LibraryPKU 2022-03-10

展開全文

TensorFlow-SlimTF?Slim 是 2016 年開源庫,，主要用于"代碼瘦身",，便于模型定義,，并給出了一些圖像分析模型. TF-Slim 是用于 TensorFlow 復(fù)雜模型的定義、訓(xùn)練和評估的輕量庫.

[tensorflow/contrib/slim]
[tensorflow/models/tree/master/research/slim/]

模塊導(dǎo)入：

import tensorflow.contrib.slim as slim

TF-Slim 用于神經(jīng)網(wǎng)絡(luò)的構(gòu)建,、訓(xùn)練和評估：

去除樣板代碼
boilerplatecode
,，便于更緊湊的定義模型.
利用通用正則子regularizers, 更易于開發(fā)模型.
提供部分常用視覺模型
如，,，等如,，VGGNet，AlexNet等
.
更易于復(fù)雜模型的擴展,，采用已有模型斷點,，進行模型訓(xùn)練算法的初始化.

TF-Slim 由獨立的幾個部分組成，主要有：

arg_scope - 新作用域 arg_scope,，用于定義作用域內(nèi)特定操作的模型參數(shù).
data - 包括 TF-Slim 的數(shù)據(jù)集定義-dataset, data providers, parallel_reader 和 decoding.
evaluation - 模型評估方法.
layers - TensorFlow 模型創(chuàng)建的高層封裝.
learning - 模型訓(xùn)練方法.
losses - 常用 loss 函數(shù).
metrics - 常用度量評測方法.
nets - 常用網(wǎng)絡(luò)定義,，如 VGG 和 AlexNet.
queues - 隊列管理器，便于簡單安全地進行 QueueRunners 的啟動和關(guān)閉.
regularizers - 權(quán)重正則子.
variables - 變量創(chuàng)建和管理的封裝.

TF-Slim 通過結(jié)合 variables, layers 和 scopes 進行模型定義.

<h3>3.1 Variables</h3>

TensorFlow 的原始變量 Variables 定義需要預(yù)定義值或者初始化方法如,，隨機采樣如,，Gaissian隨機采樣.
而且,，如果需要指定變量創(chuàng)建所在指定設(shè)備，如GPU,，還需要顯式指定.

TF-Slim 提供了更輕量的變量封裝函數(shù) - variables.py.

例如,，創(chuàng)建 weight 變量，并采用截斷正態(tài)分布truncatednormaldistribution初始化,，l2_loss 正則化,，CPU 上，

weights = slim.variable('weights',
                             shape=[10, 10, 3 , 3],
                             initializer=tf.truncated_normal_initializer(stddev=0.1),
                             regularizer=slim.l2_regularizer(0.05),
                             device='/CPU:0')

TensorFlow 原始實現(xiàn)中,，有兩種類型的變量：regular variables 和 localtransientvariables.
大部分變量都是 regular variables,，其一旦創(chuàng)建，都可以采用 saver 保存到磁盤.
local variables 的生存周期只是在會話session 期間,，不被保存到磁盤.

TF-Slim 通過定義模型變量，進一步對變量區(qū)分. 模型變量表示了模型的參數(shù)modelvariables.
網(wǎng)絡(luò)學(xué)習(xí)時,，模型變量進行訓(xùn)練或 fine-tuned,；
模型評估和推斷時，模型變量從斷點checkpoint 進行加載.
例如,，slim.fully_connected 和 slim.conv2d 創(chuàng)建的變量.

非模型變量non?modelvariables是網(wǎng)絡(luò)學(xué)習(xí)或評估時時的所有其它模型變量,，但真實推斷不需要的變量.
例如，global_step 是網(wǎng)絡(luò)學(xué)習(xí)是網(wǎng)絡(luò)學(xué)習(xí)和評估時使用的變量,，但并不是模型的一部分.
類似的,，滑動平均movingaverage變量可能反映了模型變量，但并不是模型自身的變量.

采用 TF-Slim 可以很簡單的創(chuàng)建與檢索模型變量modelvariables和正則變量regularvariables：

# Model Variablesweights = slim.model_variable('weights',
                              shape=[10, 10, 3 , 3],
                              initializer=tf.truncated_normal_initializer(stddev=0.1),
                              regularizer=slim.l2_regularizer(0.05),
                              device='/CPU:0')
model_variables = slim.get_model_variables()# Regular variablesmy_var = slim.variable('my_var',
                       shape=[20, 1],
                       initializer=tf.zeros_initializer())
regular_variables_and_model_variables = slim.get_variables()

這是怎么實現(xiàn)的呢,？
當(dāng)采用 TF-Slime 的網(wǎng)絡(luò)層layers 創(chuàng)建了一個模型變量，或者直接采用 slim.model_variable 函數(shù)創(chuàng)建模型變量,，TF-Slim 將變量添加到 tf.GraphKeys.MODEL_VARIABLES 集合collection中.

如果需要自定義網(wǎng)絡(luò)層或變量創(chuàng)建方法,，仍想 TF-Slim 來管理模型變量呢？
TF-Slim 提供了相應(yīng)的函數(shù),，來添加模型變量到集合：

my_model_variable = CreateViaCustomCode()# Letting TF-Slim know about the additional variable.slim.add_model_variable(my_model_variable)

<h3>3.2 Layers</h3>

TensorFlow Ops 是非常廣泛的,，神經(jīng)網(wǎng)絡(luò)開發(fā)者對于模型是比較高層的概念，如 Layers,，Losses,， Metrics，Networks.

網(wǎng)絡(luò)層Layer,，如卷積層ConvLayer,，全連接層FCLayer，BatchNormLayer,，比 TensorFlow Ops 更抽象,，且一般涉及多個 Ops.
此外,，網(wǎng)絡(luò)層通常不是一直，不是一直,，usuallybutnotalways包含與之對應(yīng)的變量可調(diào)參數(shù)可調(diào)參數(shù),tunableparameters,，不像一些基礎(chǔ)的 Ops.
例如，ConvLayer 由幾個底層 Ops 組成：

創(chuàng)建權(quán)重和偏置變量,；
計算權(quán)重與前層網(wǎng)絡(luò)層輸入的卷積,；
添加偏置到卷積結(jié)果；
采用激活函數(shù)處理.

ConvLayer 基于原始 TensorFlow 實現(xiàn),，相當(dāng)繁瑣：

input = ...with tf.name_scope('conv1_1') as scope:
  kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,
                                           stddev=1e-1), name='weights')
  conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
  biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),
                       trainable=True, name='biases')
  bias = tf.nn.bias_add(conv, biases)
  conv1 = tf.nn.relu(bias, name=scope)

為了減少代碼的重復(fù),，TF-Slim 提供了很多方便的 Ops，更抽象的定義網(wǎng)絡(luò)層.
如,，ConvLayer 基于 TF-Slim：

input = ...
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')

TF-Slim 提供了很多網(wǎng)絡(luò)構(gòu)建的標(biāo)準(zhǔn)網(wǎng)絡(luò)層：

Layer	TF-Slim
BiasAdd	slim.bias_add
BatchNorm	slim.batch_norm
Conv2d	slim.conv2d
Conv2dInPlace	slim.conv2d_in_plane
Conv2dTransposeDeconv	slim.conv2d_transpose
FullyConnected	slim.fully_connected
AvgPool2D	slim.avg_pool2d
Dropout	slim.dropout
Flatten	slim.flatten
MaxPool2D	slim.max_pool2d
OneHotEncoding	slim.one_hot_encoding
SeparableConv2	slim.separable_conv2d
UnitNorm	slim.unit_norm

TF-Slim 還提供了兩個 meta-operations：repeat 和 stack,，用于重復(fù)地進行相同操作.
例如，VGG 網(wǎng)絡(luò)的一部分：

net = ...
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')

采用 for 循環(huán)可以減少代碼的重復(fù)：

net = ...for i in range(3):
    net = slim.conv2d(net, 256, [3, 3], scope='conv3_%d' % (i+1))
net = slim.max_pool2d(net, [2, 2], scope='pool2')

采用 TF-Slim 的 repeat 操作,，可以更干凈：

net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')

slim.repeat 不僅應(yīng)用相同的內(nèi)嵌參數(shù),，還能智能地展開作用域，每個 slim.conv2d 連續(xù)調(diào)用的作用域會自動添加 underscore 和迭代數(shù).
更具體來說,，上面例子的作用域會自動命名為 conv3/conv3_1, conv3/conv3_2 和 conv3/conv3_3.

TF-Slim 的 slim.stack 操作,，重復(fù)地進行參數(shù)不同的相同操作，來創(chuàng)建網(wǎng)絡(luò)層堆積.
slim.stack 對于每次創(chuàng)建的操作,，創(chuàng)建新的 tf.variable_scope.
例如,，多層感知機MLP的創(chuàng)建的一種簡單方式為：

# 冗長方式:x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')# 等價方式, TF-Slim way using slim.stack:slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')

該例中，slim.stack 調(diào)用了三次 slim.fully_connected,，傳遞函數(shù)的輸出到下一次調(diào)用.
然而,，每次調(diào)用的隱神經(jīng)元數(shù)量分別是 32， 64,， 128.

類似地,，多卷積層的堆積：

# Verbose way:x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')
x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')# Using stack:slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')

<h3>3.3 Scopes</h3>

除了 TensorFlow 中的作用域類型scopetype- name_scope 和 variable_scope，TF-Slim 新增了一個作用域類型 - arg_scope.

arg_scope 用于指定一個或多個 Ops,，以及指定傳遞到在 arg_scope 定義的每個 Op 的參數(shù)集.

例如：

net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')
net = slim.conv2d(net, 256, [11, 11], padding='SAME',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')

三個 ConvLayers 共享相同的超參數(shù),，其中兩個具有相同的 padding，三個都是相同的 weights_initializer 和 weight_regularizer.
該代碼不易讀,，且包含很多可以分解出來的重復(fù)值.
一種解決方法是采用變量來指定默認(rèn)值：

padding = 'SAME'initializer = tf.truncated_normal_initializer(stddev=0.01)
regularizer = slim.l2_regularizer(0.0005)
net = slim.conv2d(inputs, 64, [11, 11], 4,
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv1')
net = slim.conv2d(net, 128, [11, 11],
                  padding='VALID',
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv2')
net = slim.conv2d(net, 256, [11, 11],
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv3')

該方式可以保證三個 ConvLayer 共享相同的參數(shù),，但并沒有完全減少代碼冗余.

采用 arg_scope，可以同時保證每一網(wǎng)絡(luò)層使用相同的值,，并簡單代碼：

  with slim.arg_scope([slim.conv2d], padding='SAME',
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
                      weights_regularizer=slim.l2_regularizer(0.0005)):
        net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
        net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
        net = slim.conv2d(net, 256, [11, 11], scope='conv3')

arg_scope 可以使代碼更整潔,，更簡單，更易于維護.
雖然參數(shù)是在 arg_scope 中指定參數(shù)值,，但也可以局部復(fù)寫.
如 padding 參數(shù)已經(jīng)設(shè)為 ’SAME’,，但第二個 ConvLayer 復(fù)寫了其值為 'VALID’.

另外,，也可以內(nèi)嵌 arg_scope，在相同作用域采用多次 Ops. 例如：

with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
        net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
        net = slim.conv2d(net, 256, [5, 5],
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
                      scope='conv2')
        net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')

該例中,，第一個 arg_scope 對 conv2d 和 fully_connected 采用了相同的 weights_initializer 和 weights_regularizer 參數(shù).
第二個 arg_scope,，只指定了 conv2d 的默認(rèn)參數(shù).

利用 TF-Slim 的 Variables，Operation 和 Scopes,，創(chuàng)建 VGG16 網(wǎng)絡(luò)：

def vgg16(inputs):
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
    net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
    net = slim.max_pool2d(net, [2, 2], scope='pool1')
    net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
    net = slim.max_pool2d(net, [2, 2], scope='pool2')
    net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
    net = slim.max_pool2d(net, [2, 2], scope='pool3')
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
    net = slim.max_pool2d(net, [2, 2], scope='pool4')
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
    net = slim.max_pool2d(net, [2, 2], scope='pool5')
    net = slim.fully_connected(net, 4096, scope='fc6')
    net = slim.dropout(net, 0.5, scope='dropout6')
    net = slim.fully_connected(net, 4096, scope='fc7')
    net = slim.dropout(net, 0.5, scope='dropout7')
    net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')  return net

TensorFlow 模型訓(xùn)練需要模型Model,，Loss 函數(shù)，梯度計算和迭代的計算模型權(quán)重相對于 loss 的梯度和對應(yīng)的權(quán)重更新的訓(xùn)練方案.

TF-Slim 提供了常用 loss 函數(shù)和 helper 函數(shù),，以進行模型訓(xùn)練和評估.

<h3>4.1 Losses</h3>

Loss 函數(shù)定義了需要最小化的目標(biāo).
對于分類問題,，一般采用類別的真實概率分布和預(yù)測概率分布間的交叉熵.
對于回歸問題，一般采用真實值和預(yù)測值間的平方差的總和.

某些模型,，如 multi-task learning 模型,，需要同時采用多個 loss. 也就是說，最終的 loss 函數(shù)是不同 loss 函數(shù)之和的最小化.
例如,，同時預(yù)測圖片中的場景類型和每個像素的相機深度,，模型的 loss 函數(shù)是分類 loss 和深度預(yù)測 loss 的二者之和.

TF-Slim 提供了一種易用的 loss 函數(shù)定義機制，采用了 losses 模塊.
例如,，VGG 網(wǎng)絡(luò)的訓(xùn)練：

import tensorflow as tfimport tensorflow.contrib.slim.nets as nets
vgg = nets.vgg# Load the images and labels.images, labels = ...# Create the model.predictions, _ = vgg.vgg_16(images)# Define the loss functions and get the total loss.loss = slim.losses.softmax_cross_entropy(predictions, labels)

該例中,，先創(chuàng)建模型采用的實現(xiàn)采用TF?Slim的VGG實現(xiàn),，然后添加標(biāo)準(zhǔn)的分類 loss.

現(xiàn)在,，針對 multi-task 模型的情況，模型有多個輸出.

# Load the images and labels.images, scene_labels, depth_labels = ...# Create the model.scene_predictions, depth_predictions = CreateMultiTaskModel(images)# Define the loss functions and get the total loss.classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)# The following two lines have the same effect:total_loss = classification_loss + sum_of_squares_loss
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)

該例中,，有兩個 losses：slim.losses.softmax_cross_entropy 和 slim.losses.sum_of_squares.
通過二者相加,，可以得到總 loss；或者采用 slim.losses.get_total_loss 得到總 loss.
其工作原理是,，當(dāng)采用 TF-Slim 來創(chuàng)建 loss 函數(shù)時,，TF-Slim 將 loss 添加到 loss 函數(shù)特定 TensorFlow 集合collecition中. 這樣，可以手工管理總 loss,，或者采用 TF-Slim 來管理.

如果自定義 loss 函數(shù),，如何采用 TF-Slim 來管理 losses 呢？
loss_ops.py 提供了添加自定義 loss 到 TF-Slim 集合中的函數(shù).
例如：

# Load the images and labels.images, scene_labels, depth_labels, pose_labels = ...# Create the model.scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)# Define the loss functions and get the total loss.classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.# The following two ways to compute the total loss are equivalent:regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss# (Regularization Loss is included in the total loss by default).total_loss2 = slim.losses.get_total_loss()

TF-Slim 提供了簡單有效的模型訓(xùn)練工具 - learning.py.
主要包括 Train 函數(shù),，其重復(fù)地計算 loss,，計算梯度，并保存模型到磁盤,，以及控制梯度的輔助函數(shù).

例如,，當(dāng)模型、loss 函數(shù)和優(yōu)化策略定義完成后,，即可調(diào)用 slim.learning.create_train_op 和 slim.learning.train 進行優(yōu)化訓(xùn)練.

g = tf.Graph()# Create the model and specify the losses......

total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)# create_train_op ensures that each time we ask for the loss, the update_ops# are run and the gradients being computed are applied too.train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ... # Where checkpoints are stored.slim.learning.train(
    train_op,
    logdir,
    number_of_steps=1000,
    save_summaries_secs=300,
    save_interval_secs=600):

該例中,，slim.learning.train 中：

train_op - 用于計算loss 和應(yīng)用梯度計算.
logdir - 指定了斷點和日志事件文件的保存路徑.
numer_of_steps - 梯度計算的 step 次數(shù).
save_summaries_secs - 表示每 300/60=5 分鐘計算一次 summary.
save_interval_secs - 表示每 600/60=10 分鐘保存一次斷點模型.

import tensorflow as tfimport tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg

...

train_log_dir = ...if not tf.gfile.Exists(train_log_dir):
    tf.gfile.MakeDirs(train_log_dir)with tf.Graph().as_default():  # Set up the data loading:
  images, labels = ...  # Define the model:
  predictions = vgg.vgg_16(images, is_training=True)  # Specify the loss function:
  slim.losses.softmax_cross_entropy(predictions, labels)

  total_loss = slim.losses.get_total_loss()
  tf.summary.scalar('losses/total_loss', total_loss)  # Specify the optimization scheme:
  optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)  # create_train_op that ensures that when we evaluate it to get the loss,
  # the update_ops are done and the gradient updates are computed.
  train_tensor = slim.learning.create_train_op(total_loss, optimizer)  # Actually runs training.
  slim.learning.train(train_tensor, train_log_dir)

<h2>5. 模型 fine-tuning</h2>

<h3>5.1 簡單回顧 - 從斷點文件恢復(fù)模型變量</h3>

模型訓(xùn)練后,，可以采用 tf.train.Saver() 來從特點斷點文件恢復(fù) Variables.
很多情況下，tf.train.Saver() 提供了恢復(fù)所有或者部分變量的簡單機制.

# Create some variables.v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...# Add ops to restore all the variables.restorer = tf.train.Saver()# Add ops to restore some variables.restorer = tf.train.Saver([v1, v2])# Later, launch the model, use the saver to restore variables from disk, and# do some work with the model.with tf.Session() as sess:# Restore variables from disk.restorer.restore(sess, "/tmp/model.ckpt")
    print("Model restored.")# Do some work with the model...

更多細(xì)節(jié)可以參考：Restoring Variables 和 Choosing which Variables to Save and Restore

在新的數(shù)據(jù)集和新任務(wù)的情況下,，往往需要采用在 pre-trained 模型上 fine-tune.
可以采用 TF-Slim 的 helper 函數(shù)來選擇模型變量的子集來恢復(fù)：

# Create some variables.v1 = slim.variable(name="v1", ...)
v2 = slim.variable(name="nested/v2", ...)
...# Get list of variables to restore (which contains only 'v2'). These are all# equivalent methods:variables_to_restore = slim.get_variables_by_name("v2")# orvariables_to_restore = slim.get_variables_by_suffix("2")# orvariables_to_restore = slim.get_variables(scope="nested")# orvariables_to_restore = slim.get_variables_to_restore(include=["nested"])# orvariables_to_restore = slim.get_variables_to_restore(exclude=["v1"])# Create the saver which will be used to restore the variables.restorer = tf.train.Saver(variables_to_restore)with tf.Session() as sess:# Restore variables from disk.restorer.restore(sess, "/tmp/model.ckpt")
    print("Model restored.")# Do some work with the model...

<h3>5.3 恢復(fù)不同變量名的模型</h3>

當(dāng)從斷點文件恢復(fù)變量時,， Saver 先確定在斷點文件中的變量名，然后映射變量到當(dāng)前圖Graph 中.

上面中,，創(chuàng)建 saver 來傳遞變量. 這里,，從每個提供的變量的 var.op.name來獲得斷點文件中確定的變量名.
當(dāng)斷點文件中的變量與圖Graph 中的變量名相匹配時，比較有效.
但,，某些時候,，斷點模型中的變量與當(dāng)前圖Graph 中的變量名不同時，恢復(fù)模型.
此時,，必須提供 Saver 字典,，以將每個斷點文件變量名和每個圖Graph 變量進行映射.

例如：

# Assuming than 'conv1/weights' should be restored from 'vgg16/conv1/weights'def name_in_checkpoint(var):return 'vgg16/' + var.op.name# Assuming than 'conv1/weights' and 'conv1/bias' should be restored from 'conv1/params1' and 'conv1/params2'def name_in_checkpoint(var):if "weights" in var.op.name:return var.op.name.replace("weights", "params1")if "bias" in var.op.name:return var.op.name.replace("bias", "params2")

variables_to_restore = slim.get_model_variables()
variables_to_restore = {name_in_checkpoint(var):var for var in variables_to_restore}
restorer = tf.train.Saver(variables_to_restore)with tf.Session() as sess:  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")

<h3>5.4 在不同任務(wù) Fine-tuning 模型</h3>

假如，已經(jīng)有預(yù)訓(xùn)練的 VGG16 模型,，其是在 ImageNet 數(shù)據(jù)集上訓(xùn)練得到,，1000 類的分類模型.
當(dāng)對 Pascal VOC 數(shù)據(jù)集應(yīng)用時，該數(shù)據(jù)集只有 20 個類別.

此種情況,，可以采用預(yù)訓(xùn)練模型初始化模型訓(xùn)練,，但排除最后一網(wǎng)絡(luò)層：

# Load the Pascal VOC dataimage, label = MyPascalVocDataLoader(...)
images, labels = tf.train.batch([image, label], batch_size=32)# Create the modelpredictions = vgg.vgg_16(images)

train_op = slim.learning.create_train_op(...)# Specify where the Model, trained on ImageNet, was saved.model_path = '/path/to/pre_trained_on_imagenet.checkpoint'# Specify where the new model will live:log_dir = '/path/to/my_pascal_model_dir/'# Restore only the convolutional layers:variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)# Start training.slim.learning.train(train_op, log_dir, init_fn=init_fn)

當(dāng)模型訓(xùn)練后，往往需要評估模型的實際表現(xiàn).
一般是通過選擇評估度量標(biāo)準(zhǔn),，評測模型表現(xiàn).
評估代碼一般包括數(shù)據(jù)加載,，推斷計算，對比預(yù)測結(jié)果和 groundtruth 的差異,，記錄評估分?jǐn)?shù).

<h3>6.1 Metrics</h3>

定義 metric 來評估模型表現(xiàn),，但不是 loss 函數(shù)一般是在訓(xùn)練時直接優(yōu)化loss一般是在訓(xùn)練時直接優(yōu)化.
例如，最小化 log loss,，但 metic 更關(guān)注 F1 score,，或者 IoU score不可微，不能用于IoU不可微,，不能用于losses.

TF-Slim 提供了很多 metric Ops,，以易于評估模型.
metric 的計算一般可分為三部分：

初始化 - 初始化變量，以計算 metrics
聚合 - 求和等操作,，以計算 metrics
最后 -
可選可選
最終處理,，以計算 metrics. 例如，求均值,，最小值,，最大值等.

例如，為了計算 mean_absolute_error,，需要將變量 count 和 total 初始化為0.
在聚合計算中,，選擇部分預(yù)測值和標(biāo)簽值,，計算其絕對差值，并加到 total.
每次觀察另一些值,，count 遞增. 最終,，total 除以 count 以得到均值.

例如：

images, labels = LoadTestData(...)
predictions = MyModel(images)

mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)

TF-Slim 還提供了兩個函數(shù)：

# Aggregates the value and update ops in two lists:value_ops, update_ops = slim.metrics.aggregate_metrics(
    slim.metrics.streaming_mean_absolute_error(predictions, labels),
    slim.metrics.streaming_mean_squared_error(predictions, labels))# Aggregates the value and update ops in two dictionaries:names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({"eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),"eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

<h3>6.2 追蹤多個 Metrics 示例</h3>

import tensorflow as tfimport tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg# 加載數(shù)據(jù)images, labels = load_data(...)# 定義網(wǎng)絡(luò)predictions = vgg.vgg_16(images)# 選擇計算的 metrics:names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({"eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),"eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})# Evaluate the model using 1000 batches of data:num_batches = 1000with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())for batch_id in range(num_batches):
        sess.run(names_to_updates.values())

    metric_values = sess.run(names_to_values.values())for metric, value in zip(names_to_values.keys(), metric_values):
        print('Metric %s has value: %f' % (metric, value))

metric.py 可以在沒有 layers 或 loss_ops.py 時獨立使用.

TF-Slim 提供了評估模塊 - evaluation.py，包含了采用 metrics 從metric_ops.py 寫入評測腳本的 helper 函數(shù). 主要包括,，周期地運行評估,，對 batch 數(shù)據(jù)計算 metric，以及打印和 summarizing metric 結(jié)果.

例如：

import tensorflow as tf

slim = tf.contrib.slim# Load the dataimages, labels = load_data(...)# Define the networkpredictions = MyModel(images)# Choose the metrics to compute:names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({'accuracy': slim.metrics.accuracy(predictions, labels),'precision': slim.metrics.precision(predictions, labels),'recall': slim.metrics.recall(mean_relative_errors, 0.3),
})# Create the summary ops such that they also print out to std output:summary_ops = []for metric_name, metric_value in names_to_values.iteritems():
    op = tf.summary.scalar(metric_name, metric_value)
    op = tf.Print(op, [metric_value], metric_name)
    summary_ops.append(op)

num_examples = 10000batch_size = 32num_batches = math.ceil(num_examples / float(batch_size))# Setup the global step.slim.get_or_create_global_step()

output_dir = ... # Where the summaries are stored.eval_interval_secs = ... # How often to run the evaluation.slim.evaluation.evaluation_loop('local',
    checkpoint_dir,
    log_dir,
    num_evals=num_batches,
    eval_op=names_to_updates.values(),
    summary_op=tf.summary.merge(summary_ops),
    eval_interval_secs=eval_interval_secs)

本站是提供個人知識管理的網(wǎng)絡(luò)存儲空間,，所有內(nèi)容均由用戶發(fā)布,，不代表本站觀點。請注意甄別內(nèi)容中的聯(lián)系方式,、誘導(dǎo)購買等信息,，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,，請點擊一鍵舉報,。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻花（0） +1

來自： LibraryPKU > 《機器學(xué)習(xí)框架》

舉報/認(rèn)領(lǐng)