之前的博客我們已經對RNN模型有了個粗略的了解。作為一個時序性模型,,RNN的強大不需要我在這里重復了,。今天,讓我們來看看除了RNN外另一個特殊的,,同時也是廣為人知的強大的神經網絡模型,即CNN模型。今天的討論主要是基于Tensorflow的CIFAR10教程,,不過作為對比,我們也會對Tensorflow的MINST教程作解析以及對比,。很快大家就會發(fā)現,,邏輯上考慮,其實內容都是大同小異的,。由于所對應的目標不一樣,,在數據處理方面可能存在著些許差異,這里我們以CIFAR10的為基準,,有興趣的朋友歡迎去閱讀并學習MNIST的過程,,地址點擊這里。CIFAR10的英文教程在Tensorflow官網上可以獲得,,教程代碼地址點擊這里,。 CNN簡介 CNN是一個神奇的深度學習框架,也是深度學習學科里的一個異類,。在被譽為AI寒冬的90年末到2000年初,,在大部分學者都棄坑的情況下,CNN的效用卻不減反增,,感謝Yann LeCun,!CNN的架構其實很符合其名,Convolutional Neural Network,,CNN在運做的開始運用了卷積(convolution)的概念,,外加pooling等方式在多次卷積了圖像并形成多個特征圖后,輸入被平鋪開進入一個完全連接的多層神經網絡里(fully connected network)里,,并由輸出的softmax來判斷圖片的分類情況,。該框架的發(fā)展史也很有趣,早在90年代末,,以LeCun命名的Le-Net5就已經聞名,。在深度學習火熱后,更多的框架變種也接踵而至,,較為聞名的包括多倫多大學的AlexNet,,谷歌的GoogLeNet,牛津的OxfordNet外還有Network in Network(NIN),VGG16等多個network,。最近,,對物體識別的研究開發(fā)了RCNN框架,可見在深度學習發(fā)展迅猛的今天,,CNN框架依然是很多著名研究小組的課題,,特別是在了解了Alpha-Go的運作里也可以看到CNN的身影,可見其能力,!至于CNN模型的基礎構架,,這方面的資源甚多,就不一一列舉了,。 CIFAR10代碼分析 在運行CIFAR10代碼時,,你只需要下載該代碼,然后cd到代碼目錄后直接輸入python cifar10_train.py就可以了,。默認的迭代步驟為100萬步,,每一步驟需要約3~4秒,運行5小時可以完成近10萬步,。由于根據cifar10_train.py的描述10萬步的準確率為86%左右,,我們運行近5個小時左右就可以了,沒必要運行全部的100萬步,。查看結果時,,運行python cifar_10_eval.py就可以了。由于模型被存儲在了tmp目錄里,,eval文件可以找尋到最近保存的模型并運行該模型,,所以還是很方便的。這個系統(tǒng)在運行后可以從照片里識別10種不同的物體,,包括飛機等,。這么好玩的系統(tǒng),,快讓我們來看一看是怎么實現的吧,! 首先,讓我們來看下cifar1_train.py文件,。文件里的核心為train函數,,它的表現如下: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | def train():
'''Train CIFAR-10 for a number of steps.'''
with tf.Graph().as_default():
global_step = tf.Variable( 0 , trainable = False )
# Get images and labels for CIFAR-10.
# 輸入選用的是distored_inputs函數
images, labels = cifar10.distorted_inputs()
# Build a Graph that computes the logits predictions from the
# inference model.
logits = cifar10.inference(images)
# Calculate loss.
loss = cifar10.loss(logits, labels)
# Build a Graph that trains the model with one batch of examples and
# updates the model parameters.
train_op = cifar10.train(loss, global_step)
# Create a saver.
saver = tf.train.Saver(tf.all_variables())
# Build the summary operation based on the TF collection of Summaries.
summary_op = tf.merge_all_summaries()
# Build an initialization operation to run below.
init = tf.initialize_all_variables()
# Start running operations on the Graph.
sess = tf.Session(config = tf.ConfigProto(
log_device_placement = FLAGS.log_device_placement))
sess.run(init)
# Start the queue runners.
tf.train.start_queue_runners(sess = sess)
summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, sess.graph)
# 在最高的迭代步驟數里進行循環(huán)迭代
for step in xrange (FLAGS.max_steps):
start_time = time.time()
_, loss_value = sess.run([train_op, loss])
duration = time.time() - start_time
assert not np.isnan(loss_value), 'Model diverged with loss = NaN'
# 每10個輸入數據顯示次step,loss,,時間等運行數據
if step % 10 = = 0 :
num_examples_per_step = FLAGS.batch_size
examples_per_sec = num_examples_per_step / duration
sec_per_batch = float (duration)
format_str = ( '%s: step %d, loss = %.2f (%.1f examples/sec; %.3f '
'sec/batch)' )
print (format_str % (datetime.now(), step, loss_value,
examples_per_sec, sec_per_batch))
# 每100個輸入數據將網絡的狀況體現在summary里
if step % 100 = = 0 :
summary_str = sess.run(summary_op)
summary_writer.add_summary(summary_str, step)
# Save the model checkpoint periodically.
# 每1000個輸入數據保存次模型
if step % 1000 = = 0 or (step + 1 ) = = FLAGS.max_steps:
checkpoint_path = os.path.join(FLAGS.train_dir, 'model.ckpt' )
saver.save(sess, checkpoint_path, global_step = step)
|
這個訓練函數本身邏輯很清晰,,除了它運用了大量的cifar10.py文件里的函數外,一個值得注意的地方是輸入里應用的是distorded_inputs函數,。這個很有意思,,因為據論文表達,對輸入數據進行一定的處理后可以得到新的數據,這是增加數據存儲量的一個簡便的方法,,那么具體它是如何做到的呢?讓我們來看看這個distorded_inputs函數,。在cifar10.py文件里,distorded_inputs函數實質上是一個wrapper,,包裝了來自cifar10_input.py函數里的distorted_inputs()函數,。這個函數的邏輯如下: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | def distorted_inputs(data_dir, batch_size):
'''Construct distorted input for CIFAR training using the Reader ops.
Args:
data_dir: Path to the CIFAR-10 data directory.
batch_size: Number of images per batch.
Returns:
images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
labels: Labels. 1D tensor of [batch_size] size.
'''
filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i)
for i in xrange ( 1 , 6 )]
for f in filenames:
if not tf.gfile.Exists(f):
raise ValueError( 'Failed to find file: ' + f)
# Create a queue that produces the filenames to read.
filename_queue = tf.train.string_input_producer(filenames)
# Read examples from files in the filename queue.
read_input = read_cifar10(filename_queue)
reshaped_image = tf.cast(read_input.uint8image, tf.float32)
height = IMAGE_SIZE
width = IMAGE_SIZE
# Image processing for training the network. Note the many random
# distortions applied to the image.
# Randomly crop a [height, width] section of the image.
# 步驟1:隨機截取一個以[高,寬]為大小的圖矩陣,。
distorted_image = tf.random_crop(reshaped_image, [height, width, 3 ])
# Randomly flip the image horizontally.
# 步驟2:隨機顛倒圖片的左右,。概率為50%
distorted_image = tf.image.random_flip_left_right(distorted_image)
# Because these operations are not commutative, consider randomizing
# the order their operation.
# 步驟3:隨機改變圖片的亮度以及色彩對比。
distorted_image = tf.image.random_brightness(distorted_image,
max_delta = 63 )
distorted_image = tf.image.random_contrast(distorted_image,
lower = 0.2 , upper = 1.8 )
# Subtract off the mean and divide by the variance of the pixels.
float_image = tf.image.per_image_whitening(distorted_image)
# Ensure that the random shuffling has good mixing properties.
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int (NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN *
min_fraction_of_examples_in_queue)
print ( 'Filling queue with %d CIFAR images before starting to train. '
'This will take a few minutes.' % min_queue_examples)
# Generate a batch of images and labels by building up a queue of examples.
return _generate_image_and_label_batch(float_image, read_input.label,
min_queue_examples, batch_size,
shuffle = True )
|
這里每一張圖片被隨機的截取一片圖后有一定的概率被翻轉,,改變亮度對比等步驟,。另外,最后一段的意思為在queue里有了不少于40%的數據的時候訓練才能開始,。那么在測試的時候,,我們需要經過這個步驟么?答案是非也,。在cifar10_input.py文件里,,distorded_inputs函數的下方,一個名為inputs的函數代表了輸入被運用在eval時的邏輯,。在輸入參數方面,,這個inputs函數在保留了distorded_inputs的同時增加了一個名為eval_data的參數,一個bool參數代表了是運用訓練的數據還是測試的數據,。下面,,讓我們來大概看下這個函數的邏輯。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | def inputs(eval_data, data_dir, batch_size):
'''Construct input for CIFAR evaluation using the Reader ops.
Args:
eval_data: bool, indicating if one should use the train or eval data set.
data_dir: Path to the CIFAR-10 data directory.
batch_size: Number of images per batch.
Returns:
images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
labels: Labels. 1D tensor of [batch_size] size.
'''
if not eval_data:
filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i)
for i in xrange ( 1 , 6 )]
num_examples_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN
else :
filenames = [os.path.join(data_dir, 'test_batch.bin' )]
num_examples_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_EVAL
for f in filenames:
if not tf.gfile.Exists(f):
raise ValueError( 'Failed to find file: ' + f)
# Create a queue that produces the filenames to read.
filename_queue = tf.train.string_input_producer(filenames)
# Read examples from files in the filename queue.
read_input = read_cifar10(filename_queue)
reshaped_image = tf.cast(read_input.uint8image, tf.float32)
height = IMAGE_SIZE
width = IMAGE_SIZE
# Image processing for evaluation.
# Crop the central [height, width] of the image. # 截取圖片中心區(qū)域
resized_image = tf.image.resize_image_with_crop_or_pad(reshaped_image,
width, height)
# Subtract off the mean and divide by the variance of the pixels. # 平衡圖片的色差
float_image = tf.image.per_image_whitening(resized_image)
# Ensure that the random shuffling has good mixing properties.
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int (num_examples_per_epoch *
min_fraction_of_examples_in_queue)
# Generate a batch of images and labels by building up a queue of examples.
return _generate_image_and_label_batch(float_image, read_input.label,
min_queue_examples, batch_size,
shuffle = False )
|
這里,,我們看到截取只有圖片的中心,,另外處理也只有平衡色差。但是,,聰明的讀者朋友一定能想到,,如果一張關于飛機的圖片是以飛機頭為圖片中心的,而訓練集合里所有的飛機圖片都是以機翼為圖片中心的話,,我們之前的distorded_inputs函數將有機會截取飛機頭的區(qū)域,,從而給我們的測試圖片提供相似信息。另外,,隨機調整色差也包含了平均色差,,所以我們的訓練集實質上包含了更廣,更多種的可能性,,故可想而之會有機會得到更好的效果,。 那么,講了關于輸入的小竅門,我們應該來看看具體的CNN模型了,。如何制造一個CNN模型呢,?讓我們先來看一個簡單的版本,即MNIST教程里的模型: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | # The variables below hold all the trainable weights. They are passed an
# initial value which will be assigned when we call:
# {tf.initialize_all_variables().run()}
conv1_weights = tf.Variable(
tf.truncated_normal([ 5 , 5 , NUM_CHANNELS, 32 ], # 5x5 filter, depth 32.
stddev = 0.1 ,
seed = SEED, dtype = data_type()))
conv1_biases = tf.Variable(tf.zeros([ 32 ], dtype = data_type()))
conv2_weights = tf.Variable(tf.truncated_normal(
[ 5 , 5 , 32 , 64 ], stddev = 0.1 ,
seed = SEED, dtype = data_type()))
conv2_biases = tf.Variable(tf.constant( 0.1 , shape = [ 64 ], dtype = data_type()))
fc1_weights = tf.Variable( # fully connected, depth 512.
tf.truncated_normal([IMAGE_SIZE / / 4 * IMAGE_SIZE / / 4 * 64 , 512 ],
stddev = 0.1 ,
seed = SEED,
dtype = data_type()))
fc1_biases = tf.Variable(tf.constant( 0.1 , shape = [ 512 ], dtype = data_type()))
fc2_weights = tf.Variable(tf.truncated_normal([ 512 , NUM_LABELS],
stddev = 0.1 ,
seed = SEED,
dtype = data_type()))
fc2_biases = tf.Variable(tf.constant(
0.1 , shape = [NUM_LABELS], dtype = data_type()))
# We will replicate the model structure for the training subgraph, as well
# as the evaluation subgraphs, while sharing the trainable parameters.
def model(data, train = False ):
'''The Model definition.'''
# 2D convolution, with 'SAME' padding (i.e. the output feature map has
# the same size as the input). Note that {strides} is a 4D array whose
# shape matches the data layout: [image index, y, x, depth].
conv = tf.nn.conv2d(data,
conv1_weights,
strides = [ 1 , 1 , 1 , 1 ],
padding = 'SAME' )
# Bias and rectified linear non-linearity.
relu = tf.nn.relu(tf.nn.bias_add(conv, conv1_biases))
# Max pooling. The kernel size spec {ksize} also follows the layout of
# the data. Here we have a pooling window of 2, and a stride of 2.
pool = tf.nn.max_pool(relu,
ksize = [ 1 , 2 , 2 , 1 ],
strides = [ 1 , 2 , 2 , 1 ],
padding = 'SAME' )
conv = tf.nn.conv2d(pool,
conv2_weights,
strides = [ 1 , 1 , 1 , 1 ],
padding = 'SAME' )
relu = tf.nn.relu(tf.nn.bias_add(conv, conv2_biases))
pool = tf.nn.max_pool(relu,
ksize = [ 1 , 2 , 2 , 1 ],
strides = [ 1 , 2 , 2 , 1 ],
padding = 'SAME' )
# Reshape the feature map cuboid into a 2D matrix to feed it to the
# fully connected layers.
pool_shape = pool.get_shape().as_list()
reshape = tf.reshape(
pool,
[pool_shape[ 0 ], pool_shape[ 1 ] * pool_shape[ 2 ] * pool_shape[ 3 ]])
# Fully connected layer. Note that the '+' operation automatically
# broadcasts the biases.
hidden = tf.nn.relu(tf.matmul(reshape, fc1_weights) + fc1_biases)
# Add a 50% dropout during training only. Dropout also scales
# activations such that no rescaling is needed at evaluation time.
if train:
hidden = tf.nn.dropout(hidden, 0.5 , seed = SEED)
return tf.matmul(hidden, fc2_weights) + fc2_biases
# Training computation: logits + cross-entropy loss.
logits = model(train_data_node, True )
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
logits, train_labels_node))
# L2 regularization for the fully connected parameters.
regularizers = (tf.nn.l2_loss(fc1_weights) + tf.nn.l2_loss(fc1_biases) +
tf.nn.l2_loss(fc2_weights) + tf.nn.l2_loss(fc2_biases))
# Add the regularization term to the loss.
loss + = 5e - 4 * regularizers
# Optimizer: set up a variable that's incremented once per batch and
# controls the learning rate decay.
batch = tf.Variable( 0 , dtype = data_type())
# Decay once per epoch, using an exponential schedule starting at 0.01.
learning_rate = tf.train.exponential_decay(
0.01 , # Base learning rate.
batch * BATCH_SIZE, # Current index into the dataset.
train_size, # Decay step.
0.95 , # Decay rate.
staircase = True )
# Use simple momentum for the optimization.
optimizer = tf.train.MomentumOptimizer(learning_rate,
0.9 ).minimize(loss,
global_step = batch)
# Predictions for the current training minibatch.
train_prediction = tf.nn.softmax(logits)
# Predictions for the test and validation, which we'll compute less often.
eval_prediction = tf.nn.softmax(model(eval_data))
|
這段代碼很直白,,在定義了convolution1,convolution2,fully_connected1和fully_connected2層神經網絡的weight和biases參數后,,在模型函數里,我們通過conv2d, relu, max_pool等方式在兩次重復后將得到的結果重新整理后輸入那個fully connected的神經網絡中,,即matmul(reshape,fc1_weights) + fc1_biases,。之后再經歷了第二層的fully connected net后得到logits。定義loss以及optimizer等常見的過程后結果是由softmax來取得,。這個邏輯我們在CIFAR10里也會見到,,它的表達如下: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | def inference(images):
'''Build the CIFAR-10 model.
Args:
images: Images returned from distorted_inputs() or inputs().
Returns:
Logits.
'''
# We instantiate all variables using tf.get_variable() instead of
# tf.Variable() in order to share variables across multiple GPU training runs.
# If we only ran this model on a single GPU, we could simplify this function
# by replacing all instances of tf.get_variable() with tf.Variable().
#
# conv1
with tf.variable_scope( 'conv1' ) as scope:
# 輸入的圖片由于是彩圖,有三個channel,,所以在conv2d中,,我們規(guī)定
# 輸出為64個channel的feature map。
kernel = _variable_with_weight_decay( 'weights' , shape = [ 5 , 5 , 3 , 64 ],
stddev = 1e - 4 , wd = 0.0 )
conv = tf.nn.conv2d(images, kernel, [ 1 , 1 , 1 , 1 ], padding = 'SAME' )
biases = _variable_on_cpu( 'biases' , [ 64 ], tf.constant_initializer( 0.0 ))
bias = tf.nn.bias_add(conv, biases)
conv1 = tf.nn.relu(bias, name = scope.name)
_activation_summary(conv1)
# pool1
pool1 = tf.nn.max_pool(conv1, ksize = [ 1 , 3 , 3 , 1 ], strides = [ 1 , 2 , 2 , 1 ],
padding = 'SAME' , name = 'pool1' )
# norm1
norm1 = tf.nn.lrn(pool1, 4 , bias = 1.0 , alpha = 0.001 / 9.0 , beta = 0.75 ,
name = 'norm1' )
# conv2
with tf.variable_scope( 'conv2' ) as scope:
# 由于之前的輸出是64個channel,,即我們這里的輸入,,我們的shape就會
# 是輸入channel數為64,輸出,,我們也規(guī)定為64
kernel = _variable_with_weight_decay( 'weights' , shape = [ 5 , 5 , 64 , 64 ],
stddev = 1e - 4 , wd = 0.0 )
conv = tf.nn.conv2d(norm1, kernel, [ 1 , 1 , 1 , 1 ], padding = 'SAME' )
biases = _variable_on_cpu( 'biases' , [ 64 ], tf.constant_initializer( 0.1 ))
bias = tf.nn.bias_add(conv, biases)
conv2 = tf.nn.relu(bias, name = scope.name)
_activation_summary(conv2)
# norm2
norm2 = tf.nn.lrn(conv2, 4 , bias = 1.0 , alpha = 0.001 / 9.0 , beta = 0.75 ,
name = 'norm2' )
# pool2
pool2 = tf.nn.max_pool(norm2, ksize = [ 1 , 3 , 3 , 1 ],
strides = [ 1 , 2 , 2 , 1 ], padding = 'SAME' , name = 'pool2' )
# local3
with tf.variable_scope( 'local3' ) as scope:
# Move everything into depth so we can perform a single matrix multiply.
reshape = tf.reshape(pool2, [FLAGS.batch_size, - 1 ])
dim = reshape.get_shape()[ 1 ].value
# 這里之前在reshape時的那個-1是根據tensor的大小自動定義為batch_size和
# 剩下的,,所以我們剩下的就是一張圖的所有內容,我們將它訓練并map到384
# 個神經元節(jié)點上
weights = _variable_with_weight_decay( 'weights' , shape = [dim, 384 ],
stddev = 0.04 , wd = 0.004 )
biases = _variable_on_cpu( 'biases' , [ 384 ], tf.constant_initializer( 0.1 ))
local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name = scope.name)
_activation_summary(local3)
# local4
with tf.variable_scope( 'local4' ) as scope:
#由于我們之前的節(jié)點有384個,,這里我們進一步縮減為192個,。
weights = _variable_with_weight_decay( 'weights' , shape = [ 384 , 192 ],
stddev = 0.04 , wd = 0.004 )
biases = _variable_on_cpu( 'biases' , [ 192 ], tf.constant_initializer( 0.1 ))
local4 = tf.nn.relu(tf.matmul(local3, weights) + biases, name = scope.name)
_activation_summary(local4)
# softmax, i.e. softmax(WX + b)
with tf.variable_scope( 'softmax_linear' ) as scope:
# 這是softmax輸出時的網絡,我們由192個節(jié)點map到輸出的不同數量上,,這里假設
# 有10類,,我們就輸出10個num_classes。
weights = _variable_with_weight_decay( 'weights' , [ 192 , NUM_CLASSES],
stddev = 1 / 192.0 , wd = 0.0 )
biases = _variable_on_cpu( 'biases' , [NUM_CLASSES],
tf.constant_initializer( 0.0 ))
softmax_linear = tf.add(tf.matmul(local4, weights), biases, name = scope.name)
_activation_summary(softmax_linear)
return softmax_linear
|
這里的邏輯跟之前的在框架上基本一樣,,不同在哪里呢,?首先,這次我們的輸入是彩圖,。學過圖片處理的朋友肯定知道彩圖有3個channel,,而之前MNIST只是單個channel的灰白圖,。所以,,在我們制作feature map的時候,由1個channel map到了32個(注,,那個NUM_CHANNELS是1),。這里我們不過把NUM_CHANNELS給直接寫為了3而已。另外,我們還運用了variable scope,,這是一種很好的方式來界定何時對那些變量進行分享,,同時,我們也不需要反復定義weight和biases的名字了,。 對Loss的定義由loss函數寫明,,其內容無非是運用了sparse_softmax_corss_entropy_with_logits,基本流程同于MNIST,,這里將不詳細描述,。最后,cifar10.py里的train函數雖然邏輯很簡單,,但是也有值得注意的地方,。代碼如下: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | def train(total_loss, global_step):
'''Train CIFAR-10 model.
Create an optimizer and apply to all trainable variables. Add moving
average for all trainable variables.
Args:
total_loss: Total loss from loss().
global_step: Integer Variable counting the number of training steps
processed.
Returns:
train_op: op for training.
'''
# Variables that affect learning rate.
num_batches_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN / FLAGS.batch_size
decay_steps = int (num_batches_per_epoch * NUM_EPOCHS_PER_DECAY)
# Decay the learning rate exponentially based on the number of steps.
lr = tf.train.exponential_decay(INITIAL_LEARNING_RATE,
global_step,
decay_steps,
LEARNING_RATE_DECAY_FACTOR,
staircase = True )
tf.scalar_summary( 'learning_rate' , lr)
# Generate moving averages of all losses and associated summaries.
loss_averages_op = _add_loss_summaries(total_loss)
# Compute gradients.
# control dependencies的運用。這里只有l(wèi)oss_averages_op完成了
# 我們才會進行gradient descent的優(yōu)化,。
with tf.control_dependencies([loss_averages_op]):
opt = tf.train.GradientDescentOptimizer(lr)
grads = opt.compute_gradients(total_loss)
# Apply gradients.
apply_gradient_op = opt.apply_gradients(grads, global_step = global_step)
# Add histograms for trainable variables.
for var in tf.trainable_variables():
tf.histogram_summary(var.op.name, var)
# Add histograms for gradients.
for grad, var in grads:
if grad is not None :
tf.histogram_summary(var.op.name + '/gradients' , grad)
# Track the moving averages of all trainable variables.
variable_averages = tf.train.ExponentialMovingAverage(
MOVING_AVERAGE_DECAY, global_step)
variables_averages_op = variable_averages. apply (tf.trainable_variables())
with tf.control_dependencies([apply_gradient_op, variables_averages_op]):
train_op = tf.no_op(name = 'train' )
return train_op
|
這里多出的一些內容為收集網絡運算時的一些臨時結果,,如記錄所有的loss的loss_averages_op = _add_loss_summaries(total_loss)以及對參數的histogram:tf.histogram_summary(var.op.name, var)。值得注意的地方是這里多次地使用了control_dependency概念,,即dependency條件沒有達成前,,dependency內的代碼是不會運行的。這個概念在Tensorflow中有著重要的意義,,這里是一個實例,,給大家很好的闡述了這個概念,建議有興趣的朋友可以多加研究,。至此,,圖片的訓練便到此為止。 那么eval文件是如何評價模型的好壞的呢,?讓我們來簡單的看下eval文件的內容,。我們首先通過evaluate函數中的cifar10.inputs函數得到輸入圖片以及其對應的label,之后,,通過之前介紹的inference函數,,即CNN框架得到logits,之后我們通過tensorflow的in_top_k函數來判斷我們得到的那個logit是否在我們label里,。這里的k被設置為1并對結果做展示以及記錄等工作,。有興趣的朋友可以仔細閱讀這段代碼,這里將不詳細說明,。 至此,,系統(tǒng)完成,我們對于如何建立一個CNN系統(tǒng)有了初步了解,。
|