從小白開始入門python+tensorflow+cnn做人臉性別識別(一)

暖寶寶j 2017-08-30

展開全文

在寫這篇文章時(shí),樓主是一個(gè)剛剛考上研的學(xué)生,以前從沒接觸過機(jī)器學(xué)習(xí),由于老師要求,開始接觸cnn人臉性別識別（也是這個(gè)時(shí)候開始接觸CSDN的）,現(xiàn)記錄下我學(xué)習(xí)中的坑以及一點(diǎn)經(jīng)驗(yàn)。

我的老師信奉任務(wù)學(xué)習(xí)法,，所以我是沒有任何Python,、tensorflow和cnn的基礎(chǔ)的情況下接觸的這個(gè)任務(wù),。我的第一個(gè)代碼是著名的MNIST手寫字識別。當(dāng)然,，這其中的困難和無奈有過相同經(jīng)驗(yàn)的同志應(yīng)該有得體會,。關(guān)于python的基礎(chǔ)知識什么的我也就不賣弄了，網(wǎng)上的大神寫的博文很好,，也很多,，我這篇主要就是記錄我做這個(gè)任務(wù)中的一些問題。

在剛開始搭建網(wǎng)絡(luò)時(shí),，我是借鑒的一個(gè)老師的mnist代碼的部分,，自己修改了一下，完成了自己的基本的網(wǎng)絡(luò)和參數(shù)定義（小白一枚,，表述不準(zhǔn)確還請見諒）

[python] view plain copy

#占位符x：（輸入數(shù)據(jù)）
xs = tf.placeholder(tf.float32, shape = [None, 92*112])
ys = tf.placeholder(tf.float32, shape = [None, 2])
x_image = tf.reshape(xs, [-1,112, 92, 1])
#get w
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev = 0.1)
return tf.Variable(initial)
#get bias
def biases_variable(shape):
initial = tf.constant(0.1, shape = shape)
return tf.Variable(initial)
#convolutional layer
def conv2d(x,w):
return tf.nn.conv2d(x, w, strides = [1, 1, 1, 1], padding = 'SAME')
#pooling layer ##pooling層模版大小為2x2,，所以輸出的長寬會變?yōu)檩斎氲囊话氪笮?nbsp;
def max_pool(x):
return tf.nn.max_pool(x, ksize = [1, 2, 2, 1], strides = [1, 2, 2, 1], padding = 'SAME')
# the first convolutional layer1
w_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = biases_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1) # output size 112x92x32
h_pool1 = max_pool(h_conv1) # output size 56x46x32
#the second convolutional layer2 每個(gè)5x5的patch會得到64個(gè)特征
w_conv2 = weight_variable([5,5,32,64])
b_conv2 = biases_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2) #output size 56x46x64
h_pool2 = max_pool(h_conv2) #output size 28x23x64
#the third converlutional layer3
w_conv3 = weight_variable([5,5,64,128])
b_conv3 = biases_variable([128])
h_conv3 = tf.nn.relu(conv2d(h_pool2, w_conv3) + b_conv3)
h_pool3 = max_pool(h_conv3) #output size 14x12x128
#全連接層
w_fc1 = weight_variable([14,12,128,1024])
b_fc1 = biases_variable([1024])
h_fc11 = tf.nn.relu(tf.nn.conv2d(h_pool3, w_fc1, strides=[1,1,1,1], padding='VALID') + b_fc1)
h_fc1=tf.reshape(h_fc11,[-1,1024])

（完整代碼我在最后貼出）

在這個(gè)框架中，我的訓(xùn)練數(shù)據(jù)是800張112x92的人臉照片（男女各四百張）,，測試數(shù)據(jù)是大概1031張112x92的人臉照片（男的591張,，別問我為啥測試數(shù)據(jù)比訓(xùn)練數(shù)據(jù)還多。,。這是我在網(wǎng)上down的）,。至于其他的，我也做了相應(yīng)的備注,，就這樣,，我的網(wǎng)絡(luò)先是基本完成了，雖然層數(shù)不多,。,。

在這過程中我寫點(diǎn)我遇到的各種問題：

首先就是處理輸入圖片的shape問題，這個(gè)問題花費(fèi)了我好幾天才解決,，解決辦法就是代碼中描述的那樣,，reshape成一個(gè)（-1，112,，92,，1）的tensor，如果不明白可以看我的完整代碼-.-

后來遇到了一個(gè)問題就是全連接層需要確定最后一層max_pool輸出的size,，因?yàn)槲疫@個(gè)112x92的size通過這幾個(gè)max_pool是不能整除的（第二層到第三層）,，后來還是請教的一個(gè)師姐，說是自己試驗(yàn)一下就好了（也怪我自己笨）,，解決了這個(gè)事,。

完成了網(wǎng)絡(luò)框架后我就開始處理我的輸入圖片的問題（也就是如何輸入保存我的原圖,，以及如何使用他們進(jìn)行訓(xùn)練和測試）。

處理代碼如下

[python] view plain copy

#image_train 是訓(xùn)練數(shù)據(jù)
image_train = np.zeros((800,112,92))
for i in range(800):
# path = ' H:\Python\train_sample\'
m = str(i+1)
filename = "face" + m + '.bmp'
with tf.Session() as sess:
image_train [i] = img.imread(filename)
#image_test 測試數(shù)據(jù)
image_test = np.zeros((1031,112,92))
for i in range(1031):
m = str(i+1)
filename = r"H:\Python\test_sample\face" + m + '.bmp'
with tf.Session() as sess:
image_test[i] = img.imread(filename)

我這是請教的學(xué)姐,，她所使用的方法,，我這么用了一下，還挺好用的,。就是把數(shù)據(jù)都讀到一個(gè)numpy數(shù)組,，第一維的數(shù)字就是數(shù)量，后面跟的是圖片的size,。

解決了圖片的問題,，接下來就是label的問題。由于我的數(shù)據(jù)都是前半部分是男臉,，后半部分是女臉,，所以我就直接自己做了一個(gè)label的numpy數(shù)組，沒有選擇讀取其他的文本（也是懶得,，不想學(xué)這方面知識了）

[python] view plain copy

#test label
label_test = np.ones((1031,2))
for i in range(591):
label_test[i,0] = 0
label_test[i,1] = 1
for i in range(591,1031):
label_test[i,0] = 1
label_test[i,1] = 0
#print(label_test[591][0])
#train label
label_train = np.ones((800,2))
for i in range(400):
label_train[i,0] = 0
label_train[i,1] = 1
for i in range(400,800):
label_train[i,0] = 1
label_train[i,1] = 0

在這過程中,，我遇到過一個(gè)很大的問題，當(dāng)時(shí)花了好長時(shí)間,，各種請教才解決了。這或許就是沒有系統(tǒng)學(xué)習(xí)的缺點(diǎn)吧,。,。

由于我的分類器選擇的是softmax，而我當(dāng)時(shí)選擇的輸出的類別數(shù)是1（在我想來,，男的是1,，女的是0），所以我剛開始制作的label是（1031,，1）和（800,，1），結(jié)果各種報(bào)錯(cuò),，我也很懵,，就是找不到問題所在。,。,。這種問題估計(jì)也就是我這種小白才會遇到。

弄完了label,，接下來就是損失函數(shù)和優(yōu)化函數(shù)了,，我這個(gè)是直接借鑒的一篇論文的，沒什么好說的,，直接上代碼

[python] view plain copy

#訓(xùn)練及評估
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=ys, logits=y_out))
#cross_entropy=tf.reduce_mean(-tf.reduce_sum(ys*tf.log(y_out),reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_out,1), tf.argmax(ys,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run(tf.global_variables_initializer())
#w1 = tf.zeros(shape = w_conv3.shape)
#w2 = tf.zeros(shape = w_conv3.shape)
#w3 = tf.zeros(shape = w_conv3.shape)
for i in range(1000):
train_accuracy = sess.run(accuracy,feed_dict={xs:image_train, ys:label_train, keep_prob:0.6})
print("step %d, training accuracy %g"%(i, train_accuracy))
sess.run(train_step,feed_dict={xs:image_train, ys:label_train, keep_prob:0.5})
print(sess.run(cross_entropy,feed_dict={xs:image_train, ys:label_train, keep_prob:0.5}))
test_accuracy = sess.run(accuracy,feed_dict={xs:image_test,ys:label_test,keep_prob:1})
print("step %d, testing accuracy %g"%(i, test_accuracy))

中間的那部分w1,，w2,，w3是我當(dāng)時(shí)實(shí)驗(yàn)一個(gè)問題，沒具體意義

這就基本完成了我的網(wǎng)絡(luò)的所有部分,，然后我就訓(xùn)練了10次先試試結(jié)果（實(shí)驗(yàn)的時(shí)候我的learning rate設(shè)置的是0.1）,，結(jié)果發(fā)現(xiàn)我的訓(xùn)練準(zhǔn)確率和測試準(zhǔn)確率都在訓(xùn)練兩次后不變了，后來查了查才發(fā)現(xiàn)如果激活函數(shù)用的是relu的話,，learning rate不能太大,，否則就會出現(xiàn)一種情況，表現(xiàn)形式就是我這種,，還會在中途出現(xiàn)kernal dead restarting（有可能拼寫錯(cuò)誤,。。）詳情可以看看這個(gè)前輩的博文http://blog.csdn.net/cyh_24/article/details/50593400,，有各種激活函數(shù)的介紹,。

我現(xiàn)在還在跑我的這個(gè)模型。,。不過我估計(jì)準(zhǔn)確率可能不會太高,，首先我的網(wǎng)絡(luò)層數(shù)太少，參數(shù)也不太會設(shè)置,，后面我在改進(jìn)后會繼續(xù)記錄的,，如果有不對的地方還請指出。

完整代碼

[python] view plain copy

# -*- coding: utf-8 -*-
"""
Created on Mon Aug 14 09:38:53 2017
@author: Administrator
"""
import matplotlib.image as img
import tensorflow as tf
import numpy as np
sess = tf.InteractiveSession()
#image_train 是訓(xùn)練數(shù)據(jù)
image_train = np.zeros((800,112,92))
for i in range(800):
# path = ' H:\Python\train_sample\'
m = str(i+1)
filename = "face" + m + '.bmp'
with tf.Session() as sess:
image_train [i] = img.imread(filename)
#image_test 測試數(shù)據(jù)
image_test = np.zeros((1031,112,92))
for i in range(1031):
m = str(i+1)
filename = r"H:\Python\test_sample\face" + m + '.bmp'
with tf.Session() as sess:
image_test[i] = img.imread(filename)
#test label
label_test = np.ones((1031,2))
for i in range(591):
label_test[i,0] = 0
label_test[i,1] = 1
for i in range(591,1031):
label_test[i,0] = 1
label_test[i,1] = 0
#print(label_test[591][0])
#train label
label_train = np.ones((800,2))
for i in range(400):
label_train[i,0] = 0
label_train[i,1] = 1
for i in range(400,800):
label_train[i,0] = 1
label_train[i,1] = 0
#print(label_train[700,0])
#print(label_train[700,1])
#print(label_train)
#占位符x：（輸入數(shù)據(jù)）
xs = tf.placeholder(tf.float32, shape = [None, 112,92])
ys = tf.placeholder(tf.float32, shape = [None, 2])
keep_prob = tf.placeholder(tf.float32)
x_image = tf.reshape(xs, [-1,112, 92, 1])#.stype(tf.float32) #指定類型
#get w
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev = 0.1)
return tf.Variable(initial)
#get bias
def biases_variable(shape):
initial = tf.constant(0.1, shape = shape)
return tf.Variable(initial)
#convolutional layer
def conv2d(x,w):
return tf.nn.conv2d(x, w, strides = [1, 1, 1, 1], padding = 'SAME')
#pooling layer ##pooling層模版大小為2x2,，所以輸出的長寬會變?yōu)檩斎氲囊话氪笮?nbsp;
def max_pool(x):
return tf.nn.max_pool(x, ksize = [1, 2, 2, 1], strides = [1, 2, 2, 1], padding = 'SAME')
# the first convolutional layer1
w_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = biases_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1) # output size 112x92x32
h_pool1 = max_pool(h_conv1) # output size 56x46x32
#the second convolutional layer2 每個(gè)5x5的patch會得到64個(gè)特征
w_conv2 = weight_variable([5,5,32,64])
b_conv2 = biases_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2) #output size 56x46x64
h_pool2 = max_pool(h_conv2) #output size 28x23x64
#the third converlutional layer3
w_conv3 = weight_variable([5,5,64,128])
b_conv3 = biases_variable([128])
h_conv3 = tf.nn.relu(conv2d(h_pool2, w_conv3) + b_conv3)
h_pool3 = max_pool(h_conv3) #output size 14x12x128
#全連接層
w_fc1 = weight_variable([14,12,128,1024])
b_fc1 = biases_variable([1024])
h_fc11 = tf.nn.relu(tf.nn.conv2d(h_pool3, w_fc1, strides=[1,1,1,1], padding='VALID') + b_fc1)
h_fc1=tf.reshape(h_fc11,[-1,1024])
#防止過擬合
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
#設(shè)置可變的學(xué)習(xí)率
#global_step = tf.Variable(0)
#learning_rate = tf.train.exponential_decay(0.001,global_step,100,0.98,staircase = True)
#softmax層
w_fc2 = weight_variable([1024, 2])
b_fc2 = biases_variable([2])
y_out = tf.nn.softmax(tf.matmul(h_fc1_drop, w_fc2) + b_fc2)
#print(y_out)
#訓(xùn)練及評估
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=ys, logits=y_out))
#cross_entropy=tf.reduce_mean(-tf.reduce_sum(ys*tf.log(y_out),reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_out,1), tf.argmax(ys,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run(tf.global_variables_initializer())
#w1 = tf.zeros(shape = w_conv3.shape)
#w2 = tf.zeros(shape = w_conv3.shape)
#w3 = tf.zeros(shape = w_conv3.shape)
for i in range(1000):
train_accuracy = sess.run(accuracy,feed_dict={xs:image_train, ys:label_train, keep_prob:0.6})
print("step %d, training accuracy %g"%(i, train_accuracy))
sess.run(train_step,feed_dict={xs:image_train, ys:label_train, keep_prob:0.5})
print(sess.run(cross_entropy,feed_dict={xs:image_train, ys:label_train, keep_prob:0.5}))
test_accuracy = sess.run(accuracy,feed_dict={xs:image_test,ys:label_test,keep_prob:1})
print("step %d, testing accuracy %g"%(i, test_accuracy))

本站是提供個(gè)人知識管理的網(wǎng)絡(luò)存儲空間,，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn),。請注意甄別內(nèi)容中的聯(lián)系方式,、誘導(dǎo)購買等信息，謹(jǐn)防詐騙,。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,，請點(diǎn)擊一鍵舉報(bào)。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自：暖寶寶j > 《Python》

舉報(bào)/認(rèn)領(lǐng)