使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

taotao_2016 2019-03-23

展開(kāi)全文

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

要構(gòu)建SLR（手語(yǔ)識(shí)別），我們需要三件事：

機(jī)器學(xué)習(xí)數(shù)據(jù)集
構(gòu)建機(jī)器學(xué)習(xí)模型（我們將使用CNN）
應(yīng)用模型平臺(tái)（我們將使用OpenCV）

1）數(shù)據(jù)集

可以在此處下載手勢(shì)數(shù)據(jù)集（https://www./datamunge/sign-language-mnist）,。

我們的機(jī)器學(xué)習(xí)數(shù)據(jù)集包含24個(gè)（J和Z除外）American Sign Laguage字母表的許多圖像,。每個(gè)圖像的大小為28x28像素，這意味著每個(gè)圖像總共784個(gè)像素,。

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

加載機(jī)器學(xué)習(xí)數(shù)據(jù)集

要加載數(shù)據(jù)集,，請(qǐng)使用以下Python代碼：

import kerasimport numpy as npimport pandas as pdimport cv2from matplotlib import pyplot as pltfrom keras.models import Sequential from keras.layers import Conv2D,MaxPooling2D, Dense,Flatten, Dropoutfrom keras.datasets import mnist import matplotlib.pyplot as pltfrom keras.utils import np_utilsfrom keras.optimizers import SGDtrain = pd.read_csv('train.csv')test = pd.read_csv('test.csv')y_train = train['label'].valuesy_test = test['label'].valuesX_train = train.drop(['label'],axis=1)X_test = test.drop(['label'], axis=1)

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

我們的數(shù)據(jù)集采用CSV（逗號(hào)分隔值）格式。train_X和test_X包含每個(gè)像素的值,。train_Y和test_Y包含圖像標(biāo)簽,。您可以使用以下Python代碼查看機(jī)器學(xué)習(xí)數(shù)據(jù)集：

display(X_train.info())display(X_test.info())display(X_train.head(n = 2))display(X_test.head(n = 2))

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

預(yù)處理

train_X和test_X包含所有像素像素值的數(shù)組。我們從這些值創(chuàng)建了一個(gè)圖像,。我們的圖像尺寸是28x28,，因此我們必須將陣列分成28x28像素組。為此,，我們將使用以下代碼：

X_train = np.array(X_train.iloc[:,:])X_train = np.array([np.reshape(i, (28,28)) for i in X_train])X_test = np.array(X_test.iloc[:,:])X_test = np.array([np.reshape(i, (28,28)) for i in X_test])num_classes = 26y_train = np.array(y_train).reshape(-1)y_test = np.array(y_test).reshape(-1)y_train = np.eye(num_classes)[y_train]y_test = np.eye(num_classes)[y_test]X_train = X_train.reshape((27455, 28, 28, 1))X_test = X_test.reshape((7172, 28, 28, 1))

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

現(xiàn)在我們可以使用這個(gè)數(shù)據(jù)集來(lái)訓(xùn)練我們的機(jī)器學(xué)習(xí)模型了。

2）建立和訓(xùn)練模型

我們將使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）來(lái)識(shí)別字母,。我們用keras,。

機(jī)器學(xué)習(xí)模型的Python實(shí)現(xiàn)如下：

classifier = Sequential()classifier.add(Conv2D(filters=8, kernel_size=(3,3),strides=(1,1),padding='same',input_shape=(28,28,1),activation='relu', data_format='channels_last'))classifier.add(MaxPooling2D(pool_size=(2,2)))classifier.add(Conv2D(filters=16, kernel_size=(3,3),strides=(1,1),padding='same',activation='relu'))classifier.add(Dropout(0.5))classifier.add(MaxPooling2D(pool_size=(4,4)))classifier.add(Dense(128, activation='relu'))classifier.add(Flatten())classifier.add(Dense(26, activation='softmax'))

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

我們的模型由Conv2D和MaxPooling層組成，然后是一些全連接層(Dense）,。

第一個(gè)Conv2D（卷積）層采用（28,28,1）的輸入圖像,。最后一個(gè)全連接層為我們提供了26個(gè)字母的輸出。

我們正在使用第二個(gè)Conv2D層之后的Dropout來(lái)正則化我們的訓(xùn)練,。

我們?cè)谧詈笠粚邮褂胹oftmax激活函數(shù),。

最后我們的模型看起來(lái)像這樣：

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

我們必須編譯并擬合機(jī)器學(xué)習(xí)模型,。為此，我們將使用如下Python代碼：

classifier.compile(optimizer='SGD', loss='categorical_crossentropy', metrics=['accuracy'])classifier.fit(X_train, y_train, epochs=50, batch_size=100)

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

我們正在使用SGD優(yōu)化器來(lái)編譯我們的模型,。您也可以將時(shí)期減少到25,。

最后要檢查準(zhǔn)確性：

accuracy = classifier.evaluate(x=X_test,y=y_test,batch_size=32)print('Accuracy: ',accuracy[1])

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

要保存訓(xùn)練過(guò)的機(jī)器學(xué)習(xí)模型，我們可以使用：

classifier.save('CNNmodel.h5')

3）OpenCV

以下Python實(shí)現(xiàn)方法為示例,，可以根據(jù)需要自己調(diào)整,。

導(dǎo)入Python庫(kù)并加載模型

import cv2import numpy as npfrom keras.models import load_modelfrom skimage.transform import resize, pyramid_reduceimport PILfrom PIL import Imagemodel = load_model('CNNmodel.h5')

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

輔助函數(shù)

def crop_image(image, x, y, width, height): return image[y:y + height, x:x + width]def prediction(pred): if pred == 0: print('A') elif pred == 1: print('B') elif pred == 2: print('C') elif pred == 3: print('D') elif pred == 14: print('O') elif pred == 8: print('I') elif pred == 20: print('U') elif pred == 21: print('V') elif pred == 22: print('W') elif pred == 24: print('Y') elif pred == 11: print('L')def keras_process_image(img): image_x = 28 image_y = 28 img = cv2.resize(img, (1,28,28), interpolation = cv2.INTER_AREA) #img = get_square(img, 28) #img = np.reshape(img, (image_x, image_y)) return img

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

預(yù)測(cè)

我們必須從輸入圖像預(yù)測(cè)字母。我們的模型將輸出作為整數(shù)而不是字母,，因?yàn)闃?biāo)簽是以整數(shù)形式給出的（A為1,，B為2，C為3,，依此類推......）

def keras_predict(model, image): data = np.asarray( image, dtype='int32' )  pred_probab = model.predict(data)[0] pred_class = list(pred_probab).index(max(pred_probab)) return max(pred_probab), pred_class

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

創(chuàng)建窗體

我們必須創(chuàng)建一個(gè)窗口來(lái)從我們的網(wǎng)絡(luò)攝像頭獲取輸入,。我們作為輸入的圖像應(yīng)該是28x28灰度圖像。因?yàn)槲覀冊(cè)?8x28尺寸的圖像上訓(xùn)練我們的模型,。示例代碼如下：

def main(): while True: cam_capture = cv2.VideoCapture(0) _, image_frame = cam_capture.read() # Select ROI im2 = crop_image(image_frame, 300,300,300,300) image_grayscale = cv2.cvtColor(im2, cv2.COLOR_BGR2GRAY) image_grayscale_blurred = cv2.GaussianBlur(image_grayscale, (15,15), 0) im3 = cv2.resize(image_grayscale_blurred, (28,28), interpolation = cv2.INTER_AREA) #ar = np.array(resized_img) #ar = resized_img.reshape(1,784) im4 = np.resize(im3, (28, 28, 1)) im5 = np.expand_dims(im4, axis=0) pred_probab, pred_class = keras_predict(model, im5) #print(pred_class, pred_probab) prediction(pred_class) # Display cropped image cv2.imshow('Image2',im2) #cv2.imshow('Image4',resized_img) cv2.imshow('Image3',image_grayscale_blurred) if cv2.waitKey(25) & 0xFF == ord('q'): cv2.destroyAllWindows() breakkeras_predict(model, np.zeros((1, 28, 28, 1), dtype=np.uint8))if __name__ == '__main__': main()cam_capture.release()cv2.destroyAllWindows()

使用CNN（卷積神經(jīng)網(wǎng)絡(luò)）和OpenCV進(jìn)行手勢(shì)識(shí)別

我們的機(jī)器學(xué)習(xí)模型準(zhǔn)確度約為94％,，因此它應(yīng)該識(shí)別字母而沒(méi)有任何問(wèn)題。

本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間,，所有內(nèi)容均由用戶發(fā)布,，不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式,、誘導(dǎo)購(gòu)買等信息,，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,，請(qǐng)點(diǎn)擊一鍵舉報(bào),。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來(lái)自： taotao_2016 > 《計(jì)算機(jī)》

舉報(bào)/認(rèn)領(lǐng)