当前位置：网站首页>[code attached] how to realize handwritten digit recognition with hog+svm

[code attached] how to realize handwritten digit recognition with hog+svm

2022-07-19 15:09:00 【DeepDriving】

Insert picture description here

This article was first published on WeChat public 【DeepDriving】, Official account reply 【 Handwritten digit recognition 】 You can get the code link of this article .

Preface

Handwritten numeral recognition is a very famous entry-level image recognition project in machine learning and deep learning , Many people began to enter the field of image recognition from this project . Although deep learning has been popular in the field of image recognition , Remarkable achievements have been made , But it is undeniable that the classical machine learning methods are still timeless and useful . This article will show you the classic extract HOG features +SVM classification Method to realize handwritten digit recognition .

Download datasets

Handwritten numeral recognition data set adopts MNIST Data sets , The data set can be downloaded from the official website ：http://yann.lecun.com/exdb/mnist/, You can also download it from the website of Gewu titanium ：https://gas.graviti.cn/dataset/data-decorators/MNIST. The data set includes the following 4 Compressed files ：

train-images-idx3-ubyte.gz: Training set image data
train-labels-idx1-ubyte.gz: Training set tag data
t10k-images-idx3-ubyte.gz: Test set image data
t10k-labels-idx1-ubyte.gz: Test set label data

The training set contains 60000 Samples , The test set contains 10000 Samples , Every sample is 28x28 Grayscale image of . After downloading the data set, we can take several samples for visualization , Look at these samples .

Insert picture description here

extract HOG features

Unlike deep learning , In machine learning, we need to extract and process features manually , Then these processed features are sent to the classifier for training or prediction .HOG(Histogram of Oriented Gradient, Direction gradient histogram ) It is a feature descriptor commonly used in computer vision and image processing . stay OpenCV in , We can call cv2.HOGDescriptor() To create a HOGDescriptor Class object ：

def CreateHOGDescriptor():
    winSize = (28, 28)
    blockSize = (14, 14)
    blockStride = (7, 7)
    cellSize = (7, 7)
    nbins = 9
    derivAperture = 1
    winSigma = -1.
    histogramNormType = 0
    L2HysThreshold = 0.2
    gammaCorrection = 1
    nlevels = 64
    signedGradient = True

    hog = cv2.HOGDescriptor(winSize, blockSize, blockStride, cellSize, nbins, derivAperture,
                            winSigma, histogramNormType, L2HysThreshold, gammaCorrection, nlevels, signedGradient)
    return hog

establish HOGDescriptor Object, some parameters need to be set ：

winSize: Set here as the size of the sample image .
cellSize: This value determines the size of the extracted feature vector , The smaller cellSize The larger the eigenvector is worth .
blockSize: block It is mainly used to solve the problem of light change , Big blockSize Value can make the algorithm less sensitive to local changes in the image , Usually blockSize Set to 2*cellSize.
blockStride: Determine the overlap between adjacent blocks and control the degree of contrast normalization , Usually blockStride Set to blockSize Of 1/2.
nbins: Set gradient histogram bin The number of ,HOG The author's recommendation is 9, This can be done in 20 Degree is incremental capture 0~180 Gradient between degrees .
signedGradients: Is the gradient signed or unsigned .

establish HOGDescriptor After the object , You can call compute() Method to calculate the HOG It's characterized by .

Training SVM Model

stay OpenCV in , We can directly call SVM_create() Function to create a SVM Classifier model , When creating a model, you need to set some parameters , For example, classification model type 、 Kernel function type 、 Regularization coefficient, etc .

def InitSVM(C=12.5, gamma=0.50625):
    model = cv2.ml.SVM_create()
    model.setGamma(gamma)
    model.setC(C)
    model.setKernel(cv2.ml.SVM_RBF)
    model.setType(cv2.ml.SVM_C_SVC)
    return model

Choose the right one SVM Hyperparameters are difficult , But the better thing is ,OpenCV It provides us with a trainAuto() function , This function passes through K Fold cross validation to find the optimal parameters . After the model is created , We can call this function to train the model .

def TrainSVM(model, samples, responses, kFold=10):
    model.trainAuto(samples, cv2.ml.ROW_SAMPLE, responses, kFold)
    return model

Because it needs to be done K Crossover verification , So call trainAuto() It takes a long time to train the function model . If you don't want to spend so much time training models , Can reduce the K Cross validation K value , Or use... Instead of this function directly train() Function to train the model .

After model training , We can HOG and SVM Save the model to XML In file , For later use .

svm_model.save('svm.xml')
hog.save('hog_descriptor.xml')

test model

After training the model , We can test the accuracy of the model on the test set . First, extract each image in the test set HOG features , Then feed the feature into SVM Classify the model to predict and calculate the accuracy of the model .

def EvaluateSVM(model, samples, labels):
    predictions = SVMPredict(model, samples)
    accuracy = (labels == predictions).mean()
    print('Accuracy: %.2f %%' % (accuracy*100))

    confusion = np.zeros((10, 10), np.int32)
    for i, j in zip(labels, predictions):
        confusion[int(i), int(j)] += 1
    print('confusion matrix:')
    print(confusion)

The accuracy of the model I trained on the test set is 99.46%, The confusion matrix is as follows ：

confusion matrix:
[[ 978    0    0    0    0    0    1    1    0    0]
 [   0 1132    1    0    0    0    1    0    1    0]
 [   1    0 1027    0    0    0    0    4    0    0]
 [   0    0    2 1006    0    1    0    0    1    0]
 [   0    0    0    0  976    0    1    0    0    5]
 [   1    0    0    2    0  888    1    0    0    0]
 [   4    2    1    0    0    1  949    0    1    0]
 [   0    2    3    0    0    0    0 1022    0    1]
 [   2    0    0    1    0    1    0    1  967    2]
 [   0    1    0    1    3    0    0    1    2 1001]]

Using the model

After training a model , We certainly hope to apply it to real life to help us solve some problems . Now that a handwritten numeral recognition model has been trained , Then let's let it recognize handwritten numbers , See how it works .

First , Let's take a piece of white paper and write some numbers , Then the image processing method is used to extract the area of each number on the paper , Then perform the extraction described above HOG features +SVM Identification process of classification . Here are some of my test results ：

Insert picture description here

As you can see from the above figure , From the front 0~8 The recognition accuracy of these columns is still relatively high , however 9 All the numbers in this column are recognized as 7, Maybe it's the number I wrote “9” And the numbers in the training set 7 More similar to numbers 9 The difference is quite big , Readers can try it if they are interested .

Reference material

https://towardsdatascience.com/mnist-handwritten-digits-classification-from-scratch-using-python-numpy-b08e401c4dab
https://learnopencv.com/handwritten-digits-classification-an-opencv-c-python-tutorial/
https://opencv24-python-tutorials.readthedocs.io/en/latest/py_tutorials/py_ml/py_svm/py_svm_opencv/py_svm_opencv.html

Welcome to my official account. 【DeepDriving】, I will share computer vision from time to time 、 machine learning 、 Deep learning 、 Driverless and other fields .

Insert picture description here