当前位置:网站首页>[code attached] how to realize handwritten digit recognition with hog+svm
[code attached] how to realize handwritten digit recognition with hog+svm
2022-07-19 15:09:00 【DeepDriving】

This article was first published on WeChat public 【DeepDriving】, Official account reply 【 Handwritten digit recognition 】 You can get the code link of this article .
Preface
Handwritten numeral recognition is a very famous entry-level image recognition project in machine learning and deep learning , Many people began to enter the field of image recognition from this project . Although deep learning has been popular in the field of image recognition , Remarkable achievements have been made , But it is undeniable that the classical machine learning methods are still timeless and useful . This article will show you the classic extract HOG features +SVM classification Method to realize handwritten digit recognition .
Download datasets
Handwritten numeral recognition data set adopts MNIST Data sets , The data set can be downloaded from the official website :http://yann.lecun.com/exdb/mnist/, You can also download it from the website of Gewu titanium :https://gas.graviti.cn/dataset/data-decorators/MNIST. The data set includes the following 4 Compressed files :
train-images-idx3-ubyte.gz: Training set image datatrain-labels-idx1-ubyte.gz: Training set tag datat10k-images-idx3-ubyte.gz: Test set image datat10k-labels-idx1-ubyte.gz: Test set label data
The training set contains 60000 Samples , The test set contains 10000 Samples , Every sample is 28x28 Grayscale image of . After downloading the data set, we can take several samples for visualization , Look at these samples .

extract HOG features
Unlike deep learning , In machine learning, we need to extract and process features manually , Then these processed features are sent to the classifier for training or prediction .HOG(Histogram of Oriented Gradient, Direction gradient histogram ) It is a feature descriptor commonly used in computer vision and image processing . stay OpenCV in , We can call cv2.HOGDescriptor() To create a HOGDescriptor Class object :
def CreateHOGDescriptor():
winSize = (28, 28)
blockSize = (14, 14)
blockStride = (7, 7)
cellSize = (7, 7)
nbins = 9
derivAperture = 1
winSigma = -1.
histogramNormType = 0
L2HysThreshold = 0.2
gammaCorrection = 1
nlevels = 64
signedGradient = True
hog = cv2.HOGDescriptor(winSize, blockSize, blockStride, cellSize, nbins, derivAperture,
winSigma, histogramNormType, L2HysThreshold, gammaCorrection, nlevels, signedGradient)
return hog
establish HOGDescriptor Object, some parameters need to be set :
winSize: Set here as the size of the sample image .
cellSize: This value determines the size of the extracted feature vector , The smaller cellSize The larger the eigenvector is worth .
blockSize: block It is mainly used to solve the problem of light change , Big blockSize Value can make the algorithm less sensitive to local changes in the image , Usually blockSize Set to 2*cellSize.
blockStride: Determine the overlap between adjacent blocks and control the degree of contrast normalization , Usually blockStride Set to blockSize Of 1/2.
nbins: Set gradient histogram bin The number of ,HOG The author's recommendation is 9, This can be done in 20 Degree is incremental capture 0~180 Gradient between degrees .
signedGradients: Is the gradient signed or unsigned .
establish HOGDescriptor After the object , You can call compute() Method to calculate the HOG It's characterized by .
Training SVM Model
stay OpenCV in , We can directly call SVM_create() Function to create a SVM Classifier model , When creating a model, you need to set some parameters , For example, classification model type 、 Kernel function type 、 Regularization coefficient, etc .
def InitSVM(C=12.5, gamma=0.50625):
model = cv2.ml.SVM_create()
model.setGamma(gamma)
model.setC(C)
model.setKernel(cv2.ml.SVM_RBF)
model.setType(cv2.ml.SVM_C_SVC)
return model
Choose the right one SVM Hyperparameters are difficult , But the better thing is ,OpenCV It provides us with a trainAuto() function , This function passes through K Fold cross validation to find the optimal parameters . After the model is created , We can call this function to train the model .
def TrainSVM(model, samples, responses, kFold=10):
model.trainAuto(samples, cv2.ml.ROW_SAMPLE, responses, kFold)
return model
Because it needs to be done K Crossover verification , So call trainAuto() It takes a long time to train the function model . If you don't want to spend so much time training models , Can reduce the K Cross validation K value , Or use... Instead of this function directly train() Function to train the model .
After model training , We can HOG and SVM Save the model to XML In file , For later use .
svm_model.save('svm.xml')
hog.save('hog_descriptor.xml')
test model
After training the model , We can test the accuracy of the model on the test set . First, extract each image in the test set HOG features , Then feed the feature into SVM Classify the model to predict and calculate the accuracy of the model .
def EvaluateSVM(model, samples, labels):
predictions = SVMPredict(model, samples)
accuracy = (labels == predictions).mean()
print('Accuracy: %.2f %%' % (accuracy*100))
confusion = np.zeros((10, 10), np.int32)
for i, j in zip(labels, predictions):
confusion[int(i), int(j)] += 1
print('confusion matrix:')
print(confusion)
The accuracy of the model I trained on the test set is 99.46%, The confusion matrix is as follows :
confusion matrix:
[[ 978 0 0 0 0 0 1 1 0 0]
[ 0 1132 1 0 0 0 1 0 1 0]
[ 1 0 1027 0 0 0 0 4 0 0]
[ 0 0 2 1006 0 1 0 0 1 0]
[ 0 0 0 0 976 0 1 0 0 5]
[ 1 0 0 2 0 888 1 0 0 0]
[ 4 2 1 0 0 1 949 0 1 0]
[ 0 2 3 0 0 0 0 1022 0 1]
[ 2 0 0 1 0 1 0 1 967 2]
[ 0 1 0 1 3 0 0 1 2 1001]]
Using the model
After training a model , We certainly hope to apply it to real life to help us solve some problems . Now that a handwritten numeral recognition model has been trained , Then let's let it recognize handwritten numbers , See how it works .
First , Let's take a piece of white paper and write some numbers , Then the image processing method is used to extract the area of each number on the paper , Then perform the extraction described above HOG features +SVM Identification process of classification . Here are some of my test results :

As you can see from the above figure , From the front 0~8 The recognition accuracy of these columns is still relatively high , however 9 All the numbers in this column are recognized as 7, Maybe it's the number I wrote “9” And the numbers in the training set 7 More similar to numbers 9 The difference is quite big , Readers can try it if they are interested .
Reference material
- https://towardsdatascience.com/mnist-handwritten-digits-classification-from-scratch-using-python-numpy-b08e401c4dab
- https://learnopencv.com/handwritten-digits-classification-an-opencv-c-python-tutorial/
- https://opencv24-python-tutorials.readthedocs.io/en/latest/py_tutorials/py_ml/py_svm/py_svm_opencv/py_svm_opencv.html
Welcome to my official account. 【DeepDriving】, I will share computer vision from time to time 、 machine learning 、 Deep learning 、 Driverless and other fields .

边栏推荐
- 第1章 预备知识
- High performance pxie data preprocessing board based on kinex ultrascale series FPGA (ku060 +fmc sub card interface)
- 揭开服务网格~Istio Service Mesh神秘的面纱
- 2021.07.13 [station B] collapsed like this
- [microservice] microservice learning note 3: use feign to replace resttemplate to complete remote call
- Istio XDS configuration generation implementation
- 状态机练习
- 1、DBMS基本概念
- ICML2022 | 几何多模态对比表示学习
- Summary of the third week of summer vacation
猜你喜欢
随机推荐
Distributed transaction summary
国内顶尖专家集聚广州,探讨健康医疗数据安全应用
Leetcode 1296. 划分数组为连续数字的集合(已解决)
GYM103660E.Disjoint Path On Tree 树上计数
[microservice] microservice learning note 3: use feign to replace resttemplate to complete remote call
跨域与CORS
session management
How to quickly realize Zadig single sign on on authoring?
JVM常用调优配置参数
Istio XDS配置生成实现
Chang'an chain learning research - storage analysis wal mechanism
Maximum heap and heap sort and priority queue
PKI:TLS握手
[port 3000 is already in use, solution to the problem of 3000 port being occupied]
天勤第九章课后习题代码
Several points to be analyzed in the domestic fpga/dsp/zynq scheme
Scheduled tasks, VIM directly creates and modifies users
Behind the high salary of programmers' operation and maintenance
Icml2022 | geometric multimodal comparative representation learning
模块1 作业



![[cute new problem solving] sum of four numbers](/img/da/19099a4b3cd5a344a4fbd60aef5a05.png)




