当前位置:网站首页>Neural network and deep learning-6-support vector machine 1-pytorch
Neural network and deep learning-6-support vector machine 1-pytorch
2022-07-26 09:25:00 【Bai Xiaosheng in Ming Dynasty】
Preface
SVM (support vector machines) It is the classic binary classification model in machine learning
It is a linear model with the largest interval defined in the feature space
Linear support vector machines and Nonlinear support vector machines , Hard spacing Soft space ,SMO( Sequence minimum optimal algorithm )
Reference documents << Statistical learning method >>
Talking about 「 Positive definite matrix 」 and 「 Positive semidefinite matrices 」 - You know
Gram matrix It's semi-positive definite
Catalog
- Linear separable support vector machine and hard interval maximization
- Linear support vector machine and soft interval maximization
- Nonlinear support vector machine and kernel function
One Linear separable support vector machine and hard interval maximization
1.1 Linear separable support vector machine definition
Given a linearly separable data set , The corresponding convex quadratic programming problem is solved by maximizing the interval or equivalent
The obtained separation hyperplane is
And the corresponding classification decision function
1.2 Function interval geometric interval
The distance between a point and the hyperplane can indicate the degree of certainty of classification prediction
The function between :
Define hyperplane (w,b) About the sample point The function interval of is
To minimize the ( The point closest to the hyperplane )
Geometric interval
Define hyperplane (w,b) About the sample point The geometric interval of is
When Function spacing and geometric spacing are equal
1.3 Maximize spacing
Find the hyperplane with the largest geometric interval for the training data set , It's not just about separating positive and negative examples , And there is enough certainty that the most difficult instance points are separated
Maximize spacing
Linearly separable SVM Learn optimization problems
Equivalent to convex quadratic programming problem
min f(w): Convex function
![]()
Constraint is Convex function + Affine function
1.4 Maximize interval algorithm
Input :
Linearly separable data sets
Output :
Maximum separation hyperplane and classification decision function
step1:
Construct and solve constrained optimization problems
step2: Separating hyperplanes
step3: Classification decision function
1.5 Learn the dual algorithm
Define the Lagrange function
Solution :
Treat me first w,b Seeking minimum , Then find the maximum of Lagrange multiplier
1:
Yes w,b After taking the partial derivative , Bring in
s.t
2 Yes seek Maximum , The dual problem
s.t:
3 utilize KKT Condition solving
Focus on training
Sample point of
An instance of is called a support vector , Play a role in classification
(
)
, among
Corresponding
Linear separable support vector machine learning algorithm
Input :
Linearly separable data sets
Output :
Maximum separation hyperplane and classification decision function
Construct and solve constrained optimization problems
s.t
Calculation
choice
Components of
Get the separated hyperplane
Classification decision function :
Two Linear support vector machine and soft interval maximization
There are some special points in the training data set , The function interval cannot be greater than or equal to 1 Constraints of , The relaxation variable is introduced
2.1 Convex quadratic programming primitive problem
2.2 Learn the dual algorithm
The Lagrangian function of the original problem
First of all L Yes w,b, Find the partial derivative
Into the , It's the same as before , We get the dual problem
s.t
Theorem set up Is a solution to the dual problem , If exist
A component of
,
Then the solution of the original problem can be obtained as follows :
prove :
The original problem is a convex quadratic programming problem , Satisfy KKT Conditions
2.3 Learning algorithms
Input :
Training data set
,
among
Output : Separating hyperplane and classification decision function
1: Select the penalty factor C>0, Construct and solve convex quadratic programming problem
s.t
2: Calculation
Select a
3: Get the separated hyperplane
4: Classification decision function
2.4 Support vector
2.5 Hinge loss function
This is another explanation : Minimum loss function
The first is hinge loss (Hinge loss function)
3、 ... and Nonlinear support vector machine and kernel function
3.1 Nonlinear classification problem
First, use transformation to map the data of the original space to the new space ; Then use the learning method of linear classification in the new space to learn the classification model from the training data
3.2 Kernel function definition
set up It's the input space ( A subset or discrete set of Euclidean spaces ), And set up H For feature space ( Hilbert space ),
If there is one from To H Mapping
Make it all right , function
call
It's a kernel function
For mapping function
3.3 Positive determination
The necessary and sufficient condition to be a sum function is a kernel function
hypothesis K(x,z) Is defined in
Symmetric functions on , And arbitrary
K(x,z) About
Of Gram The matrix is positive semidefinite . According to K(x,z) Form a Hilbert space H
step :
First define
And form a vector space S
And then in S Define inner product on to form inner product space
The final will be S Completion constitutes Hilbert space
1 First define the mapping , Form vector space S
Define mapping
According to this mapping , For any ,i=1,2...m, Define linear combinations
Consider a set of elements from linear combinations S, Because of the set S It is closed for addition and multiplication , therefore
S It forms a vector space .
2: stay S Define inner product on , Let's call it inner product space
stay S Define an operation on *: For any f,g ,
3 Put vector space S Complete into Hilbert space
4: Common kernel functions
Nuclear method is just a skill to deal with problems , Linear non separability in low dimensional space can be linearly separable in high dimensional space , But the computational complexity of high-dimensional space is very large , Then let's calculate the high-dimensional space through Calculation of low dimensional space Plus Some linear transformations To complete
Review of Linear Algebra
1: Positive definite matrix definition (positive definite and positive semi-definite)
Given a size of [n,n] Real symmetric matrix of A , If for any length it is n Nonzero vector of x , Yes Hang up , Then the matrix A It's a positive definite matrix
1.1 nature A The eigenvalue of is large 0
On the contrary, only all eigenvalues are greater than 0 The symmetric matrix of , Then it must be positive definite
1.2 Example
2 Positive semidefinite matrices
Given a size of [n,n] Real symmetric matrix of A , If for any length it is n Nonzero vector of x , Yes Hang up , Then the matrix A It's a positive semidefinite matrix
边栏推荐
猜你喜欢
随机推荐
arcgis的基本使用1
滑动窗口、双指针、单调队列、单调栈
Ext4 file system opens dir_ After nlink feature, link_ Use link after count exceeds 65000_ Count=1 means the quantity is unknown
Basic use of ArcGIS 4
异常处理机制二
Selection and practice of distributed tracking system
PHP一次请求生命周期
csdn空格用什么表示
Redis principle and use - Basic Features
Basic use of ArcGIS 1
2022 tea artist (intermediate) special operation certificate examination question bank simulated examination platform operation
大二上第一周学习笔记
Where are the laravel framework log files stored? How to use it?
CF1481C Fence Painting
a-table中的rowSelection清空问题
tabbarController的使用
MySQL strengthen knowledge points
Go intelligent robot alpha dog, alpha dog robot go
phpexcel导出emoji符号报错
登录模块用例编写