当前位置:网站首页>Neural network and deep learning-6-support vector machine 1-pytorch
Neural network and deep learning-6-support vector machine 1-pytorch
2022-07-26 09:25:00 【Bai Xiaosheng in Ming Dynasty】
Preface
SVM (support vector machines) It is the classic binary classification model in machine learning
It is a linear model with the largest interval defined in the feature space
Linear support vector machines and Nonlinear support vector machines , Hard spacing Soft space ,SMO( Sequence minimum optimal algorithm )
Reference documents << Statistical learning method >>
Talking about 「 Positive definite matrix 」 and 「 Positive semidefinite matrices 」 - You know
Gram matrix It's semi-positive definite
Catalog
- Linear separable support vector machine and hard interval maximization
- Linear support vector machine and soft interval maximization
- Nonlinear support vector machine and kernel function
One Linear separable support vector machine and hard interval maximization
1.1 Linear separable support vector machine definition
Given a linearly separable data set , The corresponding convex quadratic programming problem is solved by maximizing the interval or equivalent
The obtained separation hyperplane is

And the corresponding classification decision function

1.2 Function interval geometric interval
The distance between a point and the hyperplane can indicate the degree of certainty of classification prediction
The function between :
Define hyperplane (w,b) About the sample point
The function interval of is

To minimize the ( The point closest to the hyperplane )

Geometric interval
Define hyperplane (w,b) About the sample point
The geometric interval of is


When
Function spacing and geometric spacing are equal
1.3 Maximize spacing
Find the hyperplane with the largest geometric interval for the training data set , It's not just about separating positive and negative examples , And there is enough certainty that the most difficult instance points are separated
Maximize spacing
Linearly separable SVM Learn optimization problems


Equivalent to convex quadratic programming problem
min f(w): Convex function
![]()
Constraint is Convex function + Affine function
1.4 Maximize interval algorithm
Input :
Linearly separable data sets
Output :
Maximum separation hyperplane and classification decision function
step1:
Construct and solve constrained optimization problems
step2: Separating hyperplanes
step3: Classification decision function
1.5 Learn the dual algorithm
Define the Lagrange function

Solution :
Treat me first w,b Seeking minimum , Then find the maximum of Lagrange multiplier
1: 
Yes w,b After taking the partial derivative , Bring in

s.t

2 Yes
seek Maximum , The dual problem

s.t:

3 utilize KKT Condition solving
Focus on training
Sample point of
An instance of is called a support vector , Play a role in classification
(
)
, among
Corresponding 
Linear separable support vector machine learning algorithm
Input :
Linearly separable data sets
Output :
Maximum separation hyperplane and classification decision function
Construct and solve constrained optimization problems
s.t
Calculation
choice
Components of
Get the separated hyperplane
Classification decision function :
Two Linear support vector machine and soft interval maximization
There are some special points in the training data set , The function interval cannot be greater than or equal to 1 Constraints of , The relaxation variable is introduced 
2.1 Convex quadratic programming primitive problem


2.2 Learn the dual algorithm
The Lagrangian function of the original problem

First of all L Yes w,b,
Find the partial derivative


Into the , It's the same as before , We get the dual problem
s.t



Theorem set up
Is a solution to the dual problem , If exist
A component of
,
Then the solution of the original problem can be obtained as follows :


prove :
The original problem is a convex quadratic programming problem , Satisfy KKT Conditions








2.3 Learning algorithms
Input :
Training data set
,
among
Output : Separating hyperplane and classification decision function
1: Select the penalty factor C>0, Construct and solve convex quadratic programming problem
s.t
2: Calculation
Select a
3: Get the separated hyperplane
4: Classification decision function
2.4 Support vector


2.5 Hinge loss function
This is another explanation : Minimum loss function
![L=\sum_i [1-y_i(w \bullet x_i+b)]+\lambda ||w||^2](http://img.inotgo.com/imagesLocal/202207/26/202207260918193633_63.gif)
The first is hinge loss (Hinge loss function)
![L=[1-y(w x+b)]](http://img.inotgo.com/imagesLocal/202207/26/202207260918193633_84.gif)
![[z]=\left\{\begin{matrix} z,z>0\\ 0,z \leq 0 \end{matrix}\right.](http://img.inotgo.com/imagesLocal/202207/26/202207260918193633_11.gif)
3、 ... and Nonlinear support vector machine and kernel function
3.1 Nonlinear classification problem
First, use transformation to map the data of the original space to the new space ; Then use the learning method of linear classification in the new space to learn the classification model from the training data
3.2 Kernel function definition
set up
It's the input space ( A subset or discrete set of Euclidean spaces ), And set up H For feature space ( Hilbert space ),
If there is one from
To H Mapping

Make it all right
, function 
call
It's a kernel function
For mapping function

3.3 Positive determination
The necessary and sufficient condition to be a sum function is a kernel function
hypothesis K(x,z) Is defined in
Symmetric functions on , And arbitrary
K(x,z) About
Of Gram The matrix is positive semidefinite . According to K(x,z) Form a Hilbert space H
step :
First define
And form a vector space S
And then in S Define inner product on to form inner product space
The final will be S Completion constitutes Hilbert space
1 First define the mapping , Form vector space S
Define mapping 
According to this mapping , For any
,i=1,2...m, Define linear combinations

Consider a set of elements from linear combinations S, Because of the set S It is closed for addition and multiplication , therefore
S It forms a vector space .
2: stay S Define inner product on , Let's call it inner product space
stay S Define an operation on *: For any f,g
,



3 Put vector space S Complete into Hilbert space


4: Common kernel functions
Nuclear method is just a skill to deal with problems , Linear non separability in low dimensional space can be linearly separable in high dimensional space , But the computational complexity of high-dimensional space is very large , Then let's calculate the high-dimensional space through Calculation of low dimensional space Plus Some linear transformations To complete
Review of Linear Algebra
1: Positive definite matrix definition (positive definite and positive semi-definite)
Given a size of [n,n] Real symmetric matrix of A , If for any length it is n Nonzero vector of x , Yes
Hang up , Then the matrix A It's a positive definite matrix
1.1 nature A The eigenvalue of is large 0

On the contrary, only all eigenvalues are greater than 0 The symmetric matrix of , Then it must be positive definite
1.2 Example

2 Positive semidefinite matrices
Given a size of [n,n] Real symmetric matrix of A , If for any length it is n Nonzero vector of x , Yes
Hang up , Then the matrix A It's a positive semidefinite matrix

边栏推荐
- jvm命令归纳
- LeetCode三数之和问题
- Hbuilderx runs the wechat developer tool "fail to open ide" to solve the error
- 神经网络与深度学习-6- 支持向量机1 -PyTorch
- 会议OA项目(三)---我的会议(会议排座、送审)
- arc-gis的基本使用2
- The provincial government held a teleconference on safety precautions against high temperature weather across the province
- Implementation of fragment lazy loading after multi-layer nesting
- WARNING: [pool www] server reached pm.max_children setting (5), consider raising it
- [MySQL] detailed explanation of redo log, undo log and binlog (4)
猜你喜欢

Voice chat app source code - Nath live broadcast system source code

Study notes of dataX

Personality test system V1.0
![[shutter -- layout] detailed explanation of the use of align, center and padding](/img/01/c588f75313580063cf32cc01677600.jpg)
[shutter -- layout] detailed explanation of the use of align, center and padding

2022 Shanghai safety officer C certificate examination questions and mock examination

【Mysql】认识Mysql重要架构(一)
![[MySQL] detailed explanation of redo log, undo log and binlog (4)](/img/67/6e646040c1b941c270b3efff74e94d.png)
[MySQL] detailed explanation of redo log, undo log and binlog (4)

PMM(Percona Monitoring and Management )安装记录

MySQL transaction

Stm32+mfrc522 completes IC card number reading, password modification, data reading and writing
随机推荐
756. Serpentine matrix
解决“NOTE: One or more layouts are missing the layout_width or layout_height attributes.”
Zxing simplified version, reprinted
Basic use of ArcGIS 1
The provincial government held a teleconference on safety precautions against high temperature weather across the province
登录模块用例编写
Process32First返回false,错误x信息24
Apple generated and verified tokens for PHP
Windows通过命令备份数据库到本地
VS2019配置opencv
暑假末尾学习笔记
JS output diamond on the console
756. 蛇形矩阵
Processing of inconsistent week values obtained by PHP and MySQL
What are CSDN spaces represented by
2022 Shanghai safety officer C certificate examination questions and mock examination
Windows下Redis哨兵模式搭建
volatile 靠的是MESI协议解决可见性问题?(上)
Zipkin installation and use
吴恩达机器学习之线性回归






Sample point of
An instance of is called a support vector , Play a role in classification 











Symmetric functions on , And arbitrary 
Of Gram The matrix is positive semidefinite . According to K(x,z) Form a Hilbert space H
And form a vector space S