当前位置:网站首页>Wu Enda machine learning chapter 6-7
Wu Enda machine learning chapter 6-7
2022-07-19 06:42:00 【Watermelon that loves programming】
Wu Enda machine learning 6-7 Chapter
Due to the first 5 This chapter mainly talks about Octave The grammar of , But now we mainly use python To carry out AI Programming . So we won't summarize the fifth chapter , Interested friends can go and have a look .
The first 6 Chapter
6-1 classification
Classification is used in many places in real life , For example, spam classification 、 Network fraud 、 Tumor prediction . Positive classes are generally expressed as 1, Negative classes are generally expressed as 0. Of course , If there are multiple classification problems, there will be 0,1,2,3 And so on ,
For tumor prediction , If the result of the prediction function is greater than 0.5, Then we predict it as a positive class , On the contrary, it is a negative class .
Sometimes our prediction algorithm may result in more than 1 Or less than 0, This is obviously a little strange . So then we will talk about one logistic Regression algorithm , Make the prediction result in 0 To 1 Between ( Although the name has a return , But this algorithm makes a classification algorithm )
6-2 Hypothetical statement
logistic The function is defined as follows ( It is generally believed logistic Function is equal to sigmoid function ):
We can take a simple tumor example 
6-3 Decision boundaries
stay logistic In the regression , When the forecast is greater than 0.5 when , We think it is a positive class , The opposite is negative . We can know by observing the function graph , When z Greater than 0 when , The prediction result is greater than 0.5, When z Less than 0 when , The prediction result is less than 0.5..
In the following illustration , We make a prediction function , And learned that when -3+x1+x2>=0 when ,y be equal to 1, stay -3+x1+x2<=0 when ,y=2. So we can know that the decision boundary is x1+x2=3.
In another example , We can add higher-order terms . We make θ0=-1,θ1=0,θ2=0,θ3=1,θ=1. We can know that the decision boundary is x1 Square plus x2 The square of is equal to 1.
6-4 Cost function
In the past , We introduced the cost function , Here we use a relatively new tag -
cost. When the predicted value is different from the real value , We hope that the function can pay a price .
After using gradient descent , We can ensure that the function converges .
Our definition of cost function is as follows :
We can get the relationship with the prediction function . if y=1 And the prediction function =1, be cost=0;
But if the prediction function is 0 when ,cost Will approach infinity .

6-5 Simplified cost function and gradient descent
In this section, we use a simple cost function to combine with gradient descent , To achieve complete logistic Regression algorithm .
First, list the cost function again 
To fit the curve , We need to converge the value of the cost function .
Then we use the gradient descent algorithm , To fit the curve , Find the smallest θ value .
6-6 Advanced optimization
There are some excellent algorithms , They can better make the function reach the convergence state . Their advantages and disadvantages are as follows . For the rest of the class , The teacher is suggesting , We try not to implement the underlying algorithm , Make wheels over and over again , If you understand the theory and adjust the database better, you'll be done ,
6-7 Multivariate classification : One to many
In this lesson, we will discuss how to use logical regression to solve multi classification problems . In real life , Multiple classification problems make it very common , Such as weather conditions .
For multi classification problems , For example, three categories , We can define a certain class as the first class , Then define the other two categories as the second category , To classify . Repeat this three times , Then we can complete the problem of three classifications .
The first 7 Chapter
7-1 Over fitting problem
When the algorithm fits a straight line well , And passed all the function points perfectly , This is over fitting . But this does not seem to be a very good phenomenon , Because when there are too many variables , The cost function may be close to 0, But it cannot be generalized to new sample points , And can't predict well . On the contrary, under fitting means that there is no prediction ability at all , You can't even train at the training point .
The first one in the figure below is under fitting , The second is just , The third is over fitting .
When under fitting occurs , We have two ways . First, we can reduce the number of selected variables , That is, delete some characteristic variables . The second method is variable regularization , Reduce the value of some parameters .
7-2 Regularization related cost function
We add an error value to the cost function , To reduce the possibility of over fitting .
For example, in housing prediction , If the sample has 100 Eigenvalues , We can add regularization to the cost function .

7-3 Linear regression regularization
For linear regression , We derived two algorithms before , One is based on gradient descent , The other is based on normal equations . We extend these two algorithms to regularized linear regression .
This is the cost function that adds regularization 
This is a gradient descent function without regularization 
Now we add the regularization term 
For normal equations , We can also add regularization terms , There is an interesting phenomenon , When regularization is added , Then the proof will be reversible .
边栏推荐
- TCP/IP四层模型以及F5部分相关配置
- 手把手搭建家用 NAS 全能服务器(1)| 配置选择及准备
- #MySql MySql 计算今年有多少天周末(周六、日)
- 2022/07/11 group 5 Ding Shuai's study notes day04
- 《PyTorch深度学习实践》-B站 刘二大人-day7
- 网络层及ip学习
- Preorder traversal of binary tree
- 日常的眼睛接触检测使用无监督的注视目标发现
- Network layer and IP learning
- Get the current month, day, hour, minute, second and week, and update them in real time
猜你喜欢

实验二 类与对象定义初始化

Network layer and IP learning

颜色直方图 灰度图&彩色图

Cygwin cooperates with listary to switch the current directory and quickly open it

网络层及ip学习

Spot detection record

感知智能手機上用戶的關注狀態

mass data

Solution: unable to load file c:\program files\ Because running scripts is forbidden on this system

Vcenter6.7安装及排错
随机推荐
mass data
海量数据
Perception de l’état d’attention des utilisateurs sur les smartphones
机器人缝合手势识别和分类
ORACLE分组排序后获取第一条和最后一条值
redis
Part of the second Shanxi Network Security Skills Competition (Enterprise Group) WP (II)
Seachest utilities tool enables your hard disk to consume less power and live longer
Restclient query document
Spot detection record
C language specifies how many days to display from the date
Experiment class II and object definition initialization
Wu Enda machine learning chapter 1-2
Flip linked list
Application of views and index files
UDP message structure
Dual tone sorting of CUDA and large arrays
Google browser cannot modify cookies manually, and cookies are marked red
sql的约束条件
《PyTorch深度学习实践》-B站 刘二大人-day3