当前位置:网站首页>Week 4 Data analysis algorithms-Linear models for regression-Bias-Variance Analysis(Part B)
Week 4 Data analysis algorithms-Linear models for regression-Bias-Variance Analysis(Part B)
2022-07-17 23:25:00 【Jinzhou hungry bully】
Catalog
Two 、 The generalization error (Generalization error)
3、 ... and 、 Regularization parameters Editor and bias and variance The relationship between
Four 、 Bias-Variance in Regression(Tutorial)
One 、 deviation (Bias) And variance (Variance) The relationship between ( Measure the generalization performance of the algorithm )
1、 Definition :
(1) deviation : It refers to the error between the predicted output of the model and the label (indicate the accuracyof the models), Measure whether we find the best model or how close we are to the best model , The larger the deviation, the more likely it is to underfit .
(2) variance : It refers to the sensitivity error of the model to small fluctuations in the data set (indicate how consistent the models are), The difference between the prediction results of the model and the real data is greater as the data increases , At this time, the difference between the prediction result of the model and the best model is called variance , The larger the variance, the easier it is to fit .
2、 Relationship
(1) High deviation and low variance = Under fitting model
(2) Low deviation and high variance = Over fitting model ( Highly complex models )
(3) Low deviation and low variance = Best fit model ( The best model )
(4) High training accuracy and low testing accuracy ( Exceed sample accuracy )= High variance = Over fitting model = More model complexity
(5) Low training accuracy and low test accuracy ( Exceed sample accuracy )= High deviation = Under fitting model


3、 How to solve the deviation (Bias) Larger and variance (Variance) Bigger problem
(1) The deviation is quite large , Under fitting :
- Increase feature data to improve the degree of fitting , Avoid under fitting .
- Increased complexity of the model ( increase M Value ) Improve the degree of fitting , Avoid under fitting
- Try to get more features
- Try increase polynomial features ( Similar to the second point )
- Try Reduce the degree of regularization
(2) The variance is relatively large , Over fitting :
- Add data, especially big data , It helps to reduce the complexity of the model , Improve the prediction ability of the model in big data , Avoid overfitting
- Add regular (Regularisation), At the same time, find out about w The loss function and w The minimum value of ,w The smaller it is , The smoother the curve , The better the fitting degree of the model
- Try Reduce the number of features
Reduce the complexity of algorithm model , For example, pruning the decision tree 、 Reduce the number of layers of neural network, etc ;
4、 Reduce bais and variance Two machine learning algorithms (Bagging and Boosting)
(1)Boosting Reduce model deviation
Method :
- Bagging The algorithm can be processed in parallel (K Submodel ), and boosting The idea is an iterative method
- Boosting Every time I train, I pay more attention to the example of the last classification error , Give these examples of classification errors greater weight , The goal of the next training is to be able to more easily identify the examples of the last classification error
(2)Bagging Reduce model variance
Method :
- adopt K The second one is put back for sampling , Training K Submodel ( Each random sampling training 1 A model )
- Yes K Model results Voting/Average The fusion
Two 、 The generalization error (Generalization error)
In machine learning , An indicator used to measure the accuracy of a model on unknown data , It's called generalization error (Generalization error). An integration model (f) In the unknown dataset (D) Generalization error on Error( f ; D ), By variance (var), deviation (bias) And noise (ε) Joint decision . The variance is determined by the stability of the model , The deviation is determined by the fitting degree on the training set , Noise is uncontrollable , The smaller the generalization error , The more ideal the model is .

For test samples x Make
by x Tags in the dataset ( There may be noise that causes the marked value to differ from the true value ),y by x Of True value ,f(x;D) In the training set D School models f stay x Upper Predictive value , Take regression as an example :
Expectations of predicted values
( The average of all predicted values ) by :
![\bar{y}(x) = E[f(x; D)]](http://img.inotgo.com/imagesLocal/202207/17/202207150850044449_6.gif)
variance (var) Calculated as :
![var(x) = E[(f(x;D) )- \bar f(x;D))^2]](http://img.inotgo.com/imagesLocal/202207/17/202207150850044449_16.gif)
Variance represents the variance on a test data set , It is the relationship between the predicted value and its average value on the test data set , It has nothing to do with the real value .
noise ( ε ) The calculation of :( Is the error between the real value and the actual marked value )
![\varepsilon ^2 = E[(y_D - y)^2]](http://img.inotgo.com/imagesLocal/202207/17/202207150850044449_3.gif)
Expectation of the square difference between the marked value and the real value , Generally, the noise value is ignored .
deviation (bias) The calculation of :

Derivation of generalization error formula :( Assume that the noise expectation is zero, that is
)

because
And
(
Is a numerical value ), So the first red part is 0
Considering that the noise does not depend on
, Noise expectation is 0, So the second reddened part is also 0.
Deviation measures the deviation between the expected value of the predicted value of the learning algorithm and the real result , It describes the fitting ability of the algorithm itself to the data, that is, the matching degree between the samples of the training data and the trained model ; Variance measures the change of learning performance caused by the change of training set , Describe the impact caused by data disturbance ; Noise represents the lower bound of generalization ability of any learning algorithm , Describes the difficulty of the learning problem itself .
The relationship of generalization error curve is as follows :

3、 ... and 、 Regularization parameters
And bias and variance The relationship between
1、 When
Take the smaller value , The complexity of the model increases ,bias smaller ,variance Bigger
2、 When
When it gets bigger , The complexity of the model becomes smaller , The model becomes simple ,bias Bigger ,variance smaller , And the change trend of test error is similar to that of generalization error

Four 、 Bias-Variance in Regression(Tutorial)
1、 Polynomial model complexity M(ploynomial degree) And under fitting 、 The relationship between over fitting
In polynomial ,M The smaller the model, the simpler , The more likely it is to underfit ( High at this time bias, high variance), For example, pictures 1 It's a lack of fit ,M The larger the model, the more complex , The more likely fitting occurs ( High at this time variance, low bias), For example, pictures 10 It's over fitting , picture 5 For best fitting (M=5):

picture 1

picture 5

picture 10
2、 The variation trend of test error is similar to that of generalization error , All by bias and variacne Joint decision , We have the following diagram :
As the training goes on ,bias Gradually smaller ,variance Gradually get bigger , The generalization error also gradually increases .

边栏推荐
- It looks like a very powerful VBS little programmer code
- Proxmox ve 7.2 install SMB service
- Format transformation commonly used by grafana
- Proxmox ve 7.2 converting disk formats using QEMU img
- 软件测试 接口测试 实战 微信公众号平台 Postman+newman+jenkins 实现自动生成报告并持续集成
- Software test interface test interface authentication token authentication mock server interface encryption and decryption interface signature
- 【安全狗】微软7月多个漏洞更新解决
- IPv6 navigation, strong sail pointing application
- 下次面试官再问高并发系统设计,直接把这篇文章甩给他
- There are surprises when opening
猜你喜欢

Week 4 Data analysis algorithms-Linear models for regression-Bias-Variance Analysis(Part B)

An eight year road of Software Testing Engineer

.NET Core 使用 ImageSharp 生成图片

Target detection -- detailed explanation of Siou principle and code implementation of border regression loss function

It's too convenient to make a data analysis crosstab with one line of code

包及内置模块与软件开发目录规范

如何调试 C# Emit 生成的动态代码?

HCIP第七天笔记

nVisual二次开发——第一章 入门介绍

Switch and router technology: VLAN trunk, single arm routing, and layer 3 switching and configuration
随机推荐
Proxmox ve 7.2 QM installation of openwet
Vs2019 Tutoriel de téléchargement communautaire (détaillé)
剑指 Offer 06. 从尾到头打印链表
Hra2460d-2w high voltage power chip IC power supply
There are surprises when opening
Fast and slow clock domain processing of single bit data pulse broadening method
1688店铺所有商品API接口(整店所有商品查询API接口)
IPv6大航海,风帆指向强应用
什么是Sectigo?
企业命名规范
vs2019社區版下載教程(詳細)
Week 4 – Linear Text Classification
Week 4 Data analysis algorithms-Linear models for regression-Bias-Variance Analysis(Part B)
浏览器问题汇总(上)
Proxmox ve 7.2 backup recovery virtual machine
The whole process of easy gene chromatin immunoprecipitation sequencing (chip SEQ) analysis experiment
Sword finger offer 06 Print linked list from end to end
[security dog] Microsoft updated and solved multiple vulnerabilities in July
Slow SQL analysis and optimization
京东店铺的所有商品API接口(item_search_shop-获得店铺的所有商品API接口),整店商品API接口