当前位置:网站首页>Detailed explanation of multiple linear regression
Detailed explanation of multiple linear regression
2022-07-19 11:00:00 【TT ya】
Beginner little rookie , I hope it's like taking notes and recording what I've learned , Also hope to help the same entry-level people , I hope the big guys can help correct it ~ Tort made delete .
Catalog
3、 ... and 、 solve the problem —— look for w and b
3、 Derivative is 0 Come to the conclusion
Four 、 Hidden problems —— It may not be a full rank matrix
5、 ... and 、 Solutions to hidden problems —— Regularization
1、L1 Regularization ——Lasso Return to
2、L2 Regularization —— Ridge return
6、 ... and 、 Changes and applications of linear regression
7、 ... and 、python Realization
8、 ... and 、 Linear model —— Regression problem classification problem
One 、 Problem description
We now have a data set on hand D: Each sample is composed of d Attributes to describe , namely
, among
Is the sample x In the i Values on attributes . And each sample
The final corresponding result value is
.
Now comes a new sample
, Want to know its result value 
Two 、 Problem analysis
We need data sets D To find a linear model for
forecast
, Find out
The right one w and b.
3、 ... and 、 solve the problem —— look for w and b
We can use the least square method to solve this problem .
1、 Vector form transformation
First turn on the w and b Synthesize a vector form
, The size is (d+1)* 1;
Then rewrite the data matrix X:
, The size is (d+1)* m.
Then mark y Also written as vector pattern : 
2、 Target type

Make 
3、 Derivative is 0 Come to the conclusion
4、 Final model results


Four 、 Hidden problems ——
It may not be a full rank matrix
It may not be a full rank matrix , There will be more than one
Optimal solution , Which solution should be chosen as 
for instance : The number of samples is small , There are many characteristic attributes , It has even exceeded the number of samples , So at this time
Not a full rank matrix , Can solve multiple
Solution .
5、 ... and 、 Solutions to hidden problems —— Regularization
The role of regularization is to select models with less empirical risk and model complexity
1、L1 Regularization ——Lasso Return to
Add a term after the objective function 
be , The objective function becomes 
The first is the experience risk mentioned above , The second control is the complexity of the model .
among
, Control the punishment :
;
This is also called Lasso Return to .
As shown in the figure below ( Suppose there are only two properties ):L1 The regularized square error term isoline and the regularized isoline often intersect on the coordinate axis , This means discarding one of these attributes , It embodies the characteristics of feature selection , It is easier to get sparse solution ( Compared with the following L2 Regularization )—— Namely obtained W There will be fewer non-zero values in the vector .

2、L2 Regularization —— Ridge return
Add a term after the objective function 
be , The objective function becomes 
This is also known as ridge regression .
L2 Regularization uniform selection parameters , Let all coefficients of the fitting curve be the same , Although it failed to reduce the number of items , But the coefficients are balanced , This is in principle related to L1 Regularization is different .
6、 ... and 、 Changes and applications of linear regression
If the model of the problem is not linear regression , Then try to make the predicted value of the model approximate y Derivatives of .
such as —— Log linear regression 
A more general :
, This is called generalized linear model .
7、 ... and 、python Realization
1、 Multiple linear regression
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(x,y,test_size=0.3,random_state=1)//x,y They are divided into attribute data and tag data
model = LinearRegression()
model.fit(X_train, Y_train)
score = model.score(X_test, Y_test)
print(' Model test score :'+str(score))
Y_pred = model.predict(X_test)
print(Y_pred)2、 Ridge return
from sklearn.linear_model import Ridge
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(x,y,test_size=0.3,random_state=1)//x,y They are divided into attribute data and tag data
model = Ridge(alpha=1)
model.fit(X_train, Y_train)
score = model.score(X_test, Y_test)
print(' Model test score :'+str(score))
Y_pred = model.predict(X_test)
print(Y_pred)3、lasso Return to
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(x,y,test_size=0.3,random_state=1)//x,y They are divided into attribute data and tag data
model = Lasso(alpha=0.1)
model.fit(X_train, Y_train)
score = model.score(X_test, Y_test)
print(' Model test score :'+str(score))
Y_pred = model.predict(X_test)
print(Y_pred) 8、 ... and 、 Linear model —— The return question
Classification problem
All of the above are linear models used to solve regression problems , In fact, linear models can also be used to solve classification problems —— Logical regression ( Log probability regression ).
See Logical regression (Logistic Regression)_tt Ya's blog -CSDN Blog _ Logical regression csdn
You are welcome to criticize and correct in the comment area , Thank you. ~
边栏推荐
- LeetCode 2331. 计算布尔二叉树的值(树的遍历)
- Vérification logique complexe personnalisée lors de l'ajout et de la modification - 2022 nouvel élément
- LeetCode 2249. 统计圆内格点数目
- 军品研制过程所需文件-进阶版
- After summarizing the surface based knowledge of the database
- mpu9250 ky9250姿态、角度模块和mpu9250 mpl dma对比
- Pytorch框架 学习记录1 CIFAR-10分类
- How can enterprise telecommuting be more efficient?
- Pytorch与权重衰减(L2范数)
- Efficient space-based computing technology for satellite communication in 6g
猜你喜欢

(2) Using MySQL

LeetCode 558. 四叉树交集

(一)了解MySQL

(1) Learn about MySQL

博弈论(depu)与投资(40/100)

Paper notes: mind the gap an empirical evaluation of impaction ofmissing values techniques in timeseries

vulnhub inclusiveness: 1

Explanation of tree chain dissection idea + acwing 2568 Tree chain dissection (DFS sequence + mountain climbing method + segment tree)

一个报错, Uncaught TypeError: ModalFactory is not a constructor

Leetcode丑数题解
随机推荐
(二)使用MySQL
Unity3d 读取mpu9250 例子原代码
LeetCode 2335. Minimum total time required to fill the cup
Connected graph (union search set)
"Baidu side" angrily sprayed the interviewer! Isn't it that the tree time increases by a line number?
手机键盘(模拟题)
数据库锁的介绍与InnoDB共享,排他锁
(2) Using MySQL
LeetCode 2249. 统计圆内格点数目
Input number pure digital input limit length limit maximum value
Pytoch and weight decay (L2 norm)
How can enterprise telecommuting be more efficient?
可定义的6G安全架构
火箭大机动运动欧拉角解算的探讨
Establishment of redis cluster, one master, two slave and three Sentinels
[Huawei cloud IOT] reading notes, "Internet of things: core technology and security of the Internet of things", Chapter 3 (2)
博弈论(depu)与投资(40/100)
Over fitting and under fitting
input number 純數字輸入 限制長度 限制 最大值
Tencent cloud server uses image to deploy WordPress personal website!
Classification problem