当前位置：网站首页>Relationship between standardization, normalization and regularization

Relationship between standardization, normalization and regularization

2022-07-19 10:07:00 【A Sheng】

First , Standardized English is Standardization, Normalized English is Normalization, Regularization in English is Regularization. Standardization is a way of scaling features , It should be noted that the data distribution after standardization is not necessarily normal , Because standardization does not change the distribution of raw data . The purpose of normalization is to eliminate the influence of different dimensions and units , Improve the comparability between data . The purpose of regularization is to prevent over fitting . The article involves code download reference [5].

One . Standardization

1. Standardized equation

The most commonly used standardization is Z-Score Standardization , The simple understanding is to reduce the mean , And divide it by the standard deviation . The equation is as follows ：

among , That's the sample mean , Is the sample standard deviation .

2. Standardized visualization

The standardization process is divided into 2 A step , The first 1 Step is to subtract the mean value , The first 2 The first step is to divide by the standard deviation . The visual analysis is as follows ： The left figure in the above figure is the visualization of the original figure , The middle figure is the visualization after subtracting the mean , The right figure is the visualization after dividing by the standard deviation .

3. Standardized implementation

from sklearn import preprocessing
import numpy as np

X_train = np.array([[1., -1., 2.], [2., 0., 0.], [0., 1., -1.]])
X_train_result = preprocessing.scale(X_train, axis=0) #axis=0 Means to standardize by column 
print(X_train_result)
print(X_train_result.mean(axis=0)) # The mean for 0
print(X_train_result.std(axis=0))  # The standard deviation is 1

The output is as follows ：

[[ 0.         -1.22474487  1.33630621]
 [ 1.22474487  0.         -0.26726124]
 [-1.22474487  1.22474487 -1.06904497]]
[0. 0. 0.]
[1. 1. 1.]

If you are doing machine learning model training , I want to apply the scaling standard of the training set to the test set , Then use StandardScaler() This class . The code is as follows ：

from sklearn import preprocessing
import numpy as np

X_train = np.array([[1., -1., 2.], [2., 0., 0.], [0., 1., -1.]])
scaler = preprocessing.StandardScaler().fit(X_train)
X_scaled = scaler.transform(X_train)
print(X_scaled.mean(axis=0)) # The mean for 0
print(X_scaled.std(axis=0))  # The standard deviation is 1
X_test = [[-1., 1., 0.]]     # Use the scaling criteria of the training set to standardize the test set , The scaling standard here refers to the mean and standard deviation of the columns of the training set 
print(scaler.transform(X_test))

The output is as follows ：

[0. 0. 0.]
[1. 1. 1.]
[[-2.44948974  1.22474487 -0.26726124]]

Two . normalization

Normalization is to compress the data into an interval , such as [0,1]、[-1,1]. frequently-used 2 The two methods are as follows ：

1.Min-Max Normalization

The equation is as follows ：

The normalized range is [0,1]. The code implementation is as follows ：

import numpy as np
from sklearn import preprocessing as pp
X_train = np.array([[ 1., -5., 8.], [ 2., -3., 0.], [ 0., -1., 1.]])
scaler = pp.MinMaxScaler().fit(X_train) # The default data compression range is [0,1]
print(scaler.transform(X_train))

The output is as follows ：

[[0.5   0.    1.   ]
 [1.    0.5   0.   ]
 [0.    1.    0.125]]

2.Mean Normalization

The equation is as follows ：

The normalized range is [-1,1].

import numpy as np
from sklearn import preprocessing as pp
X_train = np.array([[ 1., -5., 8.], [ 2., -3., 0.], [ 0., -1., 1.]])
scaler = pp.MinMaxScaler(feature_range=(-1, 1)) # Set the data compression range to [-1,1]
scaler = scaler.fit(X_train)
print(scaler.transform(X_train))

The output is as follows ：

[[ 0.   -1.    1.  ]
 [ 1.    0.   -1.  ]
 [-1.    1.   -0.75]]

The above code mainly uses scikit-learn Preprocessing sub module of preprocessing Provide MinMaxScaler Class to realize the normalization function .MinMaxScaler Class has an important parameter feature_range, This parameter is used to set the range of data compression , The default value is [0,1].

3、 ... and . Regularization

Simply understand that the purpose of using regularization is to prevent over fitting , Of course, there are other ways to prevent over fitting , For example, reduce the feature dimension . Let's take an example to explain why reducing the feature dimension can also prevent over fitting , Then explain how regularization can prevent over fitting . First of all, we should understand the essence of fitting ？ That is, the noise is also modeled as the characteristics of things . If a bird gets hurt , Can't fly for the time being , When building bird classifiers , The noise of flying can also be learned into the characteristics of the model , In this way, a normal bird that can fly is judged not to be a bird , Of course, this is a far fetched example of over fitting , But it can also explain some problems . How does regularization prevent over fitting ？
For a given data set , Consider the simplest linear regression model , Take the square error as the loss function , The optimization objectives are as follows ：

introduce L2 Norm regularization , It is called ridge regression [ridge regression], As shown below ：

introduce L1 Norm regularization , be called LASSO[Least Absolute Shrinkage and Selection Operator] Return to , As shown below ：

Suppose the data dimension is 2 dimension , Through the equation ( There is no derivation here , If you are interested, please refer to [3]) You can make the following figure , The left figure is , The picture on the right is . It can be seen that with increase , More and more parameters will become 0： The purpose is to find the weight value corresponding to the minimum value of the loss function , The blue circle in the figure below is the isoline of the square error term , When taking the center point of the ellipse , Loss function ( The loss function mentioned here does not include regularization ) Minimum value . The total loss function ( The loss function here includes regularization ) Is to ask for the blue circle + The minimum value of the sum of the red circle .
The regularization method is a constant , It limits the complexity of the model , So that complex models can be trained on limited data sets , Without serious over fitting . The smaller the regular term , The smaller the punishment , In extreme cases, the regular term is 0 when , It will cause over fitting problems ; The larger the regularization , The more severe the punishment is , It will be prone to the problem of under fitting .
From the above figure, you can see intuitively the adoption of L1 The intersection of the isoline of the norm time square error term and the isoline of the regularization term appears on the coordinate axis , namely perhaps . When L2 When the norm is , The intersection of the two often appears in a quadrant , namely And . In this way L1 Regularization ratio L2 Regularization is easier to get sparse solution .L2 The advantage of regularization is that it can be derived everywhere ,L1 Regularization has inflection points , It is not differentiable everywhere , But we can get a more sparse solution .

reference ：
[1]Preprocessing data：https://scikit-learn.org/stable/modules/preprocessing.html
[2] What is the meaning of regularization often mentioned in machine learning ：https://www.zhihu.com/question/20924039
[3] Pattern recognition and machine learning [ Chinese version ]：https://url39.ctfile.com/f/2501739-616549609-5fe8d7?p=2096 ( Access password : 2096)
[4] Pattern recognition and machine learning [ English version ]：https://url39.ctfile.com/f/2501739-616549614-9897ab?p=2096 ( Access password : 2096)
[5] Standardization 、 The relationship between normalization and regularization .py：https://url39.ctfile.com/f/2501739-616549789-16fc6a?p=2096 ( Access password : 2096)

原网站

版权声明
本文为[A Sheng]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/07/202207171131174818.html