当前位置:网站首页>Freshman task-5
Freshman task-5
2022-07-19 04:49:00 【Speaking of mountains and rivers】
List of articles
One 、Funk-SVD decomposition algorithm
For a recommendation system , The relationship between users and items can be organized into a matrix as follows .
| User-Item | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| 1 | x | 4.5 | 2.0 | x |
| 2 | 4.0 | x | 3.5 | x |
| 3 | x | 5.0 | x | 2.0 |
| 4 | x | 3.5 | 4.0 | 1.0 |
Each row in the matrix represents a user , And each column represents an item . If the user has rated the item , Then the position where the row corresponding to the user crosses the column corresponding to the article in the matrix represents the user's rating value of the article . In the matrix ‘x’ The item is not rated on behalf of the user . This matrix is called User-Item Scoring matrix , After the actual statistics, most of the numbers in this matrix are now displayed as question marks .
What the recommendation system needs to do is for any user , Predict the score of all non rated items , And recommend the corresponding items to users in the order of score from high to low .
What we need to do is find out the matrix ‘x’ Value .
and S V D SVD SVD The algorithm requires three different matrices , F u n k − S V D Funk-SVD Funk−SVD The algorithm only needs two matrices . As shown in the figure below

At this point we have a formula
R m × n ≈ P m × k Q k × n = M ^ m × n R_{m \times n} \approx P_{m \times k} Q_{k \times n} = \hat{M}_{m \times n} Rm×n≈Pm×kQk×n=M^m×n
Scoring matrix R R R It's a m × n m \times n m×n Matrix , Altogether m m m Users , n n n Items . Through a series of operations, the matrix R R R Into two matrices P P P and Q Q Q, matrix U U U Its size is m × k m \times k m×k, matrix Q Q Q Its size is k × n k \times n k×n.
Because the matrix R R R Unknown in , We are just fitting this matrix , So it is approximately equal to .
This method is based on the assumption : Users' preference for an item is mainly determined by k k k Three factors determine , P n i P_{ni} Pni It means the first one n n n Users to i i i The degree of preference for factors , and Q i x Q_{ix} Qix It means the first one x x x Items meet the requirements of i i i Degree of factors , R n x R_{nx} Rnx Represent user n n n For items x x x Final preference .
At this time, there are several problems , Indicators for evaluating the degree of fitting ? How to get P , Q P,Q P,Q Two matrix ?
First answer the first question , The higher the degree of fitting, the better P Q P \ Q P Q The closer the product of two matrices is to the matrix R R R. Here we use S S E SSE SSE ( Sum squared ) To express , Then there is the formula .
S S E = E U , I 2 = ∑ U , I ( R U , I − R ^ U , I ) 2 SSE = E_{U,I}^2 = {\textstyle \sum_{U,I}}(R_{U,I} \ - \ \hat{R}_{U,I})^2 SSE=EU,I2=∑U,I(RU,I − R^U,I)2
Now the problem is to seek loss S S E SSE SSE The smallest matrix P P P and Q Q Q. This also answers how to get P , Q P,Q P,Q The problem of two matrices .
Two 、 gradient descent
For functions with multidimensional variables , The gradient of 0 There are three situations in the point of —— Maximum 、 Minimum 、 saddle point . The minimum is the most stable fixed point of the gradient descent process . The iterative process can refer to the flow direction of water when it rains , Water always collects in the pit ( Minimum ) Inside .
But what we are looking for is a negative gradient . for instance
function f ( x ) = x 2 f(x)=x^2 f(x)=x2 It's a convex function , Satisfy f ( x 1 + x 2 2 ) ≤ f ( x 1 ) + f ( x 2 ) 2 f(\frac{x_1+x_2}{2}) \le \frac{f(x_1)+f(x_2)}{2} f(2x1+x2)≤2f(x1)+f(x2). The image is as follows :

What if (-5,25) Move towards the lowest point , Its gradient at this time is -10, Should go to x Moving in the positive direction makes the gradient 0.
What if (5,25) Move towards the lowest point , Its gradient at this time is 10, Should go to x Moving in the negative direction makes the gradient 0.
Let's go back to the loss function , There are two steps
1. Solve the gradient of the loss function
S S E = E U , I 2 = ∑ U , I ( R U , I − ∑ k = 1 K P U , k Q k , I ) 2 SSE = E_{U,I}^2 = {\textstyle \sum_{U,I}}(R_{U,I} \ - \ {\textstyle \sum_{k=1}^{K}}P_{U,k} \ Q_{k,I} )^2 SSE=EU,I2=∑U,I(RU,I − ∑k=1KPU,k Qk,I)2
S S E SSE SSE It's about P P P and Q Q Q Multivariate function of , When randomly selected U U U and I I I after , You need to enumerate all k k k, And right P U , k P_{U,k} PU,k as well as Q k , I Q_{k,I} Qk,I Find the partial derivative .
∂ ∂ P U , k E U , I 2 = 2 E U , I ∂ E U , I ∂ P U , k = − 2 E U , I Q k , I \frac{\partial}{\partial P_{U,k}}{E_{U,I}}^2 = 2 E_{U,I}\frac{\partial E_{U,I}}{\partial P_{U,k}} = -2E_{U,I}Q_{k,I} ∂PU,k∂EU,I2=2EU,I∂PU,k∂EU,I=−2EU,IQk,I
∂ ∂ Q k , I E U , I 2 = 2 E U , I ∂ E U , I ∂ Q k , I = − 2 E U , I P U , k \frac{\partial}{\partial Q_{k,I}}{E_{U,I}}^2 = 2 E_{U,I}\frac{\partial E_{U,I}}{\partial Q_{k,I}} = -2E_{U,I}P_{U,k} ∂Qk,I∂EU,I2=2EU,I∂Qk,I∂EU,I=−2EU,IPU,k
2. Update variables according to negative gradient changes
P U , k = P U , k − α ( − 2 E U , I Q k , I ) = P u , k + 2 α E U , I Q k , I P_{U,k} = P_{U,k} - \alpha (-2E_{U,I}Q_{k,I}) = P_{u,k} + 2 \alpha E_{U,I}Q_{k,I} PU,k=PU,k−α(−2EU,IQk,I)=Pu,k+2αEU,IQk,I
Q k , I = Q k , I − α ( − 2 E U , I P U , k ) = Q k , I + 2 α E U , I P U , k Q_{k,I} = Q_{k,I} - \alpha (-2E_{U,I}P_{U,k}) = Q_{k,I} + 2 \alpha E_{U,I}P_{U,k} Qk,I=Qk,I−α(−2EU,IPU,k)=Qk,I+2αEU,IPU,k
This concludes the derivation of the formula , P Q P \ Q P Q The element values initialized by the two matrices are set to random numbers .
3、 ... and 、 Code instance
from math import *
import numpy as np
def matrix_factorization(R, P, Q, steps = 5000, alpha = 0.0002):
Q = Q.T
for _ in range(steps):
for i in range(len(R)):
for j in range(len(R[i])):
eij = R[i][j] - np.dot(P[i,:],Q[:,j])
for k in range(K):
if R[i][j] > 0:
P[i][k] = P[i][k] + 2 * alpha * eij * Q[k][j]
Q[k][j] = Q[k][j] + 2 * alpha * eij * P[i][k]
# SSE
e = 0
for i in range(len(R)):
for j in range(len(R[i])):
if R[i][j]>0:
e = e + pow(R[i][j] - np.dot(P[i,:],Q[:,j]),2)
# Is result convergence?
if e < 0.001:
break
return P,Q.T
if __name__ == '__main__':
R = [
[0,4.5,2,0],
[4,0,3.5,0],
[0,5,0,2],
[0,3.5,4,1]
]
R = np.array(R)
# The Row of Matrix R
M = len(R)
# The Column of Matrix R
N = len(R[0])
# The Hidden factor number
K = 2
# Get a random matrix P : M rows K columns
P = np.random.rand(M,K)
# Get a random matrix Q : N rows K columns
Q = np.random.rand(N,K)
new_P, new_Q = matrix_factorization(R,P,Q)
print("The original matrix is : ")
print(R)
print("The new matrix is : ")
R_MF = np.dot(new_P,new_Q.T)
print(R_MF)
Input : Primitive matrix and hidden factor K. As shown below
[
[0,4.5,2,0],
[4,0,3.5,0],
[0,5,0,2],
[0,3.5,4,1]
]
The default setting in the code example K by 2.
Output : The matrix fitted by matrix decomposition . From this, we can get the unknown score .

边栏推荐
- minio安装部署及使用
- Warriors of the Visual Studio, Assemble! (Visual Studio的勇士们,汇编吧!) 原创 2009年07月12日 19:40:00 标签:汇编 /mic
- [Unity] 交互之双击
- Record a stored procedure to batch modify the table structure
- Notes for Resume Writing
- 【Lipschitz】基于matlab的Lipschitz李氏指数仿真
- 数据库学习笔记(一)检索数据
- 使用Everything清理垃圾文件
- [TA frost wolf \u may- hundred people plan] Introduction to basic rendering lighting (I)
- Simple UI funny text conversion Emoji expression wechat applet supports sentence word conversion_ Source code
猜你喜欢
![[Unity] Input. Index of gettouch[index]](/img/9d/ec0e4c9e6b1bb25976357469fe037f.png)
[Unity] Input. Index of gettouch[index]

数据库与开源的未来
![[FPGA tutorial case 26] realize the basic operation of decimals through Verilog in FPGA](/img/94/a123d212ccde207395d72e9563012c.png)
[FPGA tutorial case 26] realize the basic operation of decimals through Verilog in FPGA

Only when the data analysis report is written in this way can we really understand the data

复旦微FMQL(国产Zynq) 【IAR裸机开发之PS】——非字节对齐访问

mysql主从架构和读写分离、以及高可用架构
![Fudan micro fmql (domestic zynq) [PS of IAR bare metal development] - non byte aligned access](/img/e7/8349da2c240ac8f51524e5e51bd4d6.png)
Fudan micro fmql (domestic zynq) [PS of IAR bare metal development] - non byte aligned access

C list set object de duplication LINQ de duplication with time de duplication

Problems encountered by Sphinx

赚钱大师小程序【最新版5.9.9】商城/佣金即时提现/分销推广/话费充值/美团饿了么外卖
随机推荐
[论文精读]BERT
Unit UMP Packaging Black Screen issue Summary
【TA-霜狼_may-《百人计划》】图形2.5 Bump Mapping
Money making master applet [latest version 5.9.9] mall / instant withdrawal of commission / distribution promotion / phone recharge / is meituan hungry for takeout
Usage scenarios and usage of judgment and rounding down in MySQL
Redis 集群面试题
[TA frost wolf \u may - hundred people plan] Figure 2.5 bump mapping
数据库学习笔记(一)检索数据
EasyExcel简单使用
minio安装部署及使用
使用小丸工具箱进行极限视频压缩
Deleting snapshot: error deleting snapshot: Dictionary problem
'ionic' is not an internal or external command, nor is it a runnable program or batch file.
数据库与开源的未来
PowerDesigner displays comment comments
高等数学笔记:关于等价无穷小替换的一个猜想
Common PostgreSQL data operation notes (updated from time to time)
Wildfly: how to call EJBs from EJBs located in another application
[FPGA tutorial case 26] realize the basic operation of decimals through Verilog in FPGA
And predicts that nerf will eventually replace deepfake