当前位置：网站首页>Freshman task-5

Freshman task-5

2022-07-19 04:49:00 【Speaking of mountains and rivers】

List of articles

One 、Funk-SVD decomposition algorithm
Two 、 gradient descent
3、 ... and 、 Code instance

One 、Funk-SVD decomposition algorithm

For a recommendation system , The relationship between users and items can be organized into a matrix as follows .

User-Item	1	2	3	4
1	x	4.5	2.0	x
2	4.0	x	3.5	x
3	x	5.0	x	2.0
4	x	3.5	4.0	1.0

Each row in the matrix represents a user , And each column represents an item . If the user has rated the item , Then the position where the row corresponding to the user crosses the column corresponding to the article in the matrix represents the user's rating value of the article . In the matrix ‘x’ The item is not rated on behalf of the user . This matrix is called User-Item Scoring matrix , After the actual statistics, most of the numbers in this matrix are now displayed as question marks .

What the recommendation system needs to do is for any user , Predict the score of all non rated items , And recommend the corresponding items to users in the order of score from high to low .

What we need to do is find out the matrix ‘x’ Value .

and $S V D$ The algorithm requires three different matrices , $F u nk - S V D$ The algorithm only needs two matrices . As shown in the figure below

At this point we have a formula

$R_{m \times n} \approx P_{m \times k} Q_{k \times n} = \hat{M}_{m \times n}$

Scoring matrix $R$ It's a $\times n$ Matrix , Altogether $m$ Users , $n$ Items . Through a series of operations, the matrix $R$ Into two matrices $P$ and $Q$ , matrix $U$ Its size is $\times k$ , matrix $Q$ Its size is $\times n$ .

Because the matrix $R$ Unknown in , We are just fitting this matrix , So it is approximately equal to .

This method is based on the assumption : Users' preference for an item is mainly determined by $k$ Three factors determine , $P_{ni}$ It means the first one $n$ Users to $i$ The degree of preference for factors , and $Q_{ix}$ It means the first one $x$ Items meet the requirements of $i$ Degree of factors , $R_{nx}$ Represent user $n$ For items $x$ Final preference .

At this time, there are several problems , Indicators for evaluating the degree of fitting ？ How to get $P, Q$ Two matrix ？

First answer the first question , The higher the degree of fitting, the better $\ Q$ The closer the product of two matrices is to the matrix $R$ . Here we use $SSE$ ( Sum squared ) To express , Then there is the formula .

$E_{U,I}^2 = {\textstyle \sum_{U,I}}(R_{U,I} \ - \ \hat{R}_{U,I})^2$

Now the problem is to seek loss $SSE$ The smallest matrix $P$ and $Q$ . This also answers how to get $P, Q$ The problem of two matrices .

Two 、 gradient descent

For functions with multidimensional variables , The gradient of 0 There are three situations in the point of —— Maximum 、 Minimum 、 saddle point . The minimum is the most stable fixed point of the gradient descent process . The iterative process can refer to the flow direction of water when it rains , Water always collects in the pit ( Minimum ) Inside .

But what we are looking for is a negative gradient . for instance

function $f(x)=x^2$ It's a convex function , Satisfy $f(\frac{x_1+x_2}{2}) \le \frac{f(x_1)+f(x_2)}{2}$ . The image is as follows :

What if (-5,25) Move towards the lowest point , Its gradient at this time is -10, Should go to x Moving in the positive direction makes the gradient 0.

What if (5,25) Move towards the lowest point , Its gradient at this time is 10, Should go to x Moving in the negative direction makes the gradient 0.

Let's go back to the loss function , There are two steps

1. Solve the gradient of the loss function

$E_{U,I}^2 = {\textstyle \sum_{U,I}}(R_{U,I} \ - \ {\textstyle \sum_{k=1}^{K}}P_{U,k} \ Q_{k,I} )^2$

$SSE$ It's about $P$ and $Q$ Multivariate function of , When randomly selected $U$ and $I$ after , You need to enumerate all $k$ , And right $P_{U,k}$ as well as $Q_{k,I}$ Find the partial derivative .
$\frac{\partial}{\partial P_{U,k}}{E_{U,I}}^2 = 2 E_{U,I}\frac{\partial E_{U,I}}{\partial P_{U,k}} = -2E_{U,I}Q_{k,I}$

$\frac{\partial}{\partial Q_{k,I}}{E_{U,I}}^2 = 2 E_{U,I}\frac{\partial E_{U,I}}{\partial Q_{k,I}} = -2E_{U,I}P_{U,k}$

2. Update variables according to negative gradient changes

$P_{U,k} = P_{U,k} - \alpha (-2E_{U,I}Q_{k,I}) = P_{u,k} + 2 \alpha E_{U,I}Q_{k,I}$

$Q_{k,I} = Q_{k,I} - \alpha (-2E_{U,I}P_{U,k}) = Q_{k,I} + 2 \alpha E_{U,I}P_{U,k}$

This concludes the derivation of the formula , $\ Q$ The element values initialized by the two matrices are set to random numbers .

3、 ... and 、 Code instance

from math import *
import numpy as np

def matrix_factorization(R, P, Q, steps = 5000, alpha = 0.0002):
    Q = Q.T

    for _ in range(steps):
        for i in range(len(R)):
            for j in range(len(R[i])):
                eij = R[i][j] - np.dot(P[i,:],Q[:,j])
                for k in range(K):
                    if R[i][j] > 0:
                        P[i][k] = P[i][k] + 2 * alpha * eij * Q[k][j]
                        Q[k][j] = Q[k][j] + 2 * alpha * eij * P[i][k]

        # SSE
        e = 0
        for i in range(len(R)):
            for j in range(len(R[i])):
                if R[i][j]>0:
                    e = e + pow(R[i][j] - np.dot(P[i,:],Q[:,j]),2)

        # Is result convergence?
        if e < 0.001:
            break

    return P,Q.T
 
if __name__ == '__main__':
    R = [
        [0,4.5,2,0],
        [4,0,3.5,0],
        [0,5,0,2],
        [0,3.5,4,1]
    ]
    R = np.array(R)
    # The Row of Matrix R
    M = len(R)
    # The Column of Matrix R
    N = len(R[0])
    # The Hidden factor number
    K = 2
    # Get a random matrix P : M rows K columns
    P = np.random.rand(M,K)
    # Get a random matrix Q : N rows K columns
    Q = np.random.rand(N,K)
    new_P, new_Q = matrix_factorization(R,P,Q)

    print("The original matrix is : ")
    print(R)

    print("The new matrix is : ")
    R_MF = np.dot(new_P,new_Q.T)
    print(R_MF)

 Input  :  Primitive matrix and hidden factor  K.  As shown below 
[
    [0,4.5,2,0],
    [4,0,3.5,0],
    [0,5,0,2],
    [0,3.5,4,1]
]

 The default setting in the code example  K  by  2.

 Output  :  The matrix fitted by matrix decomposition .  From this, we can get the unknown score .