当前位置：网站首页>Session based recommendations with recurrent neural networks

Session based recommendations with recurrent neural networks

2022-07-26 09:58:00 【qq_ fifty-three million four hundred and thirty thousand three 】

1. Paper introduction

meeting ：ICLR International Conference on representational learning

Time ：2016

First author ：Balazs Hidasi Gravity R&D Inc. Budapest, Hungary

Corresponding author ：Linas Baltrunas Netflflix Los Gatos, CA, USA

Address of thesis ：https://arxiv.org/pdf/1511.06939.pdf

2. Content

2.1 Abstract

This is the model proposed in this paper .

therefore , We propose a method based on RNN To make session based recommendations

We therefore propose an RNN based approach for session-based recommendations.

Our method also takes into account the actual situation of the task , And for the classic RNNs Some changes have been made , Such as a ranking loss function , Make it more feasible for this particular problem .

Our approach also considers practical aspects of the task and introduces several modififications to classic RNNs such as a ranking loss function that make it more viable for this specifific problem.

2.2 Introduce

First of all, it shows that session based recommendation is a relatively neglected problem , Because conversational behavior is difficult to collect . Many e-commerce platforms will not track users who visit their platforms for a long time ID, Even if tracking is possible, many users will only leave oneortwo sessions on the website .cookies It can provide a certain degree of user recognition, but it will cause some privacy problems . In some fields, such as classified websites , User behavior tends to show session based characteristics , Therefore, subsequent sessions of the same user should be handled separately . Most session based recommendation systems use relatively simple methods and cannot use user information such as the similarity between items .

The common method of recommendation system is factor model ( It feels like matrix decomposition ) And domain methods , Deep neural network has also achieved success in image and speech ,RNNs It is the preferred method for processing sequence data .

Therefore, the author proposes to RNNs Applied to recommendation system , Apply to session based recommendations , And it was a success .

He regards the first item that the user clicks when entering the website as RNN The input of , Then query the recommendation according to this initial input . Each successive click of the user will produce an output recommendation depending on all previous clicks .

In this paper, the ranking loss function is used to train RNN.

2.3 Related work

2.3.1 Session based recommendation

Here are some recommended methods , The first is matrix decomposition and neighborhood method . This method mainly uses the similarity matrix of the project , Items that are often clicked together are considered similar , Then recommend items similar to the items users click . This method is simple and effective , But only the last click of the user is considered, and the past click information is ignored .

Another method is Markov decision process (MDPs), He is essentially a first-order Markov chain , The next recommendation can be simply calculated according to the transition probability between projects . But when users choose more , State space cannot be managed .

Finally, the extended version of the general factorization framework , He can use session data , But it doesn't consider any ordering between sessions .

2.3.2 Deep learning among recommenders

Here are the recommended methods in deep learning , The first is to use the limited Boltzmann machine (RBM) Collaborative filtering , Later, we used deep network to extract features from the project , These methods are particularly useful when there is not enough user project interaction information .

2.4 be based on RNN The recommendation of

Then, based on RNN The recommendation of , Firstly, the RNN Forward propagation formula of , And then it gives GRU( Door control cycle unit ) Forward propagation formula of , Because the article uses RNN Namely GRU.

This is the structure of the whole network , Where the input is the actual session state one-hot code , The output is the prediction preference of the project . Through an embedded layer , Enter the multi-layer GRU In structure , A feedforward layer can be added between the last layer and the output .

In the following, the author will mention that only one layer is used in the code GRU, Because it works best .

Next is how to get input data

For the collected session data , Because his length is different , So it can't be like NLP Use sliding windows like that . The method in this article is to create the sequence in each session first , Before use X The first event of a session to form the first Mini-batch The input of , Then the predicted result is the next event in this session . here X yes 3. When any session ends, the event of the next available session is used to supplement , The last event cannot be used as input because there is no corresponding prediction output (i2,3 Not used as input ).

When there are too many items, negative sampling is needed to reduce the dimension of output , Because when there is too much output, the calculation will become very complex , This is not desirable in practice . Negative samples are items in other training examples , According to the popularity of the project, samples are taken in proportion .

Next, there are two loss functions proposed in the paper

1.BPR

$L_{s}=-\frac{1}{N_{s}}\sum_{j=1}^{N_{s}}log(\sigma (\hat{r}_{s,i}-\hat{r}_{s,j}))$

He compared a positive score with a sampled negative score ,Ns Is the sample size ,r Represents the score of the project ,i Is the desired project ,j It's a negative sample . This loss function expects the score of positive samples to be higher than that of negative samples .

2.TOP1

$L_{s}=\frac{1}{N_{s}}\sum_{j=1}^{N_{s}}\sigma (\hat{r}_{s,j}-\hat{r}_{s,i})+\sigma (\hat{r}_{s,j}^{2})$

Compared with the above, the score of negative samples increases the regular term . The initial loss is without regular terms , But this will be BPR The same problem that leads to higher and higher scores , Therefore, the addition of regular terms limits the score of negative samples .

The ultimate goal of both of them is to make i The score of is higher than j.

3. experiment

Here are some problems I encountered .

The evaluation is done by providing the events of a session one-by-one and checking the rank of the
item of the next event. The hidden state of the GRU is reset to zero after a session fifinishes.

This means that the evaluation is input one session at a time , Reset when a session is completed GRU The hidden state of . The hidden state here refers to GRU Output at every moment .

The best performing parametrizations are summarized in table 2. Weight matrices were initialized
by random numbers drawn uniformly from [−x, x]

The weight matrix initialized here is all the super parameters , It's not just GRU Weight parameters in .

Here is the corresponding code .

#weight initialization if it was defined
def init_model(model):
    if args.sigma is not None:
        for p in model.parameters():
            if args.sigma != -1 and args.sigma != -2:
                sigma = args.sigma
                p.data.uniform_(-sigma, sigma)
            elif len(list(p.size())) > 1:
                sigma = np.sqrt(6.0 / (p.size(0) + p.size(1)))
                if args.sigma == -1:
                    p.data.uniform_(-sigma, sigma)         # Evenly distribute initialization parameters 
                else:
                    p.data.uniform_(0, sigma)

4. Reappear

The code used this time is ：GitHub - hungthanhpham94/GRU4REC-pytorch: An other implementation of GRU4REC using PyTorch

The data set used is ：RecSys Challenge 2015 | Kaggle

If you are going to run locally, you can directly follow README You can operate according to the requirements of the document , Here is a description of kaggle How to use the platform .

It is not recommended to download directly with links GitHub Because the path in the code needs to be modified , and kaggle Online modification is not supported .

Let's enter first kaggle Create a new notebook , Then click... In the upper right corner Add data take RecSys Challenge 2015 Data sets are added to input in .

We found it after downloading the code locally and opening it preprocessing.py Modify the following path ：

dataBefore = '../input/recsys-challenge-2015/yoochoose-clicks.dat' #Path to Original Training Dataset "Clicks" File
dataTestBefore = '../input/recsys-challenge-2015/yoochoose-test.dat' #Path to Original Testing Dataset "Clicks" File
dataAfter = './dataafter/' #Path to Processed Dataset Folder../input/dataafter

Change the three paths to kaggle The corresponding location of the document in the notebook , Pay attention to the third address and add one at the end "/".

The above code has been modified and can be used as a reference , among dataafter Is the folder where the processed data is located ,databefore It's in the data set clicks Path to file ,datatestbefore It's a dataset test Path to file , We will be having kaggle Create such a folder in .

!mkdir dataafter

We are kaggle Running the above command in the notebook will output See the folder that has been created , Refresh after running .

Because I have run preprocessing.py So there is already processed data in the folder .

After we add the dataset , You can find the option of copy path directly on the far right of each file , Paste the path of the corresponding file into the corresponding place in the code .

The above is the imported raw data set , Place the mouse on the far right of each file to see the option of copy path .

Then we need to modify main.py In the path .

parser.add_argument('--data_folder', default='./dataafter/', type=str)#./dataafter
parser.add_argument('--train_data', default='recSys15TrainOnly.txt', type=str)
parser.add_argument('--valid_data', default='recSys15Valid.txt', type=str)

The above three items need to be modified , The first folder created for us to store the data set after processing , Pay attention to add ”/“, The modification of the following two paths is the same , This is the file name of the training set and the verification set . there train_data and valid_data Cannot copy files directly kaggle route , This will cause a duplicate item in the path search ./dataafter, An error will be reported, and the data set cannot be searched , Just fill in the name of the document directly .

print('Training Set has', len(trainTR), 'Events, ', trainTR.SessionID.nunique(), 'Sessions, and', trainTR.ItemID.nunique(), 'Items\n\n')
trainTR.to_csv(dataAfter + 'recSys15TrainOnly.txt', sep=',', index=False)
print('Validation Set has', len(trainVD), 'Events, ', trainVD.SessionID.nunique(), 'Sessions, and', trainVD.ItemID.nunique(), 'Items\n\n')
trainVD.to_csv(dataAfter + 'recSys15Valid.txt', sep=',', index=False

stay preprocessing.py The author of rihara has named the processed document , Just fill it in directly .

Finally, we need to modify one item

# torch.Tensor.item() to get a Python number from a tensor containing a single value
                losses.append(loss.item())
                recalls.append(recall)
                mrrs.append(mrr.cpu().numpy())

Put the mrr Followed by .cpu().numpy(), Because in the process of training numpy Can't handle it directly GPU Data in , You need to copy the data to CPU.

You can refer to ： Training neural network times wrong ：can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. - BooTurbo - Blog Garden

The next step is to upload the source code , stay Add data Interior selection upload

Enter the name and drag the code folder into , Then create it in the lower right corner , If you upload the modified code, you will be prompted that there are duplicate files. At this time, just click the arrow in the lower right corner to select include This one is enough .

After uploading, see the following figure

Then you can execute the code ,

!python ../input/gru4rec9/GRU4REC-pytorch-master/GRU4REC-pytorch-master/preprocessing.py

First execute the code that processes the data , among preprocessing.py He must be added to the front kaggle In the path , The copy method is the same as above , After execution dataafter There will be processed files in the folder .

!python ../input/gru4rec13/GRU4REC-pytorch-master/GRU4REC-pytorch-master/main.py

And then execute main.py Add the same kaggle route .

Finally, we get the result ：

Because of time , there epoch I set it as 1, Such a round will take nearly an hour .

Then verify ：

!python ../input/gru4rec13/GRU4REC-pytorch-master/GRU4REC-pytorch-master/main.py --is_eval --load_model ./checkpoint/07240927/model_00000.pt

among main.py The path of must be the same as above kaggle In the path , The one at the back .pt The file is finished main.py After the file checkpoint In the file , Copy its path .

Results of validation ：

Finally, the results in the paper are given ：

What we're using here is TOP1 Loss , The hidden layer dimension is 100,batch_size by 32,n_epoch by 1, Other parameters are the default options in the code .

原网站

版权声明
本文为[qq_ fifty-three million four hundred and thirty thousand three ]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/207/202207260950084935.html