Spark-movie-lens - An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

Overview

A scalable on-line movie recommender using Spark and Flask

This Apache Spark tutorial will guide you step-by-step into how to use the MovieLens dataset to build a movie recommender using collaborative filtering with Spark's Alternating Least Saqures implementation. It is organised in two parts. The first one is about getting and parsing movies and ratings data into Spark RDDs. The second is about building and using the recommender and persisting it for later use in our on-line recommender system.

This tutorial can be used independently to build a movie recommender model based on the MovieLens dataset. Most of the code in the first part, about how to use ALS with the public MovieLens dataset, comes from my solution to one of the exercises proposed in the CS100.1x Introduction to Big Data with Apache Spark by Anthony D. Joseph on edX, that is also publicly available since 2014 at Spark Summit. Starting from there, I've added with minor modifications to use a larger dataset, then code about how to store and reload the model for later use, and finally a web service using Flask.

In any case, the use of this algorithm with this dataset is not new (you can Google about it), and this is because we put the emphasis on ending up with a usable model in an on-line environment, and how to use it in different situations. But I truly got inspired by solving the exercise proposed in that course, and I highly recommend you to take it. There you will learn not just ALS but many other Spark algorithms.

It is the second part of the tutorial the one that explains how to use Python/Flask for building a web-service on top of Spark models. By doing so, you will be able to develop a complete on-line movie recommendation service.

Part I: Building the recommender

Part II: Building and running the web service

Quick start

The file server/server.py starts a CherryPy server running a Flask app.py to start a RESTful web server wrapping a Spark-based engine.py context. Through its API we can perform on-line movie recommendations.

Please, refer the the second notebook for detailed instructions on how to run and use the service.

Contributing

Contributions are welcome! For bug reports or requests please submit an issue.

Contact

Feel free to contact me to discuss any issues, questions, or comments.

License

This repository contains a variety of content; some developed by Jose A. Dianes, and some from third-parties. The third-party content is distributed under the license provided by those parties.

The content developed by Jose A. Dianes is distributed under the following license:

Copyright 2016 Jose A Dianes

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Owner
Jose A Dianes
Principal Data Scientist at Mosaic Therapeutics.
Jose A Dianes
A Python scikit for building and analyzing recommender systems

Overview Surprise is a Python scikit for building and analyzing recommender systems that deal with explicit rating data. Surprise was designed with th

Nicolas Hug 5.7k Jan 01, 2023
Plex-recommender - Get movie recommendations based on your current PleX library

plex-recommender Description: Get movie/tv recommendations based on your current

5 Jul 19, 2022
NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs.

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in

420 Jan 04, 2023
基于个性化推荐的音乐播放系统

MusicPlayer 基于个性化推荐的音乐播放系统 Hi, 这是我在大四的时候做的毕设,现如今将该项目开源。 本项目是基于Python的tkinter和pygame所著。 该项目总体来说,代码比较烂(因为当时水平很菜)。 运行的话安装几个基本库就能跑,只不过里面的数据还没有上传至Github。 先

Cedric Niu 6 Nov 19, 2022
Code for MB-GMN, SIGIR 2021

MB-GMN Code for MB-GMN, SIGIR 2021 For Beibei data, run python .\labcode.py For Tmall data, run python .\labcode.py --data tmall --rank 2 For IJCAI

32 Dec 04, 2022
It is a movie recommender web application which is developed using the Python.

Movie Recommendation 🍿 System Watch Tutorial for this project Source IMDB Movie 5000 Dataset Inspired from this original repository. Features Simple

Kushal Bhavsar 10 Dec 26, 2022
A framework for large scale recommendation algorithms.

A framework for large scale recommendation algorithms.

Alibaba Group - PAI 880 Jan 03, 2023
Spark-movie-lens - An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

A scalable on-line movie recommender using Spark and Flask This Apache Spark tutorial will guide you step-by-step into how to use the MovieLens datase

Jose A Dianes 794 Dec 23, 2022
reXmeX is recommender system evaluation metric library.

A general purpose recommender metrics library for fair evaluation.

AstraZeneca 258 Dec 22, 2022
Temporal Meta-path Guided Explainable Recommendation (WSDM2021)

Temporal Meta-path Guided Explainable Recommendation (WSDM2021) TMER Code of paper "Temporal Meta-path Guided Explainable Recommendation". Requirement

Yicong Li 13 Nov 30, 2022
ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms

ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms, including but not limited to click-through-rate (CTR) prediction, learning-to-ranking (LTR), and Matrix/Tensor Embeddi

LI, Wai Yin 90 Oct 08, 2022
Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction

MGNN-SPred This is our Tensorflow implementation for the paper: WenWang,Wei Zhang, Shukai Liu, Qi Liu, Bo Zhang, Leyu Lin, and Hongyuan Zha. 2020. Bey

Wen Wang 18 Jan 02, 2023
Global Context Enhanced Social Recommendation with Hierarchical Graph Neural Networks

SR-HGNN ICDM-2020 《Global Context Enhanced Social Recommendation with Hierarchical Graph Neural Networks》 Environments python 3.8 pytorch-1.6 DGL 0.5.

xhc 9 Nov 12, 2022
A library of metrics for evaluating recommender systems

recmetrics A python library of evalulation metrics and diagnostic tools for recommender systems. **This library is activly maintained. My goal is to c

Claire Longo 458 Jan 06, 2023
Handling Information Loss of Graph Neural Networks for Session-based Recommendation

LESSR A PyTorch implementation of LESSR (Lossless Edge-order preserving aggregation and Shortcut graph attention for Session-based Recommendation) fro

Tianwen CHEN 62 Dec 03, 2022
fastFM: A Library for Factorization Machines

Citing fastFM The library fastFM is an academic project. The time and resources spent developing fastFM are therefore justified by the number of citat

1k Dec 24, 2022
Spotify API Recommnder System

This project will access your last listened songs on Spotify using its API, then it will request the user to select 5 favorite songs in that list, on which the API will proceed to make 50 recommendat

Kevin Luke 1 Dec 14, 2021
A PyTorch implementation of "Say No to the Discrimination: Learning Fair Graph Neural Networks with Limited Sensitive Attribute Information" (WSDM 2021)

FairGNN A PyTorch implementation of "Say No to the Discrimination: Learning Fair Graph Neural Networks with Limited Sensitive Attribute Information" (

31 Jan 04, 2023
Learning Fair Representations for Recommendation: A Graph-based Perspective, WWW2021

FairGo WWW2021 Learning Fair Representations for Recommendation: A Graph-based Perspective As a key application of artificial intelligence, recommende

lei 39 Oct 26, 2022
Recommendation System to recommend top books from the dataset

recommendersystem Recommendation System to recommend top books from the dataset Introduction The recom.py is the main program code. The dataset is als

Vishal karur 1 Nov 15, 2021