reXmeX is recommender system evaluation metric library.

Last update: Dec 22, 2022

Overview

reXmeX is recommender system evaluation metric library.

Please look at the Documentation and External Resources.

reXmeX consists of utilities for recommender system evaluation. First, it provides a comprehensive collection of metrics for the evaluation of recommender systems. Second, it includes a variety of methods for reporting and plotting the performance results. Implemented metrics cover a range of well-known metrics and newly proposed metrics from data mining (ICDM, CIKM, KDD) conferences and pieces from prominent journals.

An introductory example

The following example loads a synthetic dataset which has the source_id, target_id, source_group and target group keys besides the mandatory y_true and y_scores. The dataset has binary labels and predictied probability scores. We read the dataset and define a defult ClassificationMetric instance for the evaluation of the predictions. Using this metric set we create a score card, group the predictions on with the source_group key and return a performance metric report.

from rexmex.scorecard import ScoreCard
from rexmex.dataset import DatasetReader
from rexmex.metricset import ClassificationMetricSet

reader = DatasetReader()
scores = reader.read_dataset()

metric_set = ClassificationMetricSet()

score_card = ScoreCard(metric_set)

report = score_card.generate_report(scores, groupping=["source_group"])

Scorecard

A rexmex score card allows the reporting of recommender system performance metrics, plotting the performance metrics and saving those

Metric Sets

Metric sets allow the users to calculate a range of evaluation metrics for a label - predicted label vector pair. We provide a general MetricSet class and specialized metric sets with pre-set metrics have the following general categories:

Rating
Classification
Ranking
Coverage

Rating Metric Set

These metrics assume that items are scored explicitly and ratings are predicted by a regression model.

Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)
Mean Absolute Error (MAE)
Mean Absolute Percentage Error (MAPE)

Expand to see all rating metrics in the metric set.

Classification Metric Set

These metrics assume that the items are scored with raw probabilities (these can be binarized).

Precision
Recall
PR AUC
ROC AUC

Expand to see all classification metrics in the metric set.

Ranking Metric Set

DPM
NDPM
DCG
NDCG

Expand to see all ranking metrics in the metric set.

Coverage Metric Set

These merics measure how well the recommender system covers the available items in the catalog. In other words measure the diversity of predictions.

Documentation and Reporting Issues

Head over to our documentation to find out more about installation and data handling, a full list of implemented methods, and datasets. For a quick start, check out our examples.

If you notice anything unexpected, please open an issue and let us know. If you are missing a specific method, feel free to open a feature request. We are motivated to constantly make RexMex even better.

Installation via the command line

RexMex can be installed with the following command after the repo is cloned.

$ python setup.py install

Installation via pip

RexMex can be installed with the following pip command.

$ pip install rexmex

As we create new releases frequently, upgrading the package casually might be beneficial.

$ pip install rexmex --upgrade

Running tests

$ pytest ./tests/unit -cov rexmex/
$ pytest ./tests/integration -cov rexmex/

License

Apache-2.0 License

Comments

Add static type checking with MyPy
Summary

This PR adds optional static type checking using mypy, which uses the type hints to check for errors that don't show up from unit tests.

[x] Code passes all tests

[x] Unit tests provided for these changes

[x] Documentation and docstrings added for these changes

Changes

Add mypy environment to tox.ini

Add call to mypy via tox in the GHA configuration for CI

Make a few improvements to code based on suggestions from type checker

Add additional tests
opened by cthoyt 6
Annotate redundant functions
Summary

As a follow-up to #33, this PR adds a duplicate_of annotation to the rexmex.utils.Annotator class and begins annotating which functions are duplicate of each other.

~Caveat I'm not really happy with the direction of which is the "duplicate" in many cases, e.g., where miss_rate is the "canonical" one and "false_negative_rate" is the duplicate~

[x] Code passes all tests

[x] Unit tests provided for these changes

[x] Documentation and docstrings added for these changes

[x] Check all functions are annotated, and correctly

Changes

Add new annotation duplicate_of to rexmex.utils.Annotator

Annotate duplicate functions (e.g., precision_score is a duplicate of positive_predictive_value)

Pins pandas<=1.3.5 since they just put the 1.4 release candidate up on PyPI this morning and it doesn't work with the conda env in the GHA workflow
opened by cthoyt 5
Why are there duplicate functions?

I noticed that there are duplicate functions such as miss_rate()/false_negative_rate() and fall_out()/false_positive_rate(). What's the reason for the duplication?

opened by cthoyt 3
Remove deprecated sklearn dependency
Summary

Fixes #56 by removing deprecated sklearn dependency.

[x] Code passes all tests

[x] Unit tests provided for these changes (does not apply)

[x] Documentation and docstrings added for these changes (does not apply)

Changes

Remove deprecated sklearn dependency
opened by dobraczka 2
Pandas versioning issue

Hey rexmex team,

I wanted to ask you if there is a specific reason why your installation requires pandas to be of this version?

Since this package gets installed with the latest version of PyKEEN, I am facing some versioning issues due to the Pandas version restriction of your package.

The related line of code: https://github.com/AstraZeneca/rexmex/blob/44f453ff20e92569270b9e1cfcb75b44b7839128/setup.py#L3

Apologies if it's a silly question but: is it indeed a strict requirement for rexmex to have Pandas<1.3.5 or is this something we can modify in rexmex setup.py? ( if so I can open a related PR for it )

Thank you in advance! Best, Dimitris

opened by DimitrisAlivas 2
Coverage refactor, added CoverageMetricSet and CoverageScoreCard
Summary

Please provide a high-level summary of the changes for the changes and notes for the reviewers

[X] Code passes all tests

[X] Unit tests provided for these changes

[X] Documentation and docstrings added for these changes

Changes

changed signature for the Coverage metrics, it now requires supplying the relevant user and item spaces plus a list of tuples (user, item) as final predictions

added the item coverage and user coverage metrics to CoverageMetricSet

created a ScoreCard for coverage metrics (they need a different signature than classification and regression)
opened by kajocina 2
Add binarize annotation
Summary

Similarly to #35, this PR adds an additional annotation for functions that need to be binarized.

[x] Code passes all tests

[x] Unit tests provided for these changes

[x] Documentation and docstrings added for these changes

Changes

Add additional keyword argument binarize to rexmex.utils.Annotator.annotate. This has a default of False, since most functions do not need to be binarized.

Annotate binarize=True on the functions that were binarized in the rexmex.metricset.ClassificationMetricSet.__init__

Future outlook

This improvement will enable the later implementation of automated collection and processing of metric functions to improve the rexmex.metricset.ClassificationMetricSet class.
opened by cthoyt 2
Add additional classification function annotations
Summary

Closes #28

This PR adds four new annotations to classification functions:

lower bound inclusive

upper bound inclusive

a description

a URL link to more information / citation

Checks:

[x] Code passes all tests

[x] Unit tests provided for these changes

[x] Documentation and docstrings added for these changes

Changes

This PR adds new annotation requirements and applies them to all classification functions
opened by cthoyt 2
Add basic coverage stat
Summary

Added a basic coverage statistic + some tests. The naming convention might need to be refactored though, not sure if item_coverage() is the right name (or perhaps catalogue_coverage() instead?)

[X] Code passes all tests

[X] Unit tests provided for these changes

[X] Documentation and docstrings added for these changes

Changes

added new metric (coverage) which check how many of the possible objects/items get recommended at least once, expressed as a fraction (if 1 out of 5 items never got recommended, it will be 80%)
opened by kajocina 2
Demonstrate annotating structured metadata to classification functions
Summary

This PR demonstrates a potential solution to #28 in a pilot applied to classification metrics. It could be extended in a future pull request to other branches of the package.

[x] Code passes all tests

[x] Unit tests provided for these changes

[x] Documentation and docstrings added for these changes

Changes

[x] Add an annotate decorator that adds some structured information to classification functions

[x] Add tests to make sure the data is accessible

[x] Add unit test to ensure all classification functions are annotated (1afd68b)

[x] Annotate all classification functions

Future

Before finalizing this PR, I had also used the annotate function to make a registry of functions. This could be used to make the generation/maintenance of the ClassificationMetricSet much easier, but I'd save that for a different PR.
opened by cthoyt 2
API Suggestions

Right now it's a bit round-about to get a scorecard for a given dataset since it expect a pandas format. I'd suggest exposing ScoreCard._get_performance_metrics as a public user interface and also encourage people to use that directly in case they're generating their own y_true and y_score and don't want to write their own code to generate a pandas dataframe from it, just for rexmex to need to unpack it.

My example is in PyKEEN, where we do just that: https://github.com/pykeen/pykeen/blob/799e224e772176703d796a9247bfcc179d343c6c/src/pykeen/evaluation/sklearn.py#L129-L142

I'd also say it would be worth adding a second introductory example based on this in the README.

opened by cthoyt 2
Annotate rankings (help wanted)
Summary

This PR uses the rexmex.utils.Annotator to annotate information about ranking metrics (e.g., MR, MRR, Hits @ K). I'm not familiar with all of the rankings, so help would be great on this one. This would especially be good for first-time contributors, since a lot of it is busy work of looking up metrics, finding out about their properties, etc.. A potential contributor could make a branch off of mine and then either PR it directly, or PR it into my fork (or just post the curation as a comment in this PR, and I can make the code updates while crediting them as a co-author on the relevant commits)

[ ] Code passes all tests

[ ] Unit tests provided for these changes

[ ] Documentation and docstrings added for these changes

[ ] https://github.com/AstraZeneca/rexmex/pull/43, since it would be good to re-use its generalized testing framework

Changes

[x] Add annotator to rexmex.metrics.ratings and annotate its functions

[ ] Switch construction of metric set to use the annotator's registry
opened by cthoyt 0
Improve binning in `binarize()`
The current binarize function uses a cutoff of 0.5 for binarization: https://github.com/AstraZeneca/rexmex/blob/3e266529761281ae832e49736e48d3e46f3b4af4/rexmex/utils.py#L28-L34

This is an issue for PyKEEN, where the scores that come from a model could all be on the range of [-5,-2]. The current TODO text says to use https://en.wikipedia.org/wiki/Youden%27s_J_statistic, but it's not clear how that would be used.

As an alternative, the NetMF package implements the following code for constructing an indicator that might be more applicable (though I don't personally recognize what method this is, and unfortunately it's not documented):

def construct_indicator(y_score, y): # rank the labels by the scores directly num_label = np.sum(y, axis=1, dtype=np.int) y_sort = np.fliplr(np.argsort(y_score, axis=1)) y_pred = np.zeros_like(y, dtype=np.int) for i in range(y.shape[0]): for j in range(num_label[i]): y_pred[i, y_sort[i, j]] = 1 return y_pred
opened by cthoyt 0
Add function keys and annotate ratings
Summary

This PR streamlines generating metric sets and annotates more functions.

[x] Code passes all tests

[x] Unit tests provided for these changes

[x] Documentation and docstrings added for these changes

Changes

[x] Update the rexmex.utils.Annotator class to include a key. If it's not given, this defaults to the function's name. The registry now uses the key instead of the function's name

[x] Update the metric sets to load the function names directly from the keys in the annotator's dictionary

[x] Annotate functions in the ratings module

[x] Generalizes tests to make it easier to test the existence of annotations for ratings, coverage, and rankings
opened by cthoyt 0
Adjusted mean rank

@mberr's adjusted mean rank address some of the problems with the mean rank, including its size dependence. Reference: https://arxiv.org/abs/2002.06914

opened by cthoyt 3

Releases(v_00102)

v_00102(Sep 28, 2022)
What's Changed

Docstring fix by @kajocina in https://github.com/AstraZeneca/rexmex/pull/47

added CoverageScoreCard to init by @kajocina in https://github.com/AstraZeneca/rexmex/pull/48

Remove broken link to examples by @benedekrozemberczki in https://github.com/AstraZeneca/rexmex/pull/50

Unfix pandas version in RexMex requirements by @GavEdwards in https://github.com/AstraZeneca/rexmex/pull/54

Release 0.1.2 by @GavEdwards in https://github.com/AstraZeneca/rexmex/pull/55

Note: version 0.1.2 due to an issue with creating the 0.1.1 release.

Full Changelog: https://github.com/AstraZeneca/rexmex/compare/v_00100...v_00102
Source code(tar.gz)
Source code(zip)
v_00100(Jan 7, 2022)
What's Changed

Use registry pattern for ClassificationMetricSet by @cthoyt in https://github.com/AstraZeneca/rexmex/pull/40

Coverage refactor, added CoverageMetricSet and CoverageScoreCard by @kajocina in https://github.com/AstraZeneca/rexmex/pull/42

Annotate redundant functions by @cthoyt in https://github.com/AstraZeneca/rexmex/pull/41

Source code(tar.gz)
Source code(zip)
v_00015(Jan 4, 2022)
What's Changed

Update name in citation by @cthoyt in https://github.com/AstraZeneca/rexmex/pull/39

Add binarize annotation by @cthoyt in https://github.com/AstraZeneca/rexmex/pull/36

Cleanup scorecard interface by @cthoyt in https://github.com/AstraZeneca/rexmex/pull/38

Improve testing by @cthoyt in https://github.com/AstraZeneca/rexmex/pull/37

Full Changelog: https://github.com/AstraZeneca/rexmex/compare/v_00014...v_00015
Source code(tar.gz)
Source code(zip)
v_00014(Jan 4, 2022)
What's Changed 🦖🦖

Demonstrate annotating structured metadata to classification functions by @cthoyt in https://github.com/AstraZeneca/rexmex/pull/29

Add additional classification function annotations by @cthoyt in https://github.com/AstraZeneca/rexmex/pull/35

Full Changelog: https://github.com/AstraZeneca/rexmex/compare/v_00013...v_00014
Source code(tar.gz)
Source code(zip)
v_00013(Dec 13, 2021)
New method to filter training, testing, and validation instances.

Exposing the get_performance_metrics() method.

Source code(tar.gz)
Source code(zip)
v_00012(Dec 10, 2021)

Fixes the import all
Source code(tar.gz)
Source code(zip)
v_00011(Dec 7, 2021)
New ranking metrics added are:

Geometric Mean Rank

Mean Rank

Mean Reciprocal Rank

Source code(tar.gz)
Source code(zip)
v_00010(Dec 6, 2021)
Adds the following ranking metrics

Reciprocal Rank

Mean Reciprocal Rank (MRR)

Average [email protected] ([email protected])

Mean Average [email protected] ([email protected])

Average [email protected] ([email protected])

Mean Average [email protected] ([email protected])

[email protected]

Spearman's Rho

Kendall Tau

Intra List Similarity

Personalization

Novelty

NPDM

DCG

NDCG

Source code(tar.gz)
Source code(zip)
v_00009(Dec 2, 2021)
The metric sets on scorecards are exposed.

Source code(tar.gz)
Source code(zip)
v_0007(Nov 29, 2021)
Fixes Broken Name Spacing

Source code(tar.gz)
Source code(zip)
v_00008(Nov 29, 2021)
Improved aggregation behaviour.

Source code(tar.gz)
Source code(zip)
v_00006(Nov 25, 2021)
The new release separates metrics and creates namespaces based on metric categories. This helps with modularity and organization.

Results in namespaces for:

Rating

Classification

Ranking

Coverage

Source code(tar.gz)
Source code(zip)
v_00005(Nov 24, 2021)
Library now includes:

Positive and negative likelihood ratio

Informedness and markedness

Threat score and critical success index

Fowlkes - Mallows index

Prevalence threshold

Diagnostic odds ratio

Source code(tar.gz)
Source code(zip)
v_00004(Nov 23, 2021)
The new release covers:

False Negative/Positive

True Positive/Negative

FPR, TPR, FNR, TNR

Specificity, Selectivity, False Omission Rate, False Discovery Rate

Miss Rate, Fall Out

Positive Predictive Value, Negative Predictive Value

Source code(tar.gz)
Source code(zip)
v_00003(Nov 22, 2021)
New features and bug fixes:

Normalization of targets

Metric set behaviour changed

New dataset for testing

Completed test coverage

Updated setup with tags and licensing

Source code(tar.gz)
Source code(zip)
v_00001(Nov 22, 2021)

This is a test release.
Source code(tar.gz)
Source code(zip)

Owner

AstraZeneca

Data and AI: Unlocking new science insights

GitHub Repository

Temporal Meta-path Guided Explainable Recommendation (WSDM2021)

Temporal Meta-path Guided Explainable Recommendation (WSDM2021) TMER Code of paper "Temporal Meta-path Guided Explainable Recommendation". Requirement

13 Nov 30, 2022

A framework for large scale recommendation algorithms.

880 Jan 03, 2023

Hierarchical Fashion Graph Network for Personalized Outfit Recommendation, SIGIR 2020

hierarchical_fashion_graph_network This is our Tensorflow implementation for the paper: Xingchen Li, Xiang Wang, Xiangnan He, Long Chen, Jun Xiao, and

70 Dec 05, 2022

It is a movie recommender web application which is developed using the Python.

Movie Recommendation 🍿 System Watch Tutorial for this project Source IMDB Movie 5000 Dataset Inspired from this original repository. Features Simple

10 Dec 26, 2022

Elliot is a comprehensive recommendation framework that analyzes the recommendation problem from the researcher's perspective.

Comprehensive and Rigorous Framework for Reproducible Recommender Systems Evaluation

215 Nov 29, 2022

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

10.6k Jan 01, 2023

A Python scikit for building and analyzing recommender systems

Overview Surprise is a Python scikit for building and analyzing recommender systems that deal with explicit rating data. Surprise was designed with th

5.7k Jan 01, 2023

Jointly Learning Explainable Rules for Recommendation with Knowledge Graph

57 Nov 03, 2022

A tensorflow implementation of the RecoGCN model in a CIKM'19 paper, titled with "Relation-Aware Graph Convolutional Networks for Agent-Initiated Social E-Commerce Recommendation".

This repo contains a tensorflow implementation of RecoGCN and the experiment dataset Running the RecoGCN model python train.py Example training outp

30 Nov 25, 2022

Recommendation Systems for IBM Watson Studio platform

Recommendation-Systems-for-IBM-Watson-Studio-platform Project Overview In this project, I analyze the interactions that users have with articles on th

1 Jan 21, 2022

Plex-recommender - Get movie recommendations based on your current PleX library

plex-recommender Description: Get movie/tv recommendations based on your current

5 Jul 19, 2022

Bert4rec for news Recommendation

News-Recommendation-system-using-Bert4Rec-model Bert4rec for news Recommendation

2 Feb 04, 2022

Spark-movie-lens - An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

A scalable on-line movie recommender using Spark and Flask This Apache Spark tutorial will guide you step-by-step into how to use the MovieLens datase

794 Dec 23, 2022

Spotify API Recommnder System

This project will access your last listened songs on Spotify using its API, then it will request the user to select 5 favorite songs in that list, on which the API will proceed to make 50 recommendat

1 Dec 14, 2021

大规模推荐算法库，包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、DeepWalk、SSR、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、ListWise等

(中文文档|简体中文|English) 什么是推荐系统? 推荐系统是在互联网信息爆炸式增长的时代背景下，帮助用户高效获得感兴趣信息的关键；推荐系统也是帮助产品最大限度吸引用户、留存用户、增加用户粘性、提高用户转化率的银弹。有无数优秀的产品依靠用户可感知的推荐系统建立了良好的口碑，也有无数的公司依

3.6k Dec 30, 2022

This is our Tensorflow implementation for "Graph-based Embedding Smoothing for Sequential Recommendation" (GES) (TKDE, 2021).

Graph-based Embedding Smoothing (GES) This is our Tensorflow implementation for the paper: Tianyu Zhu, Leilei Sun, and Guoqing Chen. "Graph-based Embe

15 Nov 29, 2022

reXmeX is recommender system evaluation metric library.

Related tags

Overview

Comments

Summary

Changes

Summary

Changes

Summary

Changes

Summary

Changes

Summary

Changes

Future outlook

Summary

Changes

Summary

Changes

Summary

Changes

Future

Summary

Changes

Summary

Changes

Releases(v_00102)

v_00102(Sep 28, 2022)

What's Changed

v_00100(Jan 7, 2022)

What's Changed

v_00015(Jan 4, 2022)

What's Changed

v_00014(Jan 4, 2022)

What's Changed 🦖🦖

v_00013(Dec 13, 2021)

v_00012(Dec 10, 2021)

v_00011(Dec 7, 2021)

v_00010(Dec 6, 2021)

v_00009(Dec 2, 2021)

v_0007(Nov 29, 2021)

v_00008(Nov 29, 2021)

v_00006(Nov 25, 2021)

v_00005(Nov 24, 2021)

v_00004(Nov 23, 2021)

v_00003(Nov 22, 2021)

v_00001(Nov 22, 2021)

Owner

AstraZeneca

Temporal Meta-path Guided Explainable Recommendation (WSDM2021)

A framework for large scale recommendation algorithms.

Hierarchical Fashion Graph Network for Personalized Outfit Recommendation, SIGIR 2020

It is a movie recommender web application which is developed using the Python.

Elliot is a comprehensive recommendation framework that analyzes the recommendation problem from the researcher's perspective.

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

A Python scikit for building and analyzing recommender systems

Jointly Learning Explainable Rules for Recommendation with Knowledge Graph

A tensorflow implementation of the RecoGCN model in a CIKM'19 paper, titled with "Relation-Aware Graph Convolutional Networks for Agent-Initiated Social E-Commerce Recommendation".

Recommendation Systems for IBM Watson Studio platform

Plex-recommender - Get movie recommendations based on your current PleX library

Bert4rec for news Recommendation

Spark-movie-lens - An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

Spotify API Recommnder System

This is our Tensorflow implementation for "Graph-based Embedding Smoothing for Sequential Recommendation" (GES) (TKDE, 2021).

A movie recommender which recommends the movies belonging to the genre that user has liked the most.

Code for ICML2019 Paper "Compositional Invariance Constraints for Graph Embeddings"

A Python implementation of LightFM, a hybrid recommendation algorithm.

Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch