A configurable, tunable, and reproducible library for CTR prediction

Overview

FuxiCTR

This repo is the community dev version of the official release at huawei-noah/benchmark/FuxiCTR.

Click-through rate (CTR) prediction is an critical task for many industrial applications such as online advertising, recommender systems, and sponsored search. FuxiCTR provides an open-source library for CTR prediction, with key features in configurability, tunability, and reproducibility. It also supports the building of the BARS-CTR-Prediction benchmark, which aims for open benchmarking for CTR prediction.

πŸ‘‰ If you find our code or benchmarks helpful in your research, please kindly cite the following paper.

Jieming Zhu, Jinyang Liu, Shuai Yang, Qi Zhang, Xiuqiang He. Open Benchmarking for Click-Through Rate Prediction. The 30th ACM International Conference on Information and Knowledge Management (CIKM), 2021.

Model List

Publication Model Paper Available
WWW'07 LR Predicting Clicks: Estimating the Click-Through Rate for New Ads βœ”οΈ
ICDM'10 FM Factorization Machines βœ”οΈ
CIKM'15 CCPM A Convolutional Click Prediction Model βœ”οΈ
RecSys'16 FFM Field-aware Factorization Machines for CTR Prediction βœ”οΈ
RecSys'16 YoutubeDNN Deep Neural Networks for YouTube Recommendations βœ”οΈ
DLRS'16 Wide&Deep Wide & Deep Learning for Recommender Systems βœ”οΈ
ICDM'16 IPNN Product-based Neural Networks for User Response Prediction βœ”οΈ
KDD'16 DeepCross Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features βœ”οΈ
NIPS'16 HOFM Higher-Order Factorization Machines βœ”οΈ
IJCAI'17 DeepFM DeepFM: A Factorization-Machine based Neural Network for CTR Prediction βœ”οΈ
SIGIR'17 NFM Neural Factorization Machines for Sparse Predictive Analytics βœ”οΈ
IJCAI'17 AFM Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks βœ”οΈ
ADKDD'17 DCN Deep & Cross Network for Ad Click Predictions βœ”οΈ
WWW'18 FwFM Field-weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising βœ”οΈ
KDD'18 xDeepFM xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems βœ”οΈ
KDD'18 DIN Deep Interest Network for Click-Through Rate Prediction βœ”οΈ
CIKM'19 FiGNN FiGNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction βœ”οΈ
CIKM'19 AutoInt/AutoInt+ AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks βœ”οΈ
RecSys'19 FiBiNET FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction βœ”οΈ
WWW'19 FGCNN Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction βœ”οΈ
AAAI'19 HFM/HFM+ Holographic Factorization Machines for Recommendation βœ”οΈ
NeuralNetworks'20 ONN Operation-aware Neural Networks for User Response Prediction βœ”οΈ
AAAI'20 AFN/AFN+ Adaptive Factorization Network: Learning Adaptive-Order Feature Interactions βœ”οΈ
AAAI'20 LorentzFM Learning Feature Interactions with Lorentzian Factorization βœ”οΈ
WSDM'20 InterHAt Interpretable Click-through Rate Prediction through Hierarchical Attention βœ”οΈ
DLP-KDD'20 FLEN FLEN: Leveraging Field for Scalable CTR Prediction βœ”οΈ
CIKM'20 DeepIM Deep Interaction Machine: A Simple but Effective Model for High-order Feature Interactions βœ”οΈ
WWW'21 FmFM FM^2: Field-matrixed Factorization Machines for Recommender Systems βœ”οΈ
WWW'21 DCN-V2 DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems βœ”οΈ

Installation

Please follow the guide for installation. In particular, FuxiCTR has the following dependent requirements.

  • python 3.6
  • pytorch v1.0/v1.1
  • pyyaml >=5.1
  • scikit-learn
  • pandas
  • numpy
  • h5py
  • tqdm

Get Started

  1. Run the demo to understand the overall workflow

  2. Run a model with dataset and model config files

  3. Preprocess raw csv data to h5 data

  4. Run a model with h5 data as input

  5. How to make configurations?

  6. Tune the model hyper-parameters via grid search

  7. Run a model with sequence features

  8. Run a model with pretrained embeddings

Code Structure

Check an overview of code structure for more details on API design.

Open Benchmarking

If you are looking for the benchmarking settings and results on the state-of-the-art CTR prediction models, please refer to the BARS-CTR-Prediction benchmark. By clicking on the "SOTA Results", you will find the benchmarking results along with the corresponding reproducing steps.

Discussion

Welcome to join our WeChat group for any questions and discussions.

Join Us

We have open positions for internships and full-time jobs. If you are interested in research and practice in recommender systems, please send your CV to [email protected].

Owner
XUEPAI
XUEPAI
A Python Package for Convex Regression and Frontier Estimation

pyStoNED pyStoNED is a Python package that provides functions for estimating multivariate convex regression, convex quantile regression, convex expect

Sheng Dai 17 Jan 08, 2023
The source codes for TME-BNA: Temporal Motif-Preserving Network Embedding with Bicomponent Neighbor Aggregation.

TME The source codes for TME-BNA: Temporal Motif-Preserving Network Embedding with Bicomponent Neighbor Aggregation. Our implementation is based on TG

2 Feb 10, 2022
πŸ… The Most Comprehensive List of Kaggle Solutions and Ideas πŸ…

πŸ… Collection of Kaggle Solutions and Ideas πŸ…

Farid Rashidi 2.3k Jan 08, 2023
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University 235 Jan 03, 2023
CUda Matrix Multiply library.

cumm CUda Matrix Multiply library. cumm is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I de

49 Dec 27, 2022
A Convolutional Transformer for Keyword Spotting

☒️ Audiomer ☒️ Audiomer: A Convolutional Transformer for Keyword Spotting [ arXiv ] [ Previous SOTA ] [ Model Architecture ] Results on SpeechCommands

49 Jan 27, 2022
An experimentation and research platform to investigate the interaction of automated agents in an abstract simulated network environments.

CyberBattleSim April 8th, 2021: See the announcement on the Microsoft Security Blog. CyberBattleSim is an experimentation research platform to investi

Microsoft 1.5k Dec 25, 2022
Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data

We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for

Zekeriyya Demirci 1 Jan 09, 2022
68 keypoint annotations for COFW test data

68 keypoint annotations for COFW test data This repository contains manually annotated 68 keypoints for COFW test data (original annotation of CFOW da

31 Dec 06, 2022
Recurrent Scale Approximation (RSA) for Object Detection

Recurrent Scale Approximation (RSA) for Object Detection Codebase for Recurrent Scale Approximation for Object Detection in CNN published at ICCV 2017

Yu Liu (Louis) 239 Dec 28, 2022
An official repository for Paper "Uformer: A General U-Shaped Transformer for Image Restoration".

Uformer: A General U-Shaped Transformer for Image Restoration Zhendong Wang, Xiaodong Cun, Jianmin Bao and Jianzhuang Liu Paper: https://arxiv.org/abs

Zhendong Wang 497 Dec 22, 2022
Microscopy Image Cytometry Toolkit

Cytokit Cytokit is a collection of tools for quantifying and analyzing properties of individual cells in large fluorescent microscopy datasets with a

Hammer Lab 106 Jan 06, 2023
ICLR21 Tent: Fully Test-Time Adaptation by Entropy Minimization

⛺️ Tent: Fully Test-Time Adaptation by Entropy Minimization This is the official project repository for Tent: Fully-Test Time Adaptation by Entropy Mi

Dequan Wang 204 Dec 25, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
Final project code: Implementing MAE with downscaled encoders and datasets, for ESE546 FA21 at University of Pennsylvania

546 Final Project: Masked Autoencoder Haoran Tang, Qirui Wu 1. Training To train the network, please run mae_pretraining.py. Please modify folder path

Haoran Tang 0 Apr 22, 2022
This is a Python wrapper for TA-LIB based on Cython instead of SWIG.

TA-Lib This is a Python wrapper for TA-LIB based on Cython instead of SWIG. From the homepage: TA-Lib is widely used by trading software developers re

John Benediktsson 7.3k Jan 03, 2023
[SIGGRAPH 2020] Attribute2Font: Creating Fonts You Want From Attributes

Attr2Font Introduction This is the official PyTorch implementation of the Attribute2Font: Creating Fonts You Want From Attributes. Paper: arXiv | Rese

Yue Gao 200 Dec 15, 2022
PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer [Paper] [PyTorch Implementation] [Paddle Implementation] Overview This reposit

148 Dec 30, 2022
Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection Main requirements torch = 1.0 torchvision = 0.2.0 Python 3 Environm

15 Apr 04, 2022
Rethinking Portrait Matting with Privacy Preserving

Rethinking Portrait Matting with Privacy Preserving This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

184 Jan 03, 2023