大规模推荐算法库,包含推荐系统经典及最新算法LR、Wide&Deep、DSSM、TDM、MIND、Word2Vec、DeepWalk、SSR、GRU4Rec、Youtube_dnn、NCF、GNN、FM、FFM、DeepFM、DCN、DIN、DIEN、DLRM、MMOE、PLE、ESMM、MAML、xDeepFM、DeepFEFM、NFM、AFM、RALM、Deep Crossing、PNN、BST、AutoInt、FGCNN、FLEN、ListWise等

Overview

(中文文档|简体中文|English)

什么是推荐系统?

  • 推荐系统是在互联网信息爆炸式增长的时代背景下,帮助用户高效获得感兴趣信息的关键;

  • 推荐系统也是帮助产品最大限度吸引用户、留存用户、增加用户粘性、提高用户转化率的银弹。

  • 有无数优秀的产品依靠用户可感知的推荐系统建立了良好的口碑,也有无数的公司依靠直击用户痛点的推荐系统在行业中占领了一席之地。

    可以说,谁能掌握和利用好推荐系统,谁就能在信息分发的激烈竞争中抢得先机。 但与此同时,有着许多问题困扰着推荐系统的开发者,比如:庞大的数据量,复杂的模型结构,低效的分布式训练环境,波动的在离线一致性,苛刻的上线部署要求,以上种种,不胜枚举。

什么是PaddleRec?

  • 源于飞桨生态的搜索推荐模型 一站式开箱即用工具
  • 适合初学者,开发者,研究者的推荐系统全流程解决方案
  • 包含内容理解、匹配、召回、排序、 多任务、重排序等多个任务的完整推荐搜索算法库

快速使用

环境要求

  • Python 2.7.15 / 3.5 / 3.6 / 3.7, 推荐使用python3.7,示例中的python默认表示python3.7

  • PaddlePaddle >=2.0

  • 操作系统: Windows/Mac/Linux

    Windows下PaddleRec目前仅支持单机训练,分布式训练建议使用Linux环境

安装Paddle

  • gpu环境pip安装
    python -m pip install paddlepaddle-gpu==2.0.0 
  • cpu环境pip安装
    python -m pip install paddlepaddle # gcc8 

更多版本下载可参考paddle官网下载安装

下载PaddleRec

git clone https://github.com/PaddlePaddle/PaddleRec/
cd PaddleRec

快速运行

我们以排序模型中的dnn模型为例介绍PaddleRec的一键启动。训练数据来源为Criteo数据集,我们从中截取了100条数据:

python -u tools/trainer.py -m models/rank/dnn/config.yaml # 动态图训练 
python -u tools/static_trainer.py -m models/rank/dnn/config.yaml #  静态图训练

帮助文档

项目背景

入门教程

进阶教程

FAQ

支持模型列表

方向 模型 单机CPU 单机GPU 分布式CPU 分布式GPU 支持版本 论文
内容理解 TextCnn x >=2.1.0 [EMNLP 2014]Convolutional neural networks for sentence classication
内容理解 TagSpace x >=2.1.0 [EMNLP 2014]TagSpace: Semantic Embeddings from Hashtags
匹配 DSSM x >=2.1.0 [CIKM 2013]Learning Deep Structured Semantic Models for Web Search using Clickthrough Data
匹配 MultiView-Simnet x >=2.1.0 [WWW 2015]A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems
召回 TDM >=1.8.0 >=1.8.0 1.8.5 [KDD 2018]Learning Tree-based Deep Model for Recommender Systems
召回 fasttext x x 1.8.5 [EACL 2017]Bag of Tricks for Efficient Text Classification
召回 MIND x x >=2.1.0 [2019]Multi-Interest Network with Dynamic Routing for Recommendation at Tmall
召回 Word2Vec x >=2.1.0 [NIPS 2013]Distributed Representations of Words and Phrases and their Compositionality
召回 DeepWalk x x >=2.1.0 [SIGKDD 2014]DeepWalk: Online Learning of Social Representations
召回 SSR 1.8.5 [SIGIR 2016]Multtti-Rate Deep Learning for Temporal Recommendation
召回 Gru4Rec 1.8.5 [2015]Session-based Recommendations with Recurrent Neural Networks
召回 Youtube_dnn 1.8.5 [RecSys 2016]Deep Neural Networks for YouTube Recommendations
召回 NCF >=2.1.0 [WWW 2017]Neural Collaborative Filtering
召回 GNN 1.8.5 [AAAI 2019]Session-based Recommendation with Graph Neural Networks
召回 RALM 1.8.5 [KDD 2019]Real-time Attention Based Look-alike Model for Recommender System
排序 Logistic Regression x >=2.1.0 /
排序 Dnn >=2.1.0 /
排序 FM x >=2.1.0 [IEEE Data Mining 2010]Factorization machines
排序 FFM x >=2.1.0 [RECSYS 2016]Field-aware Factorization Machines for CTR Prediction
排序 FNN x 1.8.5 [ECIR 2016]Deep Learning over Multi-field Categorical Data
排序 Deep Crossing x 1.8.5 [ACM 2016]Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features
排序 Pnn x 1.8.5 [ICDM 2016]Product-based Neural Networks for User Response Prediction
排序 DCN x >=2.1.0 [KDD 2017]Deep & Cross Network for Ad Click Predictions
排序 NFM x 1.8.5 [SIGIR 2017]Neural Factorization Machines for Sparse Predictive Analytics
排序 AFM x 1.8.5 [IJCAI 2017]Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks
排序 DMR x x >=2.1.0 [AAAI 2020]Deep Match to Rank Model for Personalized Click-Through Rate Prediction
排序 DeepFM x >=2.1.0 [IJCAI 2017]DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
排序 xDeepFM x >=2.1.0 [KDD 2018]xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems
排序 DIN x >=2.1.0 [KDD 2018]Deep Interest Network for Click-Through Rate Prediction
排序 DIEN x >=2.1.0 [AAAI 2019]Deep Interest Evolution Network for Click-Through Rate Prediction
排序 dlrm x >=2.1.0 [CoRR 2019]Deep Learning Recommendation Model for Personalization and Recommendation Systems
排序 DeepFEFM x >=2.1.0 [arXiv 2020]Field-Embedded Factorization Machines for Click-through rate prediction
排序 BST x 1.8.5 [DLP_KDD 2019]Behavior Sequence Transformer for E-commerce Recommendation in Alibaba
排序 AutoInt x 1.8.5 [CIKM 2019]AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks
排序 Wide&Deep x >=2.1.0 [DLRS 2016]Wide & Deep Learning for Recommender Systems
排序 FGCNN 1.8.5 [WWW 2019]Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction
排序 Fibinet 1.8.5 [RecSys19]FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction
排序 Flen 1.8.5 [2019]FLEN: Leveraging Field for Scalable CTR Prediction
多任务 PLE >=2.1.0 [RecSys 2020]Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations
多任务 ESMM >=2.1.0 [SIGIR 2018]Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate
多任务 MMOE >=2.1.0 [KDD 2018]Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts
多任务 ShareBottom >=2.1.0 [1998]Multitask learning
多任务 Maml x x >=2.1.0 [PMLR 2017]Model-agnostic meta-learning for fast adaptation of deep networks
重排序 Listwise x 1.8.5 [2019]Sequential Evaluation and Generation Framework for Combinatorial Recommender System

社区


Release License Slack

版本历史

  • 2021.11.19 - PaddleRec v2.2.0
  • 2021.05.19 - PaddleRec v2.1.0
  • 2021.01.29 - PaddleRec v2.0.0
  • 2020.10.12 - PaddleRec v1.8.5
  • 2020.06.17 - PaddleRec v0.1.0
  • 2020.06.03 - PaddleRec v0.0.2
  • 2020.05.14 - PaddleRec v0.0.1

许可证书

本项目的发布受Apache 2.0 license许可认证。

联系我们

如有意见、建议及使用中的BUG,欢迎在GitHub Issue提交

亦可通过以下方式与我们沟通交流:

  • QQ群号码:861717190
  • 微信小助手微信号:wxid_0xksppzk5p7f22
  • 备注REC自动加群

     

PaddleRec交流QQ群               PaddleRec微信小助手

Comments
  • Error while finding module specification for 'paddlerec.run' (ModuleNotFoundError: No module named 'paddlerec')

    Error while finding module specification for 'paddlerec.run' (ModuleNotFoundError: No module named 'paddlerec')

    hi,dear 没有这个东西啊,直接运行下面的

    python -m paddlerec.run -m ./config.yaml
    #or
    python -m paddlerec.run -m paddlerec.models.recall.gnn
    

    Error while finding module specification for 'paddlerec.run' (ModuleNotFoundError: No module named 'paddlerec')

    咋解决啊,大佬

    opened by ucasiggcas 13
  • 关于wide_deep model 的 TIPC 测试。

    关于wide_deep model 的 TIPC 测试。

    你好我在fork项目后按照test_tipc里的readme.md文件里的指示进行测试。执行一下两条代码: 1.bash test_tipc/prepare.sh ./test_tipc/configs/wide_deep/train_infer_python.txt 'lite_train_lite_infer' 2.bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/wide_deep/train_infer_python.txt 'lite_train_lite_infer' 再enable_tensorRT=True的情况下,测试失败,错误报告如下: W0504 20:43:30.999462 1664 analysis_predictor.cc:795] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect. I0504 20:43:31.013504 1664 analysis_predictor.cc:576] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] Traceback (most recent call last): File "tools/paddle_infer.py", line 169, in main(args) File "tools/paddle_infer.py", line 115, in main predictor, pred_config = init_predictor(args) File "tools/paddle_infer.py", line 93, in init_predictor predictor = create_predictor(config) ValueError: (InvalidArgument) Pass tensorrt_subgraph_pass has not been registered. Please use the paddle inference library compiled with tensorrt or disable the tensorrt engine in inference configuration! [Hint: Expected Has(pass_type) == true, but received Has(pass_type):0 != true:1.] (at /paddle/paddle/fluid/framework/ir/pass.h:236)

    运行环境是AI Studio经典版 V100 32GB。 请问是因为环境的问题吗?

    opened by Li-fAngyU 5
  • 无法可视化梯度:参数的梯度是None

    无法可视化梯度:参数的梯度是None

    代码:

    loss.backward(retain_graph=True)
    optimizer.step()
    for name, param in dy_model.named_parameters():
                    if param.grad is not None:
                    # [499, 117]
                        tag_name = "train/" + name + '/grad1'
                        log_visual.add_histogram(
                                    tag=tag_name, 
                                    values= param.grad.numpy(),
                                    step=step_num,)
    optimizer.clear_grad()
    

    报错消息:

    Warning:
    tensor.grad will return the tensor value of the gradient. This is an incompatible upgrade for tensor.grad API.  It's return type changes from numpy.ndarray in version 2.0 to paddle.Tensor in version 2.1.0.  If you want to get the numpy value of the gradient, you can use :code:`x.grad.numpy()` 
      warnings.warn(warning_msg)
    ../../../tools/trainer.py:50: RuntimeWarning: divide by zero encountered in true_divide
      similiarity = np.dot(a, b.T)/(a_norm * b_norm)
    ../../../tools/trainer.py:50: RuntimeWarning: invalid value encountered in true_divide
      similiarity = np.dot(a, b.T)/(a_norm * b_norm)
    Traceback (most recent call last):
      File "../../../tools/trainer.py", line 284, in <module>
        main(args)
      File "../../../tools/trainer.py", line 204, in main
        step=step_num,)
      File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/visualdl/writer/writer.py", line 435, in add_histogram
        hist, bin_edges = np.histogram(values, bins=buckets)
      File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/numpy/lib/histograms.py", line 780, in histogram
        bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
      File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/numpy/lib/histograms.py", line 417, in _get_bin_edges
        first_edge, last_edge = _get_outer_edges(a, range)
      File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/numpy/lib/histograms.py", line 315, in _get_outer_edges
        "autodetected range of [{}, {}] is not finite".format(first_edge, last_edge))
    ValueError: autodetected range of [nan, nan] is not finite
    

    求大神帮忙,蟹蟹!

    opened by Waterkin 4
  • fm net.py部分bug

    fm net.py部分bug

    https://github.com/PaddlePaddle/PaddleRec/blob/bcddcf46e5cd8d4e6b2c5ee8d0d5521e292a2a81/models/rank/fm/net.py#L40 fm模型这个地方代码是不是写错了?应该是self.fm.forward(sparse_inputs, dense_inputs)吧

    opened by tz28 4
  • 预测结果数据的输出在哪里?

    预测结果数据的输出在哪里?

    按照 https://github.com/PaddlePaddle/PaddleRec/blob/master/tools/readme.md python -u tools/static_infer.py -m models/rank/dnn/config.yaml 这个预测的是 test目录下的数据,但是输出在哪里目录? 看来原代码里主要是AUC的数据,没有具体每行数据的预测结果

    opened by veridone 3
  • DIEN组网代码一点疑惑

    DIEN组网代码一点疑惑

    DIEN组网代码中,下面这两行调用的add_sublayer()函数【self.add_sublayer('linear_%d' % i, linear)】,在传入的name都是'linear_%d' % i的情况下,是否会后面即L144里的覆盖L123里的,如果会覆盖,这么操作两遍add_sublayer()的目的是什么? https://github.com/PaddlePaddle/PaddleRec/blob/master/models/rank/dien/net.py#L123 https://github.com/PaddlePaddle/PaddleRec/blob/master/models/rank/dien/net.py#L144

    opened by tz28 3
  • 使用paddle_infer.py预测naml模型失败

    使用paddle_infer.py预测naml模型失败

    将naml模型导出为Paddle Inference模型后,使用paddle_infer.py脚本进行预测,但不能正确运行。 这是我的执行命令: python3 ../../../tools/paddle_infer.py --model_file=PaddleRec_model/naml/model.pdmodel --params_file=PaddleRec_model/naml/model.pdiparams --use_gpu=False --data_dir=data/sample_data/train/ --reader_file=NAMLDataReader.py --batchsize=1 --benchmark=False

    但由于NAMLDataReader.py里缺少config的读取,因此我在里面也加了下列代码。 config = load_yaml("/Work/PaddleRec/models/rank/naml/config.yaml")

    但还是会报输入shape不匹配的问题: InvalidArgumentError: The shape of input[0] and input[1] is expected to be equal.But received input[0]'s shape = [1, 10], input[1]'s shape = [1, 5, 10]. [Hint: Expected inputs_dims[i].size() == out_dims.size(), but received inputs_dims[i].size():3 != out_dims.size():2.] (at /paddle/paddle/fluid/operators/concat_op.h:40) [operator < concat > error]

    有什么办法可以让我使用Paddle Inference来推理naml吗?

    opened by shentanyue 2
  • Convert mind model met the problem

    Convert mind model met the problem

    Hi, when i used paddle_serving_client to convert the trained model, met the error, how can i fix the problem.

    The two methods give the same result.

    `python3 -m paddle_serving_client.convert --dirname model --model_filename model/rec_static.pdmodel --params_filename model/rec_static.pdparams

    import paddle_serving_client.io as serving_io serving_io.inference_model_to_serving('model', serving_server="serving_server", serving_client="serving_client", model_filename='model/rec_static.pdmodel', params_filename='model/rec_static.pdparams')`

    InvalidArgumentError: Deserialize to tensor failed, maybe the loaded file is not a paddle model(expected file format: 0, but 2459239552 found). [Hint: Expected version == 0U, but received version:2459239552 != 0U:0.] (at /paddle/paddle/fluid/framework/lod_tensor.cc:329) [operator < load_combine > error]

    opened by linWujl 2
  • 1.8.5分支ESMM demo运行失败

    1.8.5分支ESMM demo运行失败

    厂内机器,paddle信息: image

    报错信息:

    `==================================================================================================== Runtime Envs Value

    train.trainer.platform LINUX train.trainer.executor_mode train train.trainer.trainer GeneralTrainer train.trainer.engine single train.trainer.threads 2

    ==================================================================================================== paddlerec Global Envs Value

    hyper_parameters.optimizer.class adam dataset.dataset_infer.type QueueDataset runner.train_runner.epochs 3 runner.infer_runner.name infer_runner phase.infer.dataset_name dataset_infer phase.infer.thread_num 1 runner.infer_runner.phases ['infer'] dataset.dataset_train.name dataset_train dataset.dataset_train.batch_size 5 runner.train_runner.save_checkpoint_interval 1 hyper_parameters.vocab_size 737946 runner.infer_runner.selected_gpus 0 runner.infer_runner.device gpu dataset.dataset_train.type QueueDataset phase.train.thread_num 1 dataset.dataset_infer.name dataset_infer runner.infer_runner.init_model_path increment_esmm/1 hyper_parameters.embed_size 12 dataset.dataset_infer.data_converter models/multitask/esmm/esmm_reader.py runner.train_runner.print_interval 10 dataset.dataset_train.data_path models/multitask/esmm/data/train runner.train_runner.phases ['train'] hyper_parameters.optimizer.learning_rate 0.001 phase.train.model models/multitask/esmm/model.py phase.infer.name infer runner.infer_runner.print_interval 1 dataset.dataset_train.data_converter models/multitask/esmm/esmm_reader.py hyper_parameters.optimizer.strategy async runner.train_runner.selected_gpus 0 dataset.dataset_infer.batch_size 5 phase.infer.model models/multitask/esmm/model.py runner.train_runner.save_inference_path inference runner.train_runner.name train_runner phase.train.name train runner.infer_runner.class infer runner.train_runner.save_inference_interval 4 dataset.dataset_infer.data_path models/multitask/esmm/data/test phase.train.dataset_name dataset_train runner.train_runner.device gpu runner.train_runner.class train mode ['train_runner', 'infer_runner'] workspace models/multitask/esmm runner.train_runner.save_checkpoint_path increment_esmm

    PaddleRec: Runner train_runner Begin PaddleRec run on device GPU: 0 Executor Mode: train processor_register begin Running SingleInstance. Running SingleNetwork. Warning:please make sure there are no hidden files in the dataset folder and check these hidden files:[] File_list: ['models/multitask/esmm/data/train/small.txt'] Warning:please make sure there are no hidden files in the dataset folder and check these hidden files:[] File_list: ['models/multitask/esmm/data/test/small.txt'] Running SingleStartup. W0701 17:38:54.887231 58053 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.1, Runtime API Version: 10.0 W0701 17:38:54.893249 58053 device_context.cc:260] device: 0, cuDNN Version: 7.6. Running SingleRunner. Traceback (most recent call last): File "/home/work/niuyuhang/tools/python_paddle185_gpu/python/lib/python2.7/site-packages/paddle_rec-1.8.5-py2.7.egg/paddlerec/core/trainers/framework/../../utils/dataset_instance.py", line 21, in from paddlerec.core.utils.envs import lazy_instance_by_fliename ImportError: No module named paddlerec.core.utils.envs W0701 17:38:58.462741 58053 init.cc:226] Warning: PaddlePaddle catches a failure signal, it may not work properly W0701 17:38:58.462769 58053 init.cc:228] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle W0701 17:38:58.462775 58053 init.cc:231] The detail failure signal is:

    W0701 17:38:58.462781 58053 init.cc:234] *** Aborted at 1625132338 (unix time) try "date -d @1625132338" if you are using GNU date *** W0701 17:38:58.464403 58053 init.cc:234] PC: @ 0x0 (unknown) W0701 17:38:58.464587 58053 init.cc:234] *** SIGSEGV (@0x0) received by PID 58053 (TID 0x7f57f4528700) from PID 0; stack trace: *** W0701 17:38:58.465903 58053 init.cc:234] @ 0x7f57f3ce0160 (unknown) W0701 17:38:58.467234 58053 init.cc:234] @ 0x7f572085ee46 _ZNSt19_Sp_counted_deleterIP8_IO_FILEZN6paddle9framework11shell_popenERKSsS5_PiEUlS1_E_SaIiELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv W0701 17:38:58.469238 58053 init.cc:234] @ 0x7f571d0744a9 std::_Sp_counted_base<>::_M_release() W0701 17:38:58.473094 58053 init.cc:234] @ 0x7f571d3c1258 paddle::framework::MultiSlotDataFeed::~MultiSlotDataFeed() W0701 17:38:58.474995 58053 init.cc:234] @ 0x7f571d0744a9 std::_Sp_counted_base<>::_M_release() W0701 17:38:58.476524 58053 init.cc:234] @ 0x7f571d390895 paddle::framework::DatasetImpl<>::DestroyReaders() W0701 17:38:58.477910 58053 init.cc:234] @ 0x7f571d1bb518 ZZN8pybind1112cpp_function10initializeIZNS0_C1IvN6paddle9framework7DatasetEJEJNS_4nameENS_9is_methodENS_7siblingENS_10call_guardIJNS_18gil_scoped_releaseEEEEEEEMT0_FT_DpT1_EDpRKT2_EUlPS5_E_vJSM_EJS6_S7_S8_SB_EEEvOSD_PFSC_SF_ESL_ENUlRNS_6detail13function_callEE1_4_FUNEST W0701 17:38:58.479228 58053 init.cc:234] @ 0x7f571d0b40b9 pybind11::cpp_function::dispatcher() W0701 17:38:58.480751 58053 init.cc:234] @ 0x7f57f3ff9bb8 PyEval_EvalFrameEx W0701 17:38:58.482113 58053 init.cc:234] @ 0x7f57f3ffa460 PyEval_EvalFrameEx W0701 17:38:58.483494 58053 init.cc:234] @ 0x7f57f3ffd0bd PyEval_EvalCodeEx W0701 17:38:58.484843 58053 init.cc:234] @ 0x7f57f3ffa345 PyEval_EvalFrameEx W0701 17:38:58.486197 58053 init.cc:234] @ 0x7f57f3ffd0bd PyEval_EvalCodeEx W0701 17:38:58.487534 58053 init.cc:234] @ 0x7f57f3ffa345 PyEval_EvalFrameEx W0701 17:38:58.488898 58053 init.cc:234] @ 0x7f57f3ffd0bd PyEval_EvalCodeEx W0701 17:38:58.490255 58053 init.cc:234] @ 0x7f57f3ffa345 PyEval_EvalFrameEx W0701 17:38:58.491608 58053 init.cc:234] @ 0x7f57f3ffd0bd PyEval_EvalCodeEx W0701 17:38:58.492956 58053 init.cc:234] @ 0x7f57f3ffa345 PyEval_EvalFrameEx W0701 17:38:58.494313 58053 init.cc:234] @ 0x7f57f3ffd0bd PyEval_EvalCodeEx W0701 17:38:58.495658 58053 init.cc:234] @ 0x7f57f3ffa345 PyEval_EvalFrameEx W0701 17:38:58.497013 58053 init.cc:234] @ 0x7f57f3ffd0bd PyEval_EvalCodeEx W0701 17:38:58.498359 58053 init.cc:234] @ 0x7f57f3ffa345 PyEval_EvalFrameEx W0701 17:38:58.499702 58053 init.cc:234] @ 0x7f57f3ffa460 PyEval_EvalFrameEx W0701 17:38:58.501044 58053 init.cc:234] @ 0x7f57f3ffa460 PyEval_EvalFrameEx W0701 17:38:58.502399 58053 init.cc:234] @ 0x7f57f3ffd0bd PyEval_EvalCodeEx W0701 17:38:58.503741 58053 init.cc:234] @ 0x7f57f3ffd1f2 PyEval_EvalCode W0701 17:38:58.505081 58053 init.cc:234] @ 0x7f57f3ffc858 PyEval_EvalFrameEx W0701 17:38:58.506446 58053 init.cc:234] @ 0x7f57f3ffd0bd PyEval_EvalCodeEx W0701 17:38:58.507776 58053 init.cc:234] @ 0x7f57f3ffa345 PyEval_EvalFrameEx W0701 17:38:58.509127 58053 init.cc:234] @ 0x7f57f3ffd0bd PyEval_EvalCodeEx W0701 17:38:58.510396 58053 init.cc:234] @ 0x7f57f3f73eb0 function_call W0701 17:38:58.511759 58053 init.cc:234] @ 0x7f57f3f41df3 PyObject_Call train.sh: line 3: 58053 Segmentation fault (core dumped) /home/work/niuyuhang/tools/python_paddle185_gpu/python/bin/python -m paddlerec.run -m models/multitask/esmm/config.yaml`

    由于执行用的命令是paddlepython185 -m paddlerec.run -m models/multitask/esmm/config.yaml,而非python -m paddlerec.run -m models/multitask/esmm/config.yaml尝试在出错文件/home/work/niuyuhang/tools/python_paddle185_gpu/python/lib/python2.7/site-packages/paddle_rec-1.8.5-py2.7.egg/paddlerec/core/trainers/framework/../../utils/dataset_instance.py中,import paddle,依然报相同错,怀疑是强行调用了python导致。但在运行前加入alias python="/home/work/niuyuhang/tools/python_paddle185_gpu/python/bin/python",无效,依然报错。

    opened by Niuyuhang03 2
  • Wide&Deep代码中对所有sparse特征只用一个embedding

    Wide&Deep代码中对所有sparse特征只用一个embedding

    wide deep代码里对sparse特征先做embedding,查看代码发现是直接统计所有sparse特征取值不同的数值,作为sparse_feature_number,来初始化一个embedding layer的,那这样子做岂不是不同sparse feature有同个值,embedding后的结果是一样的,比如A字段也有数值2,B字段也有数值2,这样子是不是没有区分度了 image image

    opened by lhbrichard 2
  • about dssm loss

    about dssm loss

    https://github.com/PaddlePaddle/PaddleRec/blob/5cd4ae7e8da86262bc379b253d9f6cf83c4a7786/models/match/dssm/train.py#L53

    感觉这里应该改为下面的代码:如果不加axis参数,后续的paddle.mean函数将没有任何意义,而且得到的avg_cost将会是一个batch中每条样本的损失之和,而不是损失平均

    loss = -paddle.sum(paddle.log(hit_prob), axis=-1)
    avg_cost = paddle.mean(x=loss)
    
    opened by JepsonWong 2
  • TDM模型快速开始部分运行失败: KeyError: '102489422'

    TDM模型快速开始部分运行失败: KeyError: '102489422'

    @wangzhen38 我尝试参照README.md的教程运行实例,我使用的版本是stable 2.3,在参考快速开始部分运行时出现错误。我成功地完成了初次训练,并聚类生成树,但是当我再次完成训练(Step 2)后,却无法成功进行预测。报错如下:

    Traceback (most recent call last):
      File "infer.py", line 295, in <module>
        first_layer_set, config)
      File "infer.py", line 210, in infer
        for groudtruth, user_input in reader():
      File "infer.py", line 60, in reader
        groudtruth, output_list = self.line_process(line)
      File "infer.py", line 49, in line_process
        bidword_list = [self.id_code[s] for s in bidword_list]
      File "infer.py", line 49, in <listcomp>
        bidword_list = [self.id_code[s] for s in bidword_list]
    KeyError: '102489422'
    

    经过调试,我发现再次建树生成的ids_id.txt与初次建树生成文件的key字段完全不一样,它们之间相差非常大。 Snipaste_2022-12-25_19-51-17

    同时再次建树生成ids_id.txt中所有的id在 测试数据都不存在,正因如此才会出现KeyError。我确实不太清楚是哪一步出现了问题,感觉可能是生成的epoch_0_item_embedding.txt 的id号和原始数据对不上。 @wangzhen38 我是哪一步弄错了吗?

    opened by Jim59-Chen 5
  • deeprec模型训练过程中没有输出实时精度

    deeprec模型训练过程中没有输出实时精度

    训练输出信息如下,没有精度信息。 c58fc0ad9994dbda511f982b33c443ff 查看deeprec目录下的dygraph_model.py,发现train_forward函数中更新评价指标的部分似乎没有实现。具体在这个地方:https://github.com/PaddlePaddle/PaddleRec/blob/24bea1bfb6110442f5ade28ec6ceba96aa8e455b/models/rank/deeprec/dygraph_model.py#L75 a8258aed2579cd0f7d054321a7e573d3 希望可以修复一下

    opened by USTCKAY 1
  • DCN 全量模型epoch_num参数的配置是否存在问题?

    DCN 全量模型epoch_num参数的配置是否存在问题?

    DCN的config_bigdata.yaml中将epochs配置为10,训练过程中auc指标超过给出的benchmrak 0.777很多,推理过程中auc却达不到训练的程度,后几个epoch生成的模型甚至达不到benchmark,怀疑出现了过拟合现象。所以想请教一下这个参数的配置是否有问题? 另外看到有其他使用criteo数据集的模型的epoch参数设置为1,DCN是否也是配置为1比较好? 这是训练精度: 2022-11-24 23:10:42,973 - INFO - epoch: 0 done, auc: 0.797744, epoch time: 10259.33 s 2022-11-25 01:56:55,257 - INFO - epoch: 1 done, auc: 0.813983, epoch time: 9972.15 s 2022-11-25 04:42:34,678 - INFO - epoch: 2 done, auc: 0.824308, epoch time: 9939.29 s 2022-11-25 07:27:25,202 - INFO - epoch: 3 done, auc: 0.832276, epoch time: 9890.38 s 2022-11-25 10:09:49,307 - INFO - epoch: 4 done, auc: 0.838268, epoch time: 9743.98 s 2022-11-25 12:52:43,571 - INFO - epoch: 5 done, auc: 0.842872, epoch time: 9774.13 s 2022-11-25 15:36:14,327 - INFO - epoch: 6 done, auc: 0.846537, epoch time: 9810.63 s 2022-11-25 18:24:26,853 - INFO - epoch: 7 done, auc: 0.849568, epoch time: 10092.39 s 2022-11-25 21:05:38,557 - INFO - epoch: 8 done, auc: 0.852145, epoch time: 9671.57 s 2022-11-25 23:49:26,100 - INFO - epoch: 9 done, auc: 0.854425, epoch time: 9827.41 s 这是推理精度: 2022-11-28 10:05:40,387 - INFO - epoch: 0 done, auc: 0.800802, epoch time: 407.88 s 2022-11-28 10:12:20,827 - INFO - epoch: 1 done, auc: 0.801374, epoch time: 400.44 s 2022-11-28 10:19:00,820 - INFO - epoch: 2 done, auc: 0.796485, epoch time: 399.99 s 2022-11-28 10:25:47,121 - INFO - epoch: 3 done, auc: 0.790854, epoch time: 406.30 s 2022-11-28 10:32:18,459 - INFO - epoch: 4 done, auc: 0.786025, epoch time: 391.34 s 2022-11-28 10:38:54,057 - INFO - epoch: 5 done, auc: 0.782249, epoch time: 395.60 s 2022-11-28 10:45:33,497 - INFO - epoch: 6 done, auc: 0.778672, epoch time: 399.44 s 2022-11-28 10:52:11,217 - INFO - epoch: 7 done, auc: 0.775637, epoch time: 397.72 s 2022-11-28 10:58:49,925 - INFO - epoch: 8 done, auc: 0.773437, epoch time: 398.71 s 2022-11-28 11:05:31,881 - INFO - epoch: 9 done, auc: 0.770960, epoch time: 401.96 s

    opened by USTCKAY 2
  • 推荐算法业务咨询

    推荐算法业务咨询

    您好,请问PaddleRec团队老师们:有类似paddleNLP的流水线端到端,便于直接部署到业务中的推荐产品吗? (前段时间已经基于paddleNLP端到端搭建了一个规范查询系统https://mp.weixin.qq.com/s/zEHU_aDctre8e2DbJjhlfA 现需结合实际查询业务实现规范条款的精准推荐) 感谢~

    opened by bruce0210 1
  • hash函数的问题

    hash函数的问题

    发现有些预处理脚本如: https://github.com/PaddlePaddle/PaddleRec/blob/master/datasets/criteo_lr/get_slot_data.py, 在处理分类特征时使用到了python内置的hash函数,而这个函数在不同进程对同一值出来的哈希值不同。 查询以往issue发现有个修复:https://github.com/PaddlePaddle/PaddleRec/pull/476/commits/a875c7b95d72ad7e6997312112ee246281f54660 但只修复了w&d和dnn的批量读脚本。应该都修复下以防踩坑

    opened by laigood 1
Releases(v2.3.0)
  • v2.3.0(Jun 20, 2022)

    PaddleRec v2.3.0 Release Note

    重要更新

    • 新增在线运行功能,支持用户零成本使用PaddleRec
    • 新增paper目录,追踪推荐系统前沿进展
    • 新增16个经典数据集,17个模型
    • 新增外部开发者贡献列表
    • 新增主页最新动态模块,方便用户快速获取PaddleRec最新信息

    功能新增及优化

    • 新增飞桨训推一体认证流程
    • 新增AI Studio一键在线运行功能
    • 新增前沿推荐顶会论文的分析模块
    • 新增16个经典数据集,提供稳定的数据下载地址
    • 支持CPUPS稀疏参数统计量自动统计、稀疏参数增量保存等功能
    • 支持新硬件NPU、XPU运行
    • 适配DataLoader多线程加载数据功能
    • 修复多任务多instag场景下精度计算不准确的问题
    • 修复PaddleRec在动态图infer阶段计算auc没有reset的问题
    • 修复部分文档链接失效问题
    • 修复MMoE和PLE参数初始化方式

    模型新增及优化

    • 新增多任务模型 MMoE Dselect_K AITM ESCM2
    • 新增排序模型 BERT4Rec FAT_DeepFFM DeepRec AutoFIS DCN_V2 SIGN DSIN IPRec
    • 新增召回模型 ENSFM TiSAS MHCN
    • 新增匹配模型 KIM
    • 新增元学习模型 MetaHeac

    教程更新

    • 新增CPUPS流式训练优化文档教程
    • 新增短视频场景下多任务学习应用教程
    • 新增外部开发者贡献列表

    论文复现赛

    • 感谢jinweiluo 贡献BERT4Rec模型
    • 感谢LinJayan 贡献FAT_DeepFFM FLEN DCN_V2模型
    • 感谢chenjiyan2001 贡献DeepRec模型
    • 感谢renmada 贡献ENSFM TiSAS AutoFIS KIM IPRec AITM模型
    • 感谢Andy1314Chen 贡献Dselect_K MHCN模型
    • 感谢BamLubi 贡献SIGN模型
    • 感谢simuler 贡献MetaHeac模型
    • 感谢yoreG123 chenjiyan2001 贡献FGCNN模型
    • 感谢Li-fAngyU 贡献DSIN模型
    Source code(tar.gz)
    Source code(zip)
  • v2.2.0(Nov 19, 2021)

    PaddleRec v2.2.0 Release Note

    重要更新

    • 提供面向产业应用的推荐系统全流程的解决方案
    • 新增17个经典数据集,9个模型
    • 支持GpuPS训练

    功能新增及优化

    • 支持流式训练、特征重要性、在离线一致性等推荐全流程相关功能
    • 支持c++ 实现的Paddle Serving部署
    • 支持c++,java,python,go四种语言进行推理
    • 支持GpuPS训练
    • 新增17个经典数据集,提供稳定的数据下载地址
    • 新增PGL图模型支持,以deepwalk模型作为示例

    模型新增及优化

    • 新增图算法模型 deepwalk
    • 新增排序模型 bst,din,dien,dcn,dmr,deepfefm,dlrm
    • 新增元学习模型 maml

    教程更新

    • 新增GpuPS的分布式文档教程
    • 新增单机转分布式教程
    • 新增用户贡献代码教程
    • 新增推荐全流程教程,包括流式训练、特征重要性、在离线一致性检查相关功能教程

    论文复现赛

    • 感谢thinkall 贡献DMR模型 DeepFEFM模型
    • 感谢hrdws 贡献MAML模型
    • 感谢Andy1314Chen 贡献DLRM模型
    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(May 19, 2021)

    PaddleRec v2.1.0 Release Note

    重要更新

    • 支持inference预测库和serving部署
    • 支持Paddle/Perf的Benchmark功能
    • 支持单机多卡,多机多卡训练
    • 新增召回模型MIND,升级PLE, FFM等5个模型到2.0API和支持动态图
    • 新增12个经典数据集,提供稳定的数据下载地址

    功能新增及优化

    • 支持inference预测库和serving部署
    • 支持单机多卡,多机多卡训练
    • 支持开源工具Milvus用于向量存储和召回服务
    • 支持Paddle/Perf的Benchmark功能
    • 新增12个经典数据集,提供稳定的数据下载地址
    • 新增可视化支持,引入VisualDL
    • 优化LOG打印规范,统一动静打印信息的方式和输出格式
    • 修复gpu静态图auc无法置空的问题

    模型新增及优化

    • 新增召回模型MIND
    • 升级PLE, ShareBtm, FFM, xdeepfm,NCF个到2.0API和支持动态图
    • 修复LR动态图精度
    • 未支持动态图的模型保留在release/1.8.5分支下

    教程更新

    • 新增Milvus教程
    • 新增完善动转静,预测部署,可视化等文档
    Source code(tar.gz)
    Source code(zip)
  • v2.0.0(Feb 9, 2021)

    PaddleRec v2.0.0 Release Note

    重要更新

    • 此版本适配飞桨版本为v2.0.0,框架模型全面升级到2.0API
    • 支持动态图功能,重构为动静统一的模型结构,方便用户调研和上线
    • 重构框架,无需进行安装,支持多环境直接使用python在任意目录下一键启动
    • 新增经典数据集,提供稳定的数据下载地址

    功能新增及优化

    • 运行无需安装,支持linux/windows/mac在任意目录下一键启动
    • 新增动态图训练模式,支持单机CPU、GPU训练和预测
    • 新增动转静功能,支持将动态图保存的参数转化为静态图参数
    • 新增调试功能,动静都支持在运行中打印中间结果
    • 新增数据集datasets目录,包含经典数据集的预处理和数据下载地址
    • 新增流式训练trainer
    • 优化目录结构,由models、tools、datasets和doc构成
    • 优化LOG打印规范,统一规范打印信息的方式和输出格式
    • 修复多任务多个auc无法置空的问题
    • 修复windows,linux,macos平台数据处理命令不同可能需要手动转化的问题

    模型新增及优化

    • 新增模型排序模型GateDnn和NAML
    • 新增dnn, deepfm, textcnn, mmoe等15个模型的动态图模式,动静一致的2.0API组网和全量数据的复现yaml配置
    • 优化wide&deep, deepfm, fm的模型组网
    • 未支持动态图的模型保留在release/1.8.5分支下

    教程更新

    • 新增aistudio上的电影推荐教程
    • 更新入门教程、进阶教程的全套文档
    Source code(tar.gz)
    Source code(zip)
  • v1.8.5(Oct 12, 2020)

    PaddleRec v1.8.5 Release Note

    重要更新

    • 此版本适配飞桨版本为v1.8.5
    • 框架升级,支持更加灵活的reader及模型适配, 支持更加灵活的训练模式定义及数据读取定义
    • 新增9个模型,并对多个已支持模型进行了优化
    • 取消内置paddlerec.models.rank.等模型的内置配置方法, 统一由用户根据yaml的路径进行配置
    • 支持Kubernetes、PaddleCloud一键提交飞桨分布式训练
    • 支持CPU/GPU下进行飞桨分布式训练, 支持GPU下collective模式训练,支持GPU下parameter server模式训练及CPU下parameter server模式训练

    功能新增及修复

    • 新增collective模式支持GPU多卡训练、parameter server模式支持GPU-PS训练、单机多卡训练等
    • 新增分布式训练任务提交功能,支持在MPI/Kubernetes/PaddleCloud上一键启动训练
    • 新增多个指标的计算和分布式计算功能,包括AUC、Recall_k(召回topk的准确率)、PN(正逆序)、Precison_Recall等
    • 新增BatchReader功能, 可由用户在Reader中自行组batch
    • 新增预训练Trainer及流式训练Trainer,可支持用户对预训练及流式训练的需求
    • 新增本地文件列表shuffle的功能,在训练前进行数据文件粒度的shuffle支持
    • 新增batch级别模型保存
    • 数据读取优化,加入SlotReader, 用户只需要按照要求生成好数据并配置好数据格式即可使用飞桨高效训练
    • 修复LOG打印,规范log级别及log输出格式
    • 修复Windows下安装出错的bug
    • 修复数据读取读取隐藏文件的bug
    • 修复collective多卡数据不均匀划分导致训练异常的bug
    • 修复learning rate不支持科学计数法的bug

    模型新增及修复

    • 新增模型DIEN、BST、AutoInt、FGCNN、Fibinet、FLEN、RALM、Match-pyramid、TDM 等模型
    • 新增预训练模型TextCNN
    • 为Fibinet、FLEN、youtubednn、gnn、word2vec等模型加入Readme,数据处理,运行结果展示等功能,修复模型效果问题
    • 修复Rank目录下DNN、LR、FM、DeepFM等多个模型的Readme
    • 修复Recall目录下多个Readme中模型配置及路径问题
    • TDM加入完整训练流程,包括训练、建树、聚类及在线预测

    教程更新

    • 新增单机训练、分布式训练、流式训练及英文教程、 预训练模型教程
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Jun 22, 2020)

    PaddleRec 0.1.0发布

    功能:支持一键启动训练,四大可插拔组件,多种训练模式自动兼容

    • Engine:覆盖不同运行平台,含cpu/gpu;单机、分布式(单机PS、单机多卡;PaddleCloud/MPI/K8S下的多机ps、多机多卡);支持不同操作系统(ubuntu/centos/macos/windows)
    • Trainer: 支持组件化自定义训练流程,含ps-transpiler、ps-pslib、collective
    • Model: 支持快速构建新组网,已预置几十种经典模型
    • Reader: 支持高效灵活的数据处理,含slot-feasign数据读取、自定义reader

    模型库:

    • 31+推荐算法,基本覆盖推荐各个模块的主流算法:内容理解模型(2),召回模型(7),排序模型(15),多目标模型(3),重排序模型(1),树模型(1),匹配模型(2)

    文档:

    • 快速开始:十分钟上手PaddleRec(AIStudio教程) 使用了MovieLens 1M数据集训练了召回+推荐模型,并模拟了在线推荐的全流程。使得用户通过该简单示例能够快速使用PaddleRec的数据处理、训练、预测等功能。
    • 入门教程:数据准备、模型调参、启动训练、启动预测、快速部署
    • 进阶教程:自定义数据处理、自定义模型、自定义训练流程

    PaddleRec release 0.1.0

    Major Features and Components:

    • Start training with one-line command
    • Training framework with four extensible modules supported
      • Engine: local training and distributed training supported on CPU/GPU on multiple platforms
      • Trainer: support user-defined training logics
      • Model: easy to develop user-defined models and plugin models
      • Reader: high performance data processing with user-defined processing functions.

    Model zoos:

    • more than 30 plugin deep learning algorithms in recommendation system pipelines, such as content understanding models, recall models, ranking models, multi-task models, reranking models, tree-based models and matching models, etc.

    Documentation

    • Quick start: 10 minutes hands on tutorial with movielens 1M dataset. Users can understand what is going on in recommender system offline training through data processing, training, validation.
    • Basic tutorials, covering data preprocessing, model hyper-parameter tuning, training, prediction, deployment
    • Advanced tutorials, including how to do user-defined data preprocessing, how to write a user-defined network, training pipeline customization.

    Special Thanks to our Contributors

    xiexionghang (for initial commit contribution)

    Source code(tar.gz)
    Source code(zip)
Respiratory Health Recommendation System

Respiratory-Health-Recommendation-System Respiratory Health Recommendation System based on Air Quality Index Forecasts This project aims to provide pr

Abhishek Gawabde 1 Jan 29, 2022
Recommender System Papers

Included Conferences: SIGIR 2020, SIGKDD 2020, RecSys 2020, CIKM 2020, AAAI 2021, WSDM 2021, WWW 2021

RUCAIBox 704 Jan 06, 2023
Real time recommendation playground

concierge A continuous learning collaborative filter1 deployed with a light web server2. Distributed updates are live (real time pubsub + delta traini

Mark Essel 16 Nov 07, 2022
Code for MB-GMN, SIGIR 2021

MB-GMN Code for MB-GMN, SIGIR 2021 For Beibei data, run python .\labcode.py For Tmall data, run python .\labcode.py --data tmall --rank 2 For IJCAI

32 Dec 04, 2022
Detecting Beneficial Feature Interactions for Recommender Systems, AAAI 2021

Detecting Beneficial Feature Interactions for Recommender Systems (L0-SIGN) This is our implementation for the paper: Su, Y., Zhang, R., Erfani, S., &

26 Nov 22, 2022
RecList is an open source library providing behavioral, "black-box" testing for recommender systems.

RecList is an open source library providing behavioral, "black-box" testing for recommender systems.

Jacopo Tagliabue 375 Dec 30, 2022
Plex-recommender - Get movie recommendations based on your current PleX library

plex-recommender Description: Get movie/tv recommendations based on your current

5 Jul 19, 2022
A Library for Field-aware Factorization Machines

Table of Contents ================= - What is LIBFFM - Overfitting and Early Stopping - Installation - Data Format - Command Line Usage - Examples -

1.6k Dec 05, 2022
Cloud-based recommendation system

This project is based on cloud services to create data lake, ETL process, train and deploy learning model to implement a recommendation system.

Yi Ding 1 Feb 02, 2022
E-Commerce recommender demo with real-time data and a graph database

🔍 E-Commerce recommender demo 🔍 This is a simple stream setup that uses Memgraph to ingest real-time data from a simulated online store. Data is str

g-despot 3 Feb 23, 2022
A Python scikit for building and analyzing recommender systems

Overview Surprise is a Python scikit for building and analyzing recommender systems that deal with explicit rating data. Surprise was designed with th

Nicolas Hug 5.7k Jan 01, 2023
Bundle Graph Convolutional Network

Bundle Graph Convolutional Network This is our Pytorch implementation for the paper: Jianxin Chang, Chen Gao, Xiangnan He, Depeng Jin and Yong Li. Bun

55 Dec 25, 2022
Pytorch domain library for recommendation systems

TorchRec (Experimental Release) TorchRec is a PyTorch domain library built to provide common sparsity & parallelism primitives needed for large-scale

Meta Research 1.3k Jan 05, 2023
Bert4rec for news Recommendation

News-Recommendation-system-using-Bert4Rec-model Bert4rec for news Recommendation

saran pandian 2 Feb 04, 2022
Cross-Domain Recommendation via Preference Propagation GraphNet.

PPGN Codes for CIKM 2019 paper Cross-Domain Recommendation via Preference Propagation GraphNet. Citation Please cite our paper if you find this code u

Information Retrieval Group, Wuhan University, China 20 Dec 15, 2022
Graph Neural Network based Social Recommendation Model. SIGIR2019.

Basic Information: This code is released for the papers: Le Wu, Peijie Sun, Yanjie Fu, Richang Hong, Xiting Wang and Meng Wang. A Neural Influence Dif

PeijieSun 144 Dec 29, 2022
Recommendation System to recommend top books from the dataset

recommendersystem Recommendation System to recommend top books from the dataset Introduction The recom.py is the main program code. The dataset is als

Vishal karur 1 Nov 15, 2021
Mutual Fund Recommender System. Tailor for fund transactions.

Explainable Mutual Fund Recommendation Data Please see 'DATA_DESCRIPTION.md' for mode detail. Recommender System Methods Baseline Collabarative Fiilte

JHJu 2 May 19, 2022
Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction

MGNN-SPred This is our Tensorflow implementation for the paper: WenWang,Wei Zhang, Shukai Liu, Qi Liu, Bo Zhang, Leyu Lin, and Hongyuan Zha. 2020. Bey

Wen Wang 18 Jan 02, 2023
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

Spotify 10.6k Jan 01, 2023