An Unsupervised Graph-based Toolbox for Fraud Detection

Overview



Building GitHub Downloads Pypi version

An Unsupervised Graph-based Toolbox for Fraud Detection

Introduction: UGFraud is an unsupervised graph-based fraud detection toolbox that integrates several state-of-the-art graph-based fraud detection algorithms. It can be applied to bipartite graphs (e.g., user-product graph), and it can estimate the suspiciousness of both nodes and edges. The implemented models can be found here.

The toolbox incorporates the Markov Random Field (MRF)-based algorithm, dense-block detection-based algorithm, and SVD-based algorithm. For MRF-based algorithms, the users only need the graph structure and the prior suspicious score of the nodes as the input. For other algorithms, the graph structure is the only input.

Meanwhile, we have a deep graph-based fraud detection toolbox which implements state-of-the-art graph neural network-based fraud detectors.

We welcome contributions on adding new fraud detectors and extending the features of the toolbox. Some of the planned features are listed in TODO list.

If you use the toolbox in your project, please cite the paper below and the algorithms you used :

@inproceedings{dou2020robust,
  title={Robust Spammer Detection by Nash Reinforcement Learning},
  author={Dou, Yingtong and Ma, Guixiang and Yu, Philip S and Xie, Sihong},
  booktitle={Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  year={2020}
}

Useful Resources

Table of Contents

Installation

You can install UGFraud from pypi:

pip install UGFraud

or download and install from github:

git clone https://github.com/safe-graph/UGFraud.git
cd UGFraud
python setup.py install

Dataset

The demo data is not the intact data (rating and date information are missing). The rating information is only used in ZooBP demo. If you need the intact date to play demo, please email [email protected] to download the intact data from Yelp Spam Review Dataset. The metadata.gz file in /UGFraud/Yelp_Data/YelpChi includes:

  • user_id: 38063 number of users
  • product_id: 201 number of products
  • rating: from 1.0 (low) to 5.0 (high)
  • label: -1 is not spam, 1 is spam
  • date: data creation time

User Guide

Running the example code

You can find the implemented models in /UGFraud/Demo directory. For example, you can run fBox using:

python eval_fBox.py 

Running on your datasets

Have a look at the /UGFraud/Demo/data_to_network_graph.py to convert your data into the networkx graph.

In order to use your own data, you have to provide the following information at least:

  • a dict of dict:
'user_id':{
        'product_id':
                {
                'label': 1
                }
  • a dict of prior

You can use dict_to networkx(graph_dict) function from /Utils/helper.py file to convert your graph_dict into a networkx graph. For more details, please see data_to_network_graph.py.

The structure of code

The /UGFraud repository is organized as follows:

  • Demo/ contains the implemented models and the corresponding example code;
  • Detector/ contains the basic models;
  • Yelp_Data/ contains the necessary dataset files;
  • Utils/ contains the every help functions.

Implemented Models

Model Paper Venue Reference
SpEagle Collective Opinion Spam Detection: Bridging Review Networks and Metadata KDD 2015 BibTex
GANG GANG: Detecting Fraudulent Users in Online Social Networks via Guilt-by-Association on Directed Graph ICDM 2017 BibTex
fBox Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective ICDM 2014 BibTex
Fraudar FRAUDAR: Bounding Graph Fraud in the Face of Camouflage KDD 2016 BibTex
ZooBP ZooBP: Belief Propagation for Heterogeneous Networks VLDB 2017 BibTex
SVD Singular value decomposition and least squares solutions - BibTex
Prior Evaluating suspicioueness based on prior information - -

Model Comparison

Model Application Graph Type Model Type
SpEagle Review Spam Tripartite MRF
GANG Social Sybil Bipartite MRF
fBox Social Fraudster Bipartite SVD
Fraudar Social Fraudster Bipartite Dense-block
ZooBP E-commerce Fraud Tripartite MRF
SVD Dimension Reduction Bipartite SVD

TODO List

  • Homogeneous graph implementation

How to Contribute

You are welcomed to contribute to this open-source toolbox. Currently, you can create issues or send email to [email protected] for inquiry.

You might also like...
OBBDetection: an oriented object detection toolbox modified from MMdetection
OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

A Python Library for Graph Outlier Detection (Anomaly Detection)
A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

A semantic segmentation toolbox based on PyTorch

Introduction vedaseg is an open source semantic segmentation toolbox based on PyTorch. Features Modular Design We decompose the semantic segmentation

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.
mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms. It provides easily interchangeable modeling and planning components, and a set of utility functions that allow writing model-based RL algorithms with only a few lines of code.

Deep learning toolbox based on PyTorch for hyperspectral data classification.
Deep learning toolbox based on PyTorch for hyperspectral data classification.

Deep learning toolbox based on PyTorch for hyperspectral data classification.

Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle.

Paddle-Adversarial-Toolbox Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle. Model Zoo Common FGS

MMFlow is an open source optical flow toolbox based on PyTorch
MMFlow is an open source optical flow toolbox based on PyTorch

Documentation: https://mmflow.readthedocs.io/ Introduction English | 简体中文 MMFlow is an open source optical flow toolbox based on PyTorch. It is a part

Comments
  •  cannot import name 'Detector' most likely due to a circular import

    cannot import name 'Detector' most likely due to a circular import

    Performing a simple import as outlined in testing.py

    import sys
    import os
    __file__ = "~/env/lib/python3.8/site-packages/UGFraud"
    sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
    from UGFraud.Demo.eval_fBox import *
    

    However, this produces the below error:

    ---------------------------------------------------------------------------
    ImportError                               Traceback (most recent call last)
    ~/env/lib/python3.8/site-packages/UGFraud in <module>
          3 __file__ = "~/env/lib/python3.8/site-packages/UGFraud"
          4 sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
    ----> 5 from UGFraud.Demo.eval_fBox import *
    
    ~/miniconda3/lib/python3.8/site-packages/UGFraud/__init__.py in <module>
          1 # -*- coding: utf-8 -*-
          2 
    ----> 3 from . import Detector
          4 from . import Utils
          5 
    
    ImportError: cannot import name 'Detector' from partially initialized module 'UGFraud' (most likely due to a circular import) (~/miniconda3/lib/python3.8/site-packages/UGFraud/__init__.py)
    
    opened by ragyibrahim 1
Releases(v0.1.0)
Owner
SafeGraph
Towards Secure Machine Learning on Graph Data
SafeGraph
Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Learned Virtual View Visibility ICCV2021

Vis2Mesh This is the offical repository of the paper: Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Lear

71 Dec 25, 2022
Transport Mode detection - can detect the mode of transport with the help of features such as acceeration,jerk etc

title emoji colorFrom colorTo sdk app_file pinned Transport_Mode_Detector 🚀 purple yellow gradio app.py false Configuration title: string Display tit

Nishant Rajadhyaksha 3 Jan 16, 2022
3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Normal Face Photos

3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Normal Face Photos This repository contains the source code and dataset for the pa

54 Oct 09, 2022
This program creates a formatted excel file which highlights the undervalued stock according to Graham's number.

Over-and-Undervalued-Stocks Of Nepse Using Graham's Number Scrap the latest data using different websites and creates a formatted excel file that high

6 May 03, 2022
Customer-Transaction-Analysis - This analysis is based on a synthesised transaction dataset containing 3 months worth of transactions for 100 hypothetical customers.

Customer-Transaction-Analysis - This analysis is based on a synthesised transaction dataset containing 3 months worth of transactions for 100 hypothetical customers. It contains purchases, recurring

Ayodeji Yekeen 1 Jan 01, 2022
Paper: De-rendering Stylized Texts

Paper: De-rendering Stylized Texts Wataru Shimoda1, Daichi Haraguchi2, Seiichi Uchida2, Kota Yamaguchi1 1CyberAgent.Inc, 2 Kyushu University Accepted

CyberAgent AI Lab 55 Dec 18, 2022
Cancer metastasis detection with neural conditional random field (NCRF)

NCRF Prerequisites Data Whole slide images Annotations Patch images Model Training Testing Tissue mask Probability map Tumor localization FROC evaluat

Baidu Research 731 Jan 01, 2023
Process JSON files for neural recording sessions using Medtronic's BrainSense Percept PC neurostimulator

percept_processing This code processes JSON files for streamed neural data using Medtronic's Percept PC neurostimulator with BrainSense Technology for

Maria Olaru 3 Jun 06, 2022
A parallel framework for population-based multi-agent reinforcement learning.

MALib: A parallel framework for population-based multi-agent reinforcement learning MALib is a parallel framework of population-based learning nested

MARL @ SJTU 348 Jan 08, 2023
Generative Adversarial Networks(GANs)

Generative Adversarial Networks(GANs) Vanilla GAN ClusterGAN Vanilla GAN Model Structure Final Generator Structure A MLP with 2 hidden layers of hidde

Zhenbang Feng 2 Nov 05, 2021
Implementation of the ICCV'21 paper Temporally-Coherent Surface Reconstruction via Metric-Consistent Atlases

Temporally-Coherent Surface Reconstruction via Metric-Consistent Atlases [Papers 1, 2][Project page] [Video] The implementation of the papers Temporal

56 Nov 21, 2022
Official code repository for the EMNLP 2021 paper

Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization PyTorch code for the EMNLP 2021 paper "Integrating Visuospatia

Adyasha Maharana 23 Dec 19, 2022
A Haskell kernel for IPython.

IHaskell You can now try IHaskell directly in your browser at CoCalc or mybinder.org. Alternatively, watch a talk and demo showing off IHaskell featur

Andrew Gibiansky 2.4k Dec 29, 2022
Datasets, Transforms and Models specific to Computer Vision

torchvision The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision. Installat

13.1k Jan 02, 2023
This is a five-step framework for the development of intrusion detection systems (IDS) using machine learning (ML) considering model realization, and performance evaluation.

AB-TRAP: building invisibility shields to protect network devices The AB-TRAP framework is applicable to the development of Network Intrusion Detectio

Lab-C2DC - Laboratory of Command and Control and Cyber-security 17 Jan 04, 2023
Irrigation controller for Home Assistant

Irrigation Unlimited This integration is for irrigation systems large and small. It can offer some complex arrangements without large and messy script

Robert Cook 176 Jan 02, 2023
Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs

Implementation for the paper: Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs, Nurendra Choudhary, Nikhil Rao, Sumeet Ka

Nurendra Choudhary 8 Nov 15, 2022
Code for Referring Image Segmentation via Cross-Modal Progressive Comprehension, CVPR2020.

CMPC-Refseg Code of our CVPR 2020 paper Referring Image Segmentation via Cross-Modal Progressive Comprehension. Shaofei Huang*, Tianrui Hui*, Si Liu,

spyflying 55 Dec 01, 2022
Official PyTorch implementation for "Low Precision Decentralized Distributed Training with Heterogenous Data"

Low Precision Decentralized Training with Heterogenous Data Official PyTorch implementation for "Low Precision Decentralized Distributed Training with

Aparna Aketi 0 Nov 23, 2021
[IROS2021] NYU-VPR: Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymization Influences

NYU-VPR This repository provides the experiment code for the paper Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymiza

Automation and Intelligence for Civil Engineering (AI4CE) Lab @ NYU 22 Sep 28, 2022