An Unsupervised Graph-based Toolbox for Fraud Detection

Overview



Building GitHub Downloads Pypi version

An Unsupervised Graph-based Toolbox for Fraud Detection

Introduction: UGFraud is an unsupervised graph-based fraud detection toolbox that integrates several state-of-the-art graph-based fraud detection algorithms. It can be applied to bipartite graphs (e.g., user-product graph), and it can estimate the suspiciousness of both nodes and edges. The implemented models can be found here.

The toolbox incorporates the Markov Random Field (MRF)-based algorithm, dense-block detection-based algorithm, and SVD-based algorithm. For MRF-based algorithms, the users only need the graph structure and the prior suspicious score of the nodes as the input. For other algorithms, the graph structure is the only input.

Meanwhile, we have a deep graph-based fraud detection toolbox which implements state-of-the-art graph neural network-based fraud detectors.

We welcome contributions on adding new fraud detectors and extending the features of the toolbox. Some of the planned features are listed in TODO list.

If you use the toolbox in your project, please cite the paper below and the algorithms you used :

@inproceedings{dou2020robust,
  title={Robust Spammer Detection by Nash Reinforcement Learning},
  author={Dou, Yingtong and Ma, Guixiang and Yu, Philip S and Xie, Sihong},
  booktitle={Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  year={2020}
}

Useful Resources

Table of Contents

Installation

You can install UGFraud from pypi:

pip install UGFraud

or download and install from github:

git clone https://github.com/safe-graph/UGFraud.git
cd UGFraud
python setup.py install

Dataset

The demo data is not the intact data (rating and date information are missing). The rating information is only used in ZooBP demo. If you need the intact date to play demo, please email [email protected] to download the intact data from Yelp Spam Review Dataset. The metadata.gz file in /UGFraud/Yelp_Data/YelpChi includes:

  • user_id: 38063 number of users
  • product_id: 201 number of products
  • rating: from 1.0 (low) to 5.0 (high)
  • label: -1 is not spam, 1 is spam
  • date: data creation time

User Guide

Running the example code

You can find the implemented models in /UGFraud/Demo directory. For example, you can run fBox using:

python eval_fBox.py 

Running on your datasets

Have a look at the /UGFraud/Demo/data_to_network_graph.py to convert your data into the networkx graph.

In order to use your own data, you have to provide the following information at least:

  • a dict of dict:
'user_id':{
        'product_id':
                {
                'label': 1
                }
  • a dict of prior

You can use dict_to networkx(graph_dict) function from /Utils/helper.py file to convert your graph_dict into a networkx graph. For more details, please see data_to_network_graph.py.

The structure of code

The /UGFraud repository is organized as follows:

  • Demo/ contains the implemented models and the corresponding example code;
  • Detector/ contains the basic models;
  • Yelp_Data/ contains the necessary dataset files;
  • Utils/ contains the every help functions.

Implemented Models

Model Paper Venue Reference
SpEagle Collective Opinion Spam Detection: Bridging Review Networks and Metadata KDD 2015 BibTex
GANG GANG: Detecting Fraudulent Users in Online Social Networks via Guilt-by-Association on Directed Graph ICDM 2017 BibTex
fBox Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective ICDM 2014 BibTex
Fraudar FRAUDAR: Bounding Graph Fraud in the Face of Camouflage KDD 2016 BibTex
ZooBP ZooBP: Belief Propagation for Heterogeneous Networks VLDB 2017 BibTex
SVD Singular value decomposition and least squares solutions - BibTex
Prior Evaluating suspicioueness based on prior information - -

Model Comparison

Model Application Graph Type Model Type
SpEagle Review Spam Tripartite MRF
GANG Social Sybil Bipartite MRF
fBox Social Fraudster Bipartite SVD
Fraudar Social Fraudster Bipartite Dense-block
ZooBP E-commerce Fraud Tripartite MRF
SVD Dimension Reduction Bipartite SVD

TODO List

  • Homogeneous graph implementation

How to Contribute

You are welcomed to contribute to this open-source toolbox. Currently, you can create issues or send email to [email protected] for inquiry.

You might also like...
OBBDetection: an oriented object detection toolbox modified from MMdetection
OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

A Python Library for Graph Outlier Detection (Anomaly Detection)
A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

A semantic segmentation toolbox based on PyTorch

Introduction vedaseg is an open source semantic segmentation toolbox based on PyTorch. Features Modular Design We decompose the semantic segmentation

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.
mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms. It provides easily interchangeable modeling and planning components, and a set of utility functions that allow writing model-based RL algorithms with only a few lines of code.

Deep learning toolbox based on PyTorch for hyperspectral data classification.
Deep learning toolbox based on PyTorch for hyperspectral data classification.

Deep learning toolbox based on PyTorch for hyperspectral data classification.

Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle.

Paddle-Adversarial-Toolbox Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle. Model Zoo Common FGS

MMFlow is an open source optical flow toolbox based on PyTorch
MMFlow is an open source optical flow toolbox based on PyTorch

Documentation: https://mmflow.readthedocs.io/ Introduction English | 简体中文 MMFlow is an open source optical flow toolbox based on PyTorch. It is a part

Comments
  •  cannot import name 'Detector' most likely due to a circular import

    cannot import name 'Detector' most likely due to a circular import

    Performing a simple import as outlined in testing.py

    import sys
    import os
    __file__ = "~/env/lib/python3.8/site-packages/UGFraud"
    sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
    from UGFraud.Demo.eval_fBox import *
    

    However, this produces the below error:

    ---------------------------------------------------------------------------
    ImportError                               Traceback (most recent call last)
    ~/env/lib/python3.8/site-packages/UGFraud in <module>
          3 __file__ = "~/env/lib/python3.8/site-packages/UGFraud"
          4 sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
    ----> 5 from UGFraud.Demo.eval_fBox import *
    
    ~/miniconda3/lib/python3.8/site-packages/UGFraud/__init__.py in <module>
          1 # -*- coding: utf-8 -*-
          2 
    ----> 3 from . import Detector
          4 from . import Utils
          5 
    
    ImportError: cannot import name 'Detector' from partially initialized module 'UGFraud' (most likely due to a circular import) (~/miniconda3/lib/python3.8/site-packages/UGFraud/__init__.py)
    
    opened by ragyibrahim 1
Releases(v0.1.0)
Owner
SafeGraph
Towards Secure Machine Learning on Graph Data
SafeGraph
Codes for paper "KNAS: Green Neural Architecture Search"

KNAS Codes for paper "KNAS: Green Neural Architecture Search" KNAS is a green (energy-efficient) Neural Architecture Search (NAS) approach. It contain

90 Dec 22, 2022
Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Oh-My-Face This project is based on StyleCLIP, RIFE, and encoder4editing, which aims to expand human face editing via Global Direction of StyleCLIP, e

AiLin Huang 51 Nov 17, 2022
Official Repository of NeurIPS2021 paper: PTR

PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning Figure 1. Dataset Overview. Introduction A critical aspect of human vis

Yining Hong 32 Jun 02, 2022
City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces

City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces Paper Temporary GitHub page for City Surfaces paper. More soon! While designing s

14 Nov 10, 2022
Rax is a Learning-to-Rank library written in JAX

🦖 Rax: Composable Learning to Rank using JAX Rax is a Learning-to-Rank library written in JAX. Rax provides off-the-shelf implementations of ranking

Google 247 Dec 27, 2022
This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).

MoEBERT This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022). Installation Create an

Simiao Zuo 34 Dec 24, 2022
Offline Reinforcement Learning with Implicit Q-Learning

Offline Reinforcement Learning with Implicit Q-Learning This repository contains the official implementation of Offline Reinforcement Learning with Im

Ilya Kostrikov 126 Jan 06, 2023
JstDoS - HTTP Protocol Stack Remote Code Execution Vulnerability

jstDoS If you are going to skid that, please give credits ! ^^ ¿How works? This

apolo 4 Feb 11, 2022
Keras documentation, hosted live at keras.io

Keras.io documentation generator This repository hosts the code used to generate the keras.io website. Generating a local copy of the website pip inst

Keras 2k Jan 08, 2023
A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

OpenAI 29.6k Jan 08, 2023
Read and write layered TIFF ImageSourceData and ImageResources tags

Read and write layered TIFF ImageSourceData and ImageResources tags Psdtags is a Python library to read and write the Adobe Photoshop(r) specific Imag

Christoph Gohlke 4 Feb 05, 2022
A framework to train language models to learn invariant representations.

Invariant Language Modeling Implementation of the training for invariant language models. Motivation Modern pretrained language models are critical co

6 Nov 16, 2022
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022
Code Repository for Liquid Time-Constant Networks (LTCs)

Liquid time-constant Networks (LTCs) [Update] A Pytorch version is added in our sister repository: https://github.com/mlech26l/keras-ncp This is the o

Ramin Hasani 553 Dec 27, 2022
Racing line optimization algorithm in python that uses Particle Swarm Optimization.

Racing Line Optimization with PSO This repository contains a racing line optimization algorithm in python that uses Particle Swarm Optimization. Requi

Parsa Dahesh 6 Dec 14, 2022
Self-Guided Contrastive Learning for BERT Sentence Representations

Self-Guided Contrastive Learning for BERT Sentence Representations This repository is dedicated for releasing the implementation of the models utilize

Taeuk Kim 16 Dec 04, 2022
Apply a perspective transformation to a raster image inside Inkscape (no need to use an external software such as GIMP or Krita).

Raster Perspective Apply a perspective transformation to bitmap image using the selected path as envelope, without the need to use an external softwar

s.ouchene 19 Dec 22, 2022
Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention

cosFormer Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention Update log 2022/2/28 Add core code License This

120 Dec 15, 2022
Human4D Dataset tools for processing and visualization

HUMAN4D: A Human-Centric Multimodal Dataset for Motions & Immersive Media HUMAN4D constitutes a large and multimodal 4D dataset that contains a variet

tofis 15 Nov 09, 2022
Augmented Traffic Control: A tool to simulate network conditions

Augmented Traffic Control Full documentation for the project is available at http://facebook.github.io/augmented-traffic-control/. Overview Augmented

Meta Archive 4.3k Jan 08, 2023