Credit Fraud detection: Context: It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Dataset Location : This dataset could be found at https://www.kaggle.com/mlg-ulb/creditcardfraud This dataset (creditcard.csv) was provided by KAGGLE The dataset contains transactions made by credit cards in September 2013 by European cardholders. It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, β¦ V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-sensitive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise. This dataset is already preprocessed. I began with splitting the dataset into train and test sets with a split of 0.75:0.25, Did a brief analysis and checked that the dataset contains 99.8% of the values are labeled as not fraud and only 0.2% are labeled as fraud. I bootstrapped the data by upsampling the training dataset because if we had only a few positives relative to negatives, the training model will spend most of its time on negative examples and not learn enough from positive ones. Therefore I bootstrapped the data to make it balanced. Then I applied Random Forest with the number of trees = 20 and determined which were the most important features for our model. I followed with Logistic Regression Then finally I followed by a Gaussian Naive Bayes I tested all three models for accuracy, precision, recall and f1 score. The Random Forest model has better accuaracy and precision than the Logistic Regression and Gaussian Naive Bayes models, but Logistic regression has the best recall, yet Random Forest has the best f1 score which is the harmonic average between precision and recall.
Credit fraud detection in Python using a Jupyter Notebook
Overview
Deal or No Deal? End-to-End Learning for Negotiation Dialogues
Introduction This is a PyTorch implementation of the following research papers: (1) Hierarchical Text Generation and Planning for Strategic Dialogue (
Norm-based Analysis of Transformer
Norm-based Analysis of Transformer Implementations for 2 papers introducing to analyze Transformers using vector norms: Kobayashi+'20 Attention is Not
NER for Indian languages
CL-NERIL: A Cross-Lingual Model for NER in Indian Languages Code for the paper - https://arxiv.org/abs/2111.11815 Setup Setup a virtual environment Th
Keras Realtime Multi-Person Pose Estimation - Keras version of Realtime Multi-Person Pose Estimation project
This repository has become incompatible with the latest and recommended version of Tensorflow 2.0 Instead of refactoring this code painfully, I create
A Japanese Medical Information Extraction Toolkit
JaMIE: a Japanese Medical Information Extraction toolkit Joint Japanese Medical Problem, Modality and Relation Recognition The Train/Test phrases requ
Experiments with Fourier layers on simulation data.
Factorized Fourier Neural Operators This repository contains the code to reproduce the results in our NeurIPS 2021 ML4PS workshop paper, Factorized Fo
Code to generate datasets used in "How Useful is Self-Supervised Pretraining for Visual Tasks?"
Synthetic dataset rendering Framework for producing the synthetic datasets used in: How Useful is Self-Supervised Pretraining for Visual Tasks? Alejan
Code for the paper: Sketch Your Own GAN
Sketch Your Own GAN Project | Paper | Youtube | Slides Our method takes in one or a few hand-drawn sketches and customizes an off-the-shelf GAN to mat
Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU
Cross-modal Retrieval using Transformer Encoder Reasoning Networks This project reimplements the idea from "Transformer Reasoning Network for Image-Te
Towards the D-Optimal Online Experiment Design for Recommender Selection (KDD 2021)
Towards the D-Optimal Online Experiment Design for Recommender Selection (KDD 2021) Contact 0 Jan 11, 2022
ICCV2021: Code for 'Spatial Uncertainty-Aware Semi-Supervised Crowd Counting'
ICCV2021: Code for 'Spatial Uncertainty-Aware Semi-Supervised Crowd Counting'
Neural Ensemble Search for Performant and Calibrated Predictions
Neural Ensemble Search Introduction This repo contains the code accompanying the paper: Neural Ensemble Search for Performant and Calibrated Predictio
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].
PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat
TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"
TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"
Reverse engineer your pytorch vision models, in style
π Rover Reverse engineer your CNNs, in style Rover will help you break down your CNN and visualize the features from within the model. No need to wri
ONNX-PackNet-SfM: Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX
Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX
Robust Consistent Video Depth Estimation
[CVPR 2021] Robust Consistent Video Depth Estimation This repository contains Python and C++ implementation of Robust Consistent Video Depth, as descr
ν΅μΌλ DataScience ν΄λ ꡬ쑰 μ 곡 λ° κ°μνκ²½ μμ μ λΆλ΄κ° ν΄μ
Lucas coded by linux shell λͺ©μ°¨ Macλ²μ CookieCutter (autoenv) 1.How to Install autoenv 2.ν΄λ μ§μ μ, activate ꡬννκΈ° 3.ν΄λ νμΆ μ, deactivate ꡬννκΈ° 4.Alias μ€μ νκΈ° 5
Implementation of "Efficient Regional Memory Network for Video Object Segmentation" (Xie et al., CVPR 2021).
RMNet This repository contains the source code for the paper Efficient Regional Memory Network for Video Object Segmentation. Cite this work @inprocee
Simulation of the solar system using various nummerical methods
solar-system Simulation of the solar system using various nummerical methods Download the repo Make shure matplotlib, scipy etc. are installed execute