Garbage classification using structure data.

Overview

垃圾分类模型使用说明

1.包含以下数据文件

文件 描述
data/MaterialMapping.csv 物体以及其归类的信息
data/TestRecords 光谱原始测试数据 CSV 文件
data/TestRecordDesc.zip CSV 文件描述文件
data/Boundaries.csv 物体轮廓信息

2.包含以下模型文件

文件夹 描述
output/Category/ 包含预测大类别的分类模型
output/Material/ 包含预测大类别(4类)的分类模型
output/Backgroud/ 包含预测小类别(50类)的分类模型

3.环境配置

  进入garbage路径,在anaconda命令行运行pip install -r requirements.txt

4.数据预处理

  在anaconda命令行运行python data_preprocess.py,即可在data文件夹中生成AllEmbracingDataset.csv。若将来更新数据,按照和原来相同的格式和路径保存在data文件夹中,即可用data_preprocess.py生成更新后的数据集

  • 运行数据预处理Python脚本,将上述数据的信息集合到一个数据文件中
python code/data_preprocess.py -data_dir D:/datasets/garbage \
                        -test \
                        -groupbyObjID

运行脚本生成的数据文件 datasets/AllEmbracingDataset.csv 数据集

5.模型训练Python脚本

python code/train_gbdt_lr.py -data_dir D:/datasets/garbage/ \
                    -use_groupbyID True \
                    -output_dir output/ \
                    -skip_data_preprocess

其他 Python脚本说明:

  • feature_engineering.py 特征工程代码
  • ref.py 数据处理和模型推理所需的配置文件
  • utils.py 数据处理所需的一些函数
  • gbdt_feature.py 用gbdt模型生成特征

6.模型推理Python脚本

python code/predict_gbdt_lr.py -data_dir D:/datasets/garbage/ \
                    -use_groupbyID True \
                    -output_dir output/ \
                    -skip_data_preprocess \
                    -save_dir output/ 

  注1:只要同一个ObjID的多条数据的预测结果有一个不是背景零,最终预测结果就不是背景零。

  注2:预测出的Material只会是在训练数据中出现过的唯一标记号。这次数据中不同的唯一标记号共有148个,具体可参见output/log/log.txt中的LabelEncoder.classes

  • 预测结果文件(predictions.csv)说明:对每个物体(即每个ObjID,通常对应多条测试记录)给出多个预测结果汇总后的预测结果。
# 域名 意义
1 ObjID 被测物体唯一标记。同一物体会对应多条测试记录
2 Category 物体分类,从训练数据中获取
3 Material 物体对应的唯一标识号,从训练数据中获取
4 pred_Category 模型所预测出的物体分类
5 pred_Material 模型所预测出的物体唯一标识号
6 pred_background 模型预测的背景和物体 (背景标记为 0,物体标记为 1)
7 pred_Category_final 模型所预测出的物体分类
8 pred_Material_final 模型所预测出的物体材料分类

7. 模型精度

  对于Category、Material和Background三种场景的预测,我们均使用GBDT+LR模型。尝试过SVM、XGBoost、LightGBM和GBDT+LR模型,对比之下,GBDT+LR模型表现最好。   在测试集上的Accuracy如下:

场景 Accuracy
Category 0.7583130575831306
Material 0.6042173560421735
Background 0.996044825313118
Owner
wenqi
Learning is all you need!
wenqi
Privacy-Preserving Machine Learning (PPML) Tutorial Presented at PyConDE 2022

PPML: Machine Learning on Data you cannot see Repository for the tutorial on Privacy-Preserving Machine Learning (PPML) presented at PyConDE 2022 Abst

Valerio Maggio 10 Aug 16, 2022
A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. IEEE ISM 2021

PGL-SUM: Combining Global and Local Attention with Positional Encoding for Video Summarization PyTorch Implementation of PGL-SUM From "PGL-SUM: Combin

Evlampios Apostolidis 35 Dec 22, 2022
Weight estimation in CT by multi atlas techniques

maweight A Python package for multi-atlas based weight estimation for CT images, including segmentation by registration, feature extraction and model

György Kovács 0 Dec 24, 2021
Quick program made to generate alpha and delta tables for Hidden Markov Models

HMM_Calc Functions for generating Alpha and Delta tables from a Hidden Markov Model. Parameters: a: Matrix of transition probabilities. a[i][j] = a_{i

Adem Odza 1 Dec 04, 2021
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation mode

Aiden Nibali 36 Oct 30, 2022
[ICCV 2021 Oral] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers Created by Xumin Yu*, Yongming Rao*, Ziyi Wang, Zuyan Liu, Jiwen Lu, Jie Zhou

Xumin Yu 317 Dec 26, 2022
PlaidML is a framework for making deep learning work everywhere.

A platform for making deep learning work everywhere. Documentation | Installation Instructions | Building PlaidML | Contributing | Troubleshooting | R

PlaidML 4.5k Jan 02, 2023
Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project

Semantic Code Search Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project. The model

Chen Wu 24 Nov 29, 2022
Annotated, understandable, and visually interpretable PyTorch implementations of: VAE, BIRVAE, NSGAN, MMGAN, WGAN, WGANGP, LSGAN, DRAGAN, BEGAN, RaGAN, InfoGAN, fGAN, FisherGAN

Overview PyTorch 0.4.1 | Python 3.6.5 Annotated implementations with comparative introductions for minimax, non-saturating, wasserstein, wasserstein g

Shayne O'Brien 471 Dec 16, 2022
AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation

AtlasNet [Project Page] [Paper] [Talk] AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation Thibault Groueix, Matthew Fisher, Vladimir

577 Dec 17, 2022
An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation

Hierarchical GAN for large dimensional financial market data Implementation This repository is an implementation of the [Hierarchical (Sig-Wasserstein

11 Nov 29, 2022
Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"

ON-LSTM This repository contains the code used for word-level language model and unsupervised parsing experiments in Ordered Neurons: Integrating Tree

Yikang Shen 572 Nov 21, 2022
Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Introduction This is a PyTorch implementation of the following research papers: (1) Hierarchical Text Generation and Planning for Strategic Dialogue (

Facebook Research 1.4k Dec 29, 2022
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval

CLIP4CMR A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval The original data and pre-calculate

24 Dec 26, 2022
DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

Jason Antic 15.8k Jan 04, 2023
Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning"

A Unified Framework for Parameter-Efficient Transfer Learning This is the official implementation of the paper: Towards a Unified View of Parameter-Ef

Junxian He 216 Dec 29, 2022
Software associated to AAAI paper "Planning with Biological Neurons and Synapses"

jBrain Software associated with the AAAI 2022 paper Francesco D'Amore, Daniel Mitropolsky, Pierluigi Crescenzi, Emanuele Natale, Christos H. Papadimit

Pierluigi Crescenzi 1 Apr 10, 2022
Relative Uncertainty Learning for Facial Expression Recognition

Relative Uncertainty Learning for Facial Expression Recognition The official implementation of the following paper at NeurIPS2021: Title: Relative Unc

35 Dec 28, 2022
Self-Supervised Contrastive Learning of Music Spectrograms

Self-Supervised Music Analysis Self-Supervised Contrastive Learning of Music Spectrograms Dataset Songs on the Billboard Year End Hot 100 were collect

27 Dec 10, 2022
Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

Jiachen Xu 5 Jul 14, 2022