Garbage classification using structure data.

Overview

垃圾分类模型使用说明

1.包含以下数据文件

文件 描述
data/MaterialMapping.csv 物体以及其归类的信息
data/TestRecords 光谱原始测试数据 CSV 文件
data/TestRecordDesc.zip CSV 文件描述文件
data/Boundaries.csv 物体轮廓信息

2.包含以下模型文件

文件夹 描述
output/Category/ 包含预测大类别的分类模型
output/Material/ 包含预测大类别(4类)的分类模型
output/Backgroud/ 包含预测小类别(50类)的分类模型

3.环境配置

  进入garbage路径,在anaconda命令行运行pip install -r requirements.txt

4.数据预处理

  在anaconda命令行运行python data_preprocess.py,即可在data文件夹中生成AllEmbracingDataset.csv。若将来更新数据,按照和原来相同的格式和路径保存在data文件夹中,即可用data_preprocess.py生成更新后的数据集

  • 运行数据预处理Python脚本,将上述数据的信息集合到一个数据文件中
python code/data_preprocess.py -data_dir D:/datasets/garbage \
                        -test \
                        -groupbyObjID

运行脚本生成的数据文件 datasets/AllEmbracingDataset.csv 数据集

5.模型训练Python脚本

python code/train_gbdt_lr.py -data_dir D:/datasets/garbage/ \
                    -use_groupbyID True \
                    -output_dir output/ \
                    -skip_data_preprocess

其他 Python脚本说明:

  • feature_engineering.py 特征工程代码
  • ref.py 数据处理和模型推理所需的配置文件
  • utils.py 数据处理所需的一些函数
  • gbdt_feature.py 用gbdt模型生成特征

6.模型推理Python脚本

python code/predict_gbdt_lr.py -data_dir D:/datasets/garbage/ \
                    -use_groupbyID True \
                    -output_dir output/ \
                    -skip_data_preprocess \
                    -save_dir output/ 

  注1:只要同一个ObjID的多条数据的预测结果有一个不是背景零,最终预测结果就不是背景零。

  注2:预测出的Material只会是在训练数据中出现过的唯一标记号。这次数据中不同的唯一标记号共有148个,具体可参见output/log/log.txt中的LabelEncoder.classes

  • 预测结果文件(predictions.csv)说明:对每个物体(即每个ObjID,通常对应多条测试记录)给出多个预测结果汇总后的预测结果。
# 域名 意义
1 ObjID 被测物体唯一标记。同一物体会对应多条测试记录
2 Category 物体分类,从训练数据中获取
3 Material 物体对应的唯一标识号,从训练数据中获取
4 pred_Category 模型所预测出的物体分类
5 pred_Material 模型所预测出的物体唯一标识号
6 pred_background 模型预测的背景和物体 (背景标记为 0,物体标记为 1)
7 pred_Category_final 模型所预测出的物体分类
8 pred_Material_final 模型所预测出的物体材料分类

7. 模型精度

  对于Category、Material和Background三种场景的预测,我们均使用GBDT+LR模型。尝试过SVM、XGBoost、LightGBM和GBDT+LR模型,对比之下,GBDT+LR模型表现最好。   在测试集上的Accuracy如下:

场景 Accuracy
Category 0.7583130575831306
Material 0.6042173560421735
Background 0.996044825313118
Owner
wenqi
Learning is all you need!
wenqi
CIFAR-10_train-test - training and testing codes for dataset CIFAR-10

CIFAR-10_train-test - training and testing codes for dataset CIFAR-10

Frederick Wang 3 Apr 26, 2022
Flexible time series feature extraction & processing

tsflex is a toolkit for flexible time series processing & feature extraction, that is efficient and makes few assumptions about sequence data. Useful

PreDiCT.IDLab 206 Dec 28, 2022
Fake News Detection Using Machine Learning Methods

Fake-News-Detection-Using-Machine-Learning-Methods Fake news is always a real and dangerous issue. However, with the presence and abundance of various

Achraf Safsafi 1 Jan 11, 2022
git《Tangent Space Backpropogation for 3D Transformation Groups》(CVPR 2021) GitHub:1]

LieTorch: Tangent Space Backpropagation Introduction The LieTorch library generalizes PyTorch to 3D transformation groups. Just as torch.Tensor is a m

Princeton Vision & Learning Lab 482 Jan 06, 2023
CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search

CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search This repository is the official implementation of CAPITAL: Optimal Subgrou

Hengrui Cai 0 Oct 19, 2021
Pytorch reimplementation of the Mixer (MLP-Mixer: An all-MLP Architecture for Vision)

MLP-Mixer Pytorch reimplementation of Google's repository for the MLP-Mixer (Not yet updated on the master branch) that was released with the paper ML

Eunkwang Jeon 18 Dec 08, 2022
A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API

Timbre Dissimilarity Metrics A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API Installation pip install -e . Usag

Ben Hayes 21 Jan 05, 2022
Image super-resolution through deep learning

srez Image super-resolution through deep learning. This project uses deep learning to upscale 16x16 images by a 4x factor. The resulting 64x64 images

David Garcia 5.3k Dec 28, 2022
Exploring Simple Siamese Representation Learning

G-SimSiam A PyTorch implementation which refers to repo for the paper Exploring Simple Siamese Representation Learning by Xinlei Chen & Kaiming He Add

zhuyun 1 Dec 19, 2021
A framework for the elicitation, specification, formalization and understanding of requirements.

A framework for the elicitation, specification, formalization and understanding of requirements.

NASA - Software V&V 161 Jan 03, 2023
The code for paper "Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation" which is accepted by AAAI 2022

Contrastive Spatio Temporal Pretext Learning for Self-supervised Video Representation (AAAI 2022) The code for paper "Contrastive Spatio-Temporal Pret

8 Jun 30, 2022
Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

tonne 1.4k Dec 29, 2022
Extending JAX with custom C++ and CUDA code

Extending JAX with custom C++ and CUDA code This repository is meant as a tutorial demonstrating the infrastructure required to provide custom ops in

Dan Foreman-Mackey 237 Dec 23, 2022
CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

CoReNet CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image. It produces coherent reconstructions, where all objec

Google Research 80 Dec 25, 2022
The Ludii general game system, developed as part of the ERC-funded Digital Ludeme Project.

The Ludii General Game System Ludii is a general game system being developed as part of the ERC-funded Digital Ludeme Project (DLP). This repository h

Digital Ludeme Project 50 Jan 04, 2023
A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021)

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021) This repository contains the official implemen

81 Dec 14, 2022
Dilated RNNs in pytorch

PyTorch Dilated Recurrent Neural Networks PyTorch implementation of Dilated Recurrent Neural Networks (DilatedRNN). Getting Started Installation: $ pi

Zalando Research 200 Nov 17, 2022
利用yolov5和TensorRT从0到1实现目标检测的模型训练到模型部署全过程

写在前面 利用TensorRT加速推理速度是以时间换取精度的做法,意味着在推理速度上升的同时将会有精度的下降,不过不用太担心,精度下降微乎其微。此外,要有NVIDIA显卡,经测试,CUDA10.2可以支持20系列显卡及以下,30系列显卡需要CUDA11.x的支持,并且目前有bug。 默认你已经完成了

Helium 6 Jul 28, 2022
simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

Ramón Casero 1 Jan 07, 2022
RoBERTa Marathi Language model trained from scratch during huggingface 🤗 x flax community week

RoBERTa base model for Marathi Language (मराठी भाषा) Pretrained model on Marathi language using a masked language modeling (MLM) objective. RoBERTa wa

Nipun Sadvilkar 23 Oct 19, 2022