PASSL包含 SimCLR,MoCo,BYOL,CLIP等基于对比学习的图像自监督算法以及 Vision-Transformer,Swin-Transformer,BEiT,CVT,T2T,MLP_Mixer等视觉Transformer算法

Overview

PASSL

Introduction

PASSL is a Paddle based vision library for state-of-the-art Self-Supervised Learning research with PaddlePaddle. PASSL aims to accelerate research cycle in self-supervised learning: from designing a new self-supervised task to evaluating the learned representations.

  • Reproducible implementation of SOTA in Self-Supervision: Existing SOTA in Self-Supervision are implemented - SimCLR, MoCo(v1),MoCo(v2), MoCo-BYOL, CLIP. BYOL is coming soon. Also supports supervised trainings.
  • Modular: Easy to build new tasks and reuse the existing components from other tasks (Trainer, models and heads, data transforms, etc.).

Installation

Implemented Models

Benchmark Linear Image Classification on ImageNet-1K

epochs official results passl results Backbone Model
MoCo 200 60.6 60.64 ResNet-50 download
SimCLR 100 64.5 65.3 ResNet-50 download
MoCo v2 200 67.7 67.72 ResNet-50 download
MoCo-BYOL 300 71.56 72.10 ResNet-50 download
BYOL 300 72.50 71.62 ResNet-50 download

Getting Started

Please see GETTING_STARTED.md for the basic usage of PASSL.

Tutorials

Comments
  • MLP-Mixer: An all-MLP Architecture for Vision

    MLP-Mixer: An all-MLP Architecture for Vision

    readme文件里的两个模型的TOP1 是不是写反了?模型大的准确度比模型小的准确度小一些?

    Arch | Weight | Top-1 Acc | Top-5 Acc | Crop ratio | # Params -- | -- | -- | -- | -- | -- mlp_mixer_b16_224 | pretrain 1k | 76.60 | 92.23 | 0.875 | 60.0M mlp_mixer_l16_224 | pretrain 1k | 72.06 | 87.67 | 0.875 | 208.2M

    opened by gaorui999 3
  • 我很关注图像分类的自监督进展

    我很关注图像分类的自监督进展

    小弟想问问,对于图像分类的自监督,目前是什么进展呢?比如猫狗分类这种典型的二分类准确率如何?imagenet1k分类准确率如何?PASSL里面的关于图像分类的自监督算法或者模型,有哪些?能给个例子,让我知道如何使用吗?目前看到PASSLissues才1条,文档完全没看到.方便加个微信或者QQ聊几句吗?小弟对于图像分类的自监督高度重视.还有一个疑问,关于图像分类的自监督模型,是不是我给一堆图片,模型运行后,就会把图片归类呢?我需不需要给出类别的数量呢?说白了,我想知道图像分类的自监督的一个使用流程.现在都1.0了,该有点用处了吧.如果一个模型运行后,图像就分好类了,归纳为N类,我有什么办法判断分类的正确性呢?这方面有算法吗? 提了很多问题,跪求每个问题都回答一下,谢谢大佬.

    opened by yuwoyizhan 2
  • Unintended behavior in clip_logit_scale

    Unintended behavior in clip_logit_scale

    https://github.com/PaddlePaddle/PASSL/blob/83c49e6a5ba3444cee7f054122559d7759152764/passl/modeling/backbones/clip.py#L317

    check this issue for reference https://github.com/PaddlePaddle/Paddle/issues/43710

    Suggested approach (with non-public API)

    logit_scale_buffer = self.logit_scale.clip(-4.6, 4.6)
    logit_scale_buffer._share_buffer_to(self.logit_scale)
    
    opened by minogame 1
  • 建议

    建议

    1.passl很多文字都是英文的,包括快速使用等文档,希望可以提供中文文档. 2.希望知道图像分类自监督学习的技术研究目前到达什么程度了.比如猫狗这种二分类准确率如何,imagenet准确率如何,使用passl进行图像分类,需要给类别总数量吗? 3.能加个QQ或者微信聊几句吗?有些疑问,拜托了,大佬. QQ:1226194560 微信:18820785964

    opened by yuwoyizhan 1
  • fix bug of mixup for DeiT

    fix bug of mixup for DeiT

    DeiT/B-16 pretrained on ImageNet1K:

    [01/21 02:54:46] passl.engine.trainer INFO: Validate Epoch [290] acc1 (81.336), acc5 (95.544)
    [01/21 03:02:31] passl.engine.trainer INFO: Validate Epoch [291] acc1 (81.328), acc5 (95.580)
    [01/21 03:10:20] passl.engine.trainer INFO: Validate Epoch [292] acc1 (81.390), acc5 (95.608)
    [01/21 03:18:10] passl.engine.trainer INFO: Validate Epoch [293] acc1 (81.484), acc5 (95.636)
    [01/21 03:26:00] passl.engine.trainer INFO: Validate Epoch [294] acc1 (81.452), acc5 (95.600)
    [01/21 03:33:52] passl.engine.trainer INFO: Validate Epoch [295] acc1 (81.354), acc5 (95.528)
    [01/21 03:41:38] passl.engine.trainer INFO: Validate Epoch [296] acc1 (81.338), acc5 (95.562)
    [01/21 03:49:25] passl.engine.trainer INFO: Validate Epoch [297] acc1 (81.344), acc5 (95.542)
    [01/21 03:57:15] passl.engine.trainer INFO: Validate Epoch [298] acc1 (81.476), acc5 (95.550)
    [01/21 04:05:03] passl.engine.trainer INFO: Validate Epoch [299] acc1 (81.476), acc5 (95.572)
    [01/21 04:12:51] passl.engine.trainer INFO: Validate Epoch [300] acc1 (81.386), acc5 (95.536)
    
    opened by GuoxiaWang 1
  • BYOL的预训练中好像使用了gt_label?

    BYOL的预训练中好像使用了gt_label?

    • 在byol的config 中设置了 num_classes=1000: https://github.com/PaddlePaddle/PASSL/blob/9d7a9fd4af41772e29120553dddab1c162e4cb70/configs/byol/byol_r50_IM.yaml#L34
    • 在model中设置了self.classifier = nn.Linear(embedding_dim, num_classes),并且forward中将classif_out和label一起传给了head

    image

    https://github.com/PaddlePaddle/PASSL/blob/9d7a9fd4af41772e29120553dddab1c162e4cb70/passl/modeling/architectures/BYOL.py#L263

    • 在L2 Head中将对比loss和有监督的CE loss加在了一起返回

    image

    https://github.com/PaddlePaddle/PASSL/blob/9d7a9fd4af41772e29120553dddab1c162e4cb70/passl/modeling/heads/l2_head.py#L43

    opened by youqingxiaozhua 0
  • [飞桨论文复现挑战赛(第六期)] (85) Emerging Properties in Self-Supervised Vision Transformers

    [飞桨论文复现挑战赛(第六期)] (85) Emerging Properties in Self-Supervised Vision Transformers

    PR types

    New features

    PR changes

    APIs

    Describe

    • Task: https://github.com/PaddlePaddle/Paddle/issues/41482
    • 添加 passl.model.architectures.dino

    Peformance

    | Model | Official | Passl | | ---- | ---- | ---- | | DINO | 74.0 | 73.6 |

    • [x] 预训练和linear probe代码
    • [ ] 预训练和linear probe权重
    • [ ] 文档
    • [ ] TIPC
    opened by fuqianya 0
Releases(v1.0.0)
  • v1.0.0(Feb 24, 2022)

    • 新增 XCiT 视觉 Transformer 模型 xcit_nano_12_p8_224 蒸馏模型训练指标对齐,感谢 @BrilliantYuKaimin 的高质量贡献 🎉 🎉 🎉

    PASSL飞桨自监督领域核心学习库,提供大量高精度的视觉自监督模型、视觉 Transformer 模型,并支持超大视觉模型分布式训练功能,旨在提升飞桨开发者在自监督领域建模效率,并提供基于飞桨框架2.2的超大视觉模型领域最佳实践

    Source code(tar.gz)
    Source code(zip)
Convert Mission Planner (ArduCopter) Waypoint Missions to Litchi CSV Format to execute on DJI Drones

Mission Planner to Litchi Convert Mission Planner (ArduCopter) Waypoint Surveys to Litchi CSV Format to execute on DJI Drones Litchi doesn't support S

Yaros 24 Dec 09, 2022
Code for the paper "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Jukebox Code for "Jukebox: A Generative Model for Music" Paper Blog Explorer Colab Insta

OpenAI 6k Jan 02, 2023
Source code of generalized shuffled linear regression

Generalized-Shuffled-Linear-Regression Code for the ICCV 2021 paper: Generalized Shuffled Linear Regression. Authors: Feiran Li, Kent Fujiwara, Fumio

FEI 7 Oct 26, 2022
DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control One version of our system is implemented using the

260 Nov 28, 2022
YOLOv4-v3 Training Automation API for Linux

This repository allows you to get started with training a state-of-the-art Deep Learning model with little to no configuration needed! You provide your labeled dataset or label your dataset using our

BMW TechOffice MUNICH 626 Dec 31, 2022
PyTorch implementation of Neural Dual Contouring.

NDC PyTorch implementation of Neural Dual Contouring. Citation We are still writing the paper while adding more improvements and applications. If you

Zhiqin Chen 140 Dec 26, 2022
CCPD: a diverse and well-annotated dataset for license plate detection and recognition

CCPD (Chinese City Parking Dataset, ECCV) UPdate on 10/03/2019. CCPD Dataset is now updated. We are confident that images in subsets of CCPD is much m

detectRecog 1.8k Dec 30, 2022
Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Octavio Arriaga 5.3k Dec 30, 2022
Recurrent Neural Network Tutorial, Part 2 - Implementing a RNN in Python and Theano

Please read the blog post that goes with this code! Jupyter Notebook Setup System Requirements: Python, pip (Optional) virtualenv To start the Jupyter

Denny Britz 863 Dec 15, 2022
Ontologysim: a Owlready2 library for applied production simulation

Ontologysim: a Owlready2 library for applied production simulation Ontologysim is an open-source deep production simulation framework, with an emphasi

10 Nov 30, 2022
Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Mo

Abhinav Kumar 76 Jan 02, 2023
Implementation of the federated dual coordinate descent (FedDCD) method.

FedDCD.jl Implementation of the federated dual coordinate descent (FedDCD) method. Installation To install, just call Pkg.add("https://github.com/Zhen

Zhenan Fan 6 Sep 21, 2022
UMPNet: Universal Manipulation Policy Network for Articulated Objects

UMPNet: Universal Manipulation Policy Network for Articulated Objects Zhenjia Xu, Zhanpeng He, Shuran Song Columbia University Robotics and Automation

Columbia Artificial Intelligence and Robotics Lab 33 Dec 03, 2022
A simple API wrapper for Discord interactions.

Your ultimate Discord interactions library for discord.py. About | Installation | Examples | Discord | PyPI About What is discord-py-interactions? dis

james 641 Jan 03, 2023
MoViNets PyTorch implementation: Mobile Video Networks for Efficient Video Recognition;

MoViNet-pytorch Pytorch unofficial implementation of MoViNets: Mobile Video Networks for Efficient Video Recognition. Authors: Dan Kondratyuk, Liangzh

189 Dec 20, 2022
CRF-RNN for Semantic Image Segmentation - PyTorch version

This repository contains the official PyTorch implementation of the "CRF-RNN" semantic image segmentation method, published in the ICCV 2015

Sadeep Jayasumana 170 Dec 13, 2022
Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

STCN Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [a

Rex Cheng 456 Dec 12, 2022
Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow

xRBM Library Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow Installation Using pip: pip install xrbm Examples Tut

Omid Alemi 55 Dec 29, 2022
Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Abstract In this paper, we introduce SalsaNext f

308 Jan 04, 2023
Pytorch-Swin-Unet-V2 - a modified version of Swin Unet based on Swin Transfomer V2

Swin Unet V2 Swin Unet V2 is a modified version of Swin Unet arxiv based on Swin

Chenxu Peng 26 Dec 03, 2022