RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

Last update: Sep 20, 2022

Related tags

Text Data & NLP ru-clip-tiny

Overview

RuCLIPtiny

Zero-shot image classification model for Russian language

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts). Our model is based on ConvNeXt-tiny and DistilRuBert-tiny, and is supported by extensive research zero-shot transfer, computer vision, natural language processing, and multimodal learning.

Result evaluation

Our model achieved 46.62% top1 and 73.18% top5 zero-shot accuracy on CIFAR100

Examples

Evaluate & Simple usage

Finetuning

ONNX conversion and speed testing

Model weights

Usage

Install rucliptiny module and requirements first. Use this trick

!gdown -O ru-clip-tiny.pkl https://drive.google.com/uc?id=1-3g3J90pZmHo9jbBzsEmr7ei5zm3VXOL
!pip install git+https://github.com/cene555/ru-clip-tiny.git

Example in 3 steps

Download CLIP image from repo

!wget -c -O CLIP.png https://github.com/openai/CLIP/blob/main/CLIP.png?raw=true

Import libraries

from rucliptiny.predictor import Predictor
from rucliptiny import RuCLIPtiny
import torch

torch.manual_seed(1)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Load model

model = RuCLIPtiny()
model.load_state_dict(torch.load('ru-clip-tiny.pkl'))
model = model.to(device).eval()

Use predictor to get probabilities

predictor = Predictor()

classes = ['диаграмма', 'собака', 'кошка']
text_probs = predictor(model=model, images_path=["CLIP.png"],
                       classes=classes, get_probs=True,
                       max_len=77, device=device)

Cosine similarity Visualization Example

Speed Testing

NVIDIA Tesla K80 (Google Colab session)

TORCH	batch	encode_image	encode_text	total
RuCLIPtiny	2	0.011	0.004	0.015
RuCLIPtiny	8	0.011	0.004	0.015
RuCLIPtiny	16	0.012	0.005	0.017
RuCLIPtiny	32	0.014	0.005	0.019
RuCLIPtiny	64	0.013	0.006	0.019

We would like to express my gratitude to Sber AI for the grants provided, for which research was carried out, as part of the Artificial Intelligence International Junior Contest (AIIJC)

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

Related tags

Overview

RuCLIPtiny

Result evaluation

Examples

Model weights

Usage

Example in 3 steps

Cosine similarity Visualization Example

Speed Testing

Owner

Shahmatov Arseniy

Refactored version of FastSpeech2

TLA - Twitter Linguistic Analysis

Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)

Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3

Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!

ConvBERT: Improving BERT with Span-based Dynamic Convolution

An extensive UI tool built using new data scraped from BBC News

Graph4nlp is the library for the easy use of Graph Neural Networks for NLP

The Classical Language Toolkit

Auto-researching tool generating word documents.

Skipgram Negative Sampling in PyTorch

A natural language modeling framework based on PyTorch

Full Spectrum Bioinformatics - a free online text designed to introduce key topics in Bioinformatics using the Python

A Multi-modal Model Chinese Spell Checker Released on ACL2021.

Tool to check whether a GCP bucket is public or not.

Yomichad - a Japanese pop-up dictionary that can display readings and English definitions of Japanese words

An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.

Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER adversarial training part