Generate images from texts. In Russian

Overview

ruDALL-E

Generate images from texts

Apache license Downloads Coverage Status pipeline pre-commit.ci status

pip install rudalle==1.1.0rc0

🤗 HF Models:

ruDALL-E Malevich (XL)
ruDALL-E Emojich (XL) (readme here)
ruDALL-E Surrealist (XL)

Minimal Example:

Open In Colab Kaggle Hugging Face Spaces

Example usage ruDALL-E Malevich (XL) with 3.5GB vRAM! Open In Colab

Finetuning example Open In Colab

generation by ruDALLE:

import ruclip
from rudalle.pipelines import generate_images, show, super_resolution, cherry_pick_by_ruclip
from rudalle import get_rudalle_model, get_tokenizer, get_vae, get_realesrgan
from rudalle.utils import seed_everything

# prepare models:
device = 'cuda'
dalle = get_rudalle_model('Malevich', pretrained=True, fp16=True, device=device)
tokenizer = get_tokenizer()
vae = get_vae(dwt=True).to(device)

# pipeline utils:
realesrgan = get_realesrgan('x2', device=device)
clip, processor = ruclip.load('ruclip-vit-base-patch32-384', device=device)
clip_predictor = ruclip.Predictor(clip, processor, device, bs=8)
text = 'радуга на фоне ночного города'

seed_everything(42)
pil_images = []
scores = []
for top_k, top_p, images_num in [
    (2048, 0.995, 24),
]:
    _pil_images, _scores = generate_images(text, tokenizer, dalle, vae, top_k=top_k, images_num=images_num, bs=8, top_p=top_p)
    pil_images += _pil_images
    scores += _scores

show(pil_images, 6)

auto cherry-pick by ruCLIP:

top_images, clip_scores = cherry_pick_by_ruclip(pil_images, text, clip_predictor, count=6)
show(top_images, 3)

super resolution:

sr_images = super_resolution(top_images, realesrgan)
show(sr_images, 3)

text, seed = 'красивая тян из аниме', 6955

Image Prompt

see jupyters/ruDALLE-image-prompts-A100.ipynb

text, seed = 'Храм Василия Блаженного', 42
skyes = [red_sky, sunny_sky, cloudy_sky, night_sky]

Aspect ratio images -->NEW<--

🚀 Contributors 🚀

Supported by

Social Media

Comments
  • Smaller / Distilled model?

    Smaller / Distilled model?

    Will there be a smaller or a distilled model release? The problem with inferencing in google colab is the speeds. 4:32 for one image on a P100, and 2 hours+ for 3 images on K80.

    opened by johnpaulbin 10
  • RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

    RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

    i use default code and get error after generation 100% please help i use windows and conda

    `◼️ Malevich is 1.3 billion params model from the family GPT3-like, that uses Russian language and text+image multi-modality. x4 --> ready tokenizer --> ready Working with z of shape (1, 256, 32, 32) = 262144 dimensions. vae --> ready ruclip --> ready 100%|██████████████████████████████████████████████████████████████████████████████| 1024/1024 [00:46<00:00, 22.14it/s] Traceback (most recent call last): File "gen.py", line 29, in _pil_images, _scores = generate_images(text, tokenizer, dalle, vae, top_k=top_k, images_num=images_num, top_p=top_p) File "C:\Users\1\anaconda3\lib\site-packages\rudalle\pipelines.py", line 60, in generate_images images = vae.decode(codebooks) File "C:\Users\1\anaconda3\lib\site-packages\rudalle\vae\model.py", line 38, in decode img = self.model.decode(z) File "C:\Users\1\anaconda3\lib\site-packages\rudalle\vae\model.py", line 98, in decode quant = self.post_quant_conv(quant) File "C:\Users\1\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "C:\Users\1\anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 399, in forward return self._conv_forward(input, self.weight, self.bias) File "C:\Users\1\anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 395, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.

    import torch torch.backends.cuda.matmul.allow_tf32 = True torch.backends.cudnn.benchmark = True torch.backends.cudnn.deterministic = True torch.backends.cudnn.allow_tf32 = True data = torch.randn([3, 256, 32, 32], dtype=torch.float, device='cuda', requires_grad=True).to(memory_format=torch.channels_last) net = torch.nn.Conv2d(256, 256, kernel_size=[1, 1], padding=[0, 0], stride=[1, 1], dilation=[1, 1], groups=1) net = net.cuda().float().to(memory_format=torch.channels_last) out = net(data) out.backward(torch.randn_like(out)) torch.cuda.synchronize()

    ConvolutionParams data_type = CUDNN_DATA_FLOAT padding = [0, 0, 0] stride = [1, 1, 0] dilation = [1, 1, 0] groups = 1 deterministic = true allow_tf32 = true input: TensorDescriptor 0000020481F094B0 type = CUDNN_DATA_FLOAT nbDims = 4 dimA = 3, 256, 32, 32, strideA = 262144, 1, 8192, 256, output: TensorDescriptor 0000020481F09590 type = CUDNN_DATA_FLOAT nbDims = 4 dimA = 3, 256, 32, 32, strideA = 262144, 1, 8192, 256, weight: FilterDescriptor 000001FFD2E76AF0 type = CUDNN_DATA_FLOAT tensor_format = CUDNN_TENSOR_NHWC nbDims = 4 dimA = 256, 256, 1, 1, Pointer addresses: input: 0000001538C7D000 output: 000000153B87D000 weight: 00000014D3BB0000 `

    opened by bitcoin5000 7
  • Auto cut pictures into separated images

    Auto cut pictures into separated images

    Есть ли какие-нибудь параметры, которые автоматически нарежут и сохранят сгенерированные картинки по отдельности?


    Are there any args that will automatically cut and save separated images?

    opened by Sidiusz 4
  • Gradient checkpointing

    Gradient checkpointing

    This patch enables gradient checkpointing for ruDALLE.

    It's possible to use up to 3x higher batch sizes in memory-limited environments during training.

    Setting the gradient_checkpointing during model.forward makes a checkpoint every gradient_checkpointing layers. 6 is a good starting value.

    opened by neverix 3
  • Feature/dwt vae

    Feature/dwt vae

    add support decoding vae with DWT (discrete wavelet transform):

    allow restore 512x512 images

    thanks a lot @bes for issue https://github.com/sberbank-ai/ru-dalle/issues/42 with this idea 👍

    vae = get_vae(dwt=True)
    
    opened by shonenkov 3
  • optimize image prompts

    optimize image prompts

    This enables caching for image prompts. For some reason, the results change slightly. I tried looking for off-by-one bugs in this, but couldn't find one myself.

    opened by neverix 3
  • The error in ruDall-e code that published in Kaggle

    The error in ruDall-e code that published in Kaggle

    Execution of ruDall-e code in the Kaggle notebook (as is published), in GPU session ends with error:

    ModuleNotFoundError                       Traceback (most recent call last)
    /tmp/ipykernel_29/1914141142.py in <module>
    ----> 1 from rudalle.pipelines import generate_images, show, super_resolution, cherry_pick_by_clip
          2 from rudalle import get_rudalle_model, get_tokenizer, get_vae, get_realesrgan, get_ruclip
          3 from rudalle.utils import seed_everything
    
    ModuleNotFoundError: No module named 'rudalle'
    
    

    The error message refers to this code:

    !pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html > /dev/null
    !pip install rudalle==0.0.1rc1 > /dev/null
    
    opened by XieBaoshi 3
  • Constantly having to redownload models

    Constantly having to redownload models

    Hi, I've noticed that running it on a local jupyter instance will always redownload the model again. Is there a way I can avoid this as I don't want to be waiting for it to finish everytime. Thanks/

    opened by JohnnyRacer 2
  • Problem about the PyTorch vision?

    Problem about the PyTorch vision?

    I have look for the issues but I can't find the same problem. So sorry to bother you. GPU: 截屏2021-12-02 下午6 35 14 my python environment: pytorch=1.8.0&torchvision=0.9.0, cudatoolkit=11.3.1&cudnn =8.2.1. I have tried the rudalle=0.3.0 just following the readme.md, or 0.0.1rc5 by the RTX3090.ipynb, but I only got the following error! 截屏2021-12-02 下午6 38 49

    So I wanna know if any problem in my environment? Waiting for your reply!

    opened by Wang-Xiaodong1899 2
  • image_prompts.py – borders crop not working properly

    image_prompts.py – borders crop not working properly

    From an official documentation:

    borders (dict[str] | int): borders that we croped from pil_image example: {'up': 4, 'right': 0, 'left': 0, 'down': 0} (1 int eq 8 pixels)

    Up crop works just fine. But if I will pass as a crop argument something other than "Up" in the result, I will get an AssertionError: telegram-cloud-photo-size-2-5197407051389712641-y

    Thank you for a fantastic algo ✨

    opened by DenisSergeevitch 2
  • Не запускается generate_images

    Не запускается generate_images

    Пытаюсь запустить на device = 'cpu'. Пример из README самый первый

    Падает с таким трейсбеком. Что я делаю не так?

    ◼️ Malevich is 1.3 billion params model from the family GPT3-like, that uses Russian language and text+image multi-modality.
    x4 --> ready
    tokenizer --> ready
    Working with z of shape (1, 256, 32, 32) = 262144 dimensions.
    vae --> ready
    ruclip --> ready
      0%|          | 0/1024 [00:00<?, ?it/s]
    Traceback (most recent call last):
      File "%projectfolder%\test\venv\lib\site-packages\rudalle\pipelines.py", line 46, in generate_images
        logits, has_cache = dalle(out, attention_mask,
      File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "%projectfolder%\test\venv\lib\site-packages\rudalle\dalle\fp16.py", line 51, in forward
        return fp16_to_fp32(self.module(*(fp32_to_fp16(inputs)), **kwargs))
      File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "%projectfolder%\test\venv\lib\site-packages\rudalle\dalle\model.py", line 150, in forward
        transformer_output, present_has_cache = self.transformer(
      File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "%projectfolder%\test\venv\lib\site-packages\rudalle\dalle\transformer.py", line 76, in forward
        hidden_states, present_has_cache = layer(hidden_states, mask, has_cache=has_cache, use_cache=use_cache)
      File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "%projectfolder%\test\venv\lib\site-packages\rudalle\dalle\transformer.py", line 146, in forward
        layernorm_output = self.input_layernorm(hidden_states)
      File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\normalization.py", line 173, in forward
        return F.layer_norm(
      File "%projectfolder%\test\venv\lib\site-packages\torch\nn\functional.py", line 2346, in layer_norm
        return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
    RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
    
    opened by Xoma163 2
  • Add optional resume_download argument to help download large models

    Add optional resume_download argument to help download large models

    It's kinda pain to download large models with unstable network connection. For instance, i've started seeing this type of error (see screenshot). It breaks download process and you have to start again from zero bytes downloaded.

    However, cached_download(..) function in huggingface_hub has resume_download argument that can be used to restart download without loosing progress. See this line. So i think it would be helpful to add it as optional argument(defaults to False) to the get_rudalle_model(..) so users can turn it on if they have unstable internet.

    opened by Rexhaif 0
  • kandinsky model not available

    kandinsky model not available

    Nice to see the update! There is an auth error with the kandinsky model. Not sure if this is intended as there seem to be some token requirement. Could you clarify?

    opened by xavierleung 0
  • RuntimeError: nvrtc: error: failed to open libnvrtc-builtins.so.11.1.

    RuntimeError: nvrtc: error: failed to open libnvrtc-builtins.so.11.1.

    What might be causing this ?

    RuntimeError: nvrtc: error: failed to open libnvrtc-builtins.so.11.1. Make sure that libnvrtc-builtins.so.11.1 is installed correctly. nvrtc compilation failed:

    #define NAN __int_as_float(0x7fffffff)
    #define POS_INFINITY __int_as_float(0x7f800000)
    #define NEG_INFINITY __int_as_float(0xff800000)
    
    
    template<typename T>
    __device__ T maximum(T a, T b) {
      return isnan(a) ? a : (a > b ? a : b);
    }
    
    template<typename T>
    __device__ T minimum(T a, T b) {
      return isnan(a) ? a : (a < b ? a : b);
    }
    
    
    #define __HALF_TO_US(var) *(reinterpret_cast<unsigned short *>(&(var)))
    #define __HALF_TO_CUS(var) *(reinterpret_cast<const unsigned short *>(&(var)))
    #if defined(__cplusplus)
      struct __align__(2) __half {
        __host__ __device__ __half() { }
    
      protected:
        unsigned short __x;
      };
    
      /* All intrinsic functions are only available to nvcc compilers */
      #if defined(__CUDACC__)
        /* Definitions of intrinsics */
        __device__ __half __float2half(const float f) {
          __half val;
          asm("{  cvt.rn.f16.f32 %0, %1;}\n" : "=h"(__HALF_TO_US(val)) : "f"(f));
          return val;
        }
    
        __device__ float __half2float(const __half h) {
          float val;
          asm("{  cvt.f32.f16 %0, %1;}\n" : "=f"(val) : "h"(__HALF_TO_CUS(h)));
          return val;
        }
    
      #endif /* defined(__CUDACC__) */
    #endif /* defined(__cplusplus) */
    #undef __HALF_TO_US
    #undef __HALF_TO_CUS
    
    typedef __half half;
    
    extern "C" __global__
    void fused_mul_mul_mul_mu_5065363705190979294(half* t0, half* aten_mul) {
    {
      float t0_1 = __half2float(t0[(8192 * (((512 * blockIdx.x + threadIdx.x) / 8192) % 128) + ((512 * blockIdx.x + threadIdx.x) / 1048576) * 1048576) + (512 * blockIdx.x + threadIdx.x) % 8192]);
      aten_mul[(8192 * (((512 * blockIdx.x + threadIdx.x) / 8192) % 128) + ((512 * blockIdx.x + threadIdx.x) / 1048576) * 1048576) + (512 * blockIdx.x + threadIdx.x) % 8192] = __float2half((t0_1 * 0.5f) * ((tanhf((t0_1 * 0.7978845834732056f) * ((t0_1 * 0.04471499845385551f) * t0_1 + 1.f))) + 1.f));
    }
    }
    
    opened by c0ffymachyne 1
  • Bad syntax in collab

    Bad syntax in collab

    In https://colab.research.google.com/drive/1wGE-046et27oHvNlBNPH07qrEQNE04PQ?usp=sharing#scrollTo=GdOYJvwZSB-D

    it should be a couple of quotes (") in the text parameter:

    text = Что бы ни # @param

    Should be:

    text = "Что бы ни" # @param

    Thanks!

    opened by Jakeukalane 1
Releases(v1.1.0)
Owner
AI Forever
Creating ML for the future. AI projects you already know. We are non-profit organization with members from all over the world.
AI Forever
This is a TensorFlow implementation for C2-Rec

This is a TensorFlow implementation for C2-Rec We refer to the repo SASRec. Requirements requirement.txt Datasets This repo includes Amazon Beauty dat

7 Nov 14, 2022
Official code of the paper "Expanding Low-Density Latent Regions for Open-Set Object Detection" (CVPR 2022)

OpenDet Expanding Low-Density Latent Regions for Open-Set Object Detection (CVPR2022) Jiaming Han, Yuqiang Ren, Jian Ding, Xingjia Pan, Ke Yan, Gui-So

csuhan 64 Jan 07, 2023
Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network

DeepCDR Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network This work has been accepted to ECCB2020 and was also published in the

Qiao Liu 50 Dec 18, 2022
P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

P-tuning v2 P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks An optimized prompt tuning strategy achievi

THUDM 540 Dec 30, 2022
Reference models and tools for Cloud TPUs.

Cloud TPUs This repository is a collection of reference models and tools used with Cloud TPUs. The fastest way to get started training a model on a Cl

5k Jan 05, 2023
The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

SCINet This is the original PyTorch implementation of the following work: Time Series is a Special Sequence: Forecasting with Sample Convolution and I

386 Jan 01, 2023
Source code for paper "Deep Superpixel-based Network for Blind Image Quality Assessment"

DSN-IQA Source code for paper "Deep Superpixel-based Network for Blind Image Quality Assessment" Requirements Python =3.8.0 Pytorch =1.7.1 Usage wit

7 Oct 13, 2022
“袋鼯麻麻——智能购物平台”能够精准地定位识别每一个商品

“袋鼯麻麻——智能购物平台”能够精准地定位识别每一个商品,并且能够返回完整地购物清单及顾客应付的实际商品总价格,极大地降低零售行业实际运营过程中巨大的人力成本,提升零售行业无人化、自动化、智能化水平。

thomas-yanxin 192 Jan 05, 2023
An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

customer_segmentation_with_rfm Business Problem : An e-commerce company wants to

Buse Yıldırım 3 Jan 06, 2022
Honours project, on creating a depth estimation map from two stereo images of featureless regions

image-processing This module generates depth maps for shape-blocked-out images Install If working with anaconda, then from the root directory: conda e

2 Oct 17, 2022
The implementation of PEMP in paper "Prior-Enhanced Few-Shot Segmentation with Meta-Prototypes"

Prior-Enhanced network with Meta-Prototypes (PEMP) This is the PyTorch implementation of PEMP. Overview of PEMP Meta-Prototypes & Adaptive Prototypes

Jianwei ZHANG 8 Oct 14, 2021
Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

fwhr-calc-website This project is to automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azur

SoohyunPark 1 Feb 07, 2022
Self-Supervised Image Denoising via Iterative Data Refinement

Self-Supervised Image Denoising via Iterative Data Refinement Yi Zhang1, Dasong Li1, Ka Lung Law2, Xiaogang Wang1, Hongwei Qin2, Hongsheng Li1 1CUHK-S

Zhang Yi 72 Jan 01, 2023
PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"

Efficient Neural Architecture Search (ENAS) in PyTorch PyTorch implementation of Efficient Neural Architecture Search via Parameters Sharing. ENAS red

Taehoon Kim 2.6k Dec 31, 2022
2021 Artificial Intelligence Diabetes Datathon

A.I.D.D. 2021 2021 Artificial Intelligence Diabetes Datathon A.I.D.D. 2021은 ‘2021 인공지능 학습용 데이터 구축사업’을 통해 만들어진 학습용 데이터를 활용하여 당뇨병을 효과적으로 예측할 수 있는가에 대한 A

2 Dec 27, 2021
alfred-py: A deep learning utility library for **human**

Alfred Alfred is command line tool for deep-learning usage. if you want split an video into image frames or combine frames into a single video, then a

JinTian 800 Jan 03, 2023
Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics This repository is the official PyTorch implementation of "Physics-aware Differ

USC-Melady 46 Nov 20, 2022
Lightweight, Python library for fast and reproducible experimentation :microscope:

Steppy What is Steppy? Steppy is a lightweight, open-source, Python 3 library for fast and reproducible experimentation. Steppy lets data scientist fo

minerva.ml 134 Jul 10, 2022
Deep learning operations reinvented (for pytorch, tensorflow, jax and others)

This video in better quality. einops Flexible and powerful tensor operations for readable and reliable code. Supports numpy, pytorch, tensorflow, and

Alex Rogozhnikov 6.2k Jan 01, 2023
Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

A Unified Objective for Novel Class Discovery This is the official repository for the paper: A Unified Objective for Novel Class Discovery Enrico Fini

Enrico Fini 118 Dec 26, 2022