GANsformer: Generative Adversarial Transformers Drew A

Overview

PWC PWC PWC

GANsformer: Generative Adversarial Transformers

Drew A. Hudson* & C. Lawrence Zitnick

*I wish to thank Christopher D. Manning for the fruitful discussions and constructive feedback in developing the Bipartite Transformer, especially when explored within the language representation area, as well as for the kind financial support that allowed this work to happen!

This is an implementation of the GANsformer model, a novel and efficient type of transformer, explored for the task of image generation. The network employs a bipartite structure that enables long-range interactions across the image, while maintaining computation of linearly efficiency, that can readily scale to high-resolution synthesis. The model iteratively propagates information from a set of latent variables to the evolving visual features and vice versa, to support the refinement of each in light of the other and encourage the emergence of compositional representations of objects and scenes. In contrast to the classic transformer architecture, it utilizes multiplicative integration that allows flexible region-based modulation, and can thus be seen as a generalization of the successful StyleGAN network.

Instructions for model training and data prepreation as well as pretrained models will be available soon.
Note that the code is still going through some refactoring and clean-up. Will be ready for running by end of March 3. Stay Tuned!
(Code clean-up by March 3, all instructions by March 7, pretrained networks by March 20)

Bibtex

@article{hudson2021gansformer,
  title={Generative Adversarial Transformers},
  author={Hudson, Drew A and Zitnick, C. Lawrence},
  journal={arXiv preprint},
  year={2021}
}

Architecture overview

The GANsformer consists of two networks:

  • Generator: which produces the images (x) given randomly sampled latents (z). The latent z has a shape [batch_size, component_num, latent_dim], where component_num = 1 by default (Vanilla GAN, StyleGAN) but is > 1 for the GANsformer model. We can define the latent components by splitting z along the second dimension to obtain z_1,...,z_k latent components. The generator likewise consists of two parts:

    • Mapping network: converts sampled latents from a normal distribution (z) to the intermediate space (w). A series of Feed-forward layers. The k latent components either are mapped independently from the z space to the w space or interact with each other through self-attention (optional flag).
    • Synthesis network: the intermediate latents w are used to guide the generation of new images. Images features begin from a small constant/sampled grid of 4x4, and then go through multiple layers of convolution and up-sampling until reaching the desirable resolution (e.g. 256x256). After each convolution, the image features are modulated (meaning that their variance and bias are controlled) by the intermediate latent vectors w. While in the StyleGAN model there is one global w vectors that controls all the features equally. The GANsformer uses attention so that the k latent components specialize to control different regions in the image to create it cooperatively, and therefore perform better especially in generating images depicting multi-object scenes.
    • Attention can be used in several ways
      • Simplex Attention: when attention is applied in one direction only from the latents to the image features (top-down).
      • Duplex Attention: when attention is applied in the two directions: latents to image features (top-down) and then image features back to latents (bottom-up), so that each representation informs the other iteratively.
      • Self Attention between latents: can also be used so to each direct interactions between the latents.
      • Self Attention between image features (SAGAN model): prior approaches used attention directly between the image features, but this method does not scale well due to the quadratic number of features which becomes very high for high-resolutions.
  • Discriminator: Receives and image and has to predict whether it is real or fake – originating from the dataset or the generator. The model perform multiple layers of convolution and downsampling on the image, reducing the representation's resolution gradually until making final prediction. Optionally, attention can be incorporated into the discriminator as well where it has multiple (k) aggregator variables, that use attention to adaptively collect information from the image while being processed. We observe small improvements in model performance when attention is used in the discriminator, although note that most of the gain in using attention based on our observations arises from the generator.

Codebase

This codebase builds on top of and extends the great StyleGAN2 repository by Karras et al.
The GANsformer model can also be seen as a generalization of StyleGAN: while StyleGAN has one global latent vector that control the style of all image features globally, the GANsformer has k latent vectors, that cooperate through attention to control regions within the image, and thereby better modeling images of multi-object and compositional scenes.

More documentation and instructions will be coming soon!

Comments
  • Do you have any plans to export a pytorch version?

    Do you have any plans to export a pytorch version?

    Hi, I am not too familiar with tensorflow... If there are no such plans currently, do you have quick pointers to:

    1. the GANsformer model, especially where and how you deal with the latents (based on your paper, you split the latents?)
    2. what kind of optimizers are you using? and how do you implemented it? Is it similar to what we did in NLP (warmup, etc);
    3. did you ever tried using the standard feedforward after your duplex attention layer instead of 3x3? Did it still work?

    Thanks again for your kind attention! Best,

    opened by MultiPath 12
  • Some Errors On Training

    Some Errors On Training

    Thank you for your great work. I appreciate it a lot.

    I just tried to train a model with your codes, however there are lots of undefined variables used. For example:

    https://github.com/dorarad/gansformer/blob/148f72964219f8ead2621204bc5cfa89200b6879/training/network.py#L795

    It throw out undefined variable error for 'maps_in'. When I fix that with a constant, I get another error from

    https://github.com/dorarad/gansformer/blob/148f72964219f8ead2621204bc5cfa89200b6879/training/network.py#L811

    again gen_mod and gen_cond are not defined. When I fix that with a constant again, I get another error which says:

    gansformer-main/gansformer-main/training/network.py", line 1127, in G_synthesis grid_poses = get_positional_embeddings(resolution_log2, pos_dim or dlatent_size, pos_type, pos_directions_num, init = pos_init, **_kwargs) TypeError: get_positional_embeddings() got an unexpected keyword argument 'label_size'

    Am i missing something or is there a problem?

    opened by yilmazkorkmz 10
  • CLEVR pretrained model gives FID 22

    CLEVR pretrained model gives FID 22

    Hi, kudos for great work!

    I've just noticed that with recommended preprocessing and evaluation, the metrics on gdrive:cityscapes work as expected (FID ~5.2), while for CLEVR exactly two same lines:

    python prepare_data.py --clevr --max-images 100000
    python run_network.py --eval --gpus 0 --expname clevr-exp --dataset clevr --pretrained-pkl gdrive:clevr-snapshot.pkl
    

    give ~22 FID, not 9.2. Can you please double-check if the provided snapshot is correct? Or am I missing smth here?

    Thanks in advance!

    opened by JanRocketMan 8
  • kernel error in generate.py

    kernel error in generate.py

    In a python 3.7, tensorflow-gpu=1.15.0 cuda 10.0 and cuddn 7.5 I get this error in generate.py (which appeared to require cuddn 7.6.5, which brings a different error (see second part). Any advice?

    ... Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file

    ........... Total 35894608

    Generate images... 0%| | 0/8 [00:01<?, ?image (1 batches of 8 images)/s] Traceback (most recent call last): File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'FusedBiasAct' used by {{node Gs/_Run/Gs/G_mapping/AttLayer_0/FusedBiasAct}}with these attrs: [gain=1, T=DT_FLOAT, axis=1, alpha=0, grad=0, act=1] Registered devices: [CPU, XLA_CPU, XLA_GPU] Registered kernels: device='GPU'; T in [DT_HALF] device='GPU'; T in [DT_FLOAT]

         [[Gs/_Run/Gs/G_mapping/AttLayer_0/FusedBiasAct]]
    

    CUDNN7.6.5 error .... Total 35894608

    Generate images... 0%| | 0/8 [00:01<?, ?image (1 batches of 8 images)/s] Traceback (most recent call last): File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found. (0) Internal: cudaErrorNoKernelImageForDevice [[{{node Gs/_Run/Gs/G_mapping/global/Dense0_0/FusedBiasAct}}]] [[Gs/_Run/Gs/maps_out/_3151]] (1) Internal: cudaErrorNoKernelImageForDevice [[{{node Gs/_Run/Gs/G_mapping/global/Dense0_0/FusedBiasAct}}]] 0 successful operations. 0 derived errors ignored.

    opened by yaseryacoob 8
  • About the Duplex attention

    About the Duplex attention

    Hi, Thanks for sharing the code!

    I have a few questions about Section 3.1.2. Duplex attention.

    1. I am confused by the notation in the section. For example, in this section, "Y=(K^{P\times d}, V^{P\times d}), where the values store the content of the Y variables (e.g. the randomly sampled latents for the case of GAN)". Does it mean that V^{P\times d} is sampled from the original variable Y? how to set the number of P in your code?

    2. "keys track the centroids of the attention-based assignments from X to Y, which can be computed as K=a_b(Y, X)", does it mean K is calculated by using the self-attention module but with (Y, X) as input? If so, how to understand “the keys track the centroid of the attention-based assignments from X to Y”? BTW, how to get the centroids?

    3. For the update rule in duplex attention, what does the a() function mean? Does it denote a self-attention module like a_b() in Section 3.1.1, where X as query, K as keys, and V as values, if so, K is calculated from another self-attention module as mentioned in question 2, so the output of a_b(Y, X) will be treated as Keys, so the update rule contains two self-attention operations? is that right? Does it mean ’Duplex‘ attention?

    4. But finally I find I may be wrong when I read the last paragraph in this section. As mentioned in this section, "to support bidirectional interaction between elements, we can chain two reciprocal simplex attentions from X to Y and from Y to X, obtaining the duplex attention" So, does it mean, first, we calculate the Y by using a simplex attention module u^a(Y, X), and then use this Y as input of u^d(X, Y) to update X? Does it mean the duplex attention module contains three self-attention operations?

    Thanks a lot! :)

    opened by AndrewChiyz 7
  • FID VQ-GAN

    FID VQ-GAN

    Thank you for open-sourcing your code :)

    I was wondering about the generally very high FID values for the VQGAN. In the VQGAN paper, they report on, e.g., FFHQ 256x256 an FID of 11.4, whereas you report 63.1... Any idea why they are so different?

    Thanks!

    opened by xl-sr 7
  • PyTorch implementation generates same image samples

    PyTorch implementation generates same image samples

    Hi, I'm getting the same output image samples (see below) when I train the PyTorch implementation on FFHQ from scratch. The only changes I made (due to some memory issues mentioned in #33) were adding --batch-gpu 1 and removing saving attention map functionality (commenting out pytorch_version/training/visualize.py lines 167-206).

    python run_network.py --train --gpus 0 --batch-gpu 1 --ganformer-default --expname ffhq-scratch --dataset ffhq 000120 000240

    opened by kwhuang88228 6
  • Metrics PR Error

    Metrics PR Error

    Dear authors,

    Thank you for your wonderful contribution!!!

    When I tried to get precision and recall values during training by adding option, --metric pr, I got the following error


    \precision_recall.py", line 179, in _evaluate feats = self._gen_feats(Gs, inception, minibatch_size, num_gpus, Gs_kwargs) NameError: name 'inception' is not defined

    So, I have changed the lines in precision_recall.py. After the modification, It seems to work. I would greatly appreciate it if you kindly review my modification.


    def _evaluate(self, Gs, Gs_kwargs, num_gpus, num_imgs, paths = None, **kwargs):

           if paths is not None: 
               # Extract features for local sample image files (paths)
    ----->  eval_features = self._paths_to_feats(paths, feat_func, minibatch_size, num_gpus, num_imgs)
           else:
               # Extract features for newly generated fake imgs
    ----->  eval_features = self._gen_feats(Gs, feature_net, minibatch_size, num_imgs, num_gpus, Gs_kwargs)
    
           # Compute precision and recall
           state = knn_precision_recall_features(ref_features = ref_features, eval_features = eval_features,
               feature_net = feature_net, nhood_sizes = [self.nhood_size], row_batch_size = self.row_batch_size,
    ----->  col_batch_size = self.row_batch_size, num_gpus = num_gpus, num_imgs = num_imgs)
           self._report_result(state.knn_precision[0], suffix = "_precision")
           self._report_result(state.knn_recall[0], suffix = "_recall")
    
    -------------------------------------------------------------------------
    
    opened by bwhwang 6
  • Memory issue when training 1024 resolution

    Memory issue when training 1024 resolution

    I'm trying to train a 1024x1024 database on a V100 GPU. I tried both the tensorflow version and the pytorch version. Despite setting batch-gpu to 1, the tensorflow version always run out of system RAM(after the first tick, system ram total 51gb), and the pytorch version alway run out of cuda memory(before the first tick).

    Here are my training settings:

    python run_network.py --train --metrics 'none' --gpus 0 --batch-gpu 1 --resolution 1024 \
     --ganformer-default --expname art1 --dataset 1024art
    

    Also, I always encounter the warning: tcmalloc: large alloc

    opened by BlueberryGin 5
  • Issues with docker

    Issues with docker

    Hi,

    I'm trying to dockerize using this image - tensorflow/tensorflow:1.14.0-gpu-py3.

    FROM tensorflow/tensorflow:1.14.0-gpu-py3
    
    ARG USER="test"
    ARG WORK_DIR="/home/$USER"
    
    WORKDIR $WORK_DIR
    
    RUN apt-get update && apt-get install build-essential
    
    RUN apt-get install ffmpeg libsm6 libxext6  -y
    
    RUN pip install --upgrade pip setuptools wheel
    
    COPY . ./
    
    RUN pip install -r requirements.txt
    
    RUN python generate.py --gpus 0 --model gdrive:bedrooms-snapshot.pkl --output-dir images --images-num 4
    

    However, I am getting this error:

    Downloading https://drive.google.com/uc?id=1-2L3iCBpP_cf6T2onf3zEQJFAAzxsQne .... done
    
    2021-04-06 08:32:44 UTC -- Setting up TensorFlow plugin 'upfirdn_2d.cu': Preprocessing... Compiling... Loading... bin_file:  /home/test/dnnlib/tflib/_cudacache/upfirdn_2d_1.14_.so
    
    2021-04-06 08:32:44 UTC -- Failed!
    
    2021-04-06 08:32:44 UTC -- Traceback (most recent call last):
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 49, in <module>
    
    2021-04-06 08:32:44 UTC --     main()
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 46, in main
    
    2021-04-06 08:32:44 UTC --     run(**vars(args))
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 22, in run
    
    2021-04-06 08:32:44 UTC --     G, D, Gs = load_networks(model)                             # Load pre-trained network
    
    2021-04-06 08:32:44 UTC --   File "/home/test/pretrained_networks.py", line 30, in load_networks
    
    2021-04-06 08:32:44 UTC --     G, D, Gs = pickle.load(stream, encoding = "latin1")[:3]
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/network.py", line 306, in __setstate__
    
    2021-04-06 08:32:44 UTC --     self._init_graph()
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/network.py", line 159, in _init_graph
    
    2021-04-06 08:32:44 UTC --     out_expr = self._build_func(*self.input_templates, **build_kwargs)
    
    2021-04-06 08:32:44 UTC --   File "<string>", line 2371, in G_synthesis_stylegan2
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 229, in downsample_2d
    
    2021-04-06 08:32:44 UTC --     return _simple_upfirdn_2d(x, k, down=factor, pad0=(p+1)//2, pad1=p//2, data_format=data_format, impl=impl)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 358, in _simple_upfirdn_2d
    
    2021-04-06 08:32:44 UTC --     y = upfirdn_2d(y, k, upx=up, upy=up, downx=down, downy=down, padx0=pad0, padx1=pad1, pady0=pad0, pady1=pad1, impl=impl)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 61, in upfirdn_2d
    
    2021-04-06 08:32:44 UTC --     return impl_dict[impl](x=x, k=k, upx=upx, upy=upy, downx=downx, downy=downy, padx0=padx0, padx1=padx1, pady0=pady0, pady1=pady1)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 139, in _upfirdn_2d_cuda
    
    2021-04-06 08:32:44 UTC --     return func(x)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/custom_gradient.py", line 162, in decorated
    
    2021-04-06 08:32:44 UTC --     return _graph_mode_decorator(f, *args, **kwargs)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/custom_gradient.py", line 183, in _graph_mode_decorator
    
    2021-04-06 08:32:44 UTC --     result, grad_fn = f(*args)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 131, in func
    
    2021-04-06 08:32:44 UTC --     y = _get_plugin().up_fir_dn2d(x=x, k=kc, upx=upx, upy=upy, downx=downx, downy=downy, padx0=padx0, padx1=padx1, pady0=pady0, pady1=pady1)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 14, in _get_plugin
    
    2021-04-06 08:32:44 UTC --     return custom_ops.get_plugin(os.path.splitext(__file__)[0] + '.cu')
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/custom_ops.py", line 162, in get_plugin
    
    2021-04-06 08:32:44 UTC --     plugin = tf.load_op_library(bin_file)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
    
    2021-04-06 08:32:44 UTC --     lib_handle = py_tf.TF_LoadLibrary(library_filename)
    
    2021-04-06 08:32:44 UTC -- tensorflow.python.framework.errors_impl.NotFoundError: /home/test/dnnlib/tflib/_cudacache/upfirdn_2d_1.14_.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
    
    2021-04-06 08:32:44 UTC -- error building image: error building stage: failed to execute command: waiting for process to exit: exit status 1
    

    Please help to check and advise. Thanks!

    opened by arsyad-ah 5
  • Cannot utilize multiple CPU cores

    Cannot utilize multiple CPU cores

    Hi-

    Thank you for making such a fascinating project available here!

    I'm trying to run ganformer within a conda environment, but am having problems getting ganformer to utilize multiple CPU cores.

    Using Ubuntu 20.04. Here is the setup for the conda environment used:

    conda create --name cuda10 python=3.7
    conda activate cuda10
    conda install tensorflow-gpu=1.14
    conda install pillow h5py requests tqdm termcolor seaborn
    pip install opencv-python lmdb gdown easydict
    

    To run it

    python gansformer/run_network.py --train --pretrained-pkl None --gpus 0,1 --ganformer-default --expname myDS_256 --dataset myDS --data-dir /data/myDS_256_tf --keep-samples --metrics none --result-dir training_runs/256_c1/ --num-threads 24 --minibatch-size 16
    

    Everything seems to be running correctly, there are no errors or crashes. The only problem is slow training initialization and low GPU utilization during training. System Monitor shows that only one CPU core is used at a time, so I'm guessing this is the cause of both issues. Do you have any ideas of what might be causing the restriction to a single CPU core?

    I always try to avoid raising an issue when something obvious might be wrong on my end, but this is my first time using conda so it might be that I'm simply using it incorrectly, or that I'm using your program incorrectly. I appreciate your patience if that is the case.

    Thank you for your attention to this issue!

    opened by abstractdonut 4
  • question on duplex attention (k means) code

    question on duplex attention (k means) code

    First, thank you for this amazing work!

    I am suspecting that an indentation is missing at the following position of the code:

    https://github.com/dorarad/gansformer/blob/3a9efa4545be25604b70560b7f491ec3633c14a3/pytorch_version/training/networks.py#L784

    The reason why it raises my suspicion is that, if the code is executed as it is, it seems like the actual key values (to_tensor) are never involved in the computation of the attention scores when k means is enabled. If I am mistaken, would you mind explain why line 787 replaces the original attention scores with the values computed here (where the embedding "to_centroids" seems to be initialized to be a mapping of the queries)?

    opened by nintendops 0
  • Training wont work, needs tensor.contrib which was removed in tf version 1.14

    Training wont work, needs tensor.contrib which was removed in tf version 1.14

    When running: python3 run_network.py --train --ganformer-default --expname test --dataset plant --eval-images-num 10000 The following error appears:

    I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-10-11 14:56:30.661744: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2022-10-11 14:56:30.690985: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2022-10-11 14:56:31.202500: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64 2022-10-11 14:56:31.202557: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64 2022-10-11 14:56:31.202565: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. Traceback (most recent call last): File "/home/ali/gansformer/run_network.py", line 15, in import pretrained_networks File "/home/ali/gansformer/pretrained_networks.py", line 4, in import dnnlib.tflib as tflib File "/home/ali/gansformer/dnnlib/tflib/init.py", line 1, in from . import autosummary File "/home/ali/gansformer/dnnlib/tflib/autosummary.py", line 23, in from . import tfutil File "/home/ali/gansformer/dnnlib/tflib/tfutil.py", line 9, in import tensorflow.contrib # requires TensorFlow 1.x! ModuleNotFoundError: No module named 'tensorflow.contrib'

    opened by AliMezher18 0
  • Hosting models on Hugging Face

    Hosting models on Hugging Face

    Hello! Thank you for open-sourcing this work, this is amazing 😊 I was wondering if you'd be interested in mirroring the pretrained model weights over on the Hugging Face model hub. I'm sure our community would love to see your work, and (among other things) hosting checkpoints on the Hub helps a lot with discoverability. We've got a guide here on how to upload models, but I'm also happy to help out with it if you'd like!

    opened by NimaBoscarino 0
  • Ganformer2

    Ganformer2

    Thanks for your brilliant work of ganformer and ganformer2! May I ask is there a roughly timeline to when the ganformer2 model would be release? Thanks for your time!

    opened by yangkang98 0
Releases(v1.5.2)
  • v1.5.2(Feb 2, 2022)

    Official Implementation of the Generative Adversarial Transformers paper, in both pytorch and tensorflow, for image and compositional scene generation. The codebase supports training, evaluation, image sampling, and variety of visualizations.

    Updates for version 1.5.2 (Feb 22, 2022): We updated the weight initialization of the PyTorch version to the intended scale, leading to a substantial improvement in the model's learning speed.

    Source code(tar.gz)
    Source code(zip)
  • v1.0(Mar 17, 2021)

    Official Implementation of the Generative Adversarial Transformers paper for image and compositional scene generation. The codebase supports training, evaluation, image sampling, and variety of visualizations.

    Source code(tar.gz)
    Source code(zip)
Owner
Drew Arad Hudson
Drew Arad Hudson
[3DV 2021] Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation

Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation This is the official implementation for the method described in Ch

Jiaxing Yan 27 Dec 30, 2022
Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System This repository contains code for the paper Schultheis,

2 Oct 28, 2022
【Arxiv】Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution

SANet Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution Dependencies numpy==1.18.5 scikit_image==0.16.2 torchvision==0.8.1 to

36 Jan 05, 2023
UpChecker is a simple opensource project to host it fast on your server and check is server up, view statistic, get messages if it is down. UpChecker - just run file and use project easy

UpChecker UpChecker is a simple opensource project to host it fast on your server and check is server up, view statistic, get messages if it is down.

Yan 4 Apr 07, 2022
Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021.

UniRE Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021. Requirements python: 3.7.6 pytorch: 1.8.1 transformers:

Wang Yijun 109 Nov 29, 2022
A disassembler for the RP2040 Programmable I/O State-machine!

piodisasm A disassembler for the RP2040 Programmable I/O State-machine! Usage Just run piodisasm.py on a file that contains the PIO code as hex! (Such

Ghidra Ninja 29 Dec 06, 2022
Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

1 Oct 11, 2021
A framework to train language models to learn invariant representations.

Invariant Language Modeling Implementation of the training for invariant language models. Motivation Modern pretrained language models are critical co

6 Nov 16, 2022
Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Surface Reconstruction from Point Clouds by Learning Predictive Context Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository c

136 Dec 12, 2022
A Deep learning based streamlit web app which can tell with which bollywood celebrity your face resembles.

Project Name: Which Bollywood Celebrity You look like A Deep learning based streamlit web app which can tell with which bollywood celebrity your face

BAPPY AHMED 20 Dec 28, 2021
Temporal Knowledge Graph Reasoning Triggered by Memories

MTDM Temporal Knowledge Graph Reasoning Triggered by Memories To alleviate the time dependence, we propose a memory-triggered decision-making (MTDM) n

4 Sep 25, 2022
A denoising autoencoder + adversarial losses and attention mechanisms for face swapping.

faceswap-GAN Adding Adversarial loss and perceptual loss (VGGface) to deepfakes'(reddit user) auto-encoder architecture. Updates Date Update 2018-08-2

3.2k Dec 30, 2022
Code for the paper "Implicit Representations of Meaning in Neural Language Models"

Implicit Representations of Meaning in Neural Language Models Preliminaries Create and set up a conda environment as follows: conda create -n state-pr

Belinda Li 39 Nov 03, 2022
Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision https://arxiv.org/abs/2003.00393 Abstract Active learning (AL) aims to min

Denis 29 Nov 21, 2022
"Graph Neural Controlled Differential Equations for Traffic Forecasting", AAAI 2022

Graph Neural Controlled Differential Equations for Traffic Forecasting Setup Python environment for STG-NCDE Install python environment $ conda env cr

Jeongwhan Choi 55 Dec 28, 2022
Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time.

BBB Face Recognizer Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time. Instalati

Rafael Azevedo 232 Dec 24, 2022
Surrogate-Assisted Genetic Algorithm for Wrapper Feature Selection

SAGA Surrogate-Assisted Genetic Algorithm for Wrapper Feature Selection Please refer to the Jupyter notebook (Example.ipynb) for an example of using t

9 Dec 28, 2022
Animate molecular orbital transitions using Psi4 and Blender

Molecular Orbital Transitions (MOT) Animate molecular orbital transitions using Psi4 and Blender Author: Maximilian Paradiz Dominguez, University of A

3 Feb 01, 2022
Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

31 Dec 07, 2022
The code for paper "Learning Implicit Fields for Generative Shape Modeling".

implicit-decoder The tensorflow code for paper "Learning Implicit Fields for Generative Shape Modeling", Zhiqin Chen, Hao (Richard) Zhang. Project pag

Zhiqin Chen 353 Dec 30, 2022