Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

Overview

An Image Captioning codebase

This is a codebase for image captioning research.

It supports:

A simple demo colab notebook is available here

Requirements

  • Python 3
  • PyTorch 1.3+ (along with torchvision)
  • cider (already been added as a submodule)
  • coco-caption (already been added as a submodule) (Remember to follow initialization steps in coco-caption/README.md)
  • yacs
  • lmdbdict

Install

If you have difficulty running the training scripts in tools. You can try installing this repo as a python package:

python -m pip install -e .

Pretrained models

Checkout MODEL_ZOO.md.

If you want to do evaluation only, you can then follow this section after downloading the pretrained models (and also the pretrained resnet101 or precomputed bottomup features, see data/README.md).

Train your own network on COCO/Flickr30k

Prepare data.

We now support both flickr30k and COCO. See details in data/README.md. (Note: the later sections assume COCO dataset; it should be trivial to use flickr30k.)

Start training

$ python tools/train.py --id fc --caption_model newfc --input_json data/cocotalk.json --input_fc_dir data/cocotalk_fc --input_att_dir data/cocotalk_att --input_label_h5 data/cocotalk_label.h5 --batch_size 10 --learning_rate 5e-4 --learning_rate_decay_start 0 --scheduled_sampling_start 0 --checkpoint_path log_fc --save_checkpoint_every 6000 --val_images_use 5000 --max_epochs 30

or

$ python tools/train.py --cfg configs/fc.yml --id fc

The train script will dump checkpoints into the folder specified by --checkpoint_path (default = log_$id/). By default only save the best-performing checkpoint on validation and the latest checkpoint to save disk space. You can also set --save_history_ckpt to 1 to save every checkpoint.

To resume training, you can specify --start_from option to be the path saving infos.pkl and model.pth (usually you could just set --start_from and --checkpoint_path to be the same).

To checkout the training curve or validation curve, you can use tensorboard. The loss histories are automatically dumped into --checkpoint_path.

The current command use scheduled sampling, you can also set --scheduled_sampling_start to -1 to turn off scheduled sampling.

If you'd like to evaluate BLEU/METEOR/CIDEr scores during training in addition to validation cross entropy loss, use --language_eval 1 option, but don't forget to pull the submodule coco-caption.

For all the arguments, you can specify them in a yaml file and use --cfg to use the configurations in that yaml file. The configurations in command line will overwrite cfg file if there are conflicts.

For more options, see opts.py.

Train using self critical

First you should preprocess the dataset and get the cache for calculating cider score:

$ python scripts/prepro_ngrams.py --input_json data/dataset_coco.json --dict_json data/cocotalk.json --output_pkl data/coco-train --split train

Then, copy the model from the pretrained model using cross entropy. (It's not mandatory to copy the model, just for back-up)

$ bash scripts/copy_model.sh fc fc_rl

Then

$ python tools/train.py --id fc_rl --caption_model newfc --input_json data/cocotalk.json --input_fc_dir data/cocotalk_fc --input_att_dir data/cocotalk_att --input_label_h5 data/cocotalk_label.h5 --batch_size 10 --learning_rate 5e-5 --start_from log_fc_rl --checkpoint_path log_fc_rl --save_checkpoint_every 6000 --language_eval 1 --val_images_use 5000 --self_critical_after 30 --cached_tokens coco-train-idxs --max_epoch 50 --train_sample_n 5

or

$ python tools/train.py --cfg configs/fc_rl.yml --id fc_rl

You will see a huge boost on Cider score, : ).

A few notes on training. Starting self-critical training after 30 epochs, the CIDEr score goes up to 1.05 after 600k iterations (including the 30 epochs pertraining).

Generate image captions

Evaluate on raw images

Note: this doesn't work for models trained with bottomup feature. Now place all your images of interest into a folder, e.g. blah, and run the eval script:

$ python tools/eval.py --model model.pth --infos_path infos.pkl --image_folder blah --num_images 10

This tells the eval script to run up to 10 images from the given folder. If you have a big GPU you can speed up the evaluation by increasing batch_size. Use --num_images -1 to process all images. The eval script will create an vis.json file inside the vis folder, which can then be visualized with the provided HTML interface:

$ cd vis
$ python -m SimpleHTTPServer

Now visit localhost:8000 in your browser and you should see your predicted captions.

Evaluate on Karpathy's test split

$ python tools/eval.py --dump_images 0 --num_images 5000 --model model.pth --infos_path infos.pkl --language_eval 1 

The defualt split to evaluate is test. The default inference method is greedy decoding (--sample_method greedy), to sample from the posterior, set --sample_method sample.

Beam Search. Beam search can increase the performance of the search for greedy decoding sequence by ~5%. However, this is a little more expensive. To turn on the beam search, use --beam_size N, N should be greater than 1.

Evaluate on COCO test set

$ python tools/eval.py --input_json cocotest.json --input_fc_dir data/cocotest_bu_fc --input_att_dir data/cocotest_bu_att --input_label_h5 none --num_images -1 --model model.pth --infos_path infos.pkl --language_eval 0

You can download the preprocessed file cocotest.json, cocotest_bu_att and cocotest_bu_fc from link.

Miscellanea

Using cpu. The code is currently defaultly using gpu; there is even no option for switching. If someone highly needs a cpu model, please open an issue; I can potentially create a cpu checkpoint and modify the eval.py to run the model on cpu. However, there's no point using cpus to train the model.

Train on other dataset. It should be trivial to port if you can create a file like dataset_coco.json for your own dataset.

Live demo. Not supported now. Welcome pull request.

For more advanced features:

Checkout ADVANCED.md.

Reference

If you find this repo useful, please consider citing (no obligation at all):

@article{luo2018discriminability,
  title={Discriminability objective for training descriptive captions},
  author={Luo, Ruotian and Price, Brian and Cohen, Scott and Shakhnarovich, Gregory},
  journal={arXiv preprint arXiv:1803.04376},
  year={2018}
}

Of course, please cite the original paper of models you are using (You can find references in the model files).

Acknowledgements

Thanks the original neuraltalk2 and awesome PyTorch team.

Comments
  • 关于bottom up and top down模型的问题

    关于bottom up and top down模型的问题

    你好, bottom up and top down的论文里说 60k iterations 训9个小时就能到达cider 120.1 的效果。 但我将你代码模型的参数修改成论文中的参数,并将attention lstm那块的输入修改成每个bounding box 的image feature的均值。但是也达不到论文里的效果。 想问问你,以你的经验来看,觉得会是什么原因呢?

    opened by JimLee4530 26
  • The cider score gets smaller than pretrained model

    The cider score gets smaller than pretrained model

    Hi, i am trying to reproduce your code in tensorflow. I almost write my code as you do, but find the cider score is getting smaller than before. At the same time, the sampling sentence is also getting shorter. Can you give me some advice?

    opened by wjb123 23
  • Structure loss

    Structure loss

    Hi,

    I have a question regarding the structure loss in su+bu+structure. The structure loss weight is similar as the lambda in your discriminality paper? That is scales the rewards with cross-entropy ? Furthermore, if that is the case each of the rewards can as well have different weights?

    Thank you.

    opened by mememimis 20
  • About multi-GPU training

    About multi-GPU training

    hi~, when I tried to train the fc_model on 2 GPUs by setting the environment variable CUDA_VISIBLE_DEVICES=0,1, I ran this code with some errors intrain.py#125, but there were no errors when I ran your codes with only one GPU.

    Does this repository support multi-GPUs training?

    opened by xuyan1115 16
  • evaluation error using pre-trained model

    evaluation error using pre-trained model

    Hi Ruotian,

    I tried to run the following script: python eval.py --model resnet50.pth --infos_path infos.pkl --image_folder ./image/val2014_coco/ --num_images 1

    The model was resnet50 (and resnet101) downloaded from your google drive. But I got the error: Traceback (most recent call last): File "eval.py", line 102, in <module> model.load_state_dict(torch.load(opt.model)) File "/home/wentong/miniconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 522, in load_state_dict .format(name)) KeyError: 'unexpected key "conv1.weight" in state_dict'

    I have searched online but there was little information about that. I guess you have used multiple gpus.

    Any advice? Thank you for your implementation.

    opened by Wentong-DST 15
  • CUDA out of memory in self-critical training but not in xe

    CUDA out of memory in self-critical training but not in xe

    Hello. I am using GTX2080 Ti with 11GB memory. I trained the model using xe and it works fine. But then when I run self-critical, I get CUDA out of memory. How come the model can fit in xe training but not in rl training? It is the same model with the same number of parameters. Any advice would help

    opened by homelifes 12
  • Issue when training

    Issue when training

    I am trying to train the model as the per the instructions given on the repo. I am getting below error:

    [[email protected] self-critical.pytorch]$ python tools/train.py --id fc --caption_model newfc --input_json data/cocotalk.json --input_fc_dir data/cocotalk_fc --input_att_dir data/cocotalk_att --input_label_h5 data/cocotalk_label.h5 --batch_size 10 --learning_rate 5e-4 --learning_rate_decay_start 0 --scheduled_sampling_start 0 --checkpoint_path log_fc --save_checkpoint_every 6000 --val_images_use 5000 --max_epochs 30 Hugginface transformers not installed; please visit https://github.com/huggingface/transformers meshed-memory-transformer not installed; please runpip install git+https://github.com/ruotianluo/meshed-memory-transformer.gitDataLoader loading json file: data/cocotalk.json vocab size is 9487 DataLoader loading h5 file: data/cocotalk_fc data/cocotalk_att data/cocotalk_box data/cocotalk_label.h5 max sequence length in data is 16 read 123287 image features assigned 113287 images to split train assigned 5000 images to split val assigned 5000 images to split test /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) 2020-07-27 22:49:15.536672: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 Read data: 0.0002117156982421875 Traceback (most recent call last): File "tools/train.py", line 289, in <module> train(opt) File "tools/train.py", line 182, in train model_out = dp_lw_model(fc_feats, att_feats, labels, masks, att_masks, data['gts'], torch.arange(0, len(data['gts'])), sc_flag, struc_flag) File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/usr/local/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/usr/local/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/usr/local/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply output.reraise() File "/usr/local/lib64/python3.6/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) StopIteration: Caught StopIteration in replica 0 on device 0. Original Traceback (most recent call last): File "/usr/local/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker output = module(*input, **kwargs) File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/modules/loss_wrapper.py", line 45, in forward loss = self.crit(self.model(fc_feats, att_feats, labels[..., :-1], att_masks), labels[..., 1:], masks[..., 1:]) File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/models/CaptionModel.py", line 33, in forward return getattr(self, '_'+mode)(*args, **kwargs) File "/home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/models/AttModel.py", line 128, in _forward state = self.init_hidden(batch_size*seq_per_img) File "/home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/models/AttModel.py", line 99, in init_hidden weight = next(self.parameters()) StopIteration

    PyTorch version: '1.5.0+cu101'

    I saw a pytorch - bug https://github.com/huggingface/transformers/issues/3936, which describes a similar issue. Not sure if it related.

    opened by gsrivas4 11
  • transformer性能

    transformer性能

    您好,我使用 transformer.yml训练只能达到 Bleu_1: 0.749 Bleu_2: 0.584 Bleu_3: 0.443 Bleu_4: 0.336 METEOR: 0.271 ROUGE_L: 0.553 CIDEr: 1.092 远远不及您提及的分数,请问您训练参数是如何设置的?

    opened by hasky123 10
  • Type error while training

    Type error while training

    @ruotianluo Sorry again, after fixing the file problem, I got an error: Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding)

    run tools/train.py --cfg configs/fc_rl.yml --id fc_rl DataLoader loading json file: data/cocotalk.json vocab size is 9487 DataLoader loading h5 file: data/cocotalk_fc data/cocotalk_att data/cocotalk_box data/cocotalk_label.h5 max sequence length in data is 16 read 123287 image features assigned 113287 images to split train assigned 5000 images to split val assigned 5000 images to split test Read data: 0.003994464874267578 Save ckpt on exception ... model saved to ./log_fc_rl\model.pth Save ckpt done. Traceback (most recent call last): File "D:\Stephan\Final project\self-critical.pytorch-master\tools\train.py", line 183, in train model_out = dp_lw_model(fc_feats, att_feats, labels, masks, att_masks, data['gts'], torch.arange(0, len(data['gts'])), sc_flag, struc_flag) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in call result = self.forward(*input, **kwargs) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\parallel\data_parallel.py", line 150, in forward return self.module(*inputs[0], **kwargs[0]) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in call result = self.forward(*input, **kwargs) File "D:\Stephan\Final project\self-critical.pytorch-master\captioning\modules\loss_wrapper.py", line 45, in forward loss = self.crit(self.model(fc_feats, att_feats, labels[..., :-1], att_masks), labels[..., 1:], masks[..., 1:]) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in call result = self.forward(*input, **kwargs) File "D:\Stephan\Final project\self-critical.pytorch-master\captioning\models\CaptionModel.py", line 33, in forward return getattr(self, '_'+mode)(*args, **kwargs) File "D:\Stephan\Final project\self-critical.pytorch-master\captioning\models\AttModel.py", line 160, in _forward output, state = self.get_logprobs_state(it, p_fc_feats, p_att_feats, pp_att_feats, p_att_masks, state) File "D:\Stephan\Final project\self-critical.pytorch-master\captioning\models\AttModel.py", line 167, in get_logprobs_state xt = self.embed(it) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in call result = self.forward(*input, **kwargs) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\modules\sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\functional.py", line 1484, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding)

    opened by stephancheng 10
  • Question regarding self-critical reward

    Question regarding self-critical reward

    Hi Ruotian, I'd like to ask about the nature of the reward when training with self-critical. Is it normal to start at negative or zero for the first few epochs? I am getting the following for the first epoch with self-critical (after training with XE for 13 epochs):

    Capture

    Is this normal? And what is the maximum reward you achieved after training with self-critical?

    Also, I'm using the py3 branch. I saw there are a lot of differences between the py3 and py2 branches. So is the py3 branch reliable to use?

    Looking forward to your answer!

    opened by fawazsammani 9
  • ZeroDivisionError: division by zero

    ZeroDivisionError: division by zero

    ZeroDivisionError: division by zero Hi Dr. Luo. I try to evaluate model, but I meet an error.

    error info:

    python eval.py --model data/pretrain/fc/model-best.pth  --infos_path data/pretrain/fc/infos_fc-best.pkl --image_folder blah --num_images -1
    loading annotations into memory...
    0:00:00.498650
    creating index...
    index created!
    Traceback (most recent call last):
      File "eval.py", line 71, in <module>
        lang_stats = eval_utils.language_eval(opt.input_json, predictions, n_predictions, vars(opt), opt.split)
      File "/home/andrewcao95/workspace/pycharm_ws/self-critical.pytorch/eval_utils.py", line 83, in language_eval
        mean_perplexity = sum([_['perplexity'] for _ in preds_filt]) / len(preds_filt)
    ZeroDivisionError: division by zero
    
    

    source code:

    # filter results to only those in MSCOCO validation set
        preds_filt = [p for p in preds if p['image_id'] in valids]
        mean_perplexity = sum([_['perplexity'] for _ in preds_filt]) / len(preds_filt)
        mean_entropy = sum([_['entropy'] for _ in preds_filt]) / len(preds_filt)
        print('using %d/%d predictions' % (len(preds_filt), len(preds)))
        json.dump(preds_filt, open(cache_path, 'w')) # serialize to temporary json file. Sigh, COCO API...
    
        cocoRes = coco.loadRes(cache_path)
        cocoEval = COCOEvalCap(coco, cocoRes)
        cocoEval.params['image_id'] = cocoRes.getImgIds()
        cocoEval.evaluate()
    
        for metric, score in cocoEval.eval.items():
            out[metric] = score
        # Add mean perplexity
        out['perplexity'] = mean_perplexity
        out['entropy'] = mean_entropy
    
        imgToEval = cocoEval.imgToEval
        for k in list(imgToEval.values())[0]['SPICE'].keys():
            if k != 'All':
                out['SPICE_'+k] = np.array([v['SPICE'][k]['f'] for v in imgToEval.values()])
                out['SPICE_'+k] = (out['SPICE_'+k][out['SPICE_'+k]==out['SPICE_'+k]]).mean()
        for p in preds_filt:
            image_id, caption = p['image_id'], p['caption']
            imgToEval[image_id]['caption'] = caption
    
    opened by andrewcao95 8
  • How can I get my dataset image feature to scripts/prepro_feats.py

    How can I get my dataset image feature to scripts/prepro_feats.py

    Thanks for your job! When I prepare data I find that I have not pre_extract the image features in my dataset. So, how can I get these image features? Could you give me some codebase?

    opened by ShanZard 0
  • how to set the model ensemble?

    how to set the model ensemble?

    hello, I see the model ensembling method in the paper, which is helpful to improve the model performance. So I want to know how to implement the method of model enhancement?Thanks very much.

    opened by fjqfjqfjqfjqfjqfjqfjq 0
  • transformer_nsc

    transformer_nsc

    你好,我在运行transformer_nsc.yml时,碰到以下问题,请问怎么解决? Hugginface transformers not installed; please visit https://github.com/huggingface/transformers meshed-memory-transformer not installed; please run pip install git+https://github.com/ruotianluo/meshed-memory-transformer.git Warning: coco-caption not available

    opened by bai-24 11
  • Running Inference using CPU only.

    Running Inference using CPU only.

    I am not able to run inference on the model provided in the demo using CPU only. I am getting cuda device not available error. I tried removing all gpu and cuda references in the code and replacing it with CPU but it still does not work. It would be really helpful if you could help understand how to run the demo code without GPU.

    Thanks.

    opened by harindercnvrg 0
Releases(3.2)
  • 3.2(May 29, 2020)

    1. Faster beam search
    2. support h5 feature file
    3. allow beam search + scst (doesn't work as well though)
    4. Add a few models, BertCapModel and m2transformer (usefulness still question marked)
    5. Add projects.
    Source code(tar.gz)
    Source code(zip)
  • v3.1(Jan 10, 2020)

    1. Since it's 2020, py3 is officially supported. Open an issue if there is still something wrong.
    2. Finally, there is a model zoo which is relatively complete. Feel free to try the provided models.
    Source code(tar.gz)
    Source code(zip)
  • 3(Dec 31, 2019)

    1. Add structure loss inspired by Classical Structured Prediction Losses for Sequence to Sequence Learning
    2. Add a function of sample n captions. Support methods described in https://www.dropbox.com/s/tdqr9efrjdkeicz/iccv.pdf?dl=0.
    3. More pytorchy design of dataloader. Also, the dataloader now don't repeat image features according to seq_per_img. The repeating is now moved to the model forward function.
    4. Add multi-sentence sampling evaluation metrics like mBleu, Self-CIDEr etc. (those described in https://www.dropbox.com/s/tdqr9efrjdkeicz/iccv.pdf?dl=0)
    5. Use detectron type of config to setup experiments.
    6. A better self critical objective. (Named as new_self_critical now.) Use config ymls that end with nsc to test the performance. A technical report will be out soon. Basically, it performs better than original SCST on all metrics (by a small margin), but also faster (by a little bit).
    Source code(tar.gz)
    Source code(zip)
  • 2.2(Jun 25, 2019)

    1 Refactor the code a little bit. 2 Add BPE (didn’t seem to work much different) 3 Add nucleus sampling, topk and gumbel softmax sampling. 4 Make AttEnsemble compatible with transformer 5 Add remove bad ending from Improving Reinforcement Learning Based Image Captioning with Natural Language Prior

    Source code(tar.gz)
    Source code(zip)
  • 2.1(Jun 25, 2019)

  • 2.0.0(Apr 29, 2018)

    1. Add support for bleu4 optimization or combination of bleu4 and cider
    2. Add bottom-up feature support
    3. Add ensemble during evaluation.
    4. Add multi-gpu support.
    5. Add miscellaneous things. (box features; experimental models etc.)
    Source code(tar.gz)
    Source code(zip)
  • 1.0(Apr 28, 2018)

Owner
Ruotian(RT) Luo
Phd student at TTIC
Ruotian(RT) Luo
Individual Treatment Effect Estimation

CAPE Individual Treatment Effect Estimation Run CAPE python train_causal.py --loop 10 -m cape_cau -d NI --i_t 1 Run a baseline model python train_cau

S. Deng 4 Sep 02, 2022
Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

youceF 1 Nov 12, 2021
Official repository for "On Generating Transferable Targeted Perturbations" (ICCV 2021)

On Generating Transferable Targeted Perturbations (ICCV'21) Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Fatih Porikli Paper:

Muzammal Naseer 46 Nov 17, 2022
Towards Part-Based Understanding of RGB-D Scans

Towards Part-Based Understanding of RGB-D Scans (CVPR 2021) We propose the task of part-based scene understanding of real-world 3D environments: from

26 Nov 23, 2022
PyTorch implementation of the ACL, 2021 paper Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks This repo contains the PyTorch implementation of the ACL, 2021 pa

Rabeeh Karimi Mahabadi 98 Dec 28, 2022
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Welcome to AirSim AirSim is a simulator for drones, cars and more, built on Unreal Engine (we now also have an experimental Unity release). It is open

Microsoft 13.8k Jan 03, 2023
MoveNetを用いたPythonでの姿勢推定のデモ

MoveNet-Python-Example MoveNetのPythonでの動作サンプルです。 ONNXに変換したモデルも同梱しています。変換自体を試したい方はMoveNet_tf2onnx.ipynbを使用ください。 2021/08/24時点でTensorFlow Hubで提供されている以下モデ

KazuhitoTakahashi 38 Dec 17, 2022
The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation"

RegSeg The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation" Paper: arxiv D block Decoder Setup Install the

Roland 61 Dec 27, 2022
PyTorch implementation of Trust Region Policy Optimization

PyTorch implementation of TRPO Try my implementation of PPO (aka newer better variant of TRPO), unless you need to you TRPO for some specific reasons.

Ilya Kostrikov 366 Nov 15, 2022
A fast python implementation of Ray Tracing in One Weekend using python and Taichi

ray-tracing-one-weekend-taichi A fast python implementation of Ray Tracing in One Weekend using python and Taichi. Taichi is a simple "Domain specific

157 Dec 26, 2022
Rethinking Transformer-based Set Prediction for Object Detection

Rethinking Transformer-based Set Prediction for Object Detection Here are the code for the ICCV paper. The code is adapted from Detectron2 and AdelaiD

Zhiqing Sun 62 Dec 03, 2022
The CLRS Algorithmic Reasoning Benchmark

Learning representations of algorithms is an emerging area of machine learning, seeking to bridge concepts from neural networks with classical algorithms.

DeepMind 251 Jan 05, 2023
This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer Capacitor domain using text similarity indexes: An experimental analysis "

kwd-extraction-study This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer

ping 543f 1 Dec 05, 2022
CausaLM: Causal Model Explanation Through Counterfactual Language Models

CausaLM: Causal Model Explanation Through Counterfactual Language Models Authors: Amir Feder, Nadav Oved, Uri Shalit, Roi Reichart Abstract: Understan

Amir Feder 39 Jul 10, 2022
TensorFlow 2 AI/ML library wrapper for openFrameworks

ofxTensorFlow2 This is an openFrameworks addon for the TensorFlow 2 ML (Machine Learning) library

Center for Art and Media Karlsruhe 96 Dec 31, 2022
Indices Matter: Learning to Index for Deep Image Matting

IndexNet Matting This repository includes the official implementation of IndexNet Matting for deep image matting, presented in our paper: Indices Matt

Hao Lu 357 Nov 26, 2022
Gapmm2: gapped alignment using minimap2 (align transcripts to genome)

gapmm2: gapped alignment using minimap2 This tool is a wrapper for minimap2 to r

Jon Palmer 2 Jan 27, 2022
Code repo for EMNLP21 paper "Zero-Shot Information Extraction as a Unified Text-to-Triple Translation"

Zero-Shot Information Extraction as a Unified Text-to-Triple Translation Source code repo for paper Zero-Shot Information Extraction as a Unified Text

cgraywang 88 Dec 31, 2022
Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

SMSR Reposity for "Exploring Sparsity in Image Super-Resolution for Efficient Inference" [arXiv] Highlights Locate and skip redundant computation in S

Longguang Wang 225 Dec 26, 2022
ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

ALFRED A Benchmark for Interpreting Grounded Instructions for Everyday Tasks Mohit Shridhar, Jesse Thomason, Daniel Gordon, Yonatan Bisk, Winson Han,

ALFRED 204 Dec 15, 2022