3D ResNets for Action Recognition (CVPR 2018)

Overview

3D ResNets for Action Recognition

Update (2020/4/13)

We published a paper on arXiv.

Hirokatsu Kataoka, Tenga Wakamiya, Kensho Hara, and Yutaka Satoh,
"Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs",
arXiv preprint, arXiv:2004.04968, 2020.

We uploaded the pretrained models described in this paper including ResNet-50 pretrained on the combined dataset with Kinetics-700 and Moments in Time.

Update (2020/4/10)

We significantly updated our scripts. If you want to use older versions to reproduce our CVPR2018 paper, you should use the scripts in the CVPR2018 branch.

This update includes as follows:

  • Refactoring whole project
  • Supporting the newer PyTorch versions
  • Supporting distributed training
  • Supporting training and testing on the Moments in Time dataset.
  • Adding R(2+1)D models
  • Uploading 3D ResNet models trained on the Kinetics-700, Moments in Time, and STAIR-Actions datasets

Summary

This is the PyTorch code for the following papers:

Hirokatsu Kataoka, Tenga Wakamiya, Kensho Hara, and Yutaka Satoh,
"Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs",
arXiv preprint, arXiv:2004.04968, 2020.

Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh,
"Towards Good Practice for Action Recognition with Spatiotemporal 3D Convolutions",
Proceedings of the International Conference on Pattern Recognition, pp. 2516-2521, 2018.

Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh,
"Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?",
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6546-6555, 2018.

Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh,
"Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition",
Proceedings of the ICCV Workshop on Action, Gesture, and Emotion Recognition, 2017.

This code includes training, fine-tuning and testing on Kinetics, Moments in Time, ActivityNet, UCF-101, and HMDB-51.

Citation

If you use this code or pre-trained models, please cite the following:

@inproceedings{hara3dcnns,
  author={Kensho Hara and Hirokatsu Kataoka and Yutaka Satoh},
  title={Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={6546--6555},
  year={2018},
}

Pre-trained models

Pre-trained models are available here.
All models are trained on Kinetics-700 (K), Moments in Time (M), STAIR-Actions (S), or merged datasets of them (KM, KS, MS, KMS).
If you want to finetune the models on your dataset, you should specify the following options.

r3d18_K_200ep.pth: --model resnet --model_depth 18 --n_pretrain_classes 700
r3d18_KM_200ep.pth: --model resnet --model_depth 18 --n_pretrain_classes 1039
r3d34_K_200ep.pth: --model resnet --model_depth 34 --n_pretrain_classes 700
r3d34_KM_200ep.pth: --model resnet --model_depth 34 --n_pretrain_classes 1039
r3d50_K_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 700
r3d50_KM_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 1039
r3d50_KMS_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 1139
r3d50_KS_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 800
r3d50_M_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 339
r3d50_MS_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 439
r3d50_S_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 100
r3d101_K_200ep.pth: --model resnet --model_depth 101 --n_pretrain_classes 700
r3d101_KM_200ep.pth: --model resnet --model_depth 101 --n_pretrain_classes 1039
r3d152_K_200ep.pth: --model resnet --model_depth 152 --n_pretrain_classes 700
r3d152_KM_200ep.pth: --model resnet --model_depth 152 --n_pretrain_classes 1039
r3d200_K_200ep.pth: --model resnet --model_depth 200 --n_pretrain_classes 700
r3d200_KM_200ep.pth: --model resnet --model_depth 200 --n_pretrain_classes 1039

Old pretrained models are still available here.
However, some modifications are required to use the old pretrained models in the current scripts.

Requirements

conda install pytorch torchvision cudatoolkit=10.1 -c soumith
  • FFmpeg, FFprobe

  • Python 3

Preparation

ActivityNet

  • Download videos using the official crawler.
  • Convert from avi to jpg files using util_scripts/generate_video_jpgs.py
python -m util_scripts.generate_video_jpgs mp4_video_dir_path jpg_video_dir_path activitynet
  • Add fps infomartion into the json file util_scripts/add_fps_into_activitynet_json.py
python -m util_scripts.add_fps_into_activitynet_json mp4_video_dir_path json_file_path

Kinetics

  • Download videos using the official crawler.
    • Locate test set in video_directory/test.
  • Convert from avi to jpg files using util_scripts/generate_video_jpgs.py
python -m util_scripts.generate_video_jpgs mp4_video_dir_path jpg_video_dir_path kinetics
  • Generate annotation file in json format similar to ActivityNet using util_scripts/kinetics_json.py
    • The CSV files (kinetics_{train, val, test}.csv) are included in the crawler.
python -m util_scripts.kinetics_json csv_dir_path 700 jpg_video_dir_path jpg dst_json_path

UCF-101

  • Download videos and train/test splits here.
  • Convert from avi to jpg files using util_scripts/generate_video_jpgs.py
python -m util_scripts.generate_video_jpgs avi_video_dir_path jpg_video_dir_path ucf101
  • Generate annotation file in json format similar to ActivityNet using util_scripts/ucf101_json.py
    • annotation_dir_path includes classInd.txt, trainlist0{1, 2, 3}.txt, testlist0{1, 2, 3}.txt
python -m util_scripts.ucf101_json annotation_dir_path jpg_video_dir_path dst_json_path

HMDB-51

  • Download videos and train/test splits here.
  • Convert from avi to jpg files using util_scripts/generate_video_jpgs.py
python -m util_scripts.generate_video_jpgs avi_video_dir_path jpg_video_dir_path hmdb51
  • Generate annotation file in json format similar to ActivityNet using util_scripts/hmdb51_json.py
    • annotation_dir_path includes brush_hair_test_split1.txt, ...
python -m util_scripts.hmdb51_json annotation_dir_path jpg_video_dir_path dst_json_path

Running the code

Assume the structure of data directories is the following:

~/
  data/
    kinetics_videos/
      jpg/
        .../ (directories of class names)
          .../ (directories of video names)
            ... (jpg files)
    results/
      save_100.pth
    kinetics.json

Confirm all options.

python main.py -h

Train ResNets-50 on the Kinetics-700 dataset (700 classes) with 4 CPU threads (for data loading).
Batch size is 128.
Save models at every 5 epochs. All GPUs is used for the training. If you want a part of GPUs, use CUDA_VISIBLE_DEVICES=....

python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json \
--result_path results --dataset kinetics --model resnet \
--model_depth 50 --n_classes 700 --batch_size 128 --n_threads 4 --checkpoint 5

Continue Training from epoch 101. (~/data/results/save_100.pth is loaded.)

python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json \
--result_path results --dataset kinetics --resume_path results/save_100.pth \
--model_depth 50 --n_classes 700 --batch_size 128 --n_threads 4 --checkpoint 5

Calculate top-5 class probabilities of each video using a trained model (~/data/results/save_200.pth.)
Note that inference_batch_size should be small because actual batch size is calculated by inference_batch_size * (n_video_frames / inference_stride).

python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json \
--result_path results --dataset kinetics --resume_path results/save_200.pth \
--model_depth 50 --n_classes 700 --n_threads 4 --no_train --no_val --inference --output_topk 5 --inference_batch_size 1

Evaluate top-1 video accuracy of a recognition result (~/data/results/val.json).

python -m util_scripts.eval_accuracy ~/data/kinetics.json ~/data/results/val.json --subset val -k 1 --ignore

Fine-tune fc layers of a pretrained model (~/data/models/resnet-50-kinetics.pth) on UCF-101.

python main.py --root_path ~/data --video_path ucf101_videos/jpg --annotation_path ucf101_01.json \
--result_path results --dataset ucf101 --n_classes 101 --n_pretrain_classes 700 \
--pretrain_path models/resnet-50-kinetics.pth --ft_begin_module fc \
--model resnet --model_depth 50 --batch_size 128 --n_threads 4 --checkpoint 5
Comments
  • question about the 'Temporal duration of inputs'

    question about the 'Temporal duration of inputs'

    [email protected] , in the opts.py ,whether I can change temporal duration of inputs in parser.add_argument('--sample_duration', default=16, type=int, help='Temporal duration of inputs'),like 32 frames,64 frames,etc? have you take the similar experiments? I really appreciate for your reply, Thanks.

    opened by sophiazy 25
  • Performance of pretrained weights on UCF101

    Performance of pretrained weights on UCF101

    Hi, Nice work! I have a question about your results on UCF101 split 1. I've evaluated your pretrained weight of "resnext-101-kinetics-ucf101_split1.pth" on UCF101 split 1 and got the accuracy of ~85.99. I'm wondering if it is the right accuracy or not. Would you please provide the accuracies of the pretrained models?

    opened by MohsenFayyaz89 21
  • Train from scratch on UCF101 using ResNet18 and get 10% gain without doing anything

    Train from scratch on UCF101 using ResNet18 and get 10% gain without doing anything

    I train and evaluate the code and get 10% gain without changing anything. Here is my process:

    1. Parse video data using codes from READ.ME.
    2. Train the model using python3 main.py --root_path ./datasets/ --video_path UCF101/jpg --annotation_path ucf101_01.json --result_path results --dataset ucf101 --model resnet --model_depth 18 --n_classes 101 --batch_size 16 and get the model result datasets/desults/save_200.pth.
    3. Test the dataset using python3 main.py --root_path ./datasets/ --video_path UCF101/jpg --annotation_path ucf101_01.json --result_path results --dataset ucf101 --resume_path results/save_200.pth --model resnet --model_depth 18 --n_classes 101 --batch_size 16 --no_train --test and get the result in val.json and the command window says the clip accuracy is 0.346.
    4. Get the accuracy using eval_ucf101.py to compare ucf101_01.json with val.json and the top1 accuracy for video is 52.66%, which is about 10% over 42.4% (reported in the paper).

    I only use UCF101 split_01 so there are no overlap videos in train and test data. It is a little bit strange, and is it because in the paper, the training procedure did not last for 200 epochs? My platform is Pytorch 0.4 and I only modified one place to avoid the error, which is reported in another issue.

    opened by BestJuly 16
  • RuntimeError: invalid argument 1: must be strictly positive at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/TH/generic/THTensorMath.c:2247

    RuntimeError: invalid argument 1: must be strictly positive at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/TH/generic/THTensorMath.c:2247

    Hi dear Need help....on running main.py ,everything is going well till dataset loading as shown below: model generated dataset loading [0/9537] dataset loading [1000/9537] dataset loading [2000/9537] dataset loading [3000/9537] dataset loading [4000/9537] dataset loading [5000/9537] dataset loading [6000/9537] dataset loading [7000/9537] dataset loading [8000/9537] dataset loading [9000/9537] dataset loading [0/3783] dataset loading [1000/3783] dataset loading [2000/3783] dataset loading [3000/3783] run

    error occured here:

    train at epoch 1 Traceback (most recent call last): File "/media/psrana/New Volume/chandni/HAR_3D_TU/main.py", line 139, in train_logger, train_batch_logger) File "/media/psrana/New Volume/chandni/HAR_3D_TU/train.py", line 22, in train_epoch for i, (inputs, targets) in enumerate(data_loader): File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 417, in iter return DataLoaderIter(self) File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 242, in init self._put_indices() File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 290, in _put_indices indices = next(self.sample_iter, None) File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 119, in iter for idx in self.sampler: File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 50, in iter return iter(torch.randperm(len(self.data_source)).long()) RuntimeError: invalid argument 1: must be strictly positive at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/TH/generic/THTensorMath.c:2247

    what could be the

    opened by chandnikathuria1992 12
  • Very Slow Training

    Very Slow Training

    I am training Resnet with depth 34 on the kinetics dataset, however the training procedure is not improving anything. How long does it take till the model starts improving ? I have attached a screenshot; currently I am at epoch 34 but the loss is still 5.99 and is not decreasing, and accuracy is very volatile

    selection_121

    opened by cryptedp 11
  • RuntimeError: expected a non-empty list of Tensors

    RuntimeError: expected a non-empty list of Tensors

    Traceback (most recent call last): File "main.py", line 129, in train_logger, train_batch_logger) File "/home/hareesh/Downloads/3D-ResNets-PyTorch-master/train.py", line 22, in train_epoch for i, (inputs, targets) in enumerate(data_loader): File "/home/hareesh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 286, in next return self._process_next_batch(batch) File "/home/hareesh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 307, in _process_next_batch raise batch.exc_type(batch.exc_msg) RuntimeError: Traceback (most recent call last): File "/home/hareesh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 57, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/hareesh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 57, in samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/hareesh/Downloads/3D-ResNets-PyTorch-master/datasets/ucf101.py", line 193, in getitem clip = torch.stack(clip, 0).permute(1, 0, 2, 3) RuntimeError: expected a non-empty list of Tensors

    Please let me know cause of this error.

    opened by hareeshdevarakonda 10
  • Input of Densenet

    Input of Densenet

    Thank you for your wonderful work. Just read the paper, it is noted that each clip contains 16 frames. I read two other papers in which the author claims that 32 frame input would be better, have you tried 32 frames input? If you trained such models, can you please release the pretrained models?

    opened by Tord-Zhang 10
  • Asking about the using of 3D ResNet on video sequence

    Asking about the using of 3D ResNet on video sequence

    Hello,

    I'm new in this kind of 3D Convolution, so, I'm trying to understand how does this works. My dataset (UNBC McMaster) is including some videos that contains sequence of frames. For each frame, we have one pain intensity level. Now, I want to use 3D ResNet to predict pain level as regression problem. So, let says we have a sequence including 32 frames, which mean I have 32 labels for this sequence. Normally, with CNN + LSTM, I will use CNN to extract features and then put it through LSTM, take the output and label of the last frame to compute loss. So, for 3D ResNet, should I take the output of the model and the label of last frame to calculate loss ?

    opened by glmanhtu 9
  • i have some problem do a test

    i have some problem do a test

    I would like to test network on UCF101 after finetune in UCF101 using pretrained kinetics you provided.

    i use resnet18. I want to get the accuracy of a video unit, not clip.

    i use instruction below

    python main.py --root_path UCF101 --video_path jpg --annotation_path ucf101_01.json --result_path test --dataset ucf101 --model resnet --model_depth 18 --n_classes 101 --batch_size 64 --n_threads 4 --pretrain_path 18result1s/save_200.pth --no_train --no_val --test --test_subset val --n_finetune_classes 101

    jpg is the same as your data directory. 18results1s/save_200.pth is pretrained on Kinetics, finetune in UCF101's network.

    it make a error.

    run dataset loading [0/3783] dataset loading [1000/3783] dataset loading [2000/3783] dataset loading [3000/3783] test test.py:42: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead. inputs = Variable(inputs, volatile=True) test.py:45: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. outputs = F.softmax(outputs) Traceback (most recent call last): File "main.py", line 162, in test.test(test_loader, model, opt, test_data.class_names) File "test.py", line 50, in test test_results, class_names) File "test.py", line 20, in calculate_video_results 'label': class_names[locs[i]], KeyError: tensor(12)

    KeyError : tensor(12) , i change my instruction, the number in parentheses is change, maybe.

    can you help me?

    opened by lee2h 9
  • Performance of fine-tuning on UCF101

    Performance of fine-tuning on UCF101

    I downloaded the network ResNet-101 pretrained on Kinetics, and fine-tuned on UCF101 following the example script. However, I can only get 82.5 by averaging the three splits. In the paper, the authors reported 88.9. Any suggestion?

    opened by zhihuilics 9
  • Pretrain models cannot download

    Pretrain models cannot download

    Hello, I am making a demo about 3D convolution network. After reading your CVPR paper, I am happy to use your pretrain models. However, when I try to open your link, I get "The folder has been put in the recycle bin.". These pretrain models are extramely important for my program, because I don't have many GPU to train the network from scratch. Please give me a chance to use res3D convolution network...... @kenshohara

    opened by KeCh96 8
  • main.py: error: unrecognized arguments: hmdb51_3.json

    main.py: error: unrecognized arguments: hmdb51_3.json

    I'm trying to train resnet on hmdb51. So I've 3 json files. But whichever I'm passing with the --annotation_path argument, it's keep giving this error. Please help!

    opened by soumyadbanik 2
  • ABOUT ucf101_json

    ABOUT ucf101_json

    I used python script to generate three json files of UCF101, they are ucf101_01.json,ucf101_01.json,ucf101_03.json I run the main function to train resnet python main.py --root_path ./data --video_path ucf101_jpg/ --annotation_path ucf101_json/ucf101_01.json
    --result_path results --dataset ucf101 --n_classes 101 --model resnet --model_depth 50 --batch_size 64
    --n_threads 4 --checkpoint 5

    I want to know when ucf101_02.json and ucf101_03.json should be used,thank you very much!!!

    opened by theones-g 0
  • Image Resolution 112*112

    Image Resolution 112*112

    I wanted to be able to input larger image resolutions. However when I do input image size of 480*480 it takes almost 10 minutes to process a tiny 10 second clip.

    It seems when I increase image size, the model inference run-time become exponentially greater.

    There is crucial motion information being lost when I downscale my images to 112*112 and it is effecting the precision of the model on my test sets.

    Is there any alternative model or method that will allow me to proceed with larger image resolutions using the 3D-ResNet model?

    Is it practical to use 3D-CNN with input sizes of 480*480 images for video classification tasks?

    opened by darshvirbelandis 1
  • Why is opt.n_val_samples 3

    Why is opt.n_val_samples 3

    I found that when fine-tuning UCF101, split1 partition was used to verify that the number in the dataset was 11349 instead of 3783, and why was batchsize opt.batch_size // opt.n_val_samples?

    opened by YTHmamba 0
  • AssertionError when I inference

    AssertionError when I inference

    I used the r2p1d18_K_200ep.pth and finetune it on hmdb51 dataset,and when I want to use it to inference there is an AssertionError `CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --root_path /home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data --video_path hmdb51-videos/jpg --annotation_path hmdb51_1.json \

    --result_path results --dataset hmdb51 --resume_path results/save_200.pth
    --model_depth 18 --n_classes 51 --n_threads 4 --no_train --no_val --inference --output_topk 5 --inference_batch_size 1 Namespace(accimage=False, annotation_path=PosixPath('/home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data/hmdb51_1.json'), arch='resnet-18', batch_size=128, batchnorm_sync=False, begin_epoch=1, checkpoint=10, colorjitter=False, conv1_t_size=7, conv1_t_stride=1, dampening=0.0, dataset='hmdb51', dist_url='tcp://127.0.0.1:23456', distributed=False, file_type='jpg', ft_begin_module='', inference=True, inference_batch_size=1, inference_crop='center', inference_no_average=False, inference_stride=16, inference_subset='val', input_type='rgb', learning_rate=0.1, lr_scheduler='multistep', manual_seed=1, mean=[0.4345, 0.4051, 0.3775], mean_dataset='kinetics', model='resnet', model_depth=18, momentum=0.9, multistep_milestones=[50, 100, 150], n_classes=51, n_epochs=200, n_input_channels=3, n_pretrain_classes=0, n_threads=4, n_val_samples=3, nesterov=False, no_cuda=False, no_hflip=False, no_max_pool=False, no_mean_norm=False, no_std_norm=False, no_train=True, no_val=True, optimizer='sgd', output_topk=5, overwrite_milestones=False, plateau_patience=10, pretrain_path=None, resnet_shortcut='B', resnet_widen_factor=1.0, resnext_cardinality=32, result_path=PosixPath('/home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data/results'), resume_path=PosixPath('/home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data/results/save_200.pth'), root_path=PosixPath('/home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data'), sample_duration=16, sample_size=112, sample_t_stride=1, std=[0.2768, 0.2713, 0.2737], tensorboard=False, train_crop='random', train_crop_min_ratio=0.75, train_crop_min_scale=0.25, train_t_crop='random', value_scale=1, video_path=PosixPath('/home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data/hmdb51-videos/jpg'), weight_decay=0.001, wide_resnet_k=2, world_size=-1) loading checkpoint /home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data/results/save_200.pth model Traceback (most recent call last): File "main.py", line 428, in main_worker(-1, opt) File "main.py", line 345, in main_worker model = resume_model(opt.resume_path, opt.arch, model) File "main.py", line 89, in resume_model assert arch == checkpoint['arch'] AssertionError `

    opened by z369437558 0
Owner
Kensho Hara
Kensho Hara
Our CIKM21 Paper "Incorporating Query Reformulating Behavior into Web Search Evaluation"

Reformulation-Aware-Metrics Introduction This codebase contains source-code of the Python-based implementation of our CIKM 2021 paper. Chen, Jia, et a

xuanyuan14 5 Mar 05, 2022
Datasets, tools, and benchmarks for representation learning of code.

The CodeSearchNet challenge has been concluded We would like to thank all participants for their submissions and we hope that this challenge provided

GitHub 1.8k Dec 25, 2022
Privacy as Code for DSAR Orchestration: Privacy Request automation to fulfill GDPR, CCPA, and LGPD data subject requests.

Meet Fidesops: Privacy as Code for DSAR Orchestration A part of the greater Fides ecosystem. ⚡ Overview Fidesops (fee-dez-äps, combination of the Lati

Ethyca 44 Dec 06, 2022
LSTM Neural Networks for Spectroscopic Studies of Type Ia Supernovae

Package Description The difficulties in acquiring spectroscopic data have been a major challenge for supernova surveys. snlstm is developed to provide

7 Oct 11, 2022
Neural Style and MSG-Net

PyTorch-Style-Transfer This repo provides PyTorch Implementation of MSG-Net (ours) and Neural Style (Gatys et al. CVPR 2016), which has been included

Hang Zhang 904 Dec 21, 2022
Deep Video Matting via Spatio-Temporal Alignment and Aggregation [CVPR2021]

Deep Video Matting via Spatio-Temporal Alignment and Aggregation [CVPR2021] Paper: https://arxiv.org/abs/2104.11208 Introduction Despite the significa

76 Dec 07, 2022
Jittor Medical Segmentation Lib -- The assignment of Pattern Recognition course (2021 Spring) in Tsinghua University

THU模式识别2021春 -- Jittor 医学图像分割 模型列表 本仓库收录了课程作业中同学们采用jittor框架实现的如下模型: UNet SegNet DeepLab V2 DANet EANet HarDNet及其改动HarDNet_alter PSPNet OCNet OCRNet DL

48 Dec 26, 2022
DualGAN-tensorflow: tensorflow implementation of DualGAN

ICCV paper of DualGAN DualGAN: unsupervised dual learning for image-to-image translation please cite the paper, if the codes has been used for your re

Jack Yi 252 Nov 10, 2022
Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Zhensu Sun 1 Oct 26, 2021
FaceAPI: AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using TensorFlow/JS

FaceAPI AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using

Vladimir Mandic 395 Dec 29, 2022
LaneAF: Robust Multi-Lane Detection with Affinity Fields

LaneAF: Robust Multi-Lane Detection with Affinity Fields This repository contains Pytorch code for training and testing LaneAF lane detection models i

155 Dec 17, 2022
Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.

Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size. The hub data layout enables rapid transformations and streaming of data while training m

Activeloop 5.1k Jan 08, 2023
[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Dec 29, 2022
A lightweight library to compare different PyTorch implementations of the same network architecture.

TorchBug is a lightweight library designed to compare two PyTorch implementations of the same network architecture. It allows you to count, and compar

Arjun Krishnakumar 5 Jan 02, 2023
Object Detection using YOLO from PyImageSearch

Object Detection using YOLO from PyImageSearch By applying object detection, you’ll not only be able to determine what is in an image, but also where

Mohamed NIANG 1 Feb 09, 2022
FB-tCNN for SSVEP Recognition

FB-tCNN for SSVEP Recognition Here are the codes of the tCNN and FB-tCNN in the paper "Filter Bank Convolutional Neural Network for Short Time-Window

Wenlong Ding 12 Dec 14, 2022
A computational block to solve entity alignment over textual attributes in a knowledge graph creation pipeline.

How to apply? Create your config.ini file following the example provided in config.ini Choose one of the options below to run: Run with Python3 pip in

Scientific Data Management Group 3 Jun 23, 2022
Monitor your ML jobs on mobile devices📱, especially for Google Colab / Kaggle

TF Watcher TF Watcher is a simple to use Python package and web app which allows you to monitor 👀 your Machine Learning training or testing process o

Rishit Dagli 54 Nov 01, 2022
Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning"

Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning" This is the code for the paper Solving Graph-based Public Goo

Victor-Alexandru Darvariu 3 Dec 05, 2022
Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021)

Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021) Tensorflow implementation of Bridging the Gap between Label- and Reference-ba

huangqiusheng 8 Jul 13, 2022