This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend.

Overview

YOLOv2 in Keras and Applications

This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend. It supports training YOLOv2 network with various backends such as MobileNet and InceptionV3. Links to demo applications are shown below. Check out https://experiencor.github.io/yolo_demo/demo.html for a Raccoon Detector demo run entirely in brower with DeepLearn.js and MobileNet backend (it somehow breaks in Window). Source code of this demo is located at https://git.io/vF7vG.

Todo list:

  • Warmup training
  • Raccoon detection, Self-driving car, and Kangaroo detection
  • SqueezeNet, MobileNet, InceptionV3, and ResNet50 backends
  • Support python 2.7 and 3.6
  • Multiple-GPU training
  • Multiscale training
  • mAP Evaluation

Some example applications (click for videos):

Raccon detection

Dataset => https://github.com/experiencor/raccoon_dataset

Kangaroo detection

Dataset => https://github.com/experiencor/kangaroo

Self-driving Car

Dataset => http://cocodataset.org/#detections-challenge2017

Red blod cell detection

Dataset => https://github.com/cosmicad/dataset

Hand detection

Dataset => http://cvrr.ucsd.edu/vivachallenge/index.php/hands/hand-detection/

Usage for python code

0. Requirement

python 2.7

keras >= 2.0.8

imgaug

1. Data preparation

Download the Raccoon dataset from from https://github.com/experiencor/raccoon_dataset.

Organize the dataset into 4 folders:

  • train_image_folder <= the folder that contains the train images.

  • train_annot_folder <= the folder that contains the train annotations in VOC format.

  • valid_image_folder <= the folder that contains the validation images.

  • valid_annot_folder <= the folder that contains the validation annotations in VOC format.

There is a one-to-one correspondence by file name between images and annotations. If the validation set is empty, the training set will be automatically splitted into the training set and validation set using the ratio of 0.8.

2. Edit the configuration file

The configuration file is a json file, which looks like this:

{
    "model" : {
        "architecture":         "Full Yolo",    # "Tiny Yolo" or "Full Yolo" or "MobileNet" or "SqueezeNet" or "Inception3"
        "input_size":           416,
        "anchors":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
        "max_box_per_image":    10,        
        "labels":               ["raccoon"]
    },

    "train": {
        "train_image_folder":   "/home/andy/data/raccoon_dataset/images/",
        "train_annot_folder":   "/home/andy/data/raccoon_dataset/anns/",      
          
        "train_times":          10,             # the number of time to cycle through the training set, useful for small datasets
        "pretrained_weights":   "",             # specify the path of the pretrained weights, but it's fine to start from scratch
        "batch_size":           16,             # the number of images to read in each batch
        "learning_rate":        1e-4,           # the base learning rate of the default Adam rate scheduler
        "nb_epoch":             50,             # number of epoches
        "warmup_epochs":        3,              # the number of initial epochs during which the sizes of the 5 boxes in each cell is forced to match the sizes of the 5 anchors, this trick seems to improve precision emperically

        "object_scale":         5.0 ,           # determine how much to penalize wrong prediction of confidence of object predictors
        "no_object_scale":      1.0,            # determine how much to penalize wrong prediction of confidence of non-object predictors
        "coord_scale":          1.0,            # determine how much to penalize wrong position and size predictions (x, y, w, h)
        "class_scale":          1.0,            # determine how much to penalize wrong class prediction

        "debug":                true            # turn on/off the line that prints current confidence, position, size, class losses and recall
    },

    "valid": {
        "valid_image_folder":   "",
        "valid_annot_folder":   "",

        "valid_times":          1
    }
}

The model section defines the type of the model to construct as well as other parameters of the model such as the input image size and the list of anchors. The labels setting lists the labels to be trained on. Only images, which has labels being listed, are fed to the network. The rest images are simply ignored. By this way, a Dog Detector can easily be trained using VOC or COCO dataset by setting labels to ['dog'].

Download pretrained weights for backend (tiny yolo, full yolo, squeezenet, mobilenet, and inceptionV3) at:

https://drive.google.com/drive/folders/10oym4eL2RxJa0gro26vzXK__TtYOP5Ng

These weights must be put in the root folder of the repository. They are the pretrained weights for the backend only and will be loaded during model creation. The code does not work without these weights.

The link to the pretrained weights for the whole model (both frontend and backend) of the raccoon detector can be downloaded at:

https://drive.google.com/drive/folders/10oym4eL2RxJa0gro26vzXK__TtYOP5Ng

These weights can be used as the pretrained weights for any one class object detectors.

3. Generate anchors for your dataset (optional)

python gen_anchors.py -c config.json

Copy the generated anchors printed on the terminal to the anchors setting in config.json.

4. Start the training process

python train.py -c config.json

By the end of this process, the code will write the weights of the best model to file best_weights.h5 (or whatever name specified in the setting "saved_weights_name" in the config.json file). The training process stops when the loss on the validation set is not improved in 3 consecutive epoches.

5. Perform detection using trained weights on an image by running

python predict.py -c config.json -w /path/to/best_weights.h5 -i /path/to/image/or/video

It carries out detection on the image and write the image with detected bounding boxes to the same folder.

Usage for jupyter notebook

Refer to the notebook (https://github.com/experiencor/basic-yolo-keras/blob/master/Yolo%20Step-by-Step.ipynb) for a complete walk-through implementation of YOLOv2 from scratch (training, testing, and scoring).

Evaluation of the current implementation:

Train Test mAP (with this implementation) mAP (on released weights)
COCO train COCO val 28.6 42.1

The code to evaluate detection results can be found at https://github.com/experiencor/basic-yolo-keras/issues/27.

Copyright

See LICENSE for details.

Comments
  • Quick Questions

    Quick Questions

    Hello

    Are you using Multiscale Training of Data..also You have Pretrained Weights on VOC Data ...Below is the Image of Blood Smear :- 111

    I Want to Detect The Purple Color and Red Color Cells...I have Done the Annotations ... I Only have 300 Images With Me with 15-20 Annotation in an Image...What do u Recommand ...

    opened by akshaylamba 32
  • Difficulty in training multiple classes....

    Difficulty in training multiple classes....

    ValueError: Cannot feed value of shape (30,) for Tensor u'Placeholder_41:0', which has shape '(35,)'

    Getting this error when I try to train two classes on tiny_yolo_raccoon.h5 pretrained weight when I train for one class its not a problem but when I do it for two this is the error I am getting. I want it for traffic sign detecting and classification with more than 10 classes can I do it also if anyone have weights for multi-class training please share it it would be much helpful.

    Can anyone help. Thanks in advance.

    opened by Dhagash4 27
  • Understand how to start

    Understand how to start

    Hi :) My goal is to understand how I can apply fine tuning to such neural network, and so I want to run your code and play with it ;) My problem is that deep learning is a new topic for my studies, so I have some problems to understand each part of your software. I have read the YOLO paper and I have an idea about how it works. I have prepared dataset, as you suggest in the section "Usage for python Code", but my first question is:

    • Who generates annotations from images in VOC format? Are annotations only labels for images? What is their meaning?

    Once that annotations are generated, I have to generate anchors. Does your gen_anchors script updates the raw about anchors in the conf.json file? Thank you so much for your response!

    opened by frasab 25
  • local variable epoch_logs is not assigned

    local variable epoch_logs is not assigned

    Hi @experiencor , i always have the same error appeared every time and i can't solve it the error is UnboundLocalError: local variable 'epoch_logs' referenced before assignment

    how to solve it.

    opened by AhmedAAkl 22
  • YOLO version 3 equivalent in Keras

    YOLO version 3 equivalent in Keras

    Can anybody help me in building YOLOv3 architecture in keras? here is the link I think its residual network https://github.com/pjreddie/darknet/blob/master/cfg/yolov3.cfg @experiencor

    opened by hiba007 19
  • No module named expat; use SimpleXMLTreeBuilder instead error

    No module named expat; use SimpleXMLTreeBuilder instead error

    Hi, When I run python train.py -c config.json. The following error appears:

    Using TensorFlow backend.
    Traceback (most recent call last):
      File "train.py", line 140, in <module>
        _main_(args)
      File "train.py", line 76, in _main_
        config['model']['labels'])
      File "/home/julio_jcgc/yolo_tutorial/basic-yolo-keras/preprocessing.py", line 21, in parse_annotation
        tree = ET.parse(ann_dir + ann)
      File "/opt/bitnami/python/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
        tree.parse(source, parser)
      File "/opt/bitnami/python/lib/python2.7/xml/etree/ElementTree.py", line 651, in parse
        parser = XMLParser(target=TreeBuilder())
      File "/opt/bitnami/python/lib/python2.7/xml/etree/ElementTree.py", line 1476, in __init__
        "No module named expat; use SimpleXMLTreeBuilder instead"
    ImportError: No module named expat; use SimpleXMLTreeBuilder instead
    

    Thank you.

    opened by jcgarciaca 18
  • Load VOC 2007+2012 weights

    Load VOC 2007+2012 weights

    I am trying to load the VOC YOLOv2 weights from the yolo website. I am working in the jupyter notebook provided in this repository. Here are the modified parameters:

    LABELS = ['Person', 'Car', 'Bicycle', 'Bus', 'Motorbike', 'Train', 'Aeroplane', 'Chair', 'Bottle', 'Dining Table', 'Potted Plant', 'TV/Monitor', 'Sofa', 'Bird', 'Cat', 'Cow', 'Dog', 'Horse', 'Sheep']
    
    IMAGE_H, IMAGE_W = 416, 416
    GRID_H,  GRID_W  = 13 , 13
    BOX              = 5
    CLASS            = len(LABELS)
    CLASS_WEIGHTS    = np.ones(CLASS, dtype='float32')
    OBJ_THRESHOLD    = 0.3#0.5
    NMS_THRESHOLD    = 0.3#0.45
    ANCHORS          = [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828]
    
    NO_OBJECT_SCALE  = 1.0
    OBJECT_SCALE     = 5.0
    COORD_SCALE      = 1.0
    CLASS_SCALE      = 1.0
    
    BATCH_SIZE       = 16
    WARM_UP_BATCHES  = 0
    TRUE_BOX_BUFFER  = 50
    

    Here is the model summary:

    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    input_5 (InputLayer)            (None, 416, 416, 3)  0                                            
    __________________________________________________________________________________________________
    conv_0 (Conv2D)                 (None, 416, 416, 32) 864         input_5[0][0]                    
    __________________________________________________________________________________________________
    batch_norm_0 (BatchNormalizatio (None, 416, 416, 32) 128         conv_0[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_45 (LeakyReLU)      (None, 416, 416, 32) 0           batch_norm_0[0][0]               
    __________________________________________________________________________________________________
    max_pooling2d_11 (MaxPooling2D) (None, 208, 208, 32) 0           leaky_re_lu_45[0][0]             
    __________________________________________________________________________________________________
    conv_1 (Conv2D)                 (None, 208, 208, 64) 18432       max_pooling2d_11[0][0]           
    __________________________________________________________________________________________________
    batch_norm_1 (BatchNormalizatio (None, 208, 208, 64) 256         conv_1[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_46 (LeakyReLU)      (None, 208, 208, 64) 0           batch_norm_1[0][0]               
    __________________________________________________________________________________________________
    max_pooling2d_12 (MaxPooling2D) (None, 104, 104, 64) 0           leaky_re_lu_46[0][0]             
    __________________________________________________________________________________________________
    conv_2 (Conv2D)                 (None, 104, 104, 128 73728       max_pooling2d_12[0][0]           
    __________________________________________________________________________________________________
    batch_norm_2 (BatchNormalizatio (None, 104, 104, 128 512         conv_2[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_47 (LeakyReLU)      (None, 104, 104, 128 0           batch_norm_2[0][0]               
    __________________________________________________________________________________________________
    conv_3 (Conv2D)                 (None, 104, 104, 64) 8192        leaky_re_lu_47[0][0]             
    __________________________________________________________________________________________________
    batch_norm_3 (BatchNormalizatio (None, 104, 104, 64) 256         conv_3[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_48 (LeakyReLU)      (None, 104, 104, 64) 0           batch_norm_3[0][0]               
    __________________________________________________________________________________________________
    conv_4 (Conv2D)                 (None, 104, 104, 128 73728       leaky_re_lu_48[0][0]             
    __________________________________________________________________________________________________
    batch_norm_4 (BatchNormalizatio (None, 104, 104, 128 512         conv_4[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_49 (LeakyReLU)      (None, 104, 104, 128 0           batch_norm_4[0][0]               
    __________________________________________________________________________________________________
    max_pooling2d_13 (MaxPooling2D) (None, 52, 52, 128)  0           leaky_re_lu_49[0][0]             
    __________________________________________________________________________________________________
    conv_5 (Conv2D)                 (None, 52, 52, 256)  294912      max_pooling2d_13[0][0]           
    __________________________________________________________________________________________________
    batch_norm_5 (BatchNormalizatio (None, 52, 52, 256)  1024        conv_5[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_50 (LeakyReLU)      (None, 52, 52, 256)  0           batch_norm_5[0][0]               
    __________________________________________________________________________________________________
    conv_6 (Conv2D)                 (None, 52, 52, 128)  32768       leaky_re_lu_50[0][0]             
    __________________________________________________________________________________________________
    batch_norm_6 (BatchNormalizatio (None, 52, 52, 128)  512         conv_6[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_51 (LeakyReLU)      (None, 52, 52, 128)  0           batch_norm_6[0][0]               
    __________________________________________________________________________________________________
    conv_7 (Conv2D)                 (None, 52, 52, 256)  294912      leaky_re_lu_51[0][0]             
    __________________________________________________________________________________________________
    batch_norm_7 (BatchNormalizatio (None, 52, 52, 256)  1024        conv_7[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_52 (LeakyReLU)      (None, 52, 52, 256)  0           batch_norm_7[0][0]               
    __________________________________________________________________________________________________
    max_pooling2d_14 (MaxPooling2D) (None, 26, 26, 256)  0           leaky_re_lu_52[0][0]             
    __________________________________________________________________________________________________
    conv_8 (Conv2D)                 (None, 26, 26, 512)  1179648     max_pooling2d_14[0][0]           
    __________________________________________________________________________________________________
    batch_norm_8 (BatchNormalizatio (None, 26, 26, 512)  2048        conv_8[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_53 (LeakyReLU)      (None, 26, 26, 512)  0           batch_norm_8[0][0]               
    __________________________________________________________________________________________________
    conv_9 (Conv2D)                 (None, 26, 26, 256)  131072      leaky_re_lu_53[0][0]             
    __________________________________________________________________________________________________
    batch_norm_9 (BatchNormalizatio (None, 26, 26, 256)  1024        conv_9[0][0]                     
    __________________________________________________________________________________________________
    leaky_re_lu_54 (LeakyReLU)      (None, 26, 26, 256)  0           batch_norm_9[0][0]               
    __________________________________________________________________________________________________
    conv_10 (Conv2D)                (None, 26, 26, 512)  1179648     leaky_re_lu_54[0][0]             
    __________________________________________________________________________________________________
    batch_norm_10 (BatchNormalizati (None, 26, 26, 512)  2048        conv_10[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_55 (LeakyReLU)      (None, 26, 26, 512)  0           batch_norm_10[0][0]              
    __________________________________________________________________________________________________
    conv_11 (Conv2D)                (None, 26, 26, 256)  131072      leaky_re_lu_55[0][0]             
    __________________________________________________________________________________________________
    batch_norm_11 (BatchNormalizati (None, 26, 26, 256)  1024        conv_11[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_56 (LeakyReLU)      (None, 26, 26, 256)  0           batch_norm_11[0][0]              
    __________________________________________________________________________________________________
    conv_12 (Conv2D)                (None, 26, 26, 512)  1179648     leaky_re_lu_56[0][0]             
    __________________________________________________________________________________________________
    batch_norm_12 (BatchNormalizati (None, 26, 26, 512)  2048        conv_12[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_57 (LeakyReLU)      (None, 26, 26, 512)  0           batch_norm_12[0][0]              
    __________________________________________________________________________________________________
    max_pooling2d_15 (MaxPooling2D) (None, 13, 13, 512)  0           leaky_re_lu_57[0][0]             
    __________________________________________________________________________________________________
    conv_13 (Conv2D)                (None, 13, 13, 1024) 4718592     max_pooling2d_15[0][0]           
    __________________________________________________________________________________________________
    batch_norm_13 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_13[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_58 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_13[0][0]              
    __________________________________________________________________________________________________
    conv_14 (Conv2D)                (None, 13, 13, 512)  524288      leaky_re_lu_58[0][0]             
    __________________________________________________________________________________________________
    batch_norm_14 (BatchNormalizati (None, 13, 13, 512)  2048        conv_14[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_59 (LeakyReLU)      (None, 13, 13, 512)  0           batch_norm_14[0][0]              
    __________________________________________________________________________________________________
    conv_15 (Conv2D)                (None, 13, 13, 1024) 4718592     leaky_re_lu_59[0][0]             
    __________________________________________________________________________________________________
    batch_norm_15 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_15[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_60 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_15[0][0]              
    __________________________________________________________________________________________________
    conv_16 (Conv2D)                (None, 13, 13, 512)  524288      leaky_re_lu_60[0][0]             
    __________________________________________________________________________________________________
    batch_norm_16 (BatchNormalizati (None, 13, 13, 512)  2048        conv_16[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_61 (LeakyReLU)      (None, 13, 13, 512)  0           batch_norm_16[0][0]              
    __________________________________________________________________________________________________
    conv_17 (Conv2D)                (None, 13, 13, 1024) 4718592     leaky_re_lu_61[0][0]             
    __________________________________________________________________________________________________
    batch_norm_17 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_17[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_62 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_17[0][0]              
    __________________________________________________________________________________________________
    conv_18 (Conv2D)                (None, 13, 13, 1024) 9437184     leaky_re_lu_62[0][0]             
    __________________________________________________________________________________________________
    batch_norm_18 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_18[0][0]                    
    __________________________________________________________________________________________________
    conv_20 (Conv2D)                (None, 26, 26, 64)   32768       leaky_re_lu_57[0][0]             
    __________________________________________________________________________________________________
    leaky_re_lu_63 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_18[0][0]              
    __________________________________________________________________________________________________
    batch_norm_20 (BatchNormalizati (None, 26, 26, 64)   256         conv_20[0][0]                    
    __________________________________________________________________________________________________
    conv_19 (Conv2D)                (None, 13, 13, 1024) 9437184     leaky_re_lu_63[0][0]             
    __________________________________________________________________________________________________
    leaky_re_lu_65 (LeakyReLU)      (None, 26, 26, 64)   0           batch_norm_20[0][0]              
    __________________________________________________________________________________________________
    batch_norm_19 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_19[0][0]                    
    __________________________________________________________________________________________________
    lambda_4 (Lambda)               (None, 13, 13, 256)  0           leaky_re_lu_65[0][0]             
    __________________________________________________________________________________________________
    leaky_re_lu_64 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_19[0][0]              
    __________________________________________________________________________________________________
    concatenate_3 (Concatenate)     (None, 13, 13, 1280) 0           lambda_4[0][0]                   
                                                                     leaky_re_lu_64[0][0]             
    __________________________________________________________________________________________________
    conv_21 (Conv2D)                (None, 13, 13, 1024) 11796480    concatenate_3[0][0]              
    __________________________________________________________________________________________________
    batch_norm_21 (BatchNormalizati (None, 13, 13, 1024) 4096        conv_21[0][0]                    
    __________________________________________________________________________________________________
    leaky_re_lu_66 (LeakyReLU)      (None, 13, 13, 1024) 0           batch_norm_21[0][0]              
    __________________________________________________________________________________________________
    conv2d_3 (Conv2D)               (None, 13, 13, 120)  123000      leaky_re_lu_66[0][0]             
    __________________________________________________________________________________________________
    reshape_3 (Reshape)             (None, 13, 13, 5, 24 0           conv2d_3[0][0]                   
    __________________________________________________________________________________________________
    input_4 (InputLayer)            (None, 1, 1, 1, 50,  0                                            
    __________________________________________________________________________________________________
    lambda_5 (Lambda)               (None, 13, 13, 5, 24 0           reshape_3[0][0]                  
                                                                     input_4[0][0]                    
    ==================================================================================================
    Total params: 50,670,936
    Trainable params: 50,650,264
    Non-trainable params: 20,672
    __________________________________________________________________________________________________
    

    I "succesfully" read the weights (succesfully as in no errors):

    weight_reader = WeightReader('yolov2-voc.weights')
    
    for index in range(conv_count):
        conv_layer = model.get_layer('conv_%i' % index)
        norm_layer = model.get_layer('batch_norm_%i' % index)
        
        size = np.prod(norm_layer.get_weights()[0].shape) # get product of shape (total values)
        
        # read sizes
        beta  = weight_reader.read(size)
        gamma = weight_reader.read(size)
        mean  = weight_reader.read(size)
        var   = weight_reader.read(size)
        
        norm_layer.set_weights([gamma, beta, mean, var])
        
        if len(conv_layer.get_weights()) > 1: 
            bias   = weight_reader.read(np.prod(conv_layer.get_weights()[1].shape)) 
            kernel = weight_reader.read(np.prod(conv_layer.get_weights()[0].shape))
            kernel = kernel.reshape(list(reversed(conv_layer.get_weights()[0].shape))) 
            kernel = kernel.transpose([2, 3, 1, 0])
            conv_layer.set_weights([kernel, bias])
        else:
            kernel = weight_reader.read(np.prod(conv_layer.get_weights()[0].shape))
            kernel = kernel.reshape(list(reversed(conv_layer.get_weights()[0].shape)))
            kernel = kernel.transpose([2, 3, 1, 0])
            conv_layer.set_weights([kernel])
    

    When I try to run the following code, no boxes are created:

    img = cv2.imread("dog-cycle-car.png")
    img = cv2.resize(img, (416, 416)) # resize to the input dimension
    img = img / 255
    img = img[..., ::-1] # .transpose((2, 0, 1))  # BGR -> RGB | H X W C -> C X H X W 
    img_input = np.array([img])
    
    dummy_array = np.zeros((1, 1, 1, 1, TRUE_BOX_BUFFER, 4))
    
    test_prediction = model.predict([img_input, dumby_array])
    boxes = decode_netout(test_prediction[0],
                          obj_threshold=OBJ_THRESHOLD,
                          nms_threshold=NMS_THRESHOLD,
                          anchors=ANCHORS, 
                          nb_class=CLASS)
    img = draw_boxes(img, boxes, labels=LABELS)
    img.shape
    plt.imshow(img)
    boxes
    

    I assume that there must be an issue with how I am loading the weights because the summary of the model looks the same and that is the only thing that I have modified in the notebook. I would be super grateful if someone could shine some light on to why this may be happening and what I am doing wrong.

    Edit

    If it would be helpful to see my jupyter notebook just ask for a link.

    opened by zoecarver 17
  • train on raccoon

    train on raccoon

    Hi , i want to trained the full-yolo with one GTX 1080 on raccoon. Firstly , please see my config.json :

    {
        "model" : {
            "architecture":         "Full Yolo",
            "input_size":           416,
            "anchors":            [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
            "max_box_per_image":    20,        
            "labels":               ["raccoon"]
    
        },
    
        "train": {
            "train_image_folder":   "/home/mm/Detection-keras/basic-yolo-keras/raccoon_dataset/images/train/",
            "train_annot_folder":   "/home/mm/Detection-keras/basic-yolo-keras/raccoon_dataset/annotations/train/",     
              
            "train_times":          10,
            "pretrained_weights":   "",
            "batch_size":           12,
            "learning_rate":        1e-4,
            "nb_epoch":             50,
            "warmup_epochs":        3,
    
            "object_scale":         5.0 ,
            "no_object_scale":      1.0,
            "coord_scale":          1.0,
            "class_scale":          1.0,
    
            "saved_weights_name":   "full_yolo_raccoon.h5",
            "debug":                true
        },
    
        "valid": {
            "valid_image_folder":   "/home/mm/Detection-keras/basic-yolo-keras/raccoon_dataset/images/val/",
            "valid_annot_folder":   "/home/mm/Detection-keras/basic-yolo-keras/raccoon_dataset/annotations/val/",
    
            "valid_times":          1
        }
    }
    

    Now i train , Epoch 1 : current recalls are above 98 % http://uupload.ir/files/jv2c_screenshot_from_2018-03-14_21-54-08.png

    Epoch 1 : couple of iterations : recall goes to zeros. http://uupload.ir/files/pxfl_screenshot_from_2018-03-14_21-54-27.png

    Epoch 3 , End of training , Still recall=0 http://uupload.ir/files/i2qp_screenshot_from_2018-03-14_21-57-35.png

    when i tested on raccoon images , no object found . http://uupload.ir/files/sgms_screenshot_from_2018-03-14_22-21-59.png

    Why? where are the problems ?

    opened by PythonImageDeveloper 17
  • Support single channel images

    Support single channel images

    Hi, it seems that your script always assume that the image has 3 channels:

    input_image = Input(shape=(self.input_size, self.input_size, 3))

    It would be nice if you change the script that it also supports single channgel images.

    Thanks for the nice work!

    enhancement 
    opened by thorstenwagner 17
  • Issues in training on own dataset

    Issues in training on own dataset

    I've done some changes to train the model on my own data set.(4 classes). Model architecture is formed and while training epoch 1, the following error is coming:

    Epoch 1/100000 Traceback (most recent call last): File "train.py", line 137, in main(args) File "train.py", line 133, in main debug = config['train']['debug']) File "/home/bhanu/Yolo-Keras/frontend.py", line 447, in train max_queue_size = 8) File "/usr/local/lib/python3.5/dist-packages/Keras-2.1.1-py3.5.egg/keras/legacy/interfaces.py", line 87, in wrapper File "/usr/local/lib/python3.5/dist-packages/Keras-2.1.1-py3.5.egg/keras/engine/training.py", line 2114, in fit_generator File "/usr/local/lib/python3.5/dist-packages/Keras-2.1.1-py3.5.egg/keras/engine/training.py", line 1826, in train_on_batch File "/usr/local/lib/python3.5/dist-packages/Keras-2.1.1-py3.5.egg/keras/engine/training.py", line 1411, in _standardize_user_data File "/usr/local/lib/python3.5/dist-packages/Keras-2.1.1-py3.5.egg/keras/engine/training.py", line 153, in _standardize_input_data ValueError: Error when checking target: expected lambda_2 to have shape (None, 11, 11, 5, 9) but got array with shape (1, 11, 11, 5, 6)

    Please help me with this.

    opened by bhanu223 16
  • Issue in training on Raccoon dataset

    Issue in training on Raccoon dataset

    Hi. I have a problem when training with Raccoon dataset. I dont know why it always stop when I run after a few minutes. I re-train many time but the result no change.

    This is my config:

    {
        "model" : {
            "architecture":         "Tiny Yolo",
            "input_size":           416,
            "anchors":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
            "max_box_per_image":    10,        
            "labels":               ["raccoon"]
        },
    
        "train": {
            "train_image_folder":   "../raccoon_dataset/train/images/",
            "train_annot_folder":   "../raccoon_dataset/train/annotations/",     
              
            "train_times":          10,
            "pretrained_weights":   "",
            "batch_size":           16,
            "learning_rate":        1e-4,
            "nb_epoch":             50,
            "warmup_epochs":        0,
    
            "object_scale":         5.0 ,
            "no_object_scale":      1.0,
            "coord_scale":          1.0,
            "class_scale":          1.0,
    
            "saved_weights_name":   "tiny_yolo_raccoon_save.h5",
            "debug":                false
        },
    
        "valid": {
            "valid_image_folder":   "../raccoon_dataset/valid/images/",
            "valid_annot_folder":   "../raccoon_dataset/valid/annotations/",
            "valid_times":          1
        }
    }
    
    
    Epoch 1/50
     10/100 [==>...........................] - ETA: 1:11 - loss: 11.5398Epoch 00001: val_loss improved from inf to 5.79008, saving model to tiny_yolo_raccoon_save.h5
     10/100 [==>...........................] - ETA: 1:22 - loss: 11.5398 - val_loss: 0.0000e+00Epoch 2/50
     10/100 [==>...........................] - ETA: 44s - loss: 9.3070Epoch 00002: val_loss improved from 5.79008 to 5.75633, saving model to tiny_yolo_raccoon_save.h5
     10/100 [==>...........................] - ETA: 55s - loss: 9.3070 - val_loss: 0.0000e+00Epoch 3/50
     10/100 [==>...........................] - ETA: 44s - loss: 6.9476Epoch 00003: val_loss improved from 5.75633 to 4.91244, saving model to tiny_yolo_raccoon_save.h5
     10/100 [==>...........................] - ETA: 55s - loss: 6.9476 - val_loss: 0.0000e+00Epoch 4/50
     10/100 [==>...........................] - ETA: 44s - loss: 4.7331Epoch 00004: val_loss improved from 4.91244 to 4.88484, saving model to tiny_yolo_raccoon_save.h5
     10/100 [==>...........................] - ETA: 55s - loss: 4.7331 - val_loss: 0.0000e+00Epoch 5/50
     10/100 [==>...........................] - ETA: 44s - loss: 3.8645Epoch 00005: val_loss did not improve
     10/100 [==>...........................] - ETA: 53s - loss: 3.8645 - val_loss: 0.0000e+00Epoch 6/50
     10/100 [==>...........................] - ETA: 44s - loss: 3.8343Epoch 00006: val_loss did not improve
     10/100 [==>...........................] - ETA: 52s - loss: 3.8343 - val_loss: 0.0000e+00Epoch 7/50
     10/100 [==>...........................] - ETA: 44s - loss: 3.4981Epoch 00007: val_loss did not improve
     10/100 [==>...........................] - ETA: 53s - loss: 3.4981 - val_loss: 0.0000e+00Epoch 00007: early stopping
    

    It train in to few minute ??? Too fast !!!!!. Are there something wrong in here, the val_loss is 0.0000+e00, but there are no box found when I predict ? And I have question, why the progress bar only run to 10/100 each epoch ? I never seen it run to 100/100 ? Thank you so much, hope anyone reply, I tried many many times but the result seem not good :(

    opened by khiemntu 14
  • How to train my Custom Dataset with this notebook

    How to train my Custom Dataset with this notebook

    Hi, I have been trying for a long time to train my own data set with this notebook but I have faced many issues and solved some of them Firsy My Data Set is as follows images folder ----> with training images of size 640x512 labels folder ---> text files for each image above with following data 0 311.489379882812 204.399459838867 36.0547180175781 24.1059265136719 i-e class label, x , y, w, h, (these values are not normalized.

    I have modified preprocessinmg.py to output the dataset as required by the notebook but still I get the error

    c:\Users\Usman\anaconda3\envs\tf-gpu\lib\site-packages\numpy\core\fromnumeric.py:43 _wrapit
        result = getattr(asarray(obj), method)(*args, **kwds)
    
    ValueError: cannot reshape array of size 10 into shape (1,1,1,1,2)
    

    and i cannt seen to find where my error is

    Ihave boon stuck at this for like 4 doays now please help

    opened by 316usman 0
  • Bump tensorflow-gpu from 1.3 to 2.9.3

    Bump tensorflow-gpu from 1.3 to 2.9.3

    Bumps tensorflow-gpu from 1.3 to 2.9.3.

    Release notes

    Sourced from tensorflow-gpu's releases.

    TensorFlow 2.9.3

    Release 2.9.3

    This release introduces several vulnerability fixes:

    TensorFlow 2.9.2

    Release 2.9.2

    This releases introduces several vulnerability fixes:

    ... (truncated)

    Changelog

    Sourced from tensorflow-gpu's changelog.

    Release 2.9.3

    This release introduces several vulnerability fixes:

    Release 2.8.4

    This release introduces several vulnerability fixes:

    ... (truncated)

    Commits
    • a5ed5f3 Merge pull request #58584 from tensorflow/vinila21-patch-2
    • 258f9a1 Update py_func.cc
    • cd27cfb Merge pull request #58580 from tensorflow-jenkins/version-numbers-2.9.3-24474
    • 3e75385 Update version numbers to 2.9.3
    • bc72c39 Merge pull request #58482 from tensorflow-jenkins/relnotes-2.9.3-25695
    • 3506c90 Update RELEASE.md
    • 8dcb48e Update RELEASE.md
    • 4f34ec8 Merge pull request #58576 from pak-laura/c2.99f03a9d3bafe902c1e6beb105b2f2417...
    • 6fc67e4 Replace CHECK with returning an InternalError on failing to create python tuple
    • 5dbe90a Merge pull request #58570 from tensorflow/r2.9-7b174a0f2e4
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • tf.Print is deprecated

    tf.Print is deprecated

    tf.Print is deprecated so I get a Tensorflow warning about replacing it with tf.print, which unfortunately doesn't work either for some reason. After a bit of online research I think I get how tf.print works on Tensorflow 2, but I'm still confused about its function on Tensorflow 1. I run the Yolo_Step_by_Step.ipynb with Tensorflow 1.x. Has anyone been able to make these prints work?

    opened by eirini5th 0
  • NotFoundError: 2 root error(s) found when trying to run the code on Tensorflow 2

    NotFoundError: 2 root error(s) found when trying to run the code on Tensorflow 2

    I am using the notebook on colab and I want to run it with TF2. However, I come across this error on calling model.fit_generator:

    NotFoundError:` 2 root error(s) found.
      (0) Not found: Resource localhost/loss/lambda_3_loss/Variable/N10tensorflow3VarE does not exist.
    	 [[{{node loss/lambda_3_loss/AssignAddVariableOp}}]]
    	 [[Func/training/Adam/gradients/gradients/norm_6_1/cond_grad/StatelessIf/then/_1694/input/_4425/_2667]]
      (1) Not found: Resource localhost/loss/lambda_3_loss/Variable/N10tensorflow3VarE does not exist.
    	 [[{{node loss/lambda_3_loss/AssignAddVariableOp}}]]
    0 successful operations.
    0 derived errors ignored.
    

    The problem seems to be caused by the custom loss function, since I tried using a simple dummy loss function with no errors.

    The changes I've made (to no avail) are these 2:

    1. to include the lines from tensorflow.python.framework.ops import disable_eager_execution disable_eager_execution() before creating the model. Before adding these lines I came across this error:
    TypeError: Cannot convert a symbolic Keras input/output to a numpy array.
    This error may indicate that you're trying to pass a symbolic value to a NumPy call,
    which is not supported.
    Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching,
    preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
    
    1. to use tensorflow.keras instead of keras, after suggestions from similar github issues and stackoverflow posts.

    I am also adding the custom_loss code to include a few changes I made to use TF2 instead of TF1 (basically some tf.compat.v1.* additions).

    def custom_loss(y_true, y_pred):
        mask_shape = tf.shape(y_true)[:4]
        
        cell_x = tf.compat.v1.to_float(tf.reshape(tf.tile(tf.range(GRID_W), [GRID_H]), (1, GRID_H, GRID_W, 1, 1)))
        cell_y = tf.transpose(cell_x, (0,2,1,3,4))
    
        cell_grid = tf.tile(tf.concat([cell_x,cell_y], -1), [BATCH_SIZE, 1, 1, 5, 1])
        
        coord_mask = tf.zeros(mask_shape)
        conf_mask  = tf.zeros(mask_shape)
        class_mask = tf.zeros(mask_shape)
        
        seen = tf.Variable(0.)
        total_recall = tf.Variable(0.)
        
        """
        Adjust prediction
        """
        ### adjust x and y      
        pred_box_xy = tf.sigmoid(y_pred[..., :2]) + cell_grid
        
        ### adjust w and h
        pred_box_wh = tf.exp(y_pred[..., 2:4]) * np.reshape(ANCHORS, [1,1,1,BOX,2])
        
        ### adjust confidence
        pred_box_conf = tf.sigmoid(y_pred[..., 4])
        
        ### adjust class probabilities
        pred_box_class = y_pred[..., 5:]
        
        """
        Adjust ground truth
        """
        ### adjust x and y
        true_box_xy = y_true[..., 0:2] # relative position to the containing cell
        
        ### adjust w and h
        true_box_wh = y_true[..., 2:4] # number of cells accross, horizontally and vertically
        
        ### adjust confidence
        true_wh_half = true_box_wh / 2.
        true_mins    = true_box_xy - true_wh_half
        true_maxes   = true_box_xy + true_wh_half
        
        pred_wh_half = pred_box_wh / 2.
        pred_mins    = pred_box_xy - pred_wh_half
        pred_maxes   = pred_box_xy + pred_wh_half       
        
        intersect_mins  = tf.maximum(pred_mins,  true_mins)
        intersect_maxes = tf.minimum(pred_maxes, true_maxes)
        intersect_wh    = tf.maximum(intersect_maxes - intersect_mins, 0.)
        intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]
        
        true_areas = true_box_wh[..., 0] * true_box_wh[..., 1]
        pred_areas = pred_box_wh[..., 0] * pred_box_wh[..., 1]
    
        union_areas = pred_areas + true_areas - intersect_areas
        iou_scores  = tf.truediv(intersect_areas, union_areas)
        
        true_box_conf = iou_scores * y_true[..., 4]
        
        ### adjust class probabilities
        true_box_class = tf.argmax(y_true[..., 5:], -1)
        
        """
        Determine the masks
        """
        ### coordinate mask: simply the position of the ground truth boxes (the predictors)
        coord_mask = tf.expand_dims(y_true[..., 4], axis=-1) * COORD_SCALE
        
        ### confidence mask: penelize predictors + penalize boxes with low IOU
        # penalize the confidence of the boxes, which have IOU with some ground truth box < 0.6
        true_xy = true_boxes[..., 0:2]
        true_wh = true_boxes[..., 2:4]
        
        true_wh_half = true_wh / 2.
        true_mins    = true_xy - true_wh_half
        true_maxes   = true_xy + true_wh_half
        
        pred_xy = tf.expand_dims(pred_box_xy, 4)
        pred_wh = tf.expand_dims(pred_box_wh, 4)
        
        pred_wh_half = pred_wh / 2.
        pred_mins    = pred_xy - pred_wh_half
        pred_maxes   = pred_xy + pred_wh_half    
        
        intersect_mins  = tf.maximum(pred_mins,  true_mins)
        intersect_maxes = tf.minimum(pred_maxes, true_maxes)
        intersect_wh    = tf.maximum(intersect_maxes - intersect_mins, 0.)
        intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]
        
        true_areas = true_wh[..., 0] * true_wh[..., 1]
        pred_areas = pred_wh[..., 0] * pred_wh[..., 1]
    
        union_areas = pred_areas + true_areas - intersect_areas
        iou_scores  = tf.truediv(intersect_areas, union_areas)
    
        best_ious = tf.reduce_max(iou_scores, axis=4)
        conf_mask = conf_mask + tf.compat.v1.to_float(best_ious < 0.6) * (1 - y_true[..., 4]) * NO_OBJECT_SCALE
        
        # penalize the confidence of the boxes, which are reponsible for corresponding ground truth box
        conf_mask = conf_mask + y_true[..., 4] * OBJECT_SCALE
        
        ### class mask: simply the position of the ground truth boxes (the predictors)
        class_mask = y_true[..., 4] * tf.gather(CLASS_WEIGHTS, true_box_class) * CLASS_SCALE       
        
        """
        Warm-up training
        """
        no_boxes_mask = tf.compat.v1.to_float(coord_mask < COORD_SCALE/2.)
        seen = tf.compat.v1.assign_add(seen, 1.)
        
        true_box_xy, true_box_wh, coord_mask = tf.cond(tf.less(seen, WARM_UP_BATCHES), 
                              lambda: [true_box_xy + (0.5 + cell_grid) * no_boxes_mask, 
                                       true_box_wh + tf.ones_like(true_box_wh) * np.reshape(ANCHORS, [1,1,1,BOX,2]) * no_boxes_mask, 
                                       tf.ones_like(coord_mask)],
                              lambda: [true_box_xy, 
                                       true_box_wh,
                                       coord_mask])
        
        """
        Finalize the loss
        """
        nb_coord_box = tf.reduce_sum(tf.compat.v1.to_float(coord_mask > 0.0))
        nb_conf_box  = tf.reduce_sum(tf.compat.v1.to_float(conf_mask  > 0.0))
        nb_class_box = tf.reduce_sum(tf.compat.v1.to_float(class_mask > 0.0))
        
        loss_xy    = tf.reduce_sum(tf.square(true_box_xy-pred_box_xy)     * coord_mask) / (nb_coord_box + 1e-6) / 2.
        loss_wh    = tf.reduce_sum(tf.square(true_box_wh-pred_box_wh)     * coord_mask) / (nb_coord_box + 1e-6) / 2.
        loss_conf  = tf.reduce_sum(tf.square(true_box_conf-pred_box_conf) * conf_mask)  / (nb_conf_box  + 1e-6) / 2.
        loss_class = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=true_box_class, logits=pred_box_class)
        loss_class = tf.reduce_sum(loss_class * class_mask) / (nb_class_box + 1e-6)
        
        loss = loss_xy + loss_wh + loss_conf + loss_class
        
        nb_true_box = tf.reduce_sum(y_true[..., 4])
        nb_pred_box = tf.reduce_sum(tf.compat.v1.to_float(true_box_conf > 0.5) * tf.compat.v1.to_float(pred_box_conf > 0.3))
    
        """
        Debugging code
        """    
        current_recall = nb_pred_box/(nb_true_box + 1e-6)
        total_recall = tf.compat.v1.assign_add(total_recall, current_recall) 
    
        loss = tf.compat.v1.Print(loss, [tf.zeros((1))], message='Dummy Line \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [loss_xy], message='Loss XY \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [loss_wh], message='Loss WH \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [loss_conf], message='Loss Conf \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [loss_class], message='Loss Class \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [loss], message='Total Loss \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [current_recall], message='Current Recall \t', summarize=1000)
        loss = tf.compat.v1.Print(loss, [total_recall/seen], message='Average Recall \t', summarize=1000)
        
        return loss
    
    opened by eirini5th 1
Releases(v0.1)
Owner
Huynh Ngoc Anh
available for consulting jobs
Huynh Ngoc Anh
The Official TensorFlow Implementation for SPatchGAN (ICCV2021)

SPatchGAN: Official TensorFlow Implementation Paper "SPatchGAN: A Statistical Feature Based Discriminator for Unsupervised Image-to-Image Translation"

39 Dec 30, 2022
A sketch extractor for anime/illustration.

Anime2Sketch Anime2Sketch: A sketch extractor for illustration, anime art, manga By Xiaoyu Xiang Updates 2021.5.2: Upload more example results of anim

Xiaoyu Xiang 1.6k Jan 01, 2023
Learned model to estimate number of distinct values (NDV) of a population using a small sample.

Learned NDV estimator Learned model to estimate number of distinct values (NDV) of a population using a small sample. The model approximates the maxim

2 Nov 21, 2022
Multistream CNN for Robust Acoustic Modeling

Multistream Convolutional Neural Network (CNN) A multistream CNN is a novel neural network architecture for robust acoustic modeling in speech recogni

ASAPP Research 37 Sep 21, 2022
Using VapourSynth with super resolution models and speeding them up with TensorRT.

VSGAN-tensorrt-docker Using image super resolution models with vapoursynth and speeding them up with TensorRT. Using NVIDIA/Torch-TensorRT combined wi

111 Jan 05, 2023
Implementation of "Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification"

hypergraph_reid Implementation of "Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification" If you find this help your research,

62 Dec 21, 2022
WPPNets: Unsupervised CNN Training with Wasserstein Patch Priors for Image Superresolution

WPPNets: Unsupervised CNN Training with Wasserstein Patch Priors for Image Superresolution This code belongs to the paper [1] available at https://arx

Fabian Altekrueger 5 Jun 02, 2022
use tensorflow 2.0 to tell a dog and cat from a specified picture

dog_or_cat use tensorflow 2.0 to tell a dog and cat from a specified picture This is one of the classic experiments for the introduction of deep learn

你这个代码我看不懂 1 Oct 22, 2021
RetinaFace: Deep Face Detection Library in TensorFlow for Python

RetinaFace is a deep learning based cutting-edge facial detector for Python coming with facial landmarks.

Sefik Ilkin Serengil 512 Dec 29, 2022
PyJokes - Joking around with Python library pyjokes

Hi, it's Muhaimin again 👋 This is something unorthodox but cool. Don't forget t

Muhaimin A. Salay Kanton 1 Feb 02, 2022
🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

SGLKT-VisDial Pytorch Implementation for the paper: Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer Gi-Cheon Kang, Junseok P

Gi-Cheon Kang 9 Jul 05, 2022
Official implementation for Scale-Aware Neural Architecture Search for Multivariate Time Series Forecasting

1 SNAS4MTF This repo is the official implementation for Scale-Aware Neural Architecture Search for Multivariate Time Series Forecasting. 1.1 The frame

SZJ 5 Sep 21, 2022
Yas CRNN model training - Yet Another Genshin Impact Scanner

Yas-Train Yet Another Genshin Impact Scanner 又一个原神圣遗物导出器 介绍 该仓库为 Yas 的模型训练程序 相关资料 MobileNetV3 CRNN 使用 假设你会设置基本的pytorch环境。 生成数据集 python main.py gen 训练

wormtql 18 Jan 08, 2023
A library for researching neural networks compression and acceleration methods.

A library for researching neural networks compression and acceleration methods.

Intel Labs 100 Dec 29, 2022
Franka Emika Panda manipulator kinematics&dynamics simulation

pybullet_sim_panda Pybullet simulation environment for Franka Emika Panda Dependency pybullet, numpy, spatial_math_mini Simple example (please check s

0 Jan 20, 2022
Deep Hedging Demo - An Example of Using Machine Learning for Derivative Pricing.

Deep Hedging Demo Pricing Derivatives using Machine Learning 1) Jupyter version: Run ./colab/deep_hedging_colab.ipynb on Colab. 2) Gui version: Run py

Yu Man Tam 102 Jan 06, 2023
Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

AniFormer This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu S

24 Nov 02, 2022
Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

31 Dec 07, 2022
Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Heterogeneous Graph Benchmark Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks. Roadmap We organize our repo by task, and on

THUDM 176 Dec 17, 2022
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation This repository contains the source code of our paper, ESPNet (acc

Sachin Mehta 515 Dec 13, 2022