Header-only library for using Keras models in C++.

Overview

logo

CI (License MIT 1.0)

frugally-deep

Use Keras models in C++ with ease

Table of contents

Introduction

Would you like to build/train a model using Keras/Python? And would you like to run the prediction (forward pass) on your model in C++ without linking your application against TensorFlow? Then frugally-deep is exactly for you.

frugally-deep

  • is a small header-only library written in modern and pure C++.
  • is very easy to integrate and use.
  • depends only on FunctionalPlus, Eigen and json - also header-only libraries.
  • supports inference (model.predict) not only for sequential models but also for computational graphs with a more complex topology, created with the functional API.
  • re-implements a (small) subset of TensorFlow, i.e., the operations needed to support prediction.
  • results in a much smaller binary size than linking against TensorFlow.
  • works out-of-the-box also when compiled into a 32-bit executable. (Of course, 64 bit is fine too.)
  • utterly ignores even the most powerful GPU in your system and uses only one CPU core per prediction. ;-)
  • but is quite fast on one CPU core compared to TensorFlow, and you can run multiple predictions in parallel, thus utilizing as many CPUs as you like to improve the overall prediction throughput of your application/pipeline.

Supported layer types

Layer types typically used in image recognition/generation are supported, making many popular model architectures possible (see Performance section).

  • Add, Concatenate, Subtract, Multiply, Average, Maximum
  • AveragePooling1D/2D, GlobalAveragePooling1D/2D
  • Bidirectional, TimeDistributed, GRU, LSTM, CuDNNGRU, CuDNNLSTM
  • Conv1D/2D, SeparableConv2D, DepthwiseConv2D
  • Cropping1D/2D, ZeroPadding1D/2D
  • BatchNormalization, Dense, Flatten, Normalization
  • Dropout, AlphaDropout, GaussianDropout, GaussianNoise
  • SpatialDropout1D, SpatialDropout2D, SpatialDropout3D
  • RandomContrast, RandomFlip, RandomHeight
  • RandomRotation, RandomTranslation, RandomWidth, RandomZoom
  • MaxPooling1D/2D, GlobalMaxPooling1D/2D
  • ELU, LeakyReLU, ReLU, SeLU, PReLU
  • Sigmoid, Softmax, Softplus, Tanh
  • Exponential, GELU, Softsign
  • UpSampling1D/2D
  • Reshape, Permute, RepeatVector
  • Embedding

Also supported

  • multiple inputs and outputs
  • nested models
  • residual connections
  • shared layers
  • variable input shapes
  • arbitrary complex model architectures / computational graphs
  • custom layers (by passing custom factory functions to load_model)

Currently not supported are the following:

ActivityRegularization, AveragePooling3D, Conv2DTranspose (why), Conv3D, ConvLSTM2D, Cropping3D, Dot, GRUCell, LocallyConnected1D, LocallyConnected2D, LSTMCell, Masking, MaxPooling3D, RepeatVector, RNN, SimpleRNN, SimpleRNNCell, StackedRNNCells, ThresholdedReLU, Upsampling3D, temporal models

Usage

  1. Use Keras/Python to build (model.compile(...)), train (model.fit(...)) and test (model.evaluate(...)) your model as usual. Then save it to a single HDF5 file using model.save('....h5', include_optimizer=False). The image_data_format in your model must be channels_last, which is the default when using the TensorFlow backend. Models created with a different image_data_format and other backends are not supported.

  2. Now convert it to the frugally-deep file format with keras_export/convert_model.py

  3. Finally load it in C++ (fdeep::load_model(...)) and use model.predict(...) to invoke a forward pass with your data.

The following minimal example shows the full workflow:

# create_model.py
import numpy as np
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

inputs = Input(shape=(4,))
x = Dense(5, activation='relu')(inputs)
predictions = Dense(3, activation='softmax')(x)
model = Model(inputs=inputs, outputs=predictions)
model.compile(loss='categorical_crossentropy', optimizer='nadam')

model.fit(
    np.asarray([[1, 2, 3, 4], [2, 3, 4, 5]]),
    np.asarray([[1, 0, 0], [0, 0, 1]]), epochs=10)

model.save('keras_model.h5', include_optimizer=False)
python3 keras_export/convert_model.py keras_model.h5 fdeep_model.json
// main.cpp
#include <fdeep/fdeep.hpp>
int main()
{
    const auto model = fdeep::load_model("fdeep_model.json");
    const auto result = model.predict(
        {fdeep::tensor(fdeep::tensor_shape(static_cast<std::size_t>(4)),
        std::vector<float>{1, 2, 3, 4})});
    std::cout << fdeep::show_tensors(result) << std::endl;
}

When using convert_model.py a test case (input and corresponding output values) is generated automatically and saved along with your model. fdeep::load_model runs this test to make sure the results of a forward pass in frugally-deep are the same as in Keras.

For more integration examples please have a look at the FAQ.

Performance

Below you can find the average durations of multiple consecutive forward passes for some popular models ran on a single core of an Intel Core i5-6600 CPU @ 3.30GHz. frugally-deep and TensorFlow were compiled (GCC ver. 7.1) with g++ -O3 -march=native. The processes were started with CUDA_VISIBLE_DEVICES='' taskset --cpu-list 1 ... to disable the GPU and to only allow usage of one CPU. (see used Dockerfile)

Model Keras + TF frugally-deep
DenseNet121 0.12 s 0.25 s
DenseNet169 0.13 s 0.28 s
DenseNet201 0.16 s 0.39 s
InceptionV3 0.21 s 0.32 s
MobileNet 0.05 s 0.15 s
MobileNetV2 0.05 s 0.17 s
NASNetLarge 0.83 s 4.03 s
NASNetMobile 0.08 s 0.32 s
ResNet101 0.22 s 0.45 s
ResNet101V2 0.21 s 0.42 s
ResNet152 0.31 s 0.65 s
ResNet152V2 0.29 s 0.61 s
ResNet50 0.13 s 0.26 s
ResNet50V2 0.12 s 0.22 s
VGG16 0.40 s 0.56 s
VGG19 0.49 s 0.68 s
Xception 0.25 s 1.20 s

Requirements and Installation

  • A C++14-compatible compiler: Compilers from these versions on are fine: GCC 4.9, Clang 3.7 (libc++ 3.7) and Visual C++ 2015
  • Python 3.7 or higher
  • TensorFlow and Keras 2.7.0 (This is the tested version, but somewhat older ones might work too.)

Guides for different ways to install frugally-deep can be found in INSTALL.md.

FAQ

See FAQ.md

Disclaimer


The API of this library still might change in the future. If you have any suggestions, find errors or want to give general feedback/criticism, I'd love to hear from you. Of course, contributions are also very welcome.

License

Distributed under the MIT License. (See accompanying file LICENSE or at https://opensource.org/licenses/MIT)

Comments
  • Problem with results of siamese CNN using EfficientNet

    Problem with results of siamese CNN using EfficientNet

    Hi there,

    First of all let me thank you for this fantastic library!

    Recently I got stuck on converting a siamese network that utilizes functional model and EfficientNetB0 architecture. I'm strictly following this repo for my development: https://github.com/sajadamouei/Person-Re-ID-with-light-weight-network. Since EfficientNetB0 uses FixedDropout and reduce layers that shrink the dimensionality (requires multiplying tensors by 1x1xDEPTH Conv) I had to implement them myself in the library. When I convert EfficientNetB0 on its own and load it in my C++ app, the output is EXACTLY as expected on both python and C++ side - no problems there. However, When I try to create siamese network out of them like presented here: https://github.com/sajadamouei/Person-Re-ID-with-light-weight-network/blob/master/model.py I get totally different results. In anticipation to your question - yes, I made super sure that the inputs to the network are EXACTLY the same on both sides - python and C++. I've tried everything to fix this and concluded that there must be something wrong with either the way frugally-deep deals with functional models OR the converter itself. What I also noticed is that tensors look completely different when they reach both Flatten layers in the architecture. Any ideas why this may be happening? Please look at the below screenshots to better understand the problem.

    github_plane Screenshot 2022-03-06 at 11 57 21

    opened by pavel123 37
  • Hash value for json/net loaded?

    Hash value for json/net loaded?

    I think it would be handy for us to have a hash over a loaded model (so I could store, together with the results, some indication of how they were generated - particularly handy for encodings, which tend to be incompatible). I could simply calculate a hash over the file/string used to initialise the net, but since many files could potentially result in the same net it would be nicer if the net itself could provide such a hash. Is such a function implemented or, if not, do you see an easy way to get such a hash?

    Thanks

    Sven

    opened by utcke 36
  • How to convert model with

    How to convert model with "relu6" layer?

    My Keras model uses "rule6" layer , how to change convert_model.py to make the json file? and any examples for adding custom layer in fdeep::load_model?

    Thank you very much!

    opened by binlbl 32
  • Using Eigen Unsupported modules to improve convolutions

    Using Eigen Unsupported modules to improve convolutions

    I noticed that Eigen 3.3 has unsupported modules, including modules for Tensors and gemm operations.

    https://bitbucket.org/eigen/eigen/src/9b065de03d016d802a25366ff5f0055df6318121/unsupported/Eigen/CXX11/src/Tensor/README.md?at=default#markdown-header-convolutions

    I noticed you implement your own gemm operation in fdeep/convolution.hpp in function convolve_im2col. This could be improved by using gemm functions from the eigen unsupported modules.

    I ran a test by inferring the UNet model from pix2pix in frugally deep. It took 18s compared to a model converted from onnx and inferred in OpenCV which took 3s. I think this shows that convolutions in frugally could be improved.

    Thanks

    opened by pfeatherstone 32
  • Slow-ish run time on MSVC

    Slow-ish run time on MSVC

    Hi!

    First of all thank you for this great library! :-) I've got a fairly small model (18 layers) for real-time applications, basically mainly consisting of 5 blocks of Conv2D/ReLu/MaxPool2D, and input size 64x64x3. I'm unfortunately seeing some speed problems with fdeep. A forward pass takes around 11ms in Keras, and it's taking 60ms in fdeep. (I've measured by calling predict 100x in a for-loop and then averaging - a bit crude but should do the trick for this purpose). I've compiled with the latest VS2017 15.5.5, Release mode, and default compiler flags (/O2). If I enable AVX2 and instrinsics, it goes down to 50ms, but still way too slow. (I've tried without im2col but it's even slower, around >10x).

    I've run the VS profiler, but I'm not 100% sure I'm interpreting the results correctly. I think around 30%+5% of the total time is spent in Eigen's gebp and gemm functions, where we probably can't do much. Except maybe: I think I've seen you're using RowMajor storage for the Eigen matrices. Eigen is supposedly more optimised for its default, ColMajor storage. Would it be hard to change that in fdeep? Another 30% seems to be spent in convolve_im2col. But I'm not 100% sure where. I first thought it was the memcpy in eigen_mat_to_values but eigen_mat_to_values itself contains very few profiler samples only. There's also a lot of internal::transform and std::transform showing up in the profiler as well (internal::transform<ContainerOut>(reuse_t{}, f, std::forward<ContainerIn>(xs));) but I couldn't really figure out what the actual code is that this executes. I also saw that I think you pre-instantiate some convolution functions for common kernels. Most of my convolution kernels are 3x3, and it looks like you only instantiate n x m kernels for n and m equals 1 and 2. Could it help adding 3x3 there? So yea I'm really not sure about all of it. If indeed the majority of time is spent in Eigen's functions, then the RowMajor thing could indeed be a major problem.

    I'm happy to send you the model and an example input via email if you wanted to have a look.

    Here's some screenshots of the profiler: image image image

    Thank you very much!

    enhancement 
    opened by patrikhuber 32
  • Input to model

    Input to model

    If I have RGB Image and i want it to pass it to the model , what should i do ?

    what i've made is flatten the input image into vector of float , i appened the r , g , b values after each others to get just 1 vector called "input_vector"

    and then this is the next step.

             typedef fplus::shared_ref<std::vector<float>> shared_float_vec;
             shared_float_vec x(fplus::make_shared_ref<vector<float>>(std::move(input_vector)));
             const auto result = decision_model.predict({fdeep::tensor3(fdeep::shape3(3,60,60),x)});
    

    then the output is incorrect , what should i do then ? or what i've done wrong ?

    opened by rmmal 32
  •  lambda layer using tf.image

    lambda layer using tf.image

    I am using Lambda layer which includes this function to extract patches in image

    patch_one = tf.image.extract_glimpse(inputs[0], [26, 26], inputs[1][:, j, :], centered=False, normalized=False, noise='zero')

    Is it possible to implement this custom layer in your library and load model?

    opened by katmatus 30
  • Stop at the 'Loading json ...'

    Stop at the 'Loading json ...'

    Hi Tobias, thanks for this great library! I trained a ResNet50 network using Keras. I was able to convert the .h5 model to a .json. However, when I run the program as follows:

    #include <fdeep/fdeep.hpp>
    #include <opencv2/opencv.hpp>
    
    int main()
    {
    	const cv::Mat image = cv::imread("Image_1_2.jpg");
    	cv::cvtColor(image, image, cv::COLOR_BGR2RGB);
    	assert(image.isContinuous());
    	const auto model = fdeep::load_model("train7.json");
    	// Use the correct scaling, i.e., low and high.
    	const auto input = fdeep::tensor5_from_bytes(image.ptr(),
    		static_cast<std::size_t>(image.rows),
    		static_cast<std::size_t>(image.cols),
    		static_cast<std::size_t>(image.channels()),
    		0.0f, 1.0f);
    	const auto result = model.predict_class({ input });
    	std::cout << result << std::endl;
    	system("pause");
    }
    

    It likes the example in the FAQ--How to use images loaded with OpenCV as input for a model? But it doesn't work with my Keras model. It just spent about 236s to load json, and then stop here. My CPU is Core i5-3230M, which is not a good CPU. My model is used to classify 7 kinds of algae cells, which used transfer learning based on ResNet50.
    The python program for trainning model as follows:

    import numpy as np
    import matplotlib.pyplot as plt
    import keras
    from keras.preprocessing import image
    from keras.preprocessing.image import ImageDataGenerator
    from keras.applications import ResNet50
    from keras.applications.resnet50 import preprocess_input
    from keras import Model, layers
    from keras.models import load_model
    
    input_path = "data/LvsRod/"
    
    train_datagen = ImageDataGenerator(
        rescale=1. / 255,
        rotation_range=20,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        preprocessing_function=preprocess_input)
    
    train_generator = train_datagen.flow_from_directory(
        input_path + 'train',
        batch_size=10,
        class_mode='binary',
        target_size=(224, 224))
    
    validation_datagen = ImageDataGenerator(
        rescale=1. / 255,
        preprocessing_function=preprocess_input)
    
    validation_generator = validation_datagen.flow_from_directory(
        input_path + 'validation',
        shuffle=False,
        class_mode='binary',
        target_size=(224, 224))
    
    conv_base = ResNet50(include_top=False, weights='imagenet', input_shape=(224, 224, 3))
    
    for layer in conv_base.layers:
        layer.trainable = False
    
    x = conv_base.output
    x = layers.Flatten()(x)
    x = layers.Dense(256, activation='relu')(x)
    x = layers.Dropout(0.5)(x)
    predictions = layers.Dense(7, activation='softmax')(x)
    model = Model(conv_base.input, predictions)
    
    optimizer = keras.optimizers.SGD(lr=1e-4, momentum=0.9)
    model.compile(loss='sparse_categorical_crossentropy',
                  optimizer=optimizer,
                  metrics=['accuracy'])
    
    history = model.fit_generator(generator=train_generator,
                                  steps_per_epoch=10,  # added in Kaggle
                                  epochs=30,
                                  validation_data=validation_generator,
                                  validation_steps=10  # added in Kaggle
                                 )
    
    # save
    model.save('train7.h5')
    

    The h5model can download from this

    URL:https://pan.baidu.com/s/1YkuBHBkjjUs2dcpc8XTLqA
    Extraction code:1od1

    Because the file is too big, so I cannot upload it here. I really want to know how to solve the problem.

    opened by callmefish 28
  • Bad performance.

    Bad performance.

    Hi Tobias,

    I am getting a bad performance when using frugally-deep and I wanted to ask you about some advice. Of course I've read FAQ about the performance so I got that covered.

    Here is what I've tested so far:

    | Environment| Description | Time | |----------|:-------------|------:| | Python | Default settings (GPU ON) | 35ms| | Python | os.environ['CUDA_VISIBLE_DEVICES']='-1' | 45ms| | Python | NO GPU and tf.config.threading.set_intra_op_parallelism_threads(1) | 75ms| | Visual Studio 2017 | Default (Release -O2, whole program optimization) | 310ms| | Visual Studio 2017 | Compiled with AVX2 | 280ms|

    It is quite interesting that single switch (AVX2) gave me 10% boost! but it is still far, very far from what you have advocated.

    I did run a benchmark and here is what I've got:

    obraz

    Any ideas? Could I send you my model and example code? (privately as this is for the job, I will be happy to support you if I get paid for the project :) ).

    opened by TrueWodzu 27
  • Cannot load InceptionV3 model

    Cannot load InceptionV3 model

    So, I successfully loaded some models and predicted them.

    Yet, when I tried to load InceptionV3 model, I get an error. There was not any errors when I converted the model from 'h5' to 'json' but the code below does not work.

    image

    The error I got

    image

    opened by Terminou 25
  • Frugally LSTM Encoder-Decoder results different from Keras/Tensorflow LSTM Encoder-Decoder (missing support for initial_state)

    Frugally LSTM Encoder-Decoder results different from Keras/Tensorflow LSTM Encoder-Decoder (missing support for initial_state)

    Hi @Dobiasd

    I have been working on the Encoder-Decoder model for Vehicle Path Forecasting since you added support for returned_states and show_tensor5 on LSTM-based models. The workflow of the project was described on this past issue. After some experiments, the LSTM-Based encoder and decoder models are not giving me any problem related to returned_states = True or show_tensor5, confirming frugally-deep fixes worked. However, I have been trying to replicate the results I obtained using the Keras/Tensorflow models without success.

    The frugally-deep fdeep_encoder_model_NT is returning the exact same encoder_hidden_state and encoder_cell_state states compared to its Tf + Keras counterparts using the encoder_model.hdf5. However, the fdeep_decoder_model_NT is not giving me the same decoder_hidden_state and decoder_cell_state output states (compared to the results using Tf + Keras encoder_model.hdf5) :(

    Specifically, I develop the decoder inference model using TF + Keras (please refer yourself to past comments in this issue to see the corresponding code), and then converted it from .hdf5 to .json, ready to be ported into the C++ application (same as with the encoder model). Validating the encoder states: image However, both frugally-deep decoder_hidden_state and decoder_cell_state differ from corresponding Keras-based decoder_hidden_state and decoder_cell_state: image Resulting, as expected, in a wrong bounding box prediction: image which does not match with the corresponding Keras Results: image I do not really know about what is happening with fdeep_decoder_model_NT, so I have various options in mind:

    • I have trained another model using LSTM instead CuDNNLSTM layers in order to check if the problem is with CuDNNLSTM layer implementation. However, the problem is still present when using other LSTM-based cells like CuDNNLSTM and LSTM. The fdeep_encoder_model works well but fdeep_decoder_model is still making wrong predictions (both at states returned and next bbox prediction).
    • Now I am working in the main.cpp file. Maybe the problem is inside my internal manipulation of fdeep::tensor5 and fdeep::tensor5s when feeding the data into the ported models. However, both models are working well, except that the decoder's model is making (inaccurate) predictions of future bounding boxes, but it did not crash in any step of the script execution.
    • I am puzzled about the following fact: At main.cpp the decoders predictions is made with the following command: auto decoder_outputs = decoder_model.predict({target_seq, encoder_states.at(0), encoder_states.at(1)});, where encoder_states.at(0) and encoder_states.at(1) represent h_enc and c_enc respectively. However, I tried by interchanging the encoder states at the input of the decoder prediction line like this: auto decoder_outputs = decoder_model.predict({target_seq, encoder_states.at(1), encoder_states.at(0)}); and obtaining the exact same predicted_next_box (even though I interchanged the input order of decoder_states at the prediction function).
    • Finally, apart from the wrong values of h_dec and c_dec returned by fdeep_decoder_model, I noticed both h_dec hidden states (from frugally AND Keras) are in the range [-1, 1], but that does not occur to c_dec hidden states. In Keras, c_dec have values from [-11, 11] but, in frugally, c_dec takes values from [-1, 1]. In addition, based on your suggestion about internal scaling causing this kind of issues, by inspecting the fdeep_encoder_model.json, there are some initializers parameters that are using Variance_Scaling parameter inside that maybe are the cause of errors at inference-time. I think maybe this at the root of the problem but I have no idea of how to get the correct h_enc and c_enc, both between the same ranges used in Keras and with the correct values as well.

    Here is the main.cpp file I am running to test the results. Any comment or suggestion about the code would be welcomed!

    #include <fdeep/fdeep.hpp>
    #include <vector>
    #include <fstream>
    #include <iostream>
    
    int main()
    {
    	// Loading the previously trained models
    	const auto encoder_model = fdeep::load_model("fdeep_encoder_model_NT.json");
    	std::cout << "Encoder Model Loaded!" << std::endl;
    	const auto decoder_model = fdeep::load_model("fdeep_decoder_model_NT.json");
    	std::cout << "Decoder Model Loaded!" << std::endl;
    	// Batch_size = 1, num_timesteps = 10 and num_features = 4
    	fdeep::shape5 in_traj_shape(1,1,1,10,4);
    	// Loading a sample sequence trajectory into tensor5 data structure
    	const std::vector<float> src_traj  = {1728, 715, 191, 221,
    					1717, 710, 202, 215,
    					1706, 704, 206, 198,
    					1695, 700, 217, 196,
    					1687, 696, 228, 183,
    					1680, 689, 240, 181,
    					1668, 668, 240, 198,
    					1661, 668, 243, 194,
    					1650, 664, 251, 189,
    					1635, 660, 266, 181};
    	// Input trajectory from vector to tensor5 data structure
    	const fdeep::shared_float_vec shared_traj(fplus::make_shared_ref<fdeep::float_vec>(src_traj));
    	const fdeep::tensor5 encoder_inputs(in_traj_shape, shared_traj);
    	std::cout << "Trajectory #0!" << fdeep::show_tensor5(encoder_inputs) << std::endl;
    	// Using loaded encoder model to predict encoder output states
    	// Then encoder_states can be feed as input tensors into decoder_model
    	const auto encoder_states = encoder_model.predict({encoder_inputs});
    	// Printing for debbuging purposes
    	std::cout << "h_enc: "<< fdeep::show_tensor5(encoder_states.at(0)) << std::endl;
    	std::cout << "c_enc: "<< fdeep::show_tensor5(encoder_states.at(1)) << std::endl;
    	// Creating a SOS input sequence token to signal decoder model to start making predictions
    	fdeep::shape5 bbox_shape(1,1,1,1,4);
    	// Loading a sample sequence trajectory into tensor5 data structure
    	const std::vector<float> SOS_token  = {9999.0, 9999.0, 9999.0, 9999.0};
    	const fdeep::shared_float_vec shared_SOS_token(fplus::make_shared_ref<fdeep::float_vec>(SOS_token));
    	fdeep::tensor5 target_seq(bbox_shape, shared_SOS_token);
    	// In Python we have: Prediction, h, c = decoder_model.predict([target_seq] + state)
    	auto decoder_outputs = decoder_model.predict({target_seq, encoder_states.at(1), encoder_states.at(0)});
    	// Printing for debugging purposes
    	std::cout << "h_dec: "<< fdeep::show_tensor5(decoder_outputs.at(1)) << std::endl;
    	std::cout << "c_dec: "<< fdeep::show_tensor5(decoder_outputs.at(2)) << std::endl;
    	std::cout << "Predicted next bounding box!" << fdeep::show_tensor5(decoder_outputs.at(0)) << std::endl;
    }
    

    The fdeep_encoder_model_NT.json model imported into the C++ application is avaliable to download and inspect from this past comment. The fdeep_decoder_model_NT.json can be downloaded from the following link: Decoder model: https://drive.google.com/open?id=1hwrjcnNfWaqQI0o8TmJKtfsAwj6zd9aq I would really appreciate any help with this issue. I am puzzled because the encoder model is working perfectly but the decoder model does not, specifically, the results between the Keras vs Frugally decoder models differ, giving me wrong output predictions that cannot be used at all.

    opened by MarlonCajamarca 25
  • `visualize_layers.py` uses `scipy.misc.imsave` which no longer exists

    `visualize_layers.py` uses `scipy.misc.imsave` which no longer exists

    The documentation suggests switching to imageio.imwrite instead: https://docs.scipy.org/doc/scipy-1.2.1/reference/generated/scipy.misc.imsave.html

    There's even a migration guide: https://imageio.readthedocs.io/en/v2.6.1/scipy.html

    Another alternative would be keras.preprocessing.image.save_img.

    opened by torokati44 0
  • Modify Unit Tests CmakeLists and INSTALL.md

    Modify Unit Tests CmakeLists and INSTALL.md

    Modify Unit Tests CmakeLists.txt to let Cmake detect Python to execute command instead of using "python3 xxxx", because not all user can use "python3" to run python scripts. The command to convert h5 to json may be failed because of command "python3". I add find_package to detect Python and try to check pip.exe. pip3.exe etc. to check Tensorflow using "pip show tensorflow" to make sure user has install tensorflow. The requirment of Python and tensorflow is written in INSTALL.md.

    opened by sirius-william 4
  • Thanks !

    Thanks !

    Thank the project author very much! My graduate design project is a one-dimensional convolutional neural network. After training with Python's TensorFlow 2.10, I have been looking for ways to deploy the model in my Qt project. I have tried to compile TensorFlow C++(compilation always fails), TensorFlow C API (TensorFlow 2.10 is not supported), TensorRT (AMD graphics driver is not supported), OpenVino (the network architecture I choose is not supported). By chance, I found this library in Google. It is easy to use and does not require much dependence. It only requires header files. It perfectly solves my project needs. Thank you! PS. When using, the python script part is executed in CMakeList.txt in the test, using python3 xxxx. However, not all users can run Python scripts through the command 'python3'. It is recommended to find Python in CMakeLists.txt, or let users specify Python paths. In addition, Mingw will report Fatal error: can't write 286 bytes to section. text when compiling unittest. It is recommended to add: target_ compile_ options(PROJECT_NAME PRIVATE $<$<CXX_ COMPILER_ ID:MSVC>:/bigobj> $<$<CXX_ COMPILER_ ID:GNU>:-Wa,-mbig-obj>) This problem also arises when the library is used in other projects. #

    opened by sirius-william 2
  • Consider having different convolution implementations available and choosing the fastest one at runtime

    Consider having different convolution implementations available and choosing the fastest one at runtime

    Different convolution implementations might perform differently depending on the convolution settings (input size/depth, kernel size/count) and depending on the hardware (mostly CPU/memory) used.

    Right now, for example, we have a special implementation used for 2D convolutions in case strides = (1, 1) (which utilized not only by the Conv2D layer, but also by DepthwiseConv2D, and SeparableConv2D).

    I wonder if it would make sense to provide a function to the user, that when called on a model, tries out different implementations and remembers which one performed best for future calls of model.predict. (Maybe in some settings, event a naive non-im2col convolution is the fastest one.)

    Pros:

    • potentially faster forward passes

    Cons:

    • increased code complexity
    • potentially wrong settings in case the background load on the user's machine varies too much during the evaluation
    opened by Dobiasd 0
  • Feature Suggestion: Support Transformer Models

    Feature Suggestion: Support Transformer Models

    First off, I would like to say that this is a really great piece of work! I have been using it with LSTMs for time-series data and have found frugally-deep to be invaluable. I am starting to investigate Transformers in order to see how they stack up to LSTMs and it would be wonderful if support for Transformer models could be added. I am in the early stages of working with Transformers, but the specific layers that I currently do not see supported are: MultiHeadAttention and LayerNormalization.

    help wanted 
    opened by jonathan-lazzaro-nnl 11
  • Feature suggestion: Support ONNX models?

    Feature suggestion: Support ONNX models?

    How about supporting ONNX in frugally? You could have a protobuf importer for ONNX models or add a tool which converts ONNX to the JSON format you use? Just a thought. A header only ONNX inference engine would be very very useful.

    opened by pfeatherstone 24
Releases(v0.15.19-p0)
Owner
Tobias Hermann
likes functional programming, neat software architecture, and machine learning.
Tobias Hermann
Toolbox to analyze temporal context invariance of deep neural networks

PyTCI A toolbox that estimates the integration window of a sensory response using the "Temporal Context Invariance" paradigm (TCI). The TCI method Int

4 Oct 23, 2022
Unofficial Pytorch Lightning implementation of Contrastive Syn-to-Real Generalization (ICLR, 2021)

Unofficial Pytorch Lightning implementation of Contrastive Syn-to-Real Generalization (ICLR, 2021)

Gyeongjae Choi 17 Sep 23, 2021
[ACM MM2021] MGH: Metadata Guided Hypergraph Modeling for Unsupervised Person Re-identification

Introduction This project is developed based on FastReID, which is an ongoing ReID project. Projects BUC In projects/BUC, we implement AAAI 2019 paper

WuYiming 7 Apr 13, 2022
Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging"

Deep Optics for Single-shot High-dynamic-range Imaging Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging" CVPR, 2

Stanford Computational Imaging Lab 40 Dec 12, 2022
The story of Chicken for Club Bing

Chicken Story tl;dr: The time when Microsoft banned my entire country for cheating at Club Bing. (A lot of the details are from memory so I've recreat

Eyal 142 May 16, 2022
A particular navigation route using satellite feed and can help in toll operations & traffic managemen

How about adding some info that can quanitfy the stress on a particular navigation route using satellite feed and can help in toll operations & traffic management The current analysis is on the satel

Ashish Pandey 1 Feb 14, 2022
JumpDiff: Non-parametric estimator for Jump-diffusion processes for Python

jumpdiff jumpdiff is a python library with non-parametric Nadaraya─Watson estimators to extract the parameters of jump-diffusion processes. With jumpd

Rydin 28 Dec 10, 2022
Permute Me Softly: Learning Soft Permutations for Graph Representations

Permute Me Softly: Learning Soft Permutations for Graph Representations

Giannis Nikolentzos 7 Jul 10, 2022
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim, Jungil Kong, and Juhee Son In our rece

Jaehyeon Kim 1.7k Jan 08, 2023
NFNets and Adaptive Gradient Clipping for SGD implemented in PyTorch

PyTorch implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping Paper: https://arxiv.org/abs/2102.06171.pdf Original code: htt

Vaibhav Balloli 320 Jan 02, 2023
Pytorch cuda extension of grid_sample1d

Grid Sample 1d pytorch cuda extension of grid sample 1d. Since pytorch only supports grid sample 2d/3d, I extend the 1d version for efficiency. The fo

lyricpoem 24 Dec 03, 2022
[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)

Are Transformers More Robust Than CNNs? Pytorch implementation for NeurIPS 2021 Paper: Are Transformers More Robust Than CNNs? Our implementation is b

Yutong Bai 145 Dec 01, 2022
ADOP: Approximate Differentiable One-Pixel Point Rendering

ADOP: Approximate Differentiable One-Pixel Point Rendering Abstract: We present a novel point-based, differentiable neural rendering pipeline for scen

Darius Rückert 1.9k Jan 06, 2023
PyTorch code for the ICCV'21 paper: "Always Be Dreaming: A New Approach for Class-Incremental Learning"

Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning PyTorch code for the ICCV 2021 paper: Always Be Dreaming: A New Approach f

49 Dec 21, 2022
BT-Unet: A-Self-supervised-learning-framework-for-biomedical-image-segmentation-using-Barlow-Twins

BT-Unet: A-Self-supervised-learning-framework-for-biomedical-image-segmentation-using-Barlow-Twins Deep learning has brought most profound contributio

Narinder Singh Punn 12 Dec 04, 2022
Using Self-Supervised Pretext Tasks for Active Learning - Official Pytorch Implementation

Using Self-Supervised Pretext Tasks for Active Learning - Official Pytorch Implementation Experiment Setting: CIFAR10 (downloaded and saved in ./DATA

John Seon Keun Yi 38 Dec 27, 2022
Source code for PairNorm (ICLR 2020)

PairNorm Official pytorch source code for PairNorm paper (ICLR 2020) This code requires pytorch_geometric=1.3.2 usage For SGC, we use original PairNo

62 Dec 08, 2022
PyTorch implementation of DeepDream algorithm

neural-dream This is a PyTorch implementation of DeepDream. The code is based on neural-style-pt. Here we DeepDream a photograph of the Golden Gate Br

121 Nov 05, 2022
PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM

Quasi-Recurrent Neural Network (QRNN) for PyTorch Updated to support multi-GPU environments via DataParallel - see the the multigpu_dataparallel.py ex

Salesforce 1.3k Dec 28, 2022
Code, final versions, and information on the Sparkfun Graphical Datasheets

Graphical Datasheets Code, final versions, and information on the SparkFun Graphical Datasheets. Generated Cells After Running Script Example Complete

SparkFun Electronics 102 Jan 05, 2023