Automatic meme generation model using Tensorflow Keras.

Overview

Memefly

You can find the project at MemeflyAI.

Contributors

Nick Buukhalter Harsh Desai Han Lee

MIT Python Tensorflow Tensorflow Serving Docker

Project Overview

Trello Board

Product Canvas

Automatic meme generation model using Tensorflow Keras. Model is Dockerized and served as a REST API with FastAPI/uvicorn ASGI endpoint. A separate serving model serving is done with a combination of FastAPI/uvicorn ASGI endpoint with models served using Tensorflow Serving on Sagemaker.

Tech Stack

Python Packages

  • Numpy
  • Pandas
  • Tensorflow
  • FastAPI
  • Selenium

DevOps

  • Tensorflow Serving
  • Docker
  • MySQL
  • MongoDB
  • AWS ECR
  • AWS Elastic Beanstalk
  • AWS S3
  • AWS Sagemaker

Architecture

memefly_architecture

Predictions

We used an encoder-decoder architecture for the meme generation task. Pre-trained Inception V3 architecture and weights are used as the encoder to extract embeddings from an input image. At the same time, we encode the texts into text embeddings and concat them together with image embeddings. For the decoder, we used GRU to to map the image and text embeddings to predict the next word in the text string.

At training time, we repeat the same image embeddings as input and send in text sequences in order, e.g., 0. this, 1. this is, 2. this is a, 3. this is a sequence. The model will try to predict the next word in the sequence given the input image embedding and text embeddings. We denote the beginning and the end of a text sequence with startseq and endseq.

At inferencing time, we send in image embeddings and the seed token startseq to the model, and then repeatly send in the image embeddings and the prediction output of the previous timestep, until either we see endseq or reach maximum sentence length. To improve the quality of the output, we used beam search to greedily select the best N sentences. But it has to be noted that beam search is neither optimal nor complete algorithm.

To increase varieties, we tried 1) adding Guassian noise to the input image and 2) choosing top N sentence scores using beam search.

The architecture is summarized here:

architecture

In-sample Meme

in-sample

Out-of-sample Meme

out-of-sample

Batch Example Outputs

memes

Explanatory Variables

  • Image
  • Text

Data Sources

Please see Data Engineering for details.

Python Notebooks

Training Notebook

Inferencing Notebook

How to connect to the web API

Please see Machine Learning Engineering - Deployment for details.

How to connect to the data API

Please see Data Engineering for details.

Contributing

When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other method with the owners of this repository before making a change.

Please note we have a code of conduct. Please follow it in all your interactions with the project.

Issue/Bug Request

If you are having an issue with the existing project code, please submit a bug report under the following guidelines:

  • Check first to see if your issue has already been reported.
  • Check to see if the issue has recently been fixed by attempting to reproduce the issue using the latest master branch in the repository.
  • Create a live example of the problem.
  • Submit a detailed bug report including your environment & browser, steps to reproduce the issue, actual and expected outcomes, where you believe the issue is originating from, and any potential solutions you have considered.

Feature Requests

We would love to hear from you about new features which would improve this app and further the aims of our project. Please provide as much detail and information as possible to show us why you think your new feature should be implemented.

Pull Requests

If you have developed a patch, bug fix, or new feature that would improve this app, please submit a pull request. It is best to communicate your ideas with the developers first before investing a great deal of time into a pull request to ensure that it will mesh smoothly with the project.

Remember that this project is licensed under the MIT license, and by submitting a pull request, you agree that your work will be, too.

Pull Request Guidelines

  • Ensure any install or build dependencies are removed before the end of the layer when doing a build.
  • Update the README.md with details of changes to the interface, including new plist variables, exposed ports, useful file locations and container parameters.
  • Ensure that your code conforms to our existing code conventions and test coverage.
  • Include the relevant issue number, if applicable.
  • You may merge the Pull Request in once you have the sign-off of two other developers, or if you do not have permission to do that, you may request the second reviewer to merge it for you.

Attribution

These contribution guidelines have been adapted from this good-Contributing.md-template.

Documentation

See Data Engineering for details on the data engineering of our project.

See Machine Learning Engineering - Training for details on the training part of our project.

See Machine Learning Engineering - Deployment for details on the deployment of our project.

Owner
BloomTech Labs
We are the Bloom Institute of Technology's Labs Organization, hosting the products our learners build during their time in BloomTech Labs.
BloomTech Labs
PlaidML is a framework for making deep learning work everywhere.

A platform for making deep learning work everywhere. Documentation | Installation Instructions | Building PlaidML | Contributing | Troubleshooting | R

PlaidML 4.5k Jan 02, 2023
Train a deep learning net with OpenStreetMap features and satellite imagery.

DeepOSM Classify roads and features in satellite imagery, by training neural networks with OpenStreetMap (OSM) data. DeepOSM can: Download a chunk of

TrailBehind, Inc. 1.3k Nov 24, 2022
PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners

Masked Autoencoders: A PyTorch Implementation This is a PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners: @

Meta Research 4.8k Jan 04, 2023
IEGAN — Official PyTorch Implementation Independent Encoder for Deep Hierarchical Unsupervised Image-to-Image Translation

IEGAN — Official PyTorch Implementation Independent Encoder for Deep Hierarchical Unsupervised Image-to-Image Translation Independent Encoder for Deep

30 Nov 05, 2022
This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

Polarized Self-Attention: Towards High-quality Pixel-wise Regression This is an official implementation of: Huajun Liu, Fuqiang Liu, Xinyi Fan and Don

DeLightCMU 212 Jan 08, 2023
Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

smart_edu-autobooking Sistema di autoprenotazione per l'aula studio [email protected]

Davide Carnemolla 17 Jun 20, 2022
Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

RNN-for-Joint-NLU Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

Kim SungDong 194 Dec 28, 2022
An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.

Fast Face Classification (F²C) This is the code of our paper An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicit

33 Jun 27, 2021
Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness (NeurIPS2021) This repository contains code for the paper "Smo

Jongheon Jeong 17 Dec 27, 2022
A repository that finds a person who looks like you by using face recognition technology.

Find Your Twin Hello everyone, I've always wondered how casting agencies do the casting for a scene where a certain actor is young or old for a movie

Cengizhan Yurdakul 3 Jan 29, 2022
Pre-Training 3D Point Cloud Transformers with Masked Point Modeling

Point-BERT: Pre-Training 3D Point Cloud Transformers with Masked Point Modeling Created by Xumin Yu*, Lulu Tang*, Yongming Rao*, Tiejun Huang, Jie Zho

Lulu Tang 306 Jan 06, 2023
This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Developed By Google!

Machine Learning Hand Detector This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Dev

Popstar Idhant 3 Feb 25, 2022
The first dataset on shadow generation for the foreground object in real-world scenes.

Object-Shadow-Generation-Dataset-DESOBA Object Shadow Generation is to deal with the shadow inconsistency between the foreground object and the backgr

BCMI 105 Dec 30, 2022
[CVPR 2016] Unsupervised Feature Learning by Image Inpainting using GANs

Context Encoders: Feature Learning by Inpainting CVPR 2016 [Project Website] [Imagenet Results] Sample results on held-out images: This is the trainin

Deepak Pathak 829 Dec 31, 2022
Preprocessed Datasets for our Multimodal NER paper

Unified Multimodal Transformer (UMT) for Multimodal Named Entity Recognition (MNER) Two MNER Datasets and Codes for our ACL'2020 paper: Improving Mult

76 Dec 21, 2022
An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

Enrico Fini 48 Sep 27, 2022
[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”,

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling Introduction Contrastive learning approaches have achieved great success in

VITA 24 Dec 17, 2022
Official implementation of NeurIPS'2021 paper TransformerFusion

TransformerFusion: Monocular RGB Scene Reconstruction using Transformers Project Page | Paper | Video TransformerFusion: Monocular RGB Scene Reconstru

Aljaz Bozic 118 Dec 25, 2022
Uni-Fold: Training your own deep protein-folding models.

Uni-Fold: Training your own deep protein-folding models. This package provides and implementation of a trainable, Transformer-based deep protein foldi

DeepModeling 88 Jan 03, 2023
SOTR: Segmenting Objects with Transformers [ICCV 2021]

SOTR: Segmenting Objects with Transformers [ICCV 2021] By Ruohao Guo, Dantong Niu, Liao Qu, Zhenbo Li Introduction This is the official implementation

186 Dec 20, 2022