FairyTailor: Multimodal Generative Framework for Storytelling

Last update: Dec 30, 2022

Overview

FairyTailor: Multimodal Generative Framework for Storytelling

Human-in-the-loop visual story co-creation.

Users can create a cohesive children's story by weaving generated texts and retrieved images with their input. With co-creation, writers contribute their creative thinking, while generative models contribute to their constant workflow. FairyTailor adds another modality and modifies the text generation process to help producing a coherent and creative story.

Set-up (development)

After cloning the repository:

Client (Vue 2.6)

Install and check that the client compiles:

cd client
npm i
npm run build

Backend (FASTAPI)

Install and activate the environment (conda provided):

conda env create -f environment.yml
conda activate MultiModalStory

Install environment globally in the directory:

pip install -e .
pip install git+https://github.com/openai/CLIP.git

After installation run:

python -m spacy download en_core_web_sm

In python terminal:

nltk.download('wordnet')
nltk.download('sentiwordnet')
nltk.download('averaged_perceptron_tagger')

Large Data Management (dvc)

Our large data files are stored on IBM's Cloud Object Storage, and to pull data files from that platform you will use a special, read-only .dvc/config file.

dvc pull -f

Which will pull:

backend/outputs (five preset stories)
backend/story_generator/downloaded (transformers)
client/public/unsplash25k (styled images)

Running the framework during developemnt

Client:

cd client
npm run devw

Backend (with server auto reload):

uvicorn backend.server:app --reload --reload-dir backend

Open the uvicorn server localhost:8000 in your web browser

Modifications Ideas:

New huggingface transformer

Place the transformer in backend/story_generator/downloaded directory.
Update the current model path by changing the constant FINETUNED_GPT2_PATH in backend/story_generator/constants.py.

New images folder

Replace the folder client/public/unsplash25k/sketch_images1024 with yours.
Update the current path by changing the constant IMAGE_PATH in client/src/components/Constants.js.

API functionalities

Add functions to the backend endpoint at backend/server/main.py.
Update client/src/js/api/mainApi.js to call the backend endpoint from the client.
Update the corresponding user components in client/src/components.

FairyTailor: Multimodal Generative Framework for Storytelling

Related tags

Overview

FairyTailor: Multimodal Generative Framework for Storytelling

Human-in-the-loop visual story co-creation.

Set-up (development)

Client (Vue 2.6)

Backend (FASTAPI)

Large Data Management (dvc)

Running the framework during developemnt

Modifications Ideas:

New huggingface transformer

New images folder

API functionalities

Owner

Eden Bens

Gradient Step Denoiser for convergent Plug-and-Play

Experiments for distributed optimization algorithms

Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning"

Editing a Conditional Radiance Field

Non-Attentive-Tacotron - This is Pytorch Implementation of Google's Non-attentive Tacotron.

Open-sourcing the Slates Dataset for recommender systems research

Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs

Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.

TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

Efficient 3D Backbone Network for Temporal Modeling

[CVPR 2022] Unsupervised Image-to-Image Translation with Generative Prior

[ICML 2021] "Graph Contrastive Learning Automated" by Yuning You, Tianlong Chen, Yang Shen, Zhangyang Wang

Kalidokit is a blendshape and kinematics solver for Mediapipe/Tensorflow.js face, eyes, pose, and hand tracking models

This is an open solution to the Home Credit Default Risk challenge 🏡

Convert Apple NeuralHash model for CSAM Detection to ONNX.

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Torchyolo - Yolov3 ve Yolov4 modellerin Pytorch uygulamasıdır