Seq2seq - Sequence to Sequence Learning with Keras

Related tags

Deep Learningseq2seq
Overview

Seq2seq

Sequence to Sequence Learning with Keras

Hi! You have just found Seq2Seq. Seq2Seq is a sequence to sequence learning add-on for the python deep learning library Keras. Using Seq2Seq, you can build and train sequence-to-sequence neural network models in Keras. Such models are useful for machine translation, chatbots (see [4]), parsers, or whatever that comes to your mind.

seq2seq

Getting started

Seq2Seq contains modular and reusable layers that you can use to build your own seq2seq models as well as built-in models that work out of the box. Seq2Seq models can be compiled as they are or added as layers to a bigger model. Every Seq2Seq model has 2 primary layers : the encoder and the decoder. Generally, the encoder encodes the input sequence to an internal representation called 'context vector' which is used by the decoder to generate the output sequence. The lengths of input and output sequences can be different, as there is no explicit one on one relation between the input and output sequences. In addition to the encoder and decoder layers, a Seq2Seq model may also contain layers such as the left-stack (Stacked LSTMs on the encoder side), the right-stack (Stacked LSTMs on the decoder side), resizers (for shape compatibility between the encoder and the decoder) and dropout layers to avoid overfitting. The source code is heavily documented, so lets go straight to the examples:

A simple Seq2Seq model:

import seq2seq
from seq2seq.models import SimpleSeq2Seq

model = SimpleSeq2Seq(input_dim=5, hidden_dim=10, output_length=8, output_dim=8)
model.compile(loss='mse', optimizer='rmsprop')

That's it! You have successfully compiled a minimal Seq2Seq model! Next, let's build a 6 layer deep Seq2Seq model (3 layers for encoding, 3 layers for decoding).

Deep Seq2Seq models:

import seq2seq
from seq2seq.models import SimpleSeq2Seq

model = SimpleSeq2Seq(input_dim=5, hidden_dim=10, output_length=8, output_dim=8, depth=3)
model.compile(loss='mse', optimizer='rmsprop')

Notice that we have specified the depth for both encoder and decoder as 3, and your model has a total depth of 3 + 3 = 6. You can also specify different depths for the encoder and the decoder. Example:

import seq2seq
from seq2seq.models import SimpleSeq2Seq

model = SimpleSeq2Seq(input_dim=5, hidden_dim=10, output_length=8, output_dim=20, depth=(4, 5))
model.compile(loss='mse', optimizer='rmsprop')

Notice that the depth is specified as tuple, (4, 5). Which means your encoder will be 4 layers deep whereas your decoder will be 5 layers deep. And your model will have a total depth of 4 + 5 = 9.

Advanced Seq2Seq models:

Until now, you have been using the SimpleSeq2Seq model, which is a very minimalistic model. In the actual Seq2Seq implementation described in [1], the hidden state of the encoder is transferred to decoder. Also, the output of decoder at each timestep becomes the input to the decoder at the next time step. To make things more complicated, the hidden state is propogated throughout the LSTM stack. But you have no reason to worry, as we have a built-in model that does all that out of the box. Example:

import seq2seq
from seq2seq.models import Seq2Seq

model = Seq2Seq(batch_input_shape=(16, 7, 5), hidden_dim=10, output_length=8, output_dim=20, depth=4)
model.compile(loss='mse', optimizer='rmsprop')

Note that we had to specify the complete input shape, including the samples dimensions. This is because we need a static hidden state(similar to a stateful RNN) for transferring it across layers. (Update : Full input shape is not required in the latest version, since we switched to Recurrent Shop backend). By the way, Seq2Seq models also support the stateful argument, in case you need it.

You can also experiment with the hidden state propogation turned off. Simply set the arguments broadcast_state and inner_broadcast_state to False.

Peeky Seq2seq model:

Let's not stop there. Let's build a model similar to cho et al 2014, where the decoder gets a 'peek' at the context vector at every timestep.

cho et al 2014

To achieve this, simply add the argument peek=True:

import seq2seq
from seq2seq.models import Seq2Seq

model = Seq2Seq(batch_input_shape=(16, 7, 5), hidden_dim=10, output_length=8, output_dim=20, depth=4, peek=True)
model.compile(loss='mse', optimizer='rmsprop')

Seq2seq model with attention:

Attention Seq2seq

Let's not stop there either. In all the models described above, there is no allignment between the input sequence elements and the output sequence elements. But for machine translation, learning a soft allignment between the input and output sequences imporves performance.[3]. The Seq2seq framework includes a ready made attention model which does the same. Note that in the attention model, there is no hidden state propogation, and a bidirectional LSTM encoder is used by default. Example:

import seq2seq
from seq2seq.models import AttentionSeq2Seq

model = AttentionSeq2Seq(input_dim=5, input_length=7, hidden_dim=10, output_length=8, output_dim=20, depth=4)
model.compile(loss='mse', optimizer='rmsprop')

As you can see, in the attention model you need not specify the samples dimension as there are no static hidden states involved(But you have to if you are building a stateful Seq2seq model). Note: You can set the argument bidirectional=False if you wish not to use a bidirectional encoder.

Final Words

That's all for now. Hope you love this library. For any questions you might have, create an issue and I will get in touch. You can also contribute to this project by reporting bugs, adding new examples, datasets or models.

Installation:

sudo pip install git+https://github.com/farizrahman4u/seq2seq.git

Requirements:

Working Example:

Papers:

Owner
Fariz Rahman
Fariz Rahman
SustainBench: Benchmarks for Monitoring the Sustainable Development Goals with Machine Learning

Datasets | Website | Raw Data | OpenReview SustainBench: Benchmarks for Monitoring the Sustainable Development Goals with Machine Learning Christopher

67 Dec 17, 2022
CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum

CO-PILOT CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum, NeurIPS 2021, Shuang Ao, Tianyi Zhou, Guodong Long, Qingh

Shuang Ao 1 Feb 18, 2022
Crawl & visualize ICLR papers and reviews

Crawl and Visualize ICLR 2022 OpenReview Data Descriptions This Jupyter Notebook contains the data crawled from ICLR 2022 OpenReview webpages and thei

Federico Berto 75 Dec 05, 2022
Official Repository for Machine Learning class - Physics Without Frontiers 2021

PWF 2021 Física Sin Fronteras es un proyecto del Centro Internacional de Física Teórica (ICTP) en Trieste Italia. El ICTP es un centro dedicado a fome

36 Aug 06, 2022
Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection"

Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection". LRPDenseNet.py

Pedro Ricardo Ariel Salvador Bassi 2 Sep 21, 2022
a project for 3D multi-object tracking

a project for 3D multi-object tracking

155 Jan 04, 2023
Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation

deeptime Releases: Installation via conda recommended. conda install -c conda-forge deeptime pip install deeptime Documentation: deeptime-ml.github.io

495 Dec 28, 2022
Graph WaveNet apdapted for brain connectivity analysis.

Graph WaveNet for brain network analysis This is the implementation of the Graph WaveNet model used in our manuscript: S. Wein , A. Schüller, A. M. To

4 Dec 17, 2022
DA2Lite is an automated model compression toolkit for PyTorch.

DA2Lite (Deep Architecture to Lite) is a toolkit to compress and accelerate deep network models. ⭐ Star us on GitHub — it helps!! Frameworks & Librari

Sinhan Kang 7 Mar 22, 2022
Learning from Synthetic Humans, CVPR 2017

Learning from Synthetic Humans (SURREAL) Gül Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael J. Black, Ivan Laptev and Cordelia Schmid,

Gul Varol 538 Dec 18, 2022
Code for a real-time distributed cooperative slam(RDC-SLAM) system for ROS compatible platforms.

RDC-SLAM This repository contains code for a real-time distributed cooperative slam(RDC-SLAM) system for ROS compatible platforms. The system takes in

40 Nov 19, 2022
Back to Basics: Efficient Network Compression via IMP

Back to Basics: Efficient Network Compression via IMP Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta This repository contains the code to r

IOL Lab @ ZIB 1 Nov 19, 2021
Repository for open research on optimizers.

Open Optimizers Repository for open research on optimizers. This is a test in sharing research/exploration as it happens. If you use anything from thi

Ariel Ekgren 6 Jun 24, 2022
This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network.

GPRGNN This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network. Hidden state feature extraction i

Jianhao 92 Jan 03, 2023
A really easy-to-use and powerful sudoku solver.

SodukuSolver This is a really useful sudoku solver with a Qt gui. USAGE Enter the numbers in and click "RUN"! If you don't want to wait, simply press

Ujhhgtg Teams 11 Jun 02, 2022
Finding all things on-prem Microsoft for password spraying and enumeration.

msprobe About Installing Usage Examples Coming Soon Acknowledgements About Finding all things on-prem Microsoft for password spraying and enumeration.

205 Jan 09, 2023
Deep Q Learning with OpenAI Gym and Pokemon Showdown

pokemon-deep-learning An openAI gym project for pokemon involving deep q learning. Made by myself, Sam Little, and Layton Webber. This code captures g

2 Dec 22, 2021
Object Depth via Motion and Detection Dataset

ODMD Dataset ODMD is the first dataset for learning Object Depth via Motion and Detection. ODMD training data are configurable and extensible, with ea

Brent Griffin 172 Dec 21, 2022
Dataset and codebase for NeurIPS 2021 paper: Exploring Forensic Dental Identification with Deep Learning

Repository under construction. Example dataset, checkpoints, and training/testing scripts will be avaible soon! 💡 Collated best practices from most p

4 Jun 26, 2022
MADE (Masked Autoencoder Density Estimation) implementation in PyTorch

pytorch-made This code is an implementation of "Masked AutoEncoder for Density Estimation" by Germain et al., 2015. The core idea is that you can turn

Andrej 498 Dec 30, 2022