Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Overview



MIT License Latest Release Build Status Documentation Status


Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks.

We provide reference implementations of various sequence modeling papers:

List of implemented papers

What's New:

Previous updates

Features:

We also provide pre-trained models for translation and language modeling with a convenient torch.hub interface:

en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model')
en2de.translate('Hello world', beam=5)
# 'Hallo Welt'

See the PyTorch Hub tutorials for translation and RoBERTa for more examples.

Requirements and Installation

  • PyTorch version >= 1.5.0
  • Python version >= 3.6
  • For training new models, you'll also need an NVIDIA GPU and NCCL
  • To install fairseq and develop locally:
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./

# on MacOS:
# CFLAGS="-stdlib=libc++" pip install --editable ./

# to install the latest stable release (0.10.x)
# pip install fairseq
  • For faster training install NVIDIA's apex library:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \
  --global-option="--deprecated_fused_adam" --global-option="--xentropy" \
  --global-option="--fast_multihead_attn" ./
  • For large datasets install PyArrow: pip install pyarrow
  • If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run .

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.

We also have more detailed READMEs to reproduce results from specific papers:

Join the fairseq community

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

Please cite as:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}
Comments
  • wav2vec 2.0 inference pipeline

    wav2vec 2.0 inference pipeline

    🚀 Feature Request

    Provide a simple inference pipeline for the wav2vec 2.0 model.

    Motivation

    Current inference script examples/speech_recognition/infer.py handles a lot of cases, resulting being extremely complex.

    Pitch

    A single python script that loads and runs inference with wav2vec 2.0 pre-trained model on a single wav file or on a programmatically loaded waveform signal.

    Alternatives

    Additional context

    This kind of inference pipeline would enable indi researchers to test the model on their audio dataset and and against other models.

    enhancement help wanted needs triage stale 
    opened by loretoparisi 111
  • Errors running prepare_text.sh (and other preprocessing) from wav2vec-u in fresh environment

    Errors running prepare_text.sh (and other preprocessing) from wav2vec-u in fresh environment

    My Question:

    How can I get prepare_text.sh running correctly in a fresh Ubuntu Jupyterlab environment? What needs to be installed, what variables set, etc.?

    I've run into various issues attempting to run the script prepare_text.sh, from https://github.com/pytorch/fairseq/blob/master/examples/wav2vec/unsupervised/scripts/prepare_text.sh.

    Right now, I'm stuck on preprocess.py: error: unrecognized arguments: --dict-only, but I've run into some other errors that I've had to workaround, detailed below.

    Full current output:

    After getting through all the other issues I detail below, currently this is what I see when I attempt to run the script.

    I cloned the https://github.com/pytorch/fairseq.git repo, and navigated to the scripts folder: https://github.com/pytorch/fairseq/tree/master/examples/wav2vec/unsupervised/scripts before running this.

    (wav2vecu_pre) [email protected]:~/work/fairseq/examples/wav2vec/unsupervised/scripts$ zsh prepare_text.sh sw /home/jovyan/work/WikiDumps/wiki_sw_head.txt /home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out
    sw
    sw
    /home/jovyan/work/WikiDumps/wiki_sw_head.txt
    /home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out
    Warning : `load_model` does not return WordVectorModel or SupervisedModel any more, but a `FastText` object which is very similar.
    usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu]
                         [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE]
                         [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE]
                         [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE]
                         [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT]
                         [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile]
                         [--criterion {masked_lm,nat_loss,sentence_ranking,ctc,composite_loss,cross_entropy,legacy_masked_lm_loss,sentence_prediction,adaptive_loss,label_smoothed_cross_entropy,wav2vec,label_smoothed_cross_entropy_with_alignment,vocab_parallel_cross_entropy}]
                         [--tokenizer {moses,nltk,space}] [--bpe {sentencepiece,bytes,characters,byte_bpe,gpt2,hf_byte_bpe,fastbpe,subword_nmt,bert}]
                         [--optimizer {adam,adamax,adagrad,adafactor,adadelta,lamb,sgd,nag}]
                         [--lr-scheduler {triangular,fixed,reduce_lr_on_plateau,cosine,polynomial_decay,tri_stage,inverse_sqrt}] [--scoring {sacrebleu,bleu,wer,chrf}]
                         [--task TASK] [-s SRC] [-t TARGET] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--destdir DIR] [--thresholdtgt N]
                         [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary]
                         [--only-source] [--padding-factor N] [--workers N]
    preprocess.py: error: unrecognized arguments: --dict-only
    cut: /home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out/dict.txt: No such file or directory
    fatal error: PHONEMIZER_ESPEAK_PATH=espeak not found is not an executable file
    fatal error: PHONEMIZER_ESPEAK_PATH=espeak not found is not an executable file
    one is 
    sed: can't read /home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out/phones.txt: No such file or directory
    paste: /home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out/phones.txt: No such file or directory
    usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu]
                         [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE]
                         [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE]
                         [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE]
                         [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT]
                         [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile]
                         [--criterion {masked_lm,nat_loss,sentence_ranking,ctc,composite_loss,cross_entropy,legacy_masked_lm_loss,sentence_prediction,adaptive_loss,label_smoothed_cross_entropy,wav2vec,label_smoothed_cross_entropy_with_alignment,vocab_parallel_cross_entropy}]
                         [--tokenizer {moses,nltk,space}] [--bpe {sentencepiece,bytes,characters,byte_bpe,gpt2,hf_byte_bpe,fastbpe,subword_nmt,bert}]
                         [--optimizer {adam,adamax,adagrad,adafactor,adadelta,lamb,sgd,nag}]
                         [--lr-scheduler {triangular,fixed,reduce_lr_on_plateau,cosine,polynomial_decay,tri_stage,inverse_sqrt}] [--scoring {sacrebleu,bleu,wer,chrf}]
                         [--task TASK] [-s SRC] [-t TARGET] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--destdir DIR] [--thresholdtgt N]
                         [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary]
                         [--only-source] [--padding-factor N] [--workers N]
    preprocess.py: error: unrecognized arguments: --dict-only
    2021-06-03 16:39:42 | INFO | fairseq_cli.preprocess | Namespace(no_progress_bar=False, log_interval=100, log_format=None, tensorboard_logdir=None, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=False, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, min_loss_scale=0.0001, threshold_loss_scale=None, user_dir=None, empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, checkpoint_suffix='', checkpoint_shard_count=1, quantization_config_path=None, profile=False, criterion='cross_entropy', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='translation', source_lang=None, target_lang=None, trainpref='/home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out/phones/lm.phones.filtered.txt', validpref=None, testpref=None, align_suffix=None, destdir='/home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out/phones', thresholdtgt=0, thresholdsrc=0, tgtdict=None, srcdict='/home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out/phones/dict.phn.txt', nwordstgt=-1, nwordssrc=-1, alignfile=None, dataset_impl='mmap', joined_dictionary=False, only_source=True, padding_factor=8, workers=70)
    Traceback (most recent call last):
      File "/home/jovyan/work/fairseq//fairseq_cli/preprocess.py", line 401, in <module>
        cli_main()
      File "/home/jovyan/work/fairseq//fairseq_cli/preprocess.py", line 397, in cli_main
        main(args)
      File "/home/jovyan/work/fairseq//fairseq_cli/preprocess.py", line 98, in main
        src_dict = task.load_dictionary(args.srcdict)
      File "/opt/conda/envs/wav2vecu_pre/lib/python3.9/site-packages/fairseq/tasks/fairseq_task.py", line 54, in load_dictionary
        return Dictionary.load(filename)
      File "/opt/conda/envs/wav2vecu_pre/lib/python3.9/site-packages/fairseq/data/dictionary.py", line 214, in load
        d.add_from_file(f)
      File "/opt/conda/envs/wav2vecu_pre/lib/python3.9/site-packages/fairseq/data/dictionary.py", line 225, in add_from_file
        self.add_from_file(fd)
      File "/opt/conda/envs/wav2vecu_pre/lib/python3.9/site-packages/fairseq/data/dictionary.py", line 249, in add_from_file
        raise RuntimeError(
    RuntimeError: Duplicate word found when loading Dictionary: '<SIL>'. Duplicate words can overwrite earlier ones by adding the #fairseq:overwrite flag at the end of the corresponding row in the dictionary file. If using the Camembert model, please download an updated copy of the model file.
    prepare_text.sh:49: command not found: lmplz
    prepare_text.sh:50: command not found: build_binary
    python: can't open file '/home/jovyan/work/fairseq/examples/wav2vec/unsupervised/scripts/examples/speech_recognition/kaldi/kaldi_initializer.py': [Errno 2] No such file or directory
    python: can't open file '/home/jovyan/work/fairseq/examples/wav2vec/unsupervised/scripts/examples/speech_recognition/kaldi/kaldi_initializer.py': [Errno 2] No such file or directory
    prepare_text.sh:54: command not found: lmplz
    prepare_text.sh:55: command not found: build_binary
    prepare_text.sh:56: command not found: lmplz
    prepare_text.sh:57: command not found: build_binary
    Primary config directory not found.
    Check that the config directory '/home/jovyan/work/fairseq/examples/speech_recognition/kaldi/config' exists and readable
    

    Fixed (?) Problem: Can't seem to run it from the same folder as the README (workaround: run from scripts folder)

    First, I can't run it from the same folder as the README at https://github.com/pytorch/fairseq/tree/master/examples/wav2vec/unsupervised#preparation-of-speech-and-text-data says to. If you try doing so, you get errors with, e.g. path not found to other scripts.

    zsh scripts/prepare_text.sh sw /home/jovyan/work/WikiDumps/wiki_sw_head.txt /home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out
    sw
    sw
    /home/jovyan/work/WikiDumps/wiki_sw_head.txt
    /home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out
    python: can't open file '/home/jovyan/work/fairseq/examples/wav2vec/unsupervised/normalize_and_filter_text.py': [Errno 2] No such file or directory
    

    Fixed (?) Problem: "ValueError: lid.187.bin cannot be opened for loading!" (workaround: use lid.176.bin instead)

    Solution: download a different language ID model, and edit the code to use it.

    https://fasttext.cc/docs/en/language-identification.html has a different model, lid.176.bin

    wget https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin
    

    and edit this portion of normalize_and_filter_text.py:

        parser.add_argument(
            "--fasttext-model",
            help="path to fasttext model",
            default="lid.176.bin",
        )
    

    Fixed (?) Problem: dependencies needed (phonemizer, fasttext, fairseq)

    The script does not list which dependencies are needed. So far I've determined that phonemizer, fasttext are needed, and I think fairseq too. Any more I'm missing?

    Fixed (?) Problem: can't find files in fairseq_cli: (solution: iYou need to set an environment variable, FAIRSEQ_ROOT).

    I set this to point to the top level of the cloned repo. not sure if that's right.

    (I cloned the repo to ~/work/fairseq/)

    export FAIRSEQ_ROOT=~/work/fairseq/
    

    Fixed (?) Problem: Not sure what language code to use. (guessed sw)

    I've got Swahili data. Not sure whether to use sw, or swahili or what, I assume I should pick from https://github.com/espeak-ng/espeak-ng/blob/master/docs/languages.md

    Code

    Here's the command I use to invoke the script. Other than editing the default langid model, I haven't edited anything else in the repo, should be the same as https://github.com/pytorch/fairseq/tree/master/examples/wav2vec/unsupervised/scripts. git log shows c47a9b2eef0f41b0564c8daf52cb82ea97fc6548 as the commit.

    zsh prepare_text.sh language /home/jovyan/work/WikiDumps/wiki_sw_head.txt /home/jovyan/work/WikiDumps/wiki_sw_head_wav2vecu_prepared.out
    

    What have you tried?

    • Tried reading https://github.com/pytorch/fairseq/tree/master/examples/wav2vec/unsupervised#preparation-of-speech-and-text-data
    • Tried reading https://github.com/pytorch/fairseq/issues/3581 and https://github.com/pytorch/fairseq/issues/3586
    • Googling for various keywords such as "fairseq preprocess dict-only"

    What's your environment?

    I'm in a Jupyterlab in a Docker container, running Ubuntu.

    OS is Ubuntu 20.04.2:

    cat /etc/os-release
    NAME="Ubuntu"
    VERSION="20.04.2 LTS (Focal Fossa)"
    ID=ubuntu
    ID_LIKE=debian
    PRETTY_NAME="Ubuntu 20.04.2 LTS"
    VERSION_ID="20.04"
    HOME_URL="https://www.ubuntu.com/"
    SUPPORT_URL="https://help.ubuntu.com/"
    BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
    PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
    VERSION_CODENAME=focal
    UBUNTU_CODENAME=focal
    

    pip list:

    pip listPackage                Version
    ---------------------- -------------------
    antlr4-python3-runtime 4.8
    attrs                  21.2.0
    certifi                2021.5.30
    cffi                   1.14.5
    clldutils              3.9.0
    colorlog               5.0.1
    csvw                   1.11.0
    Cython                 0.29.23
    dataclasses            0.6
    editdistance           0.5.3
    fairseq                0.10.0
    fasttext               0.9.2
    hydra-core             1.0.6
    isodate                0.6.0
    joblib                 1.0.1
    numpy                  1.20.3
    omegaconf              2.0.6
    phonemizer             2.2.2
    pip                    21.1.2
    portalocker            2.0.0
    pybind11               2.6.2
    pycparser              2.20
    python-dateutil        2.8.1
    PyYAML                 5.4.1
    regex                  2021.4.4
    rfc3986                1.5.0
    sacrebleu              1.5.1
    segments               2.2.0
    setuptools             49.6.0.post20210108
    six                    1.16.0
    tabulate               0.8.9
    torch                  1.8.1
    tqdm                   4.61.0
    typing-extensions      3.10.0.0
    uritemplate            3.0.1
    wheel                  0.36.2
    

    conda list:

    conda list
    # packages in environment at /opt/conda/envs/wav2vecu_pre:
    #
    # Name                    Version                   Build  Channel
    _libgcc_mutex             0.1                 conda_forge    conda-forge
    _openmp_mutex             4.5                       1_gnu    conda-forge
    antlr4-python3-runtime    4.8                      pypi_0    pypi
    attrs                     21.2.0                   pypi_0    pypi
    ca-certificates           2021.5.30            ha878542_0    conda-forge
    certifi                   2021.5.30        py39hf3d152e_0    conda-forge
    cffi                      1.14.5                   pypi_0    pypi
    clldutils                 3.9.0                    pypi_0    pypi
    colorlog                  5.0.1                    pypi_0    pypi
    csvw                      1.11.0                   pypi_0    pypi
    cython                    0.29.23                  pypi_0    pypi
    dataclasses               0.6                      pypi_0    pypi
    editdistance              0.5.3                    pypi_0    pypi
    fairseq                   0.10.0                   pypi_0    pypi
    fasttext                  0.9.2                    pypi_0    pypi
    hydra-core                1.0.6                    pypi_0    pypi
    isodate                   0.6.0                    pypi_0    pypi
    joblib                    1.0.1                    pypi_0    pypi
    ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
    libffi                    3.3                  h58526e2_2    conda-forge
    libgcc-ng                 9.3.0               h2828fa1_19    conda-forge
    libgomp                   9.3.0               h2828fa1_19    conda-forge
    libstdcxx-ng              9.3.0               h6de172a_19    conda-forge
    ncurses                   6.2                  h58526e2_4    conda-forge
    numpy                     1.20.3                   pypi_0    pypi
    omegaconf                 2.0.6                    pypi_0    pypi
    openssl                   1.1.1k               h7f98852_0    conda-forge
    phonemizer                2.2.2                    pypi_0    pypi
    pip                       21.1.2             pyhd8ed1ab_0    conda-forge
    portalocker               2.0.0                    pypi_0    pypi
    pybind11                  2.6.2                    pypi_0    pypi
    pycparser                 2.20                     pypi_0    pypi
    python                    3.9.4           hffdb5ce_0_cpython    conda-forge
    python-dateutil           2.8.1                    pypi_0    pypi
    python_abi                3.9                      1_cp39    conda-forge
    pyyaml                    5.4.1                    pypi_0    pypi
    readline                  8.1                  h46c0cb4_0    conda-forge
    regex                     2021.4.4                 pypi_0    pypi
    rfc3986                   1.5.0                    pypi_0    pypi
    sacrebleu                 1.5.1                    pypi_0    pypi
    segments                  2.2.0                    pypi_0    pypi
    setuptools                49.6.0           py39hf3d152e_3    conda-forge
    six                       1.16.0                   pypi_0    pypi
    sqlite                    3.35.5               h74cdb3f_0    conda-forge
    tabulate                  0.8.9                    pypi_0    pypi
    tk                        8.6.10               h21135ba_1    conda-forge
    torch                     1.8.1                    pypi_0    pypi
    tqdm                      4.61.0                   pypi_0    pypi
    typing-extensions         3.10.0.0                 pypi_0    pypi
    tzdata                    2021a                he74cb21_0    conda-forge
    uritemplate               3.0.1                    pypi_0    pypi
    wheel                     0.36.2             pyhd3deb0d_0    conda-forge
    xz                        5.2.5                h516909a_1    conda-forge
    zlib                      1.2.11            h516909a_1010    conda-forge
    

    I also apt-installed phonemizer dependencies:

    sudo apt-get install festival espeak-ng mbrola
    

    And finally, here's what I get from apt list|grep installed apt-list.txt

    question needs triage stale 
    opened by cdleong 59
  • How to fine-tune wav2vec 2.0 with TIMIT

    How to fine-tune wav2vec 2.0 with TIMIT

    ❓ Questions and Help

    What is your question?

    Hello.

    this paper says that wav2vec 2.0 works well as for phoneme recognition task (with TIMIT dataset), but some important information is missing in the README.md to do it myself. I'm at a standstill. Did anyone complete this task?

    What's your environment?

    • fairseq Version: master
    • PyTorch Version: 1.6
    • OS: Linux (Debian)
    • How you installed fairseq: based on the README.md
    • Python version: 3.7
    • CUDA/cuDNN version: Tesla T4 (CUDA 11.0)
    $ nvidia-smi
    
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 450.51.06    Driver Version: 450.51.06    CUDA Version: 11.0     |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |                               |                      |               MIG M. |
    |===============================+======================+======================|
    |   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
    | N/A   77C    P0    49W /  70W |  11914MiB / 15109MiB |     95%      Default |
    |                               |                      |                  N/A |
    +-------------------------------+----------------------+----------------------+
                                                                                   
    
    question 
    opened by kosuke-kitahara 38
  • issues with mbart models

    issues with mbart models

    ❓ Questions and Help

    Thanks for releasing the mbart models! However, we are unable to produce the EN-RO fine-tuned BLEU scores reported in the paper. We get a BLEU score of 26.9, using sacreBLEU's default tokenization, v13. This is well below the 38.5 reported in the README and even below scores reported for WMT16. Here is a complete script to reproduce this; is there anything obvious we are doing wrong?

    We have also tried to work with scoring the main, pretrained-only model, and were surprised to find that the names of the parameters seem to change between the main model and fine-tuned one. Perhaps documenting this is beyond the scope of your intentions with releasing the model, but it is a bit confusing when working with these models.

    Code

    Here is the code we run:

    # constants
    langs=ar_AR,cs_CZ,de_DE,en_XX,es_XX,et_EE,fi_FI,fr_XX,gu_IN,hi_IN,it_IT,ja_XX,kk_KZ,ko_KR,lt_LT,lv_LV,my_MM,ne_NP,nl_XX,ro_RO,ru_RU,si_LK,tr_TR,vi_VN,zh_CN
    MODELDIR=MBART_finetuned_enro
    DICT=$MODELDIR/dict.txt
    export FAIRSEQ=~/code/fairseq
    export PYTHONPATH=$FAIRSEQ
    # end constants
    
    SRC=en_XX
    TRG=ro_RO
    
    mkdir tmp
    sacrebleu -t wmt16 -l en-ro --echo src | spm_encode --model $MODELDIR/sentence.bpe.model > tmp/data.spm.$SRC
    sacrebleu -t wmt16 -l en-ro --echo ref | spm_encode --model $MODELDIR/sentence.bpe.model > tmp/data.spm.$TRG
    
    python3 $FAIRSEQ/preprocess.py \
      --source-lang $SRC \
      --target-lang $TRG \
      --testpref tmp/data.spm  \
      --destdir tmp \
      --thresholdtgt 0 \
      --thresholdsrc 0 \
      --srcdict ${DICT} \
      --tgtdict ${DICT} \
      --workers 70
    
    python3 $FAIRSEQ/generate.py $tmpdir \
      --path $MODELDIR/model.pt \
      --task translation_from_pretrained_bart \
      --gen-subset test \
      --max-tokens 1000 \
      -s $SRC \
      -t $TRG \
      --max-sentences 32 \
      --langs $langs > out.wmt19.ro
    
    grep ^H out.wmt19.ro | sort -V | cut -f3 | spm_decode --model $MODELDIR/sentence.bpe.model | perl -pe 's/\[ro_RO\]//' | sacrebleu -t wmt16 -l en-ro -b
    

    What's your environment?

    • fairseq Version (e.g., 1.0 or master): latest github
    • PyTorch Version (e.g., 1.0): 1.4.0
    • OS (e.g., Linux): CentOS 7.5
    • How you installed fairseq (pip, source): source
    • Build command you used (if compiling from source): pip install --editable . (within a conda env)
    • Python version: 3.7.5
    • CUDA/cuDNN version: 10.1 / 7.6.3
    • GPU models and configuration: Titan RTX
    • Any other relevant information:
    question stale 
    opened by mjpost 38
  • What is the expected  unsupervised Wav2vec2 data set format?

    What is the expected unsupervised Wav2vec2 data set format?

    How can i prepare a data set for my own language ? Which files are necessary and how can i create them in expected way ?Is there any example for any language ?

    question needs triage stale 
    opened by Enescigdem 36
  • Error when running Fairseq generate on Trained Levenshtien Model

    Error when running Fairseq generate on Trained Levenshtien Model

    | [en] dictionary: 39840 types | [de] dictionary: 39840 types | loaded 3003 examples from: data-bin/wmt17_en_de_distill/test.en-de.en | loaded 3003 examples from: data-bin/wmt17_en_de_distill/test.en-de.de | data-bin/wmt17_en_de_distill test en-de 3003 examples | loading model(s) from checkpoints/checkpoint_1_2000.pt 0%| | 0/12 [00:00<?, ?it/s]Traceback (most recent call last): File "/home/ubuntu/miniconda3/bin/fairseq-generate", line 11, in load_entry_point('fairseq', 'console_scripts', 'fairseq-generate')() File "/home/ubuntu/fairseq/fairseq_cli/generate.py", line 199, in cli_main main(args) File "/home/ubuntu/fairseq/fairseq_cli/generate.py", line 104, in main hypos = task.inference_step(generator, models, sample, prefix_tokens) File "/home/ubuntu/fairseq/fairseq/tasks/fairseq_task.py", line 265, in inference_step return generator.generate(models, sample, prefix_tokens=prefix_tokens) File "/home/ubuntu/miniconda3/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad return func(*args, **kwargs) File "/home/ubuntu/fairseq/fairseq/sequence_generator.py", line 113, in generate return self._generate(model, sample, **kwargs) File "/home/ubuntu/miniconda3/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad return func(*args, **kwargs) File "/home/ubuntu/fairseq/fairseq/sequence_generator.py", line 295, in _generate tokens[:, :step + 1], encoder_outs, temperature=self.temperature, File "/home/ubuntu/miniconda3/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad return func(*args, **kwargs) File "/home/ubuntu/fairseq/fairseq/sequence_generator.py", line 553, in forward_decoder temperature=temperature, File "/home/ubuntu/fairseq/fairseq/sequence_generator.py", line 584, in _decode_one tokens, encoder_out=encoder_out, incremental_state=self.incremental_states[model], File "/home/ubuntu/fairseq/fairseq/models/nat/levenshtein_transformer.py", line 405, in forward_decoder output_tokens = decoder_out.output_tokens AttributeError: 'Tensor' object has no attribute 'output_tokens'

    opened by spprabhu 33
  • Fix Windows CI

    Fix Windows CI

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
    • [x] Did you read the contributor guideline?
    • [ ] Did you make sure to update the docs?
    • [ ] Did you write any new necessary tests?

    What does this PR do?

    Adds Windows CI support so we can ensure fairseq is and remains Windows-compatible. 😄

    PR review

    Anyone in the community is free to review the PR once the tests have passed.
    If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

    CLA Signed stale 
    opened by erip 32
  • Release of wav2vec 2.0

    Release of wav2vec 2.0

    I just read the recent paper on wav2vec 2.0, it looks very interesting. Thanks for this contribution. Could someone please let me know when the codes would be released?

    question 
    opened by sandeep-badrinath 32
  • Difficulties to reproduce XSUM results with BART

    Difficulties to reproduce XSUM results with BART

    I'm trying to reproduce the results of BART on XSUM dataset.

    I followed README, didn't apply any preprocessing to the XSUM data and use beam=6, lenpen=1.0, max_len_b=60, min_len=10 for generation.

    I got following results :

    1 ROUGE-1 Average_F: 0.43809 (95%-conf.int. 0.43543 - 0.44078)
    ---------------------------------------------
    1 ROUGE-2 Average_F: 0.20327 (95%-conf.int. 0.20052 - 0.20598)
    ---------------------------------------------
    1 ROUGE-L Average_F: 0.34652 (95%-conf.int. 0.34382 - 0.34941)
    

    which is a bit lower than reported results : image


    For the CNN/DM dataset, there was a few details to add in the data preprocessing step, I'm wondering if I missed these details for XSUM dataset.

    Adding the missing preprocessing steps lead to score improvments, so I think it's the same issue for XSUM dataset. Does someone know where I can find a detailed explanation on how to preprocess XSUM dataset ?

    @ngoyal2707 @yinhanliu

    question stale 
    opened by astariul 26
  • No decrease of wer when fine tuning wav2vec 2.0

    No decrease of wer when fine tuning wav2vec 2.0

    I am trying to replicate the paper by fine-tuning the Wa2Vec 2.0 No finetuning base model with 1h of librilight. As written in the readme, I am doing the following command :

    python3 fairseq/train.py \
        --distributed-world-size 6  /path/to/libri-light/1h \
        --save-dir path/to/model_checkpoint \
        --fp16 \
        --wer-args '("path/4-gram.bin","path/librispeech_lexicon.lst",2,-1)' \
        --post-process letter \
        --valid-subset valid \
        --no-epoch-checkpoints \
        --best-checkpoint-metric wer \
        --num-workers 4 \
        --max-update 13000 \
        --sentence-avg \
        --task audio_pretraining \
        --arch wav2vec_ctc \
        --w2v-path path/to/wav2vec_small.pt \
        --labels ltr \
        --apply-mask \
        --mask-selection static \
        --mask-other 0 \
        --mask-length 10 \
        --mask-prob 0.75 \
        --layerdrop 0.05 \
        --mask-channel-selection static \
        --mask-channel-other 0 \
        --mask-channel-length 64 \
        --mask-channel-prob 0.256 \
        --zero-infinity \
        --feature-grad-mult 0.0 \
        --freeze-finetune-updates 10000 \
        --validate-after-updates 10000 \
        --optimizer adam \
        --adam-betas '(0.9, 0.98)' \
        --adam-eps 1e-08 \
        --lr 1e-04 \
        --lr-scheduler tri_stage \
        --warmup-steps 1300 \
        --hold-steps 5200 \
        --decay-steps 6500 \
        --final-lr-scale 0.05 \
        --final-dropout 0.0 \
        --dropout 0.0 \
        --activation-dropout 0.1 \
        --criterion ctc \
        --attention-dropout 0.0 \
        --max-tokens 1280000 \
        --seed 2337 \
        --log-format json \
        --log-interval 500 \
        --ddp-backend no_c10d
    

    However, the valid_raw_wer and valid_wer never goes down during training and stays at around 99-100%. valid_uer decrease until ~78% ang goes up again even though the loss keep decreasing.

    Both the lexicon and the language model seem fine as I used them for testing the model already fine tuned and got similar result then reported in the paper.

    An example of log:

    2020-10-02 12:23:48 | INFO | fairseq.trainer | begin training epoch 40
    2020-10-02 12:23:55 | INFO | fairseq_cli.train | begin validation on "valid" subset
    2020-10-02 12:24:01 | INFO | valid | {"epoch": 40, "valid_loss": "1942.87", "valid_ntokens": "5368", "valid_nsentences": "30", "valid_nll_loss": "10.858", "valid_uer": "79.49", "valid_wer": "99.797", "valid_raw_wer": "111.663", "valid_wps": "0", "valid_wpb": "5368", "valid_bsz": "30", "valid_num_updates": "317", "valid_best_wer": "99.696"}
    2020-10-02 12:24:01 | INFO | fairseq_cli.train | begin save checkpoint
    
    2020-10-02 12:24:04 | INFO | fairseq_cli.train | end of epoch 40 (average epoch stats below)
    2020-10-02 12:24:04 | INFO | train | {"epoch": 40, "train_loss": "2211.27", "train_ntokens": "6130.5", "train_nsentences": "32", "train_nll_loss": "11.542", "train_wps": "3088.9", "train_ups": "0.5", "train_wpb": "6130.5", "train_bsz": "32", "train_num_updates": "317", "train_lr": "2.51408e-05", "train_gnorm": "645.876", "train_loss_scale": "16", "train_train_wall": "1", "train_wall": "684"}
    

    Do you have any idea where I should look at ? Thank you for your help

    question needs triage 
    opened by ezerhouni 25
  • refactor namespaces in criterion interface

    refactor namespaces in criterion interface

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
    • [x] Did you read the contributor guideline?
    • [x] Did you make sure to update the docs?
    • [x] Did you write any new necessary tests?

    What does this PR do?

    Fixes #1672 in part (part 1: context)

    PR review

    Anyone in the community is free to review the PR once the tests have passed.
    If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

    CLA Signed Merged 
    opened by erip 25
  • Update readme

    Update readme

    The table column name is wrong. I fixed it as in paper

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
    • [x] Did you read the contributor guideline?
    • [x] Did you make sure to update the docs?
    • [x] Did you write any new necessary tests?

    What does this PR do?

    Fixes # (issue).

    PR review

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

    CLA Signed 
    opened by dlwlgus53 0
  • "OOM during optimization" when fine-tuning NLLB

    ❓ Questions and Help

    What is your question?

    Hi, I am getting "OOM during optimization, irrecoverable" when trying to fine-tune the 3.3B parameter NLLB model.

    Stack trace:
    Traceback (most recent call last):
      File "/home/x/projects/nllb/fairseq/slurm_snapshot_code/2022-12-28T22_01_31.150636/fairseq/trainer.py", line 1147, in train_step
        raise e
      File "/home/x/projects/nllb/fairseq/slurm_snapshot_code/2022-12-28T22_01_31.150636/fairseq/trainer.py", line 1099, in train_step
        self.task.optimizer_step(
      File "/home/x/projects/nllb/fairseq/slurm_snapshot_code/2022-12-28T22_01_31.150636/fairseq/tasks/fairseq_task.py", line 550, in optimizer_step
        optimizer.step()
      File "/home/x/projects/nllb/fairseq/slurm_snapshot_code/2022-12-28T22_01_31.150636/fairseq/optim/fp16_optimizer.py", line 440, in step
        self.wrapped_optimizer.step(
      File "/home/x/projects/nllb/fairseq/slurm_snapshot_code/2022-12-28T22_01_31.150636/fairseq/optim/fairseq_optimizer.py", line 120, in step
        self.optimizer.step(closure, scale=scale)
      File "/home/x/.local/lib/python3.10/site-packages/torch/optim/optimizer.py", line 140, in wrapper
        out = func(*args, **kwargs)
      File "/home/x/projects/nllb/fairseq/slurm_snapshot_code/2022-12-28T22_01_31.150636/fairseq/optim/fused_adam.py", line 209, in step
        exp_avg = exp_avg.float() * state["exp_avg_scale"]
    torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 5.11 GiB (GPU 0; 23.70 GiB total capacity; 20.43 GiB already allocated; 2.13 GiB free; 20.44 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
    

    Any ideas? Any help will be greatly appreciated.

    What have you tried?

    Tried fine-tuning smaller models and only the 600M param. (smallest) model didn't cause the error above.

    What's your environment?

    • GPU models and configuration: 24Gb GPU (RTX 3090)
    question needs triage 
    opened by zgerrard 4
  • Question for XStory Cloze dataset

    Question for XStory Cloze dataset

    Hello! I read that XStory Cloze is translated by Story Cloze here https://github.com/facebookresearch/fairseq/blob/main/examples/xglm/model_card.md#xstorycloze. I also want to use XStory Cloze for test, and I'm trying to translate it myself, but how did you translate it, by googletrans or something else?

    question needs triage 
    opened by mkw18 0
  • How to use optimizer of

    How to use optimizer of "--composite" in command line?

    ❓ Questions and Help

    Before asking:

    1. search the issues.
    2. search the docs.

    What is your question?

    Can anyone provide an example of using the composite parameter on the fairseq-train command line? e.g ,I have two parameter groups : parm1 and parm2,How do I use composite to set different parameters for these two parameter sets?

    Code

    What have you tried?

    View all issues

    What's your environment?

    • fairseq Version (e.g., 1.0 or main): 0.12.2
    • PyTorch Version (e.g., 1.0) 1.13
    • OS (e.g., Linux): Ubuntu 22.04
    • How you installed fairseq (pip, source):
    • Build command you used (if compiling from source):
    • Python version:
    • CUDA/cuDNN version:
    • GPU models and configuration:
    • Any other relevant information:
    question needs triage 
    opened by b3y0nd 0
  • Unable to train nlp with base_text_only_task

    Unable to train nlp with base_text_only_task

    🐛 Bug

    Following the documentation I'm trying to get NLP to work on my local machine (running on Fedora 37)

    To Reproduce

    $ python3.10 fairseq_cli/hydra_train.py -m --config-dir examples/data2vec/config/v2 --config-name base_text_only_task task.data=/home/my-user/TheVault/Codes/experiments/data/nlp/nlp_base.pt
    
    
    [2022-12-26 10:35:02,704][HYDRA] Launching 1 jobs locally
    [2022-12-26 10:35:02,704][HYDRA] 	#0 : task.data=/home/my-user/TheVault/Codes/experiments/data/nlp/nlp_base.pt
    Traceback (most recent call last):
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
        return func()
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/hydra/_internal/utils.py", line 355, in <lambda>
        lambda: hydra.multirun(
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/hydra/_internal/hydra.py", line 136, in multirun
        return sweeper.sweep(arguments=task_overrides)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/hydra/_internal/core_plugins/basic_sweeper.py", line 154, in sweep
        results = self.launcher.launch(batch, initial_job_idx=initial_job_idx)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/hydra/_internal/core_plugins/basic_launcher.py", line 76, in launch
        ret = run_job(
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/hydra/core/utils.py", line 129, in run_job
        ret.return_value = task_function(task_cfg)
      File "/home/my-user/TheVault/Codes/experiments/ai/fairseq/fairseq_cli/hydra_train.py", line 27, in hydra_main
        _hydra_main(cfg)
      File "/home/my-user/TheVault/Codes/experiments/ai/fairseq/fairseq_cli/hydra_train.py", line 31, in _hydra_main
        add_defaults(cfg)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/fairseq/dataclass/initialize.py", line 61, in add_defaults
        cfg[k] = merge_with_parent(dc, field_cfg)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/fairseq/dataclass/utils.py", line 500, in merge_with_parent
        merged_cfg = OmegaConf.merge(dc, cfg)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/omegaconf.py", line 321, in merge
        target.merge_with(*others[1:])
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/basecontainer.py", line 331, in merge_with
        self._format_and_raise(key=None, value=None, cause=e)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/base.py", line 95, in _format_and_raise
        format_and_raise(
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/_utils.py", line 629, in format_and_raise
        _raise(ex, cause)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/_utils.py", line 610, in _raise
        raise ex  # set end OC_CAUSE=1 for full backtrace
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/basecontainer.py", line 329, in merge_with
        self._merge_with(*others)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/basecontainer.py", line 347, in _merge_with
        BaseContainer._map_merge(self, other)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/basecontainer.py", line 314, in _map_merge
        dest[key] = src._get_node(key)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/dictconfig.py", line 258, in __setitem__
        self._format_and_raise(
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/base.py", line 95, in _format_and_raise
        format_and_raise(
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/_utils.py", line 629, in format_and_raise
        _raise(ex, cause)
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/omegaconf/_utils.py", line 610, in _raise
        raise ex  # set end OC_CAUSE=1 for full backtrace
    omegaconf.errors.ConfigKeyError: Key 'include_index' not in 'MaskedLMConfig'
    	full_key: include_index
    	reference_type=Optional[MaskedLMConfig]
    	object_type=MaskedLMConfig
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/home/my-user/TheVault/Codes/experiments/ai/fairseq/fairseq_cli/hydra_train.py", line 91, in <module>
        cli_main()
      File "/home/my-user/TheVault/Codes/experiments/ai/fairseq/fairseq_cli/hydra_train.py", line 87, in cli_main
        hydra_main()
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/hydra/main.py", line 32, in decorated_main
        _run_hydra(
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/hydra/_internal/utils.py", line 354, in _run_hydra
        run_and_report(
      File "/home/my-user/TheVault/Codes/experiments/ai/.venv-310/lib64/python3.10/site-packages/hydra/_internal/utils.py", line 267, in run_and_report
        print_exception(etype=None, value=ex, tb=final_tb)  # type: ignore
    TypeError: print_exception() got an unexpected keyword argument 'etype'
    

    Code sample

    Expected behavior

    To start the training for NLP.

    Environment

    • fairseq Version (e.g., 1.0 or main): fairseq==0.12.2
    • PyTorch Version (e.g., 1.0) torch==1.13.1
    • OS (e.g., Linux): Fedora 37
    • How you installed fairseq (pip, source): pip3
    • Build command you used (if compiling from source): N/A
    • Python version: Python 3.10.9
    • CUDA/cuDNN version:N/A
    • GPU models and configuration: N/A
    • Any other relevant information:
    $ pip3 freeze
    antlr4-python3-runtime==4.8
    bitarray==2.6.1
    cffi==1.15.1
    colorama==0.4.6
    Cython==0.29.32
    fairseq==0.12.2
    hydra-core==1.0.7
    lxml==4.9.2
    numpy==1.24.0
    nvidia-cublas-cu11==11.10.3.66
    nvidia-cuda-nvrtc-cu11==11.7.99
    nvidia-cuda-runtime-cu11==11.7.99
    nvidia-cudnn-cu11==8.5.0.96
    omegaconf==2.0.6
    portalocker==2.6.0
    protobuf==3.20.1
    pycparser==2.21
    PyYAML==6.0
    regex==2022.10.31
    sacrebleu==2.3.1
    tabulate==0.9.0
    tensorboardX==2.5.1
    torch==1.13.1
    torchaudio==0.13.1
    tqdm==4.64.1
    typing_extensions==4.4.0
    

    Additional context

    N/A

    bug needs triage 
    opened by pazooki 0
Releases(v0.12.2)
  • v0.12.2(Jun 27, 2022)

  • v0.12.1(Jun 13, 2022)

  • v0.12.0(Jun 10, 2022)

  • v0.10.2(Jan 5, 2021)

  • v0.10.0(Nov 12, 2020)

    It's been a long time since our last release (0.9.0) nearly a year ago! There have been numerous changes and new features added since then, which we've tried to summarize below. While this release carries the same major version as our previous release (0.x.x), if you have code that relies on 0.9.0, it is likely you'll need to adapt it before updating to 0.10.0.

    Looking forward, this will also be the last significant release with the 0.x.x numbering. The next release will be 1.0.0 and will include a major migration to the Hydra configuration system, with an eye towards modularizing fairseq to be more usable as a library.

    Changelog:

    New papers:

    Major new features:

    • TorchScript support for Transformer and SequenceGenerator (PyTorch 1.6+ only)
    • Model parallel training support (see Megatron-11b)
    • TPU support via --tpu and --bf16 options (775122950d145382146e9120308432a9faf9a9b8)
    • Added VizSeq (a visual analysis toolkit for evaluating fairseq models)
    • Migrated to Python logging (fb76dac1c4e314db75f9d7a03cb4871c532000cb)
    • Added “SlowMo” distributed training backend (0dac0ff3b1d18db4b6bb01eb0ea2822118c9dd13)
    • Added Optimizer State Sharding (ZeRO) (5d7ed6ab4f92d20ad10f8f792b8703e260a938ac)
    • Added several features to improve speech recognition support in fairseq: CTC criterion, external ASR decoder support (currently only wav2letter decoder) with KenLM and fairseq language model fusion

    Minor features:

    • Added --patience for early stopping
    • Added --shorten-method=[none|truncate|random_crop] to language modeling (and other) tasks
    • Added --eval-bleu for computing BLEU scores during training (60fbf64f302a825eee77637a0b7de54fde38fb2c)
    • Added support for training huggingface models (e.g. hf_gpt2) (2728f9b06d9a3808cc7ebc2afa1401eddef35e35)
    • Added FusedLAMB optimizer (--optimizer=lamb) (f75411af2690a54a5155871f3cf7ca1f6fa15391)
    • Added LSTM-based language model (lstm_lm) (9f4256edf60554afbcaadfa114525978c141f2bd)
    • Added dummy tasks and models for benchmarking (91f05347906e80e6705c141d4c9eb7398969a709; a541b19d853cf4a5209d3b8f77d5d1261554a1d9)
    • Added tutorial and pretrained models for paraphrasing (630701eaa750efda4f7aeb1a6d693eb5e690cab1)
    • Support quantization for Transformer (6379573c9e56620b6b4ddeb114b030a0568ce7fe)
    • Support multi-GPU validation in fairseq-validate (2f7e3f33235b787de2e34123d25f659e34a21558)
    • Support batched inference in hub interface (3b53962cd7a42d08bcc7c07f4f858b55bf9bbdad)
    • Support for language model fusion in standard beam search (5379461e613263911050a860b79accdf4d75fd37)

    Breaking changes:

    • Updated requirements to Python 3.6+ and PyTorch 1.5+
    • --max-sentences renamed to --batch-size
    • Main entry point scripts (eval_lm.py, generate.py, etc.) removed from root directory into fairseq_cli
    • Changed format for generation output; H- now corresponds to tokenized system outputs and newly added D- lines correspond to detokenized outputs (f353913420b6ef8a31ecc55d2ec0c988178698e0)
    • We now log the stats from the log-interval (displayed as train_inner) instead of a rolling average over each epoch.
    • SequenceGenerator/Scorer does not print alignment by default, re-enable with --print-alignment
    • Print base 2 scores in generation scripts (660d69fd2bdc4c3468df7eb26b3bbd293c793f94)
    • Incremental decoding interface changed to use FairseqIncrementalState (4e48c4ae5da48a5f70c969c16793e55e12db3c81; 88185fcc3f32bd24f65875bd841166daa66ed301)
    • Refactor namespaces in Criterions to support library usage (introduce LegacyFairseqCriterion for BC) (46b773a393c423f653887c382e4d55e69627454d)
    • Deprecate FairseqCriterion::aggregate_logging_outputs interface, use FairseqCriterion::reduce_metrics instead (86793391e38bf88c119699bfb1993cb0a7a33968)
    • Moved fairseq.meters to fairseq.logging.meters and added new metrics aggregation module (fairseq.logging.metrics) (1e324a5bbe4b1f68f9dadf3592dab58a54a800a8; f8b795f427a39c19a6b7245be240680617156948)
    • Reset mid-epoch stats every log-interval steps (244835d811c2c66b1de2c5e86532bac41b154c1a)
    • Ignore duplicate entries in dictionary files (dict.txt) and support manual overwrite with #fairseq:overwrite option (dd1298e15fdbfc0c3639906eee9934968d63fc29; 937535dba036dc3759a5334ab5b8110febbe8e6e)
    • Use 1-based indexing for epochs everywhere (aa79bb9c37b27e3f84e7a4e182175d3b50a79041)

    Minor interface changes:

    • Added FairseqTask::begin_epoch hook (122fc1db49534a5ca295fcae1b362bbd6308c32f)
    • FairseqTask::build_generator interface changed (cd2555a429b5f17bc47260ac1aa61068d9a43db8)
    • Change RobertaModel base class to FairseqEncoder (307df5604131dc2b93cc0a08f7c98adbfae9d268)
    • Expose FairseqOptimizer.param_groups property (8340b2d78f2b40bc365862b24477a0190ad2e2c2)
    • Deprecate --fast-stat-sync and replace with FairseqCriterion::logging_outputs_can_be_summed interface (fe6c2edad0c1f9130847b9a19fbbef169529b500)
    • --raw-text and --lazy-load are fully deprecated; use --dataset-impl instead
    • Mixture of expert tasks moved to examples/ (8845dcf5ff43ca4d3e733ade62ceca52f1f1d634)

    Performance improvements:

    • Use cross entropy from apex for improved memory efficiency (5065077dfc1ec4da5246a6103858641bfe3c39eb)
    • Added buffered dataloading (--data-buffer-size) (411531734df8c7294e82c68e9d42177382f362ef)
    Source code(tar.gz)
    Source code(zip)
  • v0.9.0(Dec 4, 2019)

    Possibly breaking changes:

    • Set global numpy seed (4a7cd58)
    • Split in_proj_weight into separate k, v, q projections in MultiheadAttention (fdf4c3e)
    • TransformerEncoder returns namedtuples instead of dict (27568a7)

    New features:

    • Add --fast-stat-sync option (e1ba32a)
    • Add --empty-cache-freq option (315c463)
    • Support criterions with parameters (ba5f829)

    New papers:

    • Simple and Effective Noisy Channel Modeling for Neural Machine Translation (49177c9)
    • Levenshtein Transformer (86857a5, ...)
    • Cross+Self-Attention for Transformer Models (4ac2c5f)
    • Jointly Learning to Align and Translate with Transformer Models (1c66792)
    • Reducing Transformer Depth on Demand with Structured Dropout (dabbef4)
    • Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (e23e5ea)
    • BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a92bcda)
    • CamemBERT: a French BERT (b31849a)

    Speed improvements:

    • Add CUDA kernels for LightConv and DynamicConv (f840564)
    • Cythonization of various dataloading components (4fc3953, ...)
    • Don't project mask tokens for MLM training (718677e)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.0(Aug 14, 2019)

    Changelog:

    • Relicensed under MIT license
    • Add RoBERTa
    • Add wav2vec
    • Add WMT'19 models
    • Add initial ASR code
    • Changed torch.hub interface (generate renamed to translate)
    • Add --tokenizer and --bpe
    • f812e52: Renamed data.transforms -> data.encoders
    • 654affc: New Dataset API (optional)
    • 47fd985: Deprecate old Masked LM components
    • 5f78106: Set mmap as default dataset format and infer format automatically
    • Misc fixes for sampling
    • Misc fixes to support PyTorch 1.2
    Source code(tar.gz)
    Source code(zip)
  • v0.7.2(Jul 19, 2019)

    No major API changes since the last release. Cutting a new release since we'll be merging significant (possibly breaking) changes to logging, data loading and the masked LM implementation soon.

    Source code(tar.gz)
    Source code(zip)
  • v0.7.1(Jun 20, 2019)

  • v0.7.0(Jun 20, 2019)

    Notable (possibly breaking) changes:

    • d45db80: Remove checkpoint utility functions from utils.py into checkpoint_utils.py
    • f2563c2: Move LM definitions into separate files
    • dffb167: Updates to model API:
      • FairseqModel -> FairseqEncoderDecoderModel
      • add FairseqDecoder.extract_features and FairseqDecoder.output_layer
      • encoder_out_dict -> encoder_out
      • rm unused remove_head functions
    • 34726d5: Move distributed_init into DistributedFairseqModel
    • cf17068: Simplify distributed launch by automatically launching multiprocessing on each node for all visible GPUs (allows launching just one job per node instead of one per GPU)
    • d45db80: Change default LR scheduler from reduce_lr_on_plateau to fixed
    • 96ac28d: Rename --sampling-temperature -> --temperature
    • fc1a19a: Deprecate dummy batches
    • a1c997b: Add memory mapped datasets
    • 0add50c: Allow cycling over multiple datasets, where each one becomes an "epoch"

    Plus many additional features and bugfixes

    Source code(tar.gz)
    Source code(zip)
  • v0.6.2(Mar 15, 2019)

    Changelog:

    • 998ba4f: Add language models from Baevski & Auli (2018)
    • 4294c4f: Add mixture of experts code from Shen et al. (2019)
    • 0049349: Add example for multilingual training
    • 48d9afb: Speed improvements, including fused operators from apex
    • 44d27e6: Add Tensorboard support
    • d17fa85: Add Adadelta optimizer
    • 9e1c880: Add FairseqEncoderModel
    • b65c579: Add FairseqTask.inference_step to modularize generate.py
    • 2ad1178: Add back --curriculum
    • Misc bug fixes and other features
    Source code(tar.gz)
    Source code(zip)
  • v0.6.1(Feb 9, 2019)

  • v0.6.0(Sep 26, 2018)

    Changelog:

    • 4908863: Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0
      • no more FP16Trainer, we just have an FP16Optimizer wrapper
      • most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time
      • Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0
      • Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq
    • 1c56b58: parallelize preprocessing
    • Misc bug fixes and features
    Source code(tar.gz)
    Source code(zip)
So-ViT: Mind Visual Tokens for Vision Transformer

So-ViT: Mind Visual Tokens for Vision Transformer        Introduction This repository contains the source code under PyTorch framework and models trai

Jiangtao Xie 44 Nov 24, 2022
Tensorflow implementation for "Improved Transformer for High-Resolution GANs" (NeurIPS 2021).

HiT-GAN Official TensorFlow Implementation HiT-GAN presents a Transformer-based generator that is trained based on Generative Adversarial Networks (GA

Google Research 78 Oct 31, 2022
iBOT: Image BERT Pre-Training with Online Tokenizer

Image BERT Pre-Training with iBOT Official PyTorch implementation and pretrained models for paper iBOT: Image BERT Pre-Training with Online Tokenizer.

Bytedance Inc. 435 Jan 06, 2023
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

ELECTRA Introduction ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using

Google Research 2.1k Dec 28, 2022
A different spin on dataclasses.

dataklasses Dataklasses is a library that allows you to quickly define data classes using Python type hints. Here's an example of how you use it: from

David Beazley 752 Nov 18, 2022
Gans-in-action - Companion repository to GANs in Action: Deep learning with Generative Adversarial Networks

GANs in Action by Jakub Langr and Vladimir Bok List of available code: Chapter 2: Colab, Notebook Chapter 3: Notebook Chapter 4: Notebook Chapter 6: C

GANs in Action 914 Dec 21, 2022
Aalto-cs-msc-theses - Listing of M.Sc. Theses of the Department of Computer Science at Aalto University

Aalto-CS-MSc-Theses Listing of M.Sc. Theses of the Department of Computer Scienc

Jorma Laaksonen 3 Jan 27, 2022
Corruption Invariant Learning for Re-identification

Corruption Invariant Learning for Re-identification The official repository for Benchmarks for Corruption Invariant Person Re-identification (NeurIPS

Minghui Chen 73 Dec 08, 2022
JORLDY an open-source Reinforcement Learning (RL) framework provided by KakaoEnterprise

Repository for Open Source Reinforcement Learning Framework JORLDY

Kakao Enterprise Corp. 330 Dec 30, 2022
tensorflow code for inverse face rendering

InverseFaceRender This is tensorflow code for our project: Learning Inverse Rendering of Faces from Real-world Videos. (https://arxiv.org/abs/2003.120

Yuda Qiu 18 Nov 16, 2022
LowRankModels.jl is a julia package for modeling and fitting generalized low rank models.

LowRankModels.jl LowRankModels.jl is a Julia package for modeling and fitting generalized low rank models (GLRMs). GLRMs model a data array by a low r

Madeleine Udell 183 Dec 17, 2022
pytorch implementation for PointNet

PointNet.pytorch This repo is implementation for PointNet in pytorch. The model is in pointnet/model.py. It is teste

Fei Xia 1.7k Dec 30, 2022
[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

F8Net Fixed-Point 8-bit Only Multiplication for Network Quantization (ICLR 2022 Oral) OpenReview | arXiv | PDF | Model Zoo | BibTex PyTorch implementa

Snap Research 76 Dec 13, 2022
Effective Use of Transformer Networks for Entity Tracking

Effective Use of Transformer Networks for Entity Tracking (EMNLP19) This is a PyTorch implementation of our EMNLP paper on the effectiveness of pre-tr

5 Nov 06, 2021
Examples of how to create colorful, annotated equations in Latex using Tikz.

The file "eqn_annotate.tex" is the main latex file. This repository provides four examples of annotated equations: [example_prob.tex] A simple one ins

SyNeRCyS Research Lab 3.2k Jan 05, 2023
[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021

Pedestron Pedestron is a MMdetection based repository, that focuses on the advancement of research on pedestrian detection. We provide a list of detec

Irtiza Hasan 594 Jan 05, 2023
the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]

BGNet This repository contains the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet] Environment Python 3.6.* C

3DCV developer 87 Nov 29, 2022
Pmapper is a super-resolution and deconvolution toolkit for python 3.6+

pmapper pmapper is a super-resolution and deconvolution toolkit for python 3.6+. PMAP stands for Poisson Maximum A-Posteriori, a highly flexible and a

NASA Jet Propulsion Laboratory 8 Nov 06, 2022
A Pytorch implementation of "LegoNet: Efficient Convolutional Neural Networks with Lego Filters" (ICML 2019).

LegoNet This code is the implementation of ICML2019 paper LegoNet: Efficient Convolutional Neural Networks with Lego Filters Run python train.py You c

YangZhaohui 140 Sep 26, 2022
Camera-caps - Examine the camera capabilities for V4l2 cameras

camera-caps This is a graphical user interface over the v4l2-ctl command line to

Jetsonhacks 25 Dec 26, 2022