Muzic: Music Understanding and Generation with Artificial Intelligence

Overview



Muzic is a research project on AI music that empowers music understanding and generation with deep learning and artificial intelligence. Muzic is pronounced as [ˈmjuːzeik] and '谬贼客' (in Chinese). Besides the logo in image version (see above), Muzic also has a logo in video version (you can click here to watch ). Muzic was started by some researchers from Microsoft Research Asia.


We summarize the scope of our Muzic project in the following figure:


The current work in Muzic include:

Requirements

The operaton system is Linux. We test on Ubuntu 16.04.6 LTS, with Python 3.6.12. The requirements for running Muzic are listed in requirements.txt. To install the requirements, run:

pip install -r requirements.txt

We initially release the code of 5 research work: MusicBERT, PDAugment, DeepRapper, SongMASS, and TeleMelody. You can find the README in the corresponding folder for detailed instructions on how to use.

Reference

If you find the Muzic project useful in your work, you can cite the following papers if there's a need:

  • MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training, Mingliang Zeng, Xu Tan, Rui Wang, Zeqian Ju, Tao Qin, Tie-Yan Liu, ACL 2021.
  • PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription, Chen Zhang, Jiaxing Yu, Luchin Chang, Xu Tan, Jiawei Chen, Tao Qin, Kejun Zhang, arXiv 2021.
  • DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling, Lanqing Xue, Kaitao Song, Duocai Wu, Xu Tan, Nevin L. Zhang, Tao Qin, Wei-Qiang Zhang, Tie-Yan Liu, ACL 2021.
  • SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint, Zhonghao Sheng, Kaitao Song, Xu Tan, Yi Ren, Wei Ye, Shikun Zhang, Tao Qin, AAAI 2021.
  • TeleMelody: Lyric-to-Melody Generation with a Template-Based Two-Stage Method, Zeqian Ju, Peiling Lu, Xu Tan, Rui Wang, Chen Zhang, Songruoyao Wu, Kejun Zhang, Xiangyang Li, Tao Qin, Tie-Yan Liu, arXiv 2021.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Comments
  • [MusicBERT]: Could not infer model type from Namespace (eval_genre.py)

    [MusicBERT]: Could not infer model type from Namespace (eval_genre.py)

    Hello!

    I'm trying to run the evaluation script for the genre classification task using the command python -u eval_genre.py checkpoints/checkpoint_last_musicbert_small.pt topmagd_data_bin/x, and I'm getting the error below when running RobertaModel.from_pretrained:

    Traceback (most recent call last):
      File "eval_genre.py", line 39, in <module>
        user_dir='musicbert'
      File "/home/aspil/muzic/musicbert/fairseq/fairseq/models/roberta/model.py", line 251, in from_pretrained
        **kwargs,
      File "/home/aspil/muzic/musicbert/fairseq/fairseq/hub_utils.py", line 75, in from_pretrained
        arg_overrides=kwargs,
      File "/home/aspil/muzic/musicbert/fairseq/fairseq/checkpoint_utils.py", line 353, in load_model_ensemble_and_task
        model = task.build_model(cfg.model)
      File "/home/aspil/muzic/musicbert/fairseq/fairseq/tasks/fairseq_task.py", line 567, in build_model
        model = models.build_model(args, self)
      File "/home/aspil/muzic/musicbert/fairseq/fairseq/models/__init__.py", line 93, in build_model
        + model_type
    AssertionError: Could not infer model type from Namespace(_name='roberta_small', activation_dropout=0.0, activation_fn='gelu', adam_betas='(0.9,0.98)', adam_eps=1e-06, all_gather_list_size=16384, arch='roberta_small', attention_dropout=0.1, azureml_logging=False, batch_size=8, batch_size_valid=8, best_checkpoint_metric='loss', bf16=False, bpe='gpt2', broadcast_buffers=False, bucket_cap_mb=25, checkpoint_shard_count=1, checkpoint_suffix='_bar_roberta_small', clip_norm=0.0, cpu=False, criterion='masked_lm', curriculum=0, data='topmagd_data_bin/0/input0', data_buffer_size=10, dataset_impl=None, ddp_backend='c10d', device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_port=-1, distributed_rank=0, distributed_world_size=8, distributed_wrapper='DDP', dropout=0.1, empty_cache_freq=0, encoder_attention_heads=8, encoder_embed_dim=512, encoder_ffn_embed_dim=2048, encoder_layerdrop=0, encoder_layers=4, encoder_layers_to_keep=None, end_learning_rate=0.0, eos=2, fast_stat_sync=False, find_unused_parameters=False, finetune_from_model=None, fix_batches_to_gpus=False, fixed_validation_seed=None, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, freq_weighted_replacement=False, gen_subset='test', heartbeat_timeout=-1, keep_best_checkpoints=-1, keep_interval_updates=-1, keep_last_epochs=-1, leave_unmasked_prob=0.1, load_checkpoint_heads=True, load_checkpoint_on_all_dp_ranks=False, localsgd_frequency=3, log_format='simple', log_interval=100, lr=[0.0005], lr_scheduler='polynomial_decay', mask_multiple_length=1, mask_prob=0.15, mask_stdev=0.0, mask_whole_words=False, max_epoch=0, max_positions=8192, max_tokens=None, max_tokens_valid=None, max_update=125000, maximize_best_checkpoint_metric=False, memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, model_parallel_size=1, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_save=False, no_save_optimizer_state=False, no_seed_provided=False, nprocs_per_node=8, num_shards=1, num_workers=1, optimizer='adam', optimizer_overrides='{}', pad=1, patience=-1, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, pooler_activation_fn='tanh', pooler_dropout=0.0, power=1.0, profile=False, quant_noise_pq=0, quant_noise_pq_block_size=8, quant_noise_scalar=0, quantization_config_path=None, random_token_prob=0.1, required_batch_size_multiple=8, required_seq_len_multiple=1, reset_dataloader=False, reset_logging=True, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoints/checkpoint_last_bar_roberta_small.pt', sample_break_mode='complete', save_dir='checkpoints', save_interval=1, save_interval_updates=0, scoring='bleu', seed=1, sentence_avg=False, shard_id=0, shorten_data_split_list='', shorten_method='none', skip_invalid_size_inputs_valid_test=False, slowmo_algorithm='LocalSGD', slowmo_momentum=None, spectral_norm_classification_head=False, stop_min_lr=-1.0, stop_time_hours=0, task='masked_lm', tensorboard_logdir=None, threshold_loss_scale=None, tokenizer=None, tokens_per_sample=8192, total_num_update='125000', tpu=False, train_subset='train', unk=3, untie_weights_roberta=False, update_freq=[4], use_bmuf=False, use_old_adam=False, user_dir='musicbert', valid_subset='valid', validate_after_updates=0, validate_interval=1, validate_interval_updates=0, wandb_project=None, warmup_updates=25000, weight_decay=0.01, zero_sharding='none'). Available models: dict_keys(['transformer_lm', 'wav2vec', 'wav2vec2', 'wav2vec_ctc', 'wav2vec_seq2seq']) Requested model type: roberta_small
    

    Environment

    • python: Python 3.6.13 :: Anaconda, Inc.
    • fairseq: git+https://github.com/pytorch/[email protected]#egg=fairseq

    Thanks in advance!

    Edit: When running the above command with the base checkpoint, I get the following:

    Traceback (most recent call last):
      File "eval_genre.py", line 39, in <module>
        user_dir='musicbert'
      File "/home/aspil/anaconda3/envs/musicbert_01/lib/python3.6/site-packages/fairseq/models/roberta/model.py", line 251, in from_pretrained
        **kwargs,
      File "/home/aspil/anaconda3/envs/musicbert_01/lib/python3.6/site-packages/fairseq/hub_utils.py", line 75, in from_pretrained
        arg_overrides=kwargs,
      File "/home/aspil/anaconda3/envs/musicbert_01/lib/python3.6/site-packages/fairseq/checkpoint_utils.py", line 355, in load_model_ensemble_and_task
        model.load_state_dict(state["model"], strict=strict, model_cfg=cfg.model)
      File "/home/aspil/anaconda3/envs/musicbert_01/lib/python3.6/site-packages/fairseq/models/fairseq_model.py", line 115, in load_state_dict
        return super().load_state_dict(new_state_dict, strict)
      File "/home/aspil/anaconda3/envs/musicbert_01/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict
        self.__class__.__name__, "\n\t".join(error_msgs)))
    RuntimeError: Error(s) in loading state_dict for RobertaModel:
            Unexpected key(s) in state_dict: "encoder.sentence_encoder.downsampling.0.weight", "encoder.sentence_encoder.downsampling.0.bias", "encoder.sentence_encoder.upsampling.0.weight", "encoder.sentence_encoder.upsampling.0.bias".
    

    I don't know if I messed up something, I'd appreciate any help!

    opened by aspil 13
  • miss argument

    miss argument

    when i bash generate.sh, the function " get_sentence_pinyin_finals() " input "raw_text", but this function has two parameters when it is defined

    TypeError: get_sentence_pinyin_finals() missing 1 required positional argument: 'invalids_finals'

    opened by pikapi111 10
  • [teleMelody] How to import lyric to the generated midi sample?

    [teleMelody] How to import lyric to the generated midi sample?

    Hi @jzq2000, I have a quick question about inference. In the example link https://ai-muzic.github.io/telemelody/, I notice that each sample contains both media and lyric. From the code, we could generated the midi file given lyric. So how do we match them together? Thank you!

    opened by elricwan 9
  • musicBert SMALL base MODEL error

    musicBert SMALL base MODEL error

    when i run: $ bash train_mask.sh lmd_full small with provided small model, fairseq gets error:

    -- Process 3 terminated with the following error: Traceback (most recent call last): File "/usr/local/anaconda3/envs/pyg/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap fn(i, *args) File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq/distributed_utils.py", line 270, in distributed_main main(args, **kwargs) File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq_cli/train.py", line 114, in main disable_iterator_cache=task.has_sharded_data("train"), File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq/checkpoint_utils.py", line 193, in load_checkpoint reset_meters=reset_meters, File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq/trainer.py", line 279, in load_checkpoint state = checkpoint_utils.load_checkpoint_to_cpu(filename) File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq/checkpoint_utils.py", line 232, in load_checkpoint_to_cpu state = _upgrade_state_dict(state) File "/mmu_ssd/chenjunmin/apps/fairseq-0.10.0/fairseq/checkpoint_utils.py", line 436, in _upgrade_state_dict registry.set_defaults(state["args"], tasks.TASK_REGISTRY[state["args"].task]) AttributeError: 'NoneType' object has no attribute 'task'

    opened by jammyWolf 7
  • [MusicBERT] Restriction to 1002 octuples when using `preprocess.encoding_to_str`

    [MusicBERT] Restriction to 1002 octuples when using `preprocess.encoding_to_str`

    Hi once again!

    While preprocessing a MIDI file, I noticed that the MIDI_to_encoding method performs as intended and converts the sample song to 106 bars as seen in the snip below of the resultant octuples (please correct me if I'm wrong).

    However, the encoding_to_str method has the result with restriction to just 18 bars (as conculsive from highlighted <0-18> near the end of the encoded string in snip below):

    image

    More generally, what I have noticed in cases of multiple MIDI files is that only upto the first 1000 octuples (i.e, start token octuple + 1000 note octuples + end token octuple = (1002 * 8) = 8016 tokens) are considered:

    image

    Is there any way to change encoding_to_str to get the whole song instead?, upto 256 bars only I mean, as model vocabulary is also restricted to 256 bars. I am not familiar enough with miditoolkit or mido to understand the code properly as of now, else I would have tried to fix this.

    Thanks in advance!

    Edit: I am aware that the musicbert_base model can support upto 8192 octuples (i.e, final input to MusicBERT encoder) only, but that does not seem to be the issue here I think.

    opened by tripathiarpan20 6
  • Problems for inference stage

    Problems for inference stage

    i success in data and train stages and meet some problems in inference. image I meet the similar problem in the train stage, and I use a single GPU to avoid this. However, it didn't work for inference stage. My environment is Unbuntu 16.04, CUDA 10.2, Python 3.6.12 and others follow requiments.txt. Happy to hear your reply

    opened by DrWelles 6
  • Cannot install dependencies: No matching distribution found for fairseq==0.10.2

    Cannot install dependencies: No matching distribution found for fairseq==0.10.2

    Cannot install dependencies when I run pip install -r requirements.txt. The major error message is as followed:

    ...
    ERROR: Could not find a version that satisfies the requirement fairseq==0.10.2 (from versions: 0.6.1, 0.6.2, 0.7.1, 0.7.2, 0.8.0, 0.9.0, 0.10.0, 0.10.1, 0.10.2)
    ERROR: No matching distribution found for fairseq==0.10.2
    

    My development environments:

    • python 3.9.7
    • pip 21.2.4
    • macos 11.5.2
    opened by Bin-Huang 6
  • [MusicBERT]: How to fill masked tokens in an input sequence after training?

    [MusicBERT]: How to fill masked tokens in an input sequence after training?

    Hello again,

    I have fine-tuned MusicBERT on masked language modeling using a custom dataset. I have loaded the fine-tuned checkpoint using:

    roberta = RobertaModel.from_pretrained( # MusicBERTModel.from_pretrained also works
        '.',
        checkpoint_file=sys.argv[1],
        data_name_or_path=sys.argv[2],
        user_dir='musicbert'
    )
    

    What I want to do is to give it an input sequence, mask one or more tokens before passing the input to the model and somehow predict them. Something like masked language modeling, but with control over which tokens I want to mask and predict.

    What I cannot understand is what format the input sequence should be in order to be passed to the model, and how to make the model predict the masked tokens in the input. I have tried to replicate it by looking at the fairseq's training code since I want to do something similar, but it's too complicated.

    Thanks in advance.

    opened by aspil 5
  • [teleMelody] How does lyric and chord corresponds with each other?

    [teleMelody] How does lyric and chord corresponds with each other?

    Hi there, from the test example, how does the lyric and chord corresponds with each other? I thought every [sep] represents one chord, but apparently the number does not match. If we match every word to one chord, then number does not match either. Here is example: en:

    this thing called love and i just [sep] han -dle it [sep] this thing called love and i must get [sep] round to it [sep] i rea -dy [sep]
    
    Eb:maj7 Eb:maj7 C:dim C:dim C:dim C:dim C:dim C:dim
    

    ch:

    斑 驳 的 夜 色 在 说 什 么 [sep] 谁 能 告 诉 我 如 何 选 择 [sep] 每 当 我 想 起 分 离 时 刻 [sep] 悲 伤 就 逆 流 成 河 [sep] 你 给 的 温 暖 属 于 谁 呢 [sep] 谁 又 会 在 乎 我 是 谁 呢 [sep] 每 当 我 想 起 你 的 选 择 [sep] 悲 伤 就 逆 流 成 河 [sep] 失 去 了 你 也 是 种 获 得 [sep] 一 个 人 孤 单 未 尝 不 可 [sep] 每 当 我 深 夜 辗 转 反 侧 [sep] 悲 伤 就 逆 流 成 河 [sep] 离 开 你 也 是 一 种 快 乐 [sep] 没 人 说 一 定 非 爱 不 可 [sep] 想 问 你 双 手 是 否 温 热 [sep] 悲 伤 就 逆 流 成 河 [sep] 我 想 是 因 为 我 太 天 真 [sep] 难 过 是 因 为 我 太 认 真 [sep] 每 当 我 想 起 你 的 眼 神 [sep] 悲 伤 就 逆 流 成 河 [sep]
    
    C:m7 C:m7 G:m7 Bb: C:m7 F:m7 C:m7 C:m7 G:m7 Bb: C:m7 F:m7 C:m C:m G:m7 F:m7 C:m7 F:m7 C:m C:m G:m7 C:m7 C:m7 F:m7 C:m7 C:m7 G:m7 G:m7 C:m F:m7 C:m7
    
    opened by elricwan 4
  • Bump pillow from 8.3.1 to 9.0.1 in /relyme

    Bump pillow from 8.3.1 to 9.0.1 in /relyme

    Bumps pillow from 8.3.1 to 9.0.1.

    Release notes

    Sourced from pillow's releases.

    9.0.1

    https://pillow.readthedocs.io/en/stable/releasenotes/9.0.1.html

    Changes

    • In show_file, use os.remove to remove temporary images. CVE-2022-24303 #6010 [@​radarhere, @​hugovk]
    • Restrict builtins within lambdas for ImageMath.eval. CVE-2022-22817 #6009 [radarhere]

    9.0.0

    https://pillow.readthedocs.io/en/stable/releasenotes/9.0.0.html

    Changes

    ... (truncated)

    Changelog

    Sourced from pillow's changelog.

    9.0.1 (2022-02-03)

    • In show_file, use os.remove to remove temporary images. CVE-2022-24303 #6010 [radarhere, hugovk]

    • Restrict builtins within lambdas for ImageMath.eval. CVE-2022-22817 #6009 [radarhere]

    9.0.0 (2022-01-02)

    • Restrict builtins for ImageMath.eval(). CVE-2022-22817 #5923 [radarhere]

    • Ensure JpegImagePlugin stops at the end of a truncated file #5921 [radarhere]

    • Fixed ImagePath.Path array handling. CVE-2022-22815, CVE-2022-22816 #5920 [radarhere]

    • Remove consecutive duplicate tiles that only differ by their offset #5919 [radarhere]

    • Improved I;16 operations on big endian #5901 [radarhere]

    • Limit quantized palette to number of colors #5879 [radarhere]

    • Fixed palette index for zeroed color in FASTOCTREE quantize #5869 [radarhere]

    • When saving RGBA to GIF, make use of first transparent palette entry #5859 [radarhere]

    • Pass SAMPLEFORMAT to libtiff #5848 [radarhere]

    • Added rounding when converting P and PA #5824 [radarhere]

    • Improved putdata() documentation and data handling #5910 [radarhere]

    • Exclude carriage return in PDF regex to help prevent ReDoS #5912 [hugovk]

    • Fixed freeing pointer in ImageDraw.Outline.transform #5909 [radarhere]

    ... (truncated)

    Commits
    • 6deac9e 9.0.1 version bump
    • c04d812 Update CHANGES.rst [ci skip]
    • 4fabec3 Added release notes for 9.0.1
    • 02affaa Added delay after opening image with xdg-open
    • ca0b585 Updated formatting
    • 427221e In show_file, use os.remove to remove temporary images
    • c930be0 Restrict builtins within lambdas for ImageMath.eval
    • 75b69dd Dont need to pin for GHA
    • cd938a7 Autolink CWE numbers with sphinx-issues
    • 2e9c461 Add CVE IDs
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 4
  • Bump numpy from 1.21.3 to 1.22.0 in /relyme

    Bump numpy from 1.21.3 to 1.22.0 in /relyme

    Bumps numpy from 1.21.3 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 4
  • [Museformer] Could not override 'task.dataset_impl'

    [Museformer] Could not override 'task.dataset_impl'

    Hello, I, following the instructions in the README.md, create a virtual environment in Linux on WSL 2, download the checkpoint model, run the command that runs tgen/generation__mf-lmd6remi-x.sh and get the error hydra.errors.ConfigCompositionException: Could not override 'task.dataset_impl'. To append to your config use +task.dataset_impl=null

    I'm running the model on a 3080 Ti

    opened by artyomche9 1
  • 【meloform】why all token id in dictionary is 0?

    【meloform】why all token id in dictionary is 0?

    Notice that gen_dictionary function uses the variable ‘num’ to represent each token's id, but 'num' keeps 0 for the whole process, so all tokens in the dictionary are 0, is that a bug, or does it make any sense? image image

    opened by punkcure 1
  • 【museformer】TypeError: forward() got multiple values for argument 'key_padding_mask'

    【museformer】TypeError: forward() got multiple values for argument 'key_padding_mask'

    hi,I run bash ttrain/mf-lmd6remi-1.sh and bash tval/val__mf-lmd6remi-x.sh 1 checkpoint_best.pt 10240 in museformer it will be: 2022-11-19 13:04:11 | WARNING | fairseq.tasks.fairseq_task | 844 samples have invalid sizes and will be skipped, max_positions=1024, first few sample ids=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 2022-11-19 13:04:11 | INFO | fairseq.trainer | begin training epoch 1 Traceback (most recent call last): File "/home/hyc/anaconda3/envs/musefo/bin/fairseq-train", line 8, in sys.exit(cli_main()) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq_cli/train.py", line 352, in cli_main distributed_utils.call_main(args, main) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/distributed_utils.py", line 283, in call_main torch.multiprocessing.spawn( File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes while not context.join(): File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

    -- Process 2 terminated with the following error: Traceback (most recent call last): File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap fn(i, *args) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/distributed_utils.py", line 270, in distributed_main main(args, **kwargs) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq_cli/train.py", line 125, in main valid_losses, should_stop = train(args, trainer, task, epoch_itr) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/contextlib.py", line 75, in inner return func(*args, **kwds) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq_cli/train.py", line 208, in train log_output = trainer.train_step(samples) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/contextlib.py", line 75, in inner return func(*args, **kwds) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/trainer.py", line 480, in train_step loss, sample_size_i, logging_output = self.task.train_step( File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/tasks/fairseq_task.py", line 416, in train_step loss, sample_size, logging_output = criterion(model, sample) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/criterions/cross_entropy.py", line 35, in forward net_output = model(**sample["net_input"]) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 705, in forward output = self.module(*inputs[0], **kwargs[0]) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/fairseq/models/fairseq_model.py", line 481, in forward return self.decoder(src_tokens, **kwargs) File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/hyc/tmp/museformer/museformer/museformer_decoder.py", line 413, in forward x, extra = self.extract_features( File "/home/hyc/tmp/museformer/museformer/museformer_decoder.py", line 645, in extract_features (sum_x, reg_x), inner_states = self.run_layers( File "/home/hyc/tmp/museformer/museformer/museformer_decoder.py", line 731, in run_layers x, _ = layer( File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/hyc/tmp/museformer/museformer/museformer_decoder_layer.py", line 413, in forward x, attn = self.run_self_attn( File "/home/hyc/tmp/museformer/museformer/museformer_decoder_layer.py", line 486, in run_self_attn r, weight = self.self_attn( File "/home/hyc/anaconda3/envs/musefo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) TypeError: forward() got multiple values for argument 'key_padding_mask'

    fairseq is 0.10.2 torch is 1.8.0 python is 3.8 who know why?

    opened by hhhyc333 1
  • Error when running ttrain/mf-lmd6remi-1.sh 

    Error when running ttrain/mf-lmd6remi-1.sh 

    Currently I'm trying to implement museformer.

    I don't know how to deal with errors in the model learning stage.(I fixed the errors leading up to this point)

    ttrain/mf-lmd6remi-1.sh

    ...................

    
    2022-11-11 07:08:28 | WARNING | fairseq.tasks.fairseq_task | 290 samples have invalid sizes and will be skipped, max_positions=1024, first few sample ids=[0, 1, 2, 4, 5, 6, 7, 8, 9, 10]
    2022-11-11 07:08:28 | INFO | fairseq.trainer | begin training epoch 1
    Traceback (most recent call last):
      File "/usr/local/bin/fairseq-train", line 8, in <module>
        sys.exit(cli_main())
      File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 352, in cli_main
        distributed_utils.call_main(args, main)
      File "/usr/local/lib/python3.8/dist-packages/fairseq/distributed_utils.py", line 301, in call_main
        main(args, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 125, in main
        valid_losses, should_stop = train(args, trainer, task, epoch_itr)
      File "/usr/lib/python3.8/contextlib.py", line 75, in inner
        return func(*args, **kwds)
      File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 208, in train
        log_output = trainer.train_step(samples)
      File "/usr/lib/python3.8/contextlib.py", line 75, in inner
        return func(*args, **kwds)
      File "/usr/local/lib/python3.8/dist-packages/fairseq/trainer.py", line 480, in train_step
        loss, sample_size_i, logging_output = self.task.train_step(
      File "/usr/local/lib/python3.8/dist-packages/fairseq/tasks/fairseq_task.py", line 416, in train_step
        loss, sample_size, logging_output = criterion(model, sample)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/fairseq/criterions/cross_entropy.py", line 35, in forward
        net_output = model(**sample["net_input"])
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/fairseq/models/fairseq_model.py", line 481, in forward
        return self.decoder(src_tokens, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "muzic/museformer/museformer/museformer_decoder.py", line 413, in forward
        x, extra = self.extract_features(
      File "muzic/museformer/museformer/museformer_decoder.py", line 645, in extract_features
        (sum_x, reg_x), inner_states = self.run_layers(
      File "/content/muzic/museformer/museformer/museformer_decoder.py", line 731, in run_layers
        x, _ = layer(
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "muzic/museformer/museformer/museformer_decoder_layer.py", line 413, in forward
        x, attn = self.run_self_attn(
      File "muzic/museformer/museformer/museformer_decoder_layer.py", line 486, in run_self_attn
        r, weight = self.self_attn(
    TypeError: 'NotImplementedError' object is not callable
    
    

    I have confirmed that the environments match.

    tensorboardX 2.2 Python 3.8.15 fairseq 0.10.2 CUDA 11.3

    opened by taktak1 5
Releases(DeepRapper-v1.0)
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream).

rfsoapyfile A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream). The script is threaded fo

4 Dec 19, 2022
Muzic: Music Understanding and Generation with Artificial Intelligence

Muzic is a research project on AI music that empowers music understanding and generation with deep learning and artificial intelligence.

Microsoft 2.6k Dec 30, 2022
Code to work with wave files!

Code to work with wave files!

Mohammad Dori 3 Jul 15, 2022
Expressive Digital Signal Processing (DSP) package for Python

AudioLazy Development Last release PyPI status Real-Time Expressive Digital Signal Processing (DSP) Package for Python! Laziness and object representa

Danilo de Jesus da Silva Bellini 642 Dec 26, 2022
kapre: Keras Audio Preprocessors

Kapre Keras Audio Preprocessors - compute STFT, ISTFT, Melspectrogram, and others on GPU real-time. Tested on Python 3.6 and 3.7 Why Kapre? vs. Pre-co

Keunwoo Choi 867 Dec 29, 2022
C++ library for audio and music analysis, description and synthesis, including Python bindings

Essentia Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license.

Music Technology Group - Universitat Pompeu Fabra 2.3k Jan 03, 2023
Python game programming in Jupyter notebooks.

Jupylet Jupylet is a Python library for programming 2D and 3D games, graphics, music and sound synthesizers, interactively in a Jupyter notebook. It i

Nir Aides 178 Dec 09, 2022
Okaeri-Music is a telegram music bot project, allow you to play music on voice chat group telegram.

Okaeri-Music is a telegram bot project that's allow you to play music on telegram voice chat group

Wahyusaputra 1 Dec 22, 2021
F.R.I.D.A.Y. ----- Female Replacement Intelligent Digital Assistant Youth

F.R.I.D.A.Y. Female Replacement Intelligent Digital Assistant Youth--Jarvis-- the virtual assistant made by python Overview This is a virtual assistan

JIB - Just Innovative Bro 4 Feb 26, 2022
Users can transcribe their favorite piano recordings to MIDI files after installation

Users can transcribe their favorite piano recordings to MIDI files after installation

190 Dec 17, 2022
praudio provides audio preprocessing framework for Deep Learning audio applications

praudio provides objects and a script for performing complex preprocessing operations on entire audio datasets with one command.

Valerio Velardo 105 Dec 26, 2022
This Bot can extract audios and subtitles from video files

Send any valid video file and the bot shows you available streams in it that can be extracted!!

TroJanzHEX 56 Nov 22, 2022
MusicBrainz Picard

MusicBrainz Picard MusicBrainz Picard is a cross-platform (Linux/Mac OS X/Windows) application written in Python and is the official MusicBrainz tagge

MetaBrainz Foundation 3k Dec 31, 2022
Algorithmic and AI MIDI Drums Generator Implementation

Algorithmic and AI MIDI Drums Generator Implementation

Tegridy Code 8 Dec 30, 2022
TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music

TONet Introduction The official implementation of "TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music", in ICASSP 2022 We

Knut(Ke) Chen 29 Dec 01, 2022
This is a short program that takes the input from your microphone and uses OpenGL to draw a live colourful pattern

Visual-Music This is a short program that takes the input from your microphone and uses OpenGL to draw a live colourful pattern Installation and Setup

Tom Jebbo 1 Dec 26, 2021
Music bot of # Owner

Pokimane-Music Music bot of # Owner How To Host The easiest way to deploy this Bot Support Channel :- TeamDlt Support Group :- TeamDlt Please fork thi

5 Dec 23, 2022
A useful tool to generate chord progressions according to melody MIDIs

Auto chord generator, pure python package that generate chord progressions according to given melodies

Billy Yi 53 Dec 30, 2022
A voice assistant which can be used to interact with your computer and controls your pc operations

Introduction 👨‍💻 It is a voice assistant which can be used to interact with your computer and also you have been seeing it in Iron man movies, but t

Sujith 84 Dec 22, 2022
We built this fully functioning Music player in Python. The music player allows you to play/pause and switch to different songs easily.

We built this fully functioning Music player in Python. The music player allows you to play/pause and switch to different songs easily.

1 Nov 19, 2021