MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving.

Overview

PyPi Latest Release License Documentation Status

Doc|Model|Dataset|Paper

MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving. It is a comprehensive framework for research purpose that integrates popular MWP benchmark datasets and typical deep learning-based MWP algorithms.

Our framework has the following architecture. You could utilize our toolkit to evaluate the build-in datasets, apply it to process your raw data copies or develop your own models.

Figure: The Overall Framework of MWP Toolkit

News

Characteristics

  • Unification and Modularization. We decouple solvers with different model architectures into highly modularized, reusable components and integrate them in a unified framework, which includes data, model, evaluation modules. It is convenient for you to study MWPs at a conceptual level and compare different models fairly.
  • Comprehensiveness and Standardization. MWPToolkit has deployed the popular benchmark datasets and models for MWPs solving, covering Seq2Seq, Seq2Tree, Graph2Tree, and Pre-trained Language Models. Moreover, some tricks like hyper-parameter tuning used in general NLP tasks are integrated. As all models can be implemented with a same experimental configuration, the evaluation of different models is standardized.
  • Extensibility and Usability. MWPToolkit provides user-friendly interfaces for various functions or modules. And the components in the pipeline architecture are modeled as exchangeable modules. You can try different combinations of modules via simply changing the configuration file or command line. You can also easily develop your own models by replacing or extending corresponding modules with your proposed ones.

Installation

Development environment:

python >= 3.6.0
pytorch >= 1.5.0
pyltp >= 0.2.1 (optional)

Method 1: Install from pip

pip install mwptoolkit

Method 2: Install from source

# Clone current repo
git clone https://github.com/LYH-YF/MWPToolkit.git && cd MWPToolkit

# Requirements
pip install -r requirements.txt

Quick Start

Evaluate a build-in dataset with a model

To have an initial trial of our toolkit, you can use the provided cmd script:

python run_mwptoolkit.py --model=GTS --dataset=math23k --task_type=single_equation --equation_fix=prefix --k_fold=5 --test_step=5 --gpu_id=0

Above script will run GTS model on Math23K dataset with 5 cross-validation. It will take around xx minutes to train 5 GTS models independently and output the average scores of equation accuracy and value accuracy. The training log can be found in the log file.

If you would like to change the parameters, such as dataset and model, please refer to the following instruction:

  • model: The model name you specify to apply. You can see all options in Section Model.
  • dataset: The dataset name you specify to evaluate. You can see all options in Section Dataset.
  • task_type: The type of generated equation. It should be chosen from options [single_equation | multi_equation]. Usually, it's up to the datasets. You can refer to dataset. The single-equation dataset corresponds to 'single_equation' in code and multiple-equation dataset corresponds 'multi_equation' in code.
  • equation_fix: The type of equation generation order. It should be chosen from options [infix | postfix | prefix]. Please note some models require a specific type of equation generation order. Usually, the corresponding paper for model will mention which order it takes. You can look up the reference paper in Section Model.
  • k_fold: The fold number of cross-validation. It could be either NA value or an integer. If it is NA value, it will run train-valid-test split procedure.
  • test_step: The epoch number of training after which conducts the evaluation on test. It should be an interger.
  • gpu_id: The GPU ID for training the model. It should be an integer based on your GPU configuration. Please note that we haven't tested the framework with multiple GPUs yet.

Please note model, dataset and task_type are the required. We also provide the interface where you can config your experiments by clicking options and we automatically generate corresponding cmd lines.

Evaluate a new dataset

Our supported datasets are all saved under the folder 'dataset'. Besides trying our code with these build-in datasets, we also provide the option for you to run models on your own data copies, you can follow the steps below:

Step 1: Organize your dataset. Your dataset folder (same as the dataset name) should include three json files for train, validation and test, respectively:

dataset_name
    |----trainset.json
    |----validset.json
    |----testset.json

Move your dataset folder under path 'dataset' of our framework, the file structure would be like:

dataset
    |----dataset_name
            |----trainset.json
            |----validset.json
            |----testset.json

Step 2: Setup your dataset configuration. The dataset configuration files are saved under path 'mwptoolkit/properties/dataset/'. You can write your own dataset configuration and save a JSON file under the path. The path to your JSON file should be mwptoolkit/properties/dataset/dataset_name.json

Step 3: Run the code!

python run_mwptoolkit.py --model=[model_name] --dataset=[dataset_name] --task_type=[single_equation|multi_equation] --equation_fix=[infix|postfix|prefix] --k_fold=[5|None] --gpu_id=0

Instead of moving your dataset folder and dataset configuration file to the above folders, the following parameters can be set directly.

  • dataset_path: The path to dataset folder. The default value is 'dataset/dataset_name', you can change it to your own dataset path via appending --dataset_path=[your_dataset] to cmd script.
  • dataset_config_path: The path to dataset configuration file. The default value is 'mwptoolkit/properties/dataset/dataset_name.json', you can change it to your own dataset configuration path via appending --dataset_config_path=[your_dataset_configuration] to cmd script.

Run hyper-parameters search

Our toolkit also provides the option to do hyper-parameters search, which could facilitate users to obtain optimal hyper-parameters efficiently. We integrated hyper-parameter search in our framework via ray.tune. Due to the search procedure, it will take longer time to train a model.

You can run the cmd script template below:

python run_hyper_search.py --model=[model_name] --dataset=[dataset_name] --task_type=[single_equation|multi_equation] --equation_fix=[infix|postfix|prefix] --k_fold=[5|None] --cpu_per_trial=2 --gpu_per_trial=0.5 --samples=1 --search_file=search_file.json --gpu_id=0
  • cpu_per_trial: The CPU resources to allocate per trial.
  • gpu_per_trial: The GPU resources to allocate per trial.
  • samples: The number of sampling times from the search space.
  • search_file: A json file including search parameter name and space. For example:["embedding_size=[64,128,256]","hidden_size=[256,512]","learning_rate=(1e-4, 1e-2)"]
  • search_parameter: If you don't have the search file, you can set this parameter in command line to specify the search space. For example:--search_parameter=hidden_size=[256,512] --search_parameter=embedding_size=[64,128,256] --search_parameter=learning_rate='(1e-4, 1e-2).

Architecture

We have shown the overall architecture of our toolkit in the above Figure. The configuration is specified via command line, external config files and internal config dictionaries. Multiple processors and dataloaders are integrated to process different forms of data. Models and evaluators take charge of doing the training and evaluation. Therefore, input datasets will get prepared and trained based on the specified configuration. You can refer to documentation for more information.

Dataset

We have deployed 8 popular MWP datasets in our toolkit. These datasets are divided into two categories, Single-equation dataset and Multiple-equation dataset, which can be found in the table below. We will keep updating more datasets like ape200k(Zhao et al., 2020), dolphin1878(Shi et al., 2015) and dolphin18k(Huang et al., 2016).

task dataset reference
Single-equation dataset math23k (Wang et al., 2017)
asdiv-a (Miao et al., 2020)
mawps-single (Kedziorski et al., 2016)
mawps_asdiv-a_svamp (Patel et al., 2021)
Multiple-equation dataset alg514 (Kushman et al., 2014)
draw (Upadhyay et al., 2017)
mawps (Kedziorski et al., 2016)
hmwp (Qin et al., 2020)

Model

We have deployed 18 deep learning MWP models in our toolkit. Based on the featured generation procedure, we categorize them into Sequence-to-sequence, Sequence-to-tree, Graph-to-tree, VAE and Pre-trained models. Please note Pre-trained models are simple implementation of pretrained language models on MWP solving task. The table is displayed as follows:

type model reference
Seq2Seq DNS (Wang et al., 2017)
MathEN (Wang et al., 2018)
Saligned (Chiang et al., 2019)
GroupATT (Li et al., 2019)
RNN (Sutskever et al., 2014)
RNNVAE (Zhang et al., 2016)
Transformer (Vaswani et al., 2017)
Seq2Tree TRNN (Wang et al., 2019)
AST-Dec (Liu et al., 2019)
GTS (Xie et al., 2019)
SAUSolver (Qin et al., 2020)
TSN (Zhang et al., 2020)
Graph2Tree Graph2Tree (Zhang et al., 2020)
MultiE&D (Shen et al., 2020)
Pre-trained BertGen (Devlin et al., 2018)
RobertaGen (Liu et al., 2019)
GPT-2 (Radford et al., 2019)
Updating KA-S2T (Wu et al., 2020)
HMS (Lin et al., 2021)
NUM-S2T (Wu et al., 2021)

Evaluation metric

We have implemented 2 evaluation metrics to measure the effect of MWP models.

measurement note
Equ acc The predicted equation is exactly match the correct equation
Val acc The predicted answer is match the correct answer

Experiment Results

We have implemented the models on the datasets that are integrated within our toolkit. All the implementation follows the build-in configurations. All the experiments are conducted with 5 cross-validation. The experiment results(Equ acc|Val acc) are displayed in the following table.

Single-equation Task Results

model Dataset
math23k mawps-single asdiv-a mawps_asdiv-a_svamp
Equ. Acc Ans. Acc Equ. Acc Ans. Acc Equ. Acc Ans. Acc Equ. Acc Ans. Acc
DNS 57.1 67.5 78.9 86.3 63.0 66.2 22.1 24.2
MathEN 66.7 69.5 85.9 86.4 64.3 64.7 21.8 25.0
Saligned 59.1 69.0 86.0 86.3 66.0 67.9 23.9 26.1
GroupATT 56.7 66.6 84.7 85.3 59.5 61.0 19.2 21.5
AttSeq 57.1 68.7 79.4 87.0 64.2 68.3 23.0 25.4
LSTMVAE 59.0 70.0 79.8 88.2 64.0 68.7 23.2 25.9
Transformer 52.3 61.5 77.9 85.6 57.2 59.3 18.4 20.7
TRNN 65.0 68.1 86.0 86.5 68.9 69.3 22.6 26.1
AST-Dec 57.5 67.7 84.1 84.8 54.5 56.0 21.9 24.7
GTS 63.4 74.2 83.5 84.1 67.7 69.9 25.6 29.1
SAU-Solver 64.6 75.1 83.4 84.0 68.5 71.2 27.1 29.7
TSN 63.8 74.4 84.0 84.7 68.5 71.0 25.7 29.0
Graph2Tree 64.9 75.3 84.9 85.6 72.4 75.3 31.6 35.0
MultiE&D 65.5 76.5 83.2 84.1 70.5 72.6 29.3 32.4
BERTGen 64.8 76.6 79.0 86.9 68.7 71.5 22.2 24.8
RoBERTaGen 65.2 76.9 80.8 88.4 68.7 72.1 27.9 30.3
GPT-2 63.8 74.3 75.4 75.9 59.9 61.4 22.5 25.7

Multiple-equation Task Result

model Dataset
draw hmwp
Equ. Acc Ans. Acc Equ. Acc Ans. Acc
DNS 35.8 36.8 24.0 32.7
MathEN 38.2 39.5 32.4 43.7
Saligned 36.7 37.8 31.0 41.8
GroupATT 30.4 31.4 25.2 33.2
AttSeq 39.7 41.2 32.9 44.7
LSTMVAE 40.9 42.3 33.6 45.9
Transformer 27.1 28.3 24.4 32.4
TRNN 27.4 28.9 27.2 36.8
AST-Dec 26.0 26.7 24.9 32.0
GTS 38.6 39.9 33.7 44.6
SAU-Solver 38.4 39.2 33.1 43.7
TSN 39.3 40.4 34.3 44.9
Graph2Tree 39.8 41.0 34.4 45.1
MultiE&D 38.1 39.2 34.6 45.3
BERTGen 33.9 35.0 29.2 39.5
RoBERTaGen 34.2 34.9 30.6 41.0
GPT-2 30.7 31.5 36.3 49.0
## Contributing

We will keep updating and maintaining this repository. You are welcome to contribute to this repository through giving us suggestions and developing extensions! If you have any questions or encounter a bug, please fill an issue.

Cite

If you find MWP toolkit is useful for your research, please cite:

@article{lan2021mwptoolkit,
    title={MWPToolkit: An Open-Source Framework for Deep Learning-Based Math Word Problem Solvers},
    author={Yihuai Lan and Lei Wang and Qiyuan Zhang and Yunshi Lan and Bing Tian Dai and Yan Wang and Dongxiang Zhang and Ee-Peng Lim},
    journal={arXiv preprint arXiv:2109.00799},
    year={2021}
}

License

MWPToolkit uses MIT License.

Comments
  • How to perform inference for only one math problem?

    How to perform inference for only one math problem?

    I also have another doubt, when I run the evaluation on the mawps dataset, parenthesis are gone and signs are messed up on the predicted answer:

    Example: { "id": 460, "prediction": "= x - + 47.0 40.0 25.0", --> this should be X=((47.0+40.0)-25.0) "target": "= x - + 47.0 40.0 25.0", "number list": [ "47.0", "25.0", "40.0" ], "value acc": true, "equ acc": true },

    opened by Gabriel11101 5
  • Are you able to reproduce the paper results from MWP-BERT (Liang et al., 2021)

    Are you able to reproduce the paper results from MWP-BERT (Liang et al., 2021)

    I found that the paper (https://arxiv.org/pdf/2107.13435.pdf) was also using the BERT-based model, but it seems they are able to achieve 83.8 accuracy (Table 1 in the paper, MWP-BERT w/o MLM), by a simple sequence-to-sequence model, their paper mentioned it is a sequence-to-tree. But from the figure, it seems it is still generating a sequence only.

    image

    I notice that you also mentioned this work in the related work. Just wonder if you guys have replicated that that results as well.

    opened by allanj 5
  • MAWPS dataset issue

    MAWPS dataset issue

    In the graph-to-tree paper, they reported the size of MAWPS is 2373, but in this toolkit/paper, the size is 1987. Which one is correct? And also, it might not be comparable for this table.

    image

    opened by allanj 3
  • torch.nn.modules.module.ModuleAttributeError: 'Graph2Tree' object has no attribute 'in_pad_token'

    torch.nn.modules.module.ModuleAttributeError: 'Graph2Tree' object has no attribute 'in_pad_token'

    Code version: 8779d0d Command line (generated by https://mwptoolkit.readthedocs.io/en/latest/_static/cmd.html):

    python run_mwptoolkit.py --model=Graph2Tree --dataset=asdiv-a --task_type=single_equation --gpu_id=0 --equation_fix=prefix
    

    Error Message:

    Traceback (most recent call last):
      File "run_mwptoolkit.py", line 26, in <module>
        run_toolkit(args.model, args.dataset, args.task_type, config_dict)
      File "/data/MWPToolkit/mwptoolkit/quick_start.py", line 215, in run_toolkit
        train_with_train_valid_test_split(config)
      File "/data/MWPToolkit/mwptoolkit/quick_start.py", line 103, in train_with_train_valid_test_split
        trainer.fit()
      File "/data/MWPToolkit/mwptoolkit/trainer/supervised_trainer.py", line 532, in fit
        loss_total, train_time_cost = self._train_epoch()
      File "/data/MWPToolkit/mwptoolkit/trainer/supervised_trainer.py", line 515, in _train_epoch
        batch_loss = self._train_batch(batch)
      File "/data/MWPToolkit/mwptoolkit/trainer/supervised_trainer.py", line 476, in _train_batch
        batch_loss = self.model.calculate_loss(batch)
      File "/data/MWPToolkit/mwptoolkit/model/Graph2Tree/graph2tree.py", line 161, in calculate_loss
        group_nums, target, output_all_layers=True)
      File "/data/MWPToolkit/mwptoolkit/model/Graph2Tree/graph2tree.py", line 105, in forward
        seq_mask = torch.eq(seq, self.in_pad_token).to(self.device)
      File "/data/anaconda3/envs/number/lib/python3.7/site-packages/torch/nn/modules/module.py", line 779, in __getattr__
        type(self).__name__, name))
    torch.nn.modules.module.ModuleAttributeError: 'Graph2Tree' object has no attribute 'in_pad_token'
    
    bug 
    opened by liamjxu 2
  • Pretrained Model of GPT2

    Pretrained Model of GPT2

    Thanks for your great work. I noticed that the default pretrained model path in GPT2 is pretrained/gpt2_cn, which is ignored in .gitignore. Could you provide a download link for the Chinese model?

    opened by ToheartZhang 2
  • Cannot reproduce results for RobertaGen on SVAMP

    Cannot reproduce results for RobertaGen on SVAMP

    Hi,

    Thanks for your great work.

    I am trying to reproduce the results for RobertaGen on SVAMP.

    This is my command

    python run_mwptoolkit.py
    --model=RobertaGen --dataset=mawps_asdiv-a_svamp --task_type=single_equation --equation_fix=prefix --k_fold=None --test_step=5 --gpu_id=1

    In your paper, the performance of RoBERTaGen on SVAMP is 30.3%. But my reproduced result is only 23.4%. Could you share with me about why this happens?

    Thanks again for your work!

    opened by qfkkwgd 0
  • Transformer can not be trained on a new dataset!

    Transformer can not be trained on a new dataset!

    Traceback (most recent call last): File "/MWPToolkit/run_mwptoolkit.py", line 26, in run_toolkit(args.model, args.dataset, args.task_type, config_dict) File "/MWPToolkit/mwptoolkit/quick_start.py", line 215, in run_toolkit train_with_train_valid_test_split(config) File "/MWPToolkit/mwptoolkit/quick_start.py", line 103, in train_with_train_valid_test_split trainer.fit() File "/MWPToolkit/mwptoolkit/trainer/supervised_trainer.py", line 217, in fit valid_equ_ac, valid_val_ac, valid_total, valid_time_cost = self.evaluate(DatasetType.Valid) File "/MWPToolkit/mwptoolkit/trainer/supervised_trainer.py", line 271, in evaluate batch_val_ac, batch_equ_ac = self._eval_batch(batch) File "/MWPToolkit/mwptoolkit/trainer/supervised_trainer.py", line 148, in _eval_batch test_out, target = self.model.model_test(batch) File "/MWPToolkit/mwptoolkit/model/Seq2Seq/transformer.py", line 154, in model_test _, symbol_outputs, _ = self.forward(src) File "/MWPToolkit/mwptoolkit/model/Seq2Seq/transformer.py", line 100, in forward source_embeddings = self.in_embedder(src).to(self.device) + self.pos_embedder(src).to(self.device) File "/root/anaconda3/envs/toolkit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/MWPToolkit/mwptoolkit/module/Embedder/basic_embedder.py", line 32, in forward embedding_output = self.embedder(input_seq) File "/root/anaconda3/envs/toolkit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/root/anaconda3/envs/toolkit/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 158, in forward return F.embedding( File "/root/anaconda3/envs/toolkit/lib/python3.10/site-packages/torch/nn/functional.py", line 2199, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)

    opened by CeMarzie 2
  • Can Not Use Cuda When Running K-fold Cross Validation (bug in v0.0.6)

    Can Not Use Cuda When Running K-fold Cross Validation (bug in v0.0.6)

    When use a GPU to train any model with k-fold cross validation, it seems all right when running the first fold and starts to train model slowly when running the second fold. Actually, the GPU is not used to train model. It is caused by the code of saving checkpoint. All the parameters(parameters in config object) are saved in a json file when saving checkpoint. The important is, config['device']=torch.device('cuda') can't be parsed to json format. However, this parameter is deleted directly from config object. So when running another fold, config['device'] can't be found, so the model is not on the GPU.

    bug 
    opened by LYH-YF 0
  • RNNVAE object has no attribute max_gen_len

    RNNVAE object has no attribute max_gen_len

    On trying to train the LSTM VAE model for multi-equation on both hwmps and draw the following error is thrown. image

    Changing max_gen_len to max_length in rnnvae.py fixed this issue.

    bug 
    opened by Chakita 0
  • run_hyper_search.py is not working

    run_hyper_search.py is not working

    Failure # 1 (occurred at 2022-05-23_16-22-42) Traceback (most recent call last): File "/home/marzieh/anaconda3/envs/mwptoolkit/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 901, in get_next_executor_event future_result = ray.get(ready_future) File "/home/marzieh/anaconda3/envs/mwptoolkit/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/home/marzieh/anaconda3/envs/mwptoolkit/lib/python3.7/site-packages/ray/worker.py", line 1809, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(TuneError): ray::ImplicitFunc.train() (pid=70586, ip=192.168.21.70, repr=func) File "/home/marzieh/anaconda3/envs/mwptoolkit/lib/python3.7/site-packages/ray/tune/trainable.py", line 349, in train result = self.step() File "/home/marzieh/anaconda3/envs/mwptoolkit/lib/python3.7/site-packages/ray/tune/function_runner.py", line 403, in step self._report_thread_runner_error(block=True) File "/home/marzieh/anaconda3/envs/mwptoolkit/lib/python3.7/site-packages/ray/tune/function_runner.py", line 568, in _report_thread_runner_error ("Trial raised an exception. Traceback:\n{}".format(err_tb_str)) ray.tune.error.TuneError: Trial raised an exception. Traceback: ray::ImplicitFunc.train() (pid=70586, ip=192.168.21.70, repr=func) File "/home/marzieh/anaconda3/envs/mwptoolkit/lib/python3.7/site-packages/ray/tune/function_runner.py", line 272, in run self._entrypoint() File "/home/marzieh/anaconda3/envs/mwptoolkit/lib/python3.7/site-packages/ray/tune/function_runner.py", line 351, in entrypoint self._status_reporter.get_checkpoint(), File "/home/marzieh/anaconda3/envs/mwptoolkit/lib/python3.7/site-packages/ray/tune/function_runner.py", line 640, in _trainable_func output = fn() File "/home/marzieh/PycharmProjects/mwptoolkit/MWPToolkit/mwptoolkit/hyper_search.py", line 34, in train_process dataset = create_dataset(configs) File "/home/marzieh/PycharmProjects/mwptoolkit/MWPToolkit/mwptoolkit/data/utils.py", line 47, in create_dataset return SingleEquationDataset(config) File "/home/marzieh/PycharmProjects/mwptoolkit/MWPToolkit/mwptoolkit/data/dataset/single_equation_dataset.py", line 89, in init super().init(config) File "/home/marzieh/PycharmProjects/mwptoolkit/MWPToolkit/mwptoolkit/data/dataset/abstract_dataset.py", line 116, in init self._load_k_fold_dataset() File "/home/marzieh/PycharmProjects/mwptoolkit/MWPToolkit/mwptoolkit/data/dataset/abstract_dataset.py", line 194, in _load_k_fold_dataset datas = self._load_all_data() File "/home/marzieh/PycharmProjects/mwptoolkit/MWPToolkit/mwptoolkit/data/dataset/abstract_dataset.py", line 128, in _load_all_data trainset = read_json_data(os.path.join(os.getcwd(), trainset_file)) File "/home/marzieh/PycharmProjects/mwptoolkit/MWPToolkit/mwptoolkit/utils/utils.py", line 33, in read_json_data f = open(filename, 'r', encoding="utf-8") FileNotFoundError: [Errno 2] No such file or directory: '/home/marzieh/ray_results/train_process_2022-05-23_16-22-25/train_process_d056d_00000_0_embedding_size=128,epoch_nums=80,hidden_size=2,test_batch_size=16,train_batch_size=16_2022-05-23_16-22-25/dataset/pmwp/trainset.json'

    why is it reading trainset.json from this path -- '/home/marzieh/ray_results/train_process_2022-05-23_16-22-25/train_process_d056d_00000_0_embedding_size=128,epoch_nums=80,hidden_size=2,test_batch_size=16,train_batch_size=16_2022-05-23_16-22-25/dataset/pmwp/trainset.json' -- ??

    opened by Marzie00Abd 3
Releases(v0.0.6)
  • v0.0.6(Apr 23, 2022)

    MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving.

    News in version 0.0.6 1.Fix some bugs:

    (1)from_prefix_to_infix,from_postfix_to_infix in mwptoolkit/utils/preprocess_tool/equation_operator.py

    (2)the sequence length will be longer than pos_embedder's max length in RobertGen, BertGen.

    (3)data preprocessing for new dataset won't automatically remove 'x=' or '=x' in single equation.

    2.Update new models:

    (1)Seq2Tree model [BertTD]

    (2)Seq2Tree model [MWPBert]

    3.Rewrite Dataloader and Config

    4.Implement function save_dataset() and load_from_pretrained() of Dataset

    5.Implement function save_config() and load_from_pretrained() of Config

    Source code(tar.gz)
    Source code(zip)
  • v0.0.5(Oct 24, 2021)

  • v0.0.4(Sep 15, 2021)

Large-scale language modeling tutorials with PyTorch

Large-scale language modeling tutorials with PyTorch 안녕하세요. 저는 TUNiB에서 머신러닝 엔지니어로 근무 중인 고현웅입니다. 이 자료는 대규모 언어모델 개발에 필요한 여러가지 기술들을 소개드리기 위해 마련하였으며 기본적으로

TUNiB 172 Dec 29, 2022
Pretraining Representations For Data-Efficient Reinforcement Learning

Pretraining Representations For Data-Efficient Reinforcement Learning Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Ch

Mila 40 Dec 11, 2022
Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.

COResets and Data Subset selection Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order

decile-team 244 Jan 09, 2023
PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric

PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric This repository contains the implementation of MSBG hearing loss m

BUT <a href=[email protected]"> 9 Nov 08, 2022
Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences forImage-Text Retrieval

NSGDC Some codes in this repo are copied/modified from opensource implementations made available by UNITER, PyTorch, HuggingFace, OpenNMT, and Nvidia.

Zhihao Fan 2 Nov 07, 2022
Jittor Medical Segmentation Lib -- The assignment of Pattern Recognition course (2021 Spring) in Tsinghua University

THU模式识别2021春 -- Jittor 医学图像分割 模型列表 本仓库收录了课程作业中同学们采用jittor框架实现的如下模型: UNet SegNet DeepLab V2 DANet EANet HarDNet及其改动HarDNet_alter PSPNet OCNet OCRNet DL

48 Dec 26, 2022
A large-image collection explorer and fast classification tool

IMAX: Interactive Multi-image Analysis eXplorer This is an interactive tool for visualize and classify multiple images at a time. It written in Python

Matias Carrasco Kind 23 Dec 16, 2022
Distance Encoding for GNN Design

Distance-encoding for GNN design This repository is the official PyTorch implementation of the DEGNN and DEAGNN framework reported in the paper: Dista

172 Nov 08, 2022
SoGCN: Second-Order Graph Convolutional Networks

SoGCN: Second-Order Graph Convolutional Networks This is the authors' implementation of paper "SoGCN: Second-Order Graph Convolutional Networks" in Py

Yuehao 7 Aug 16, 2022
Using Hotel Data to predict High Value And Potential VIP Guests

Description Using hotel data and AI to predict high value guests and potential VIP guests. Hotel can leverage on prediction resutls to run more effect

HCG 12 Feb 14, 2022
TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".

ICNet_tensorflow This repo provides a TensorFlow-based implementation of paper "ICNet for Real-Time Semantic Segmentation on High-Resolution Images,"

HsuanKung Yang 406 Nov 27, 2022
The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

SF-Net for fullband SE This is the repo of the manuscript "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Ban

Guochen Yu 36 Dec 02, 2022
ICLR21 Tent: Fully Test-Time Adaptation by Entropy Minimization

⛺️ Tent: Fully Test-Time Adaptation by Entropy Minimization This is the official project repository for Tent: Fully-Test Time Adaptation by Entropy Mi

Dequan Wang 204 Dec 25, 2022
Bianace Prediction Pytorch Model

Bianace Prediction Pytorch Model Main Results ETHUSDT from 2021-01-01 00:00:00 t

RoyYang 4 Jul 20, 2022
A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization

MADGRAD Optimization Method A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization pip install madgrad Try it out! A best

Meta Research 774 Dec 31, 2022
A Fast Sequence Transducer Implementation with PyTorch Bindings

transducer A Fast Sequence Transducer Implementation with PyTorch Bindings. The corresponding publication is Sequence Transduction with Recurrent Neur

Awni Hannun 184 Dec 18, 2022
This repository holds the code for the paper "Deep Conditional Gaussian Mixture Model forConstrained Clustering".

Deep Conditional Gaussian Mixture Model for Constrained Clustering. This repository holds the code for the paper Deep Conditional Gaussian Mixture Mod

17 Oct 30, 2022
No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

This repository contains the implementation for the paper: No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consiste

Alireza Golestaneh 75 Dec 30, 2022
This tool uses Deep Learning to help you draw and write with your hand and webcam.

This tool uses Deep Learning to help you draw and write with your hand and webcam. A Deep Learning model is used to try to predict whether you want to have 'pencil up' or 'pencil down'.

lmagne 169 Dec 10, 2022
HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

HeatNet HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events glob

Google Research 6 Jul 07, 2022