fklearn: Functional Machine Learning

Overview

fklearn: Functional Machine Learning

PyPI Documentation Status Gitter CircleCI codecov.io License

fklearn uses functional programming principles to make it easier to solve real problems with Machine Learning.

The name is a reference to the widely known scikit-learn library.

fklearn Principles

  1. Validation should reflect real-life situations.
  2. Production models should match validated models.
  3. Models should be production-ready with few extra steps.
  4. Reproducibility and in-depth analysis of model results should be easy to achieve.

Documentation | Getting Started | API Docs | Contributing |

Installation

To install via pip:

pip install fklearn

You can also install from the source:

git clone [email protected]:nubank/fklearn.git
cd fklearn
git checkout master
pip install -e .

License

Apache License 2.0

Comments
  • Columns duplicator learner and decorator

    Columns duplicator learner and decorator

    Status

    READY

    Todo list

    • [x] Documentation
    • [x] Tests added and passed
    • [X] Issue: closes #116

    Description of the changes proposed in the pull request

    • Add a new transformer feature_duplicator
    • Add new meta decorator column_duplicatable to transformers and learners to duplicate columns

    Background

    Currently, the transformations (such as target_categorizer) replace the value in the column. A better approach would be to allow the user to preserve the original value, outputting the encoded feature to a new column. This would also enable users to apply more than one transformation to the same column (example: frequency and target encoding of the same feature) without needing to duplicate the column first.

    Usage examples in comments.

    opened by vitorsrg 11
  • Fix shap

    Fix shap

    Status

    On Hold

    Todo list

    • [x] Issue: closes #109

    Background context

    There was a change on Shap's output (to a list of ndarray) on its newer versions, and the classification learners throw a ValueError when apply_shap=True for binary classification. Multiclass classification and regression are not affected.

    Description of the changes proposed in the pull request

    Uses shap_values[1] and explainer.expected_value[1] for binary classification.

    opened by tatasz 11
  • Causal val Funcs

    Causal val Funcs

    Status

    READY

    Todo list

    • [x] Documentation
    • [x] Tests added and passed

    Background context

    Add common causal evaluation techniques, such as cumulative effect curve and cumulative gain curve

    Description of the changes proposed in the pull request

    First, this PR adds a list of possible metrics that one can use as effects. Then, it uses those effects for causal model validation curves. Finally, it adds functions to compute the area under those causal inference curves

    Where should the reviewer start?

    1. Effects
    2. Curves
    3. AUC

    Remaining problems or questions

    Need to integrate the causal metrics with standard fklearn validators. This should be easy to do, as the causal AUC function all return a float.

    review-request 
    opened by matheusfacure 10
  • Create multiple requirements to avoid install unused packages

    Create multiple requirements to avoid install unused packages

    Status

    IN DEVELOPMENT

    Todo list

    • [ ] Documentation

    Background context

    As we add new models, this package gets bigger, what can lead to a bad user experience. So we want to find a way to enable user install only the dependencies he/she need

    Description of the changes proposed in the pull request

    We introduce different requirements file based on scope, so if you want to use only xgboost, you can install only it using pip install fklearn[xgboost], same for other models

    Remaining problems or questions

    • Are we ok with this change, or should we keep all_deps(except for test) as default?
    enhancement ready-to-merge 
    opened by caique-lima 8
  • Make things like xgboost, lgbm ... extra requirements

    Make things like xgboost, lgbm ... extra requirements

    Describe the feature and the current state.

    Today we choose to install all packages required by fklearn, for development purposes this sounds good, but in production could be better to install only what you need. My idea is to use extra requirements, this way we can have something like:

    pip install fklearn[xgboost]
    

    And we only install the common packages + xgboost, instead of install everything. We should also include a fklearn[all] to install everything

    Will this change a current behavior? How?

    Yes, standalone install will not work with most models.

    Additional Information

    This needs to be well documented in order to avoid a bad user experience

    enhancement good first issue 
    opened by caique-lima 5
  • Tutorial

    Tutorial

    Status

    READY

    Background context

    On fklearn open source announcement meetup we did a live presentation about how to use fklearn

    Description of the changes proposed in the pull request

    Adding two notebooks:

    • Notebook used to generate the dataset that was used on the presentation
    • Notebook of the presentation with all the analyses/models done on top of the dataset

    Where should the reviewer start?

    Review both notebooks

    review-request 
    opened by hf-lopes 5
  • First iteration of a data corruption feature

    First iteration of a data corruption feature

    Status

    READY

    Todo list

    • [x] Documentation
    • [x] Tests added and passed
    • [ ] Bump version
    • [ ] Edit the changelog: ?

    Background context

    First attempt to implement an artificial data corruption feature. Sort of a Chaos Monkey for datasets.

    Description of the changes proposed in the pull request

    Implements a new perturbators.py file to store data perturbation functions and also a general function similar to validator in validator.py.

    Related PRs

    None.

    Where should the reviewer start?

    Macacaos.ipynb: simple demo of what was implemented.

    perturbator.py: I tried to follow the same structure in validator.py: a general broad function perturbator() that receives specific column-wise perturbation functions like nullify() or shift_mu().

    validator.py: added two methods chaos_validator and chaos_validator_iteration, analogous to their non chaos counterparts but with data corruption at either train-time or test-time.

    Remaining problems or questions

    Is creating the chaos_validator function the best approach? Maybe just an optional argument inside the original validator?

    No logging in the new functions. How to do it?

    Maybe more corruption functions.

    Improve the demo. Current plot does not look that convincing on how the dataset is being degraded by nulls.

    Tests.

    enhancement ready-to-merge 
    opened by sadikneipp 5
  • Include min and max values in the extremeintervals on quantile biner.

    Include min and max values in the extremeintervals on quantile biner.

    Status

    READY

    • [ ] Issue: closes #112

    Background context

    quantile_biner would result in n+1 quantiles when the min or max values were equal to either the first or last quantile.

    Description of the changes proposed in the pull request

    Change values categorized as the '0' quantile to 1 and values categorizes as 'n+1' (n being the wanted number of quantiles) to n.

    opened by gabrieldi95 4
  • Add Normalized Discount Cumulative Gain (NDCG) evaluator

    Add Normalized Discount Cumulative Gain (NDCG) evaluator

    Status

    READY

    Todo list

    • [x] Documentation
    • [x] Tests added and passed

    Background context

    Currently, in this project exists an only rank evaluator, the Spearman correlation that examines the rank order between two variables. However, if we must measure the quality of the two ranks given real and predicted scores, we have to consider using the Normalized Discount Cumulative Gain (NDCG). Here, there are two main motivations [1]

    1. Highly relevant items are more useful when appearing earlier in a search – have higher ranks;
    2. Highly relevant items are more useful than marginally relevant items, which are in turn more useful than non-relevant items.

    This metric is widely used by recommender systems and information retrieval [2].

    [1] https://en.wikipedia.org/wiki/Discounted_cumulative_gain.

    [2] BOBADILLA, Jesús et al. Recommender systems survey. Knowledge-based systems, v. 46, p. 109-132, 2013.

    enhancement 
    opened by ajmemilio 4
  • Fix shap

    Fix shap

    Status

    READY

    Todo list

    • [x] Tests added and passed
    • [x] Issue: closes #109

    Background context

    Update shap to be compatible with latest shap versions

    Description of the changes proposed in the pull request

    Updated lgbm shap, and changed the way multiclass shap is calculated for catboost. UPD: added description for magic numbers

    Where should the reviewer start?

    src/fklearn/training/classification.py

    opened by tatasz 4
  • Adding PR AUC evaluator

    Adding PR AUC evaluator

    Status

    READY

    Todo list

    • [ ] Documentation
    • [ ] Tests added and passed

    Background context

    Even though we have ROC AUC evaluator implemented in fklearn, we are still missing PR AUC evaluator.

    Description of the changes proposed in the pull request

    This PR:

    1. Adds a new evaluator pr_auc_evaluator
    2. Changes the generic auc evaluator name from auc_evaluator to roc_auc_evaluator
    enhancement review-request 
    opened by dieggoluis 4
  • Add ascending parameter causal validation

    Add ascending parameter causal validation

    Instructions

    • Follow the instructions in README.md
    • Delete everything between parenthesis (...)
    • Remove the sections that are not relevant

    Status

    READY

    Todo list

    • [x] Documentation
    • [x] Tests added and passed
    • [x] Issue: closes #204

    Background context

    In the causal validation module and the curves file, it would be useful to add an ascending parameter for the cumulative effect and cumulative gain curves.

    The current state is to order predictions descending:

    ordered_df = df.sort_values(prediction, ascending=False).reset_index(drop=True) If we add an ascending: bool = False argument to the cumulative_effect_curve, cumulative_gain_curve, relative_cumulative_gain_curve, and effect_curves, a user could modify how these effects are computed, whether to do them ascending or descending by the prediction column.

    Description of the changes proposed in the pull request

    A model could output a prediction that is not necessarily positively related to the effect to be computed, so adding an option to order this relationship differently allows for effects and gains with negatively related predictions and outcomes to be computed adequately.

    The changes are applied to curves.py and also on auc.py on the causal-effect module.

    Where should the reviewer start?

    Reviewing causal-effect/curves as there are the definition of the functions from which all ordering behavior is propagated.

    opened by MarianaBlaz 0
  • Add K-means algorithm and unsupervised evaluators

    Add K-means algorithm and unsupervised evaluators

    Status

    READY

    Todo list

    • [x] Documentation
    • [x] Tests added and passed

    Background context

    We are extending the fklearn capabilities with unsupervised algorithm support. First, we proposed to use k-means clustering implementation from scikit-learn since it is a simple and relevant unsupervised algorithm.

    Description of the changes proposed in the pull request

    • Add kmeans learner using scikit-learn K-means implementation

    • Add Silhouette Coefficient to evaluate unsupervised results

    • Add Davies-Bouldin score to evaluate unsupervised results

    • Add generic_unsupervised_sklearn_evaluator method in order to provide support to scikit-learn unsupervised metrics

    Where should the reviewer start?

    kmeans_learner method at src/fklearn/training/unsupervised.py. Then, the unsupervised evaluation metrics at src/fklearn/validation/evaluators.py.

    review-request 
    opened by fabiano-santos-nubank 0
  • Bumping the range of scikit-learn supported versions

    Bumping the range of scikit-learn supported versions

    Status

    READY

    Todo list

    • [x] Documentation
    • [x] Tests added and passed

    (Nothing had to change in here)

    Background context

    Requiring a version of scikit-learn strictly lower tan 0.24 prevents the installation of fklearn in python 3.9. More context in this Slack thread.

    Description of the changes proposed in the pull request

    Versions of scikit-learn that can be installed in python 3.9 (and don't include any major change) are supported. The only breaking change is in IsolationForest, but a version check is added for backwards compatibility.

    Where should the reviewer start?

    Check the bump in the range of the scikit-learn supported versions and, if interested, check about the behaviour param in the IsolationForest's doc.

    opened by peguerosdc 2
  • Causal Effect Bin Partitioners

    Causal Effect Bin Partitioners

    Status

    READY

    Todo list

    • [x] Documentation
    • [x] Tests added and passed

    Background context

    To calculate causal effects by segments a quantile-based approach is used to create the segments. However, we've seen that this is usually not the ideal way to create them as there are methods that create segments that are more distinguishable between one another.

    One such example is the Fisher-Jenks algorithm. A user could create its own partitioner with this algorithm like this:

    import pandas as pd
    
    from jenkspy import jenks_breaks
    from toolz import curry
    from typing import List
    
    
    @curry
    def fisher_jenks_partitioner(series: pd.Series, segments: int) -> List:
        
        bins = jenks_breaks(series, n_classes=segments)
        bins[0] = -float("inf")
        bins[-1] = float("inf")
        
        return bins
    

    And use it in effect_by_segment:

    from fklearn.causal.effects import linear_effect
    from fklearn.causal.validation.curves import effect_by_segment
    
    df = pd.DataFrame(dict(
        t=[1, 1, 1, 2, 2, 2, 3, 3, 3],
        x=[1, 2, 3, 1, 2, 3, 1, 2, 3],
        y=[1, 1, 1, 2, 3, 4, 3, 5, 7],
    ))
    
    result = effect_by_segment(
        df,
        prediction="x",
        outcome="y",
        treatment="t",
        segments=3,
        effect_fn=linear_effect,
        partition_fn=fisher_jenks_partitioner)
    

    Or use another custom partitioner such as:

    @curry
    def bin_partitioner(series: pd.Series, segments: int = 1) -> List:
        return [1, 4, 5]
    

    Description of the changes proposed in the pull request

    We're adding:

    • an argument to the effect_by_segment function so a user can define the way the segments are created.
    • the quantile_partitioner so the default behavior of effect_by_segment is maintained.
    • a new PartitionFnType type.
    • tests for quantile_partitioner
    • documentation for the new fklearn.causal.partitioners module

    Related PRs

    NA

    Where should the reviewer start?

    At the modifications we did in effect_by_segment and then to the quantile_partitioner definition.

    Remaining problems or questions

    We are not creating additional partitioners to the ones used by default because this would require more complex definitions or imports on new libraries (such as the Fisher-Jenks algorithm).

    review-request 
    opened by HectorLira 1
  • Add MinMax Scaler

    Add MinMax Scaler

    Status

    READY

    Todo list

    • [x] Documentation
    • [x] Tests added and passed
    • [x] ~Issue: closes NA~

    Background context

    We're adding a MinMax scaler to fklearn.training module, often used in neural networks.

    Description of the changes proposed in the pull request

    We've made two changes:

    1. Adding the MinMax scaler to fklearn.training module.
    2. Adding tests of the MinMax scaler.

    Related PRs

    NA

    Where should the reviewer start?

    With the definition of the MinMax scaler itself.

    Remaining problems or questions

    NA

    review-request 
    opened by HectorLira 0
  • Add feature_clustering_selection method

    Add feature_clustering_selection method

    Status

    READY

    Todo list

    • [x] Documentation
    • [x] Tests added and passed

    Background context

    This is a correlation-based feature selection method. But unlike the already existing correlation_feature_selection which does not have a criteria to selected among correlated features, feature_clustering_selection first employs a feature clustering, using absolute correlation as distance metric, following by the selection of the feature with lower 1-R2 metric from each cluster. 1-R2 metric allows to find the feature that most preserve the information (own cluster R2) from the other features from the same clusters, penalizing by the information (nearest cluster R2) present in the nearest cluster.

    Description of the changes proposed in the pull request

    This commit will add the feature selection method feature_clustering_selection in fklearn/tuning/model_agnostic_fc.py

    Where should the reviewer start?

    The reviewer should start by method feature_clustering_selection at src/fklearn/tuning/model_agnostic_fc.py The method test_feature_clustering_selection at fklearn/tests/tuning/test_model_agnostic_fc.py illustrates how is the method usage.

    opened by brunoleme 0
Releases(2.2.1)
  • 2.2.1(Sep 6, 2022)

    • Bug Fix
      • Including a necessary init file to allow the import of the causal cate learners.
      • Fix a docstring issue where the description of causal learners were not showing all parameters.
    Source code(tar.gz)
    Source code(zip)
  • 2.2.0(Aug 25, 2022)

    • Enhancement
      • Including Classification S-Learner model to the causal cate learning library.
    • Bug Fix
      • Fix validator behavior when receiving data containing gaps and a time based split function that could generate empty training and testing folds and then break. The argument drop_empty_folds can be set to True to drop invalid folds from validation and store them in the log.
    • Documentation
      • Including Classification S-learner documentation and changing validator documentation to reflect changes.
    Source code(tar.gz)
    Source code(zip)
  • 2.1.0(Jul 27, 2022)

    • Enhancement
      • Add optional parameter return_eval_logs_on_train to the validator function, enabling it to return the evaluation logs for all training folds instead of just the first one
    • Bug Fix
      • Fix import in pd_extractors.py for Python 3.10 compatibility
      • Set a minimal version of Python (3.6.2) for Fklearn
    • Documentation
      • Fixing some typos, broken links and general improvement on the documentation
    Source code(tar.gz)
    Source code(zip)
  • 2.0.0(Dec 30, 2021)

    • Possible breaking changes
      • Allow greater versions of:
        • catboost, lightgbm, xgboost
        • joblib, numpy
        • shap, swifter
        • matplotlib, tqdm, scipy
      • Most of the breaking changes in the libs above were due to deprecation of support to Python 3.5 and older versions.
      • Libraries depending on fklearn can still restrict the versions of the aforementioned libraries, keeping the previous behavior (e.g., xgboost<0.90).
    Source code(tar.gz)
    Source code(zip)
  • 1.24.0(Dec 6, 2021)

  • 1.23.0(Oct 29, 2021)

  • 1.22.2(Sep 1, 2021)

  • 1.22.0(Feb 9, 2021)

    • Enhancement
      • Add verbose method to validator and parallel_validator
      • Add column_duplicator decorator to value_mapper
    • Bug Fix
      • Fix Spatial LC check
      • Fix circleci
    Source code(tar.gz)
    Source code(zip)
  • 1.21.0(Oct 2, 2020)

    • Enhancement
      • Now transformers can create a new column instead of replace the input
    • Bug Fix
      • Make requirements more flexible to cover the latest releases
      • split_evaluator_extractor now supports eval_name parameter
      • Fixed drop_first_column behaviour in onehot categorizer
    • New
      • Add learner to calibrate predictions based on a fairness metric
    • Documentation
      • Fixed docstrings for reverse_time_learning_curve_splitter and feature_importance_backward_selection
    Source code(tar.gz)
    Source code(zip)
  • 1.20.0(Jul 13, 2020)

  • 1.19.1(Jul 13, 2020)

  • 1.19.0(Jun 17, 2020)

  • 1.18.0(May 8, 2020)

    • Enhancement
      • Allow users to inform a Placeholder value in imputer learner
    • New
      • Add Normalized Discount Cumulative Gain evaluator
    • Bug Fix
      • Fix some sklearn related warnings
      • Fix get_recovery logic in make_confounded_data method
    • Documentation
      • Add target_categorizer documentation
    Source code(tar.gz)
    Source code(zip)
  • 1.17.0(Feb 28, 2020)

    • Enhancement
      • Allow users to set a gap between training and holdout in time splitters
      • Raise Errors instead of use asserts
    • New
      • Support pipelines with duplicated learners
      • Add stratified split method
    • Bug Fix
      • Fix space_time_split holdout
      • Fix compatibility with newer shap version
    Source code(tar.gz)
    Source code(zip)
  • 1.16.0(Oct 7, 2019)

    • Enhancement
      • Improve split evaluator to avoid unexpected errors
    • New
      • Now users can install only the set of requirements they need
      • Add Target encoding learner
      • Add PR AUC and rename AUC evaluator to ROC AUC
    • Bug Fix
      • Fix bug with space_time_split_dataset fn
    • Documentation
      • Update space time split DOCSTRING to match the actual behaviour
      • Add more tutorials(Pydata)
    Source code(tar.gz)
    Source code(zip)
  • 1.15.1(Aug 16, 2019)

  • 1.15.0(Aug 12, 2019)

    • Enhancement
      • Make custom_transformer a pure function
      • Remove unused requirements
    • New
      • Now features created by one hot enconding can be used in the next steps of pipeline
      • Shap multiclass support
      • Custom model pipeline
    • Bug Fix
      • Fix the way one hot encoding handle nans
    • Documentation
      • Minor fix flake8 documentation to make it work in other shells
      • Fix fbeta_score_evaluator docstring
      • Fix typo on onehot_categorizer
      • New tutorial from meetup presentation
    Source code(tar.gz)
    Source code(zip)
  • 1.14.2(Aug 2, 2019)

  • 1.13.5(Aug 2, 2019)

  • 1.14.1(May 29, 2019)

  • 1.13.4(May 29, 2019)

  • 1.14.0(Apr 30, 2019)

    • Enhancement
      • Validator accepts predict_oof as argument
    • New
      • Add CatBoosting regressor
      • Data corruption(Macacaos)
    • Documentation
      • Multiple fixes in the documentation
      • Add Contribution guide
    Source code(tar.gz)
    Source code(zip)
  • 1.13.3(Apr 24, 2019)

  • 1.13.2(Apr 22, 2019)

Owner
nubank
nubank
GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

564 Jan 02, 2023
Implementation of our recent paper, WOOD: Wasserstein-based Out-of-Distribution Detection.

WOOD Implementation of our recent paper, WOOD: Wasserstein-based Out-of-Distribution Detection. Abstract The training and test data for deep-neural-ne

8 Dec 24, 2022
SiT: Self-supervised vIsion Transformer

This repository contains the official PyTorch self-supervised pretraining, finetuning, and evaluation codes for SiT (Self-supervised image Transformer).

Sara Ahmed 275 Dec 28, 2022
This repository contains the scripts for downloading and validating scripts for the documents

HC4: HLTCOE CLIR Common-Crawl Collection This repository contains the scripts for downloading and validating scripts for the documents. Document ids,

JHU Human Language Technology Center of Excellence 6 Jun 07, 2022
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Semantic Segmentation.

Swin Transformer for Semantic Segmentation of satellite images This repo contains the supported code and configuration files to reproduce semantic seg

23 Oct 10, 2022
A Pytorch implementation of the multi agent deep deterministic policy gradients (MADDPG) algorithm

Multi-Agent-Deep-Deterministic-Policy-Gradients A Pytorch implementation of the multi agent deep deterministic policy gradients(MADDPG) algorithm This

Phil Tabor 159 Dec 28, 2022
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers (arXiv2021)

Polyp-PVT by Bo Dong, Wenhai Wang, Deng-Ping Fan, Jinpeng Li, Huazhu Fu, & Ling Shao. This repo is the official implementation of "Polyp-PVT: Polyp Se

Deng-Ping Fan 102 Jan 05, 2023
Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation

A Theoretical Analysis of the Repetition Problem in Text Generation This repository share the code for the paper "A Theoretical Analysis of the Repeti

Zihao Fu 37 Nov 21, 2022
API for RL algorithm design & testing of BCA (Building Control Agent) HVAC on EnergyPlus building energy simulator by wrapping their EMS Python API

RL - EmsPy (work In Progress...) The EmsPy Python package was made to facilitate Reinforcement Learning (RL) algorithm research for developing and tes

20 Jan 05, 2023
Videocaptioning.pytorch - A simple implementation of video captioning

pytorch implementation of video captioning recommend installing pytorch and pyth

Yiyu Wang 2 Jan 01, 2022
Bayesian optimisation library developped by Huawei Noah's Ark Library

Bayesian Optimisation Research This directory contains official implementations for Bayesian optimisation works developped by Huawei R&D, Noah's Ark L

HUAWEI Noah's Ark Lab 395 Dec 30, 2022
Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph

Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph Model Description Open-CyKG is a framework that is constructed using an attenti

Injy Sarhan 34 Jan 05, 2023
PyTorch Implementation of Spatially Consistent Representation Learning(SCRL)

Spatially Consistent Representation Learning (CVPR'21) Official PyTorch implementation of Spatially Consistent Representation Learning (SCRL). This re

Kakao Brain 102 Nov 03, 2022
Keras Model Implementation Walkthrough

Keras Model Implementation Walkthrough

Luke Wood 17 Sep 27, 2022
NovelD: A Simple yet Effective Exploration Criterion

NovelD: A Simple yet Effective Exploration Criterion Intro This is an implementation of the method proposed in NovelD: A Simple yet Effective Explorat

29 Dec 05, 2022
Decorators for maximizing memory utilization with PyTorch & CUDA

torch-max-mem This package provides decorators for memory utilization maximization with PyTorch and CUDA by starting with a maximum parameter size and

Max Berrendorf 10 May 02, 2022
Spatiotemporal resampling methods for mlr3

mlr3spatiotempcv Package website: release | dev Spatiotemporal resampling methods for mlr3. This package extends the mlr3 package framework with spati

45 Nov 21, 2022
Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation.

Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation. It was introduced in Wright, Logan G. & Onodera, Tatsuhiro et al. (2021)1 to train Physical Neural Networ

McMahon Lab 230 Jan 05, 2023
Final project code: Implementing MAE with downscaled encoders and datasets, for ESE546 FA21 at University of Pennsylvania

546 Final Project: Masked Autoencoder Haoran Tang, Qirui Wu 1. Training To train the network, please run mae_pretraining.py. Please modify folder path

Haoran Tang 0 Apr 22, 2022