AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations

Related tags

Deep LearningAugLy
Overview

logo


AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-library. These sub-libraries include both function-based and class-based transforms, composition operators, and have the option to provide metadata about the transform applied, including its intensity.

AugLy is a great library to utilize for augmenting your data in model training, or to evaluate the robustness gaps of your model! We designed AugLy to include many specific data augmentations that users perform in real life on internet platforms like Facebook's -- for example making an image into a meme, overlaying text/emojis on images/videos, reposting a screenshot from social media. While AugLy contains more generic data augmentations as well, it will be particularly useful to you if you're working on a problem like copy detection, hate speech detection, or copyright infringement where these "internet user" types of data augmentations are prelevant.

Visual

To see more examples of augmentations, open the Colab notebooks in the README for each modality! (e.g. image README & Colab)

The library is Python-based and requires at least Python 3.6, as we use dataclasses.

Authors

Joanna Bitton — Software Engineer at Facebook AI

Zoe Papakipos — Research Engineer at FAIR

Installation

AugLy is a Python 3.6+ library. It can be installed with:

pip install augly

Or clone AugLy if you want to be able to run our unit tests, contribute a pull request, etc:

git clone [email protected]:facebookresearch/AugLy.git
[Optional, but recommended] conda create -n augly && conda activate augly && conda install pip
pip install -e AugLy/

NOTE: In some environments, pip doesn't install python-magic as expected. In that case, you will need to additionally run:

conda install -c conda-forge python-magic

Or if you aren't using conda:

sudo apt-get install python3-magic

Documentation

To find documentation about each sub-library, please see the READMEs in the respective directories.

Assets

We provide various media assets to use with some of our augmentations. These assets include:

  1. Emojis (Twemoji) - Copyright 2020 Twitter, Inc and other contributors. Code licensed under the MIT License. Graphics licensed under CC-BY 4.0.
  2. Fonts (Noto fonts) - Noto is a trademark of Google Inc. Noto fonts are open source. All Noto fonts are published under the SIL Open Font License, Version 1.1.
  3. Screenshot Templates - Images created by a designer at Facebook specifically to use with AugLy. You can use these with the overlay_onto_screenshot augmentation in both the image and video libraries to make it look like your source image/video was screenshotted in a social media feed similar to Facebook or Instagram.

Citation

If you use AugLy in your work, please cite:

@misc{bitton2021augly,
  author =       {Bitton, Joanna and Papakipos, Zoe},
  title =        {AugLy: A data augmentations library for audio, image, text, and video.},
  howpublished = {\url{https://github.com/facebookresearch/AugLy}},
  year =         {2021}
}

License

AugLy is MIT licensed, as found in the LICENSE file. Please note that some of the dependencies AugLy uses may be licensed under different terms.

Comments
  • Final Sphinx documentation w/ ReadTheDocs

    Final Sphinx documentation w/ ReadTheDocs

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    <Please summarize what you are trying to achieve, what changes you made, and how they acheive the desired result.>

    What This Is

    This is comprehensive documentation for AugLy with Sphinx docstring formatting under the hood with a Read the Docs theme to make augmentation parameters, return types, etc more readable and user-friendly.

    How It Works

    Sphinx is a documentation generator commonly used by the Python community. It also has its own docstring format.

    Internal AugLy docstrings utilize tags such as @param or @returns for labeling due to internal Facebook convention. However, Sphinx does not recognize these tags, opting for : in favor of @, among other changes.

    Luckily for us, Python docstring format epytext is very similar to AugLy (credit @Cloud9c), meaning that we can claim that we use epytext formatting and then convert it to sphinx when necessary.

    Another problem: Sphinx requires explicit types labeled in the form of :type and :rtype to display types once documentation is rendered. However, Python typehints (which AugLy uses generously) are not natively supported. Therefore we use an extension to Sphinx that autodetects typehints and adds them on the fly.

    In the end, Sphinx uses the module structure specified in docs/source with rST (filetype similar to Markdown) to generate a table of contents for our final documentation.

    How to Build Documentation Locally

    1. Clone repository. git clone github.com/facebookresearch/AugLy
    2. Install all requirements for both the library and documentation-specific dependencies cd docs && pip install -r requirements.txt
    3. Make sure you are in the docs subdirectory. Then run make html to generate documentation. If you want to delete all these later, you can run make clean.
    4. Navigate to docs/build and open index.html

    Generating new documentation later

    1. Sphinx can detect new files added to the augly subdirectory and add their .rst files accordingly, but the process of detection needs to be triggered manually. Run sphinx-apidoc -o ./source ../augly in the root directory to do so, and update the toctree in index.rst if necessary.
    2. Edited a docstring and want to see the same changes be reflected in the published documentation? No worries, this will be automatically completed, an overview of which is provided below.

    Integration with ReadTheDocs

    1. This documentation uses Sphinx's ReadTheDocs theme to make it easy to publish documentation on RTD's site.
    2. Create a Github webhook to detect pushes made to the repository so documentation can rebuild.
    3. The .readthedocs.yml file specifies the configuration for these builds. By default ffmpeg and libsndfile1 are based on C and are required as prerequisites before requirements in docs/requirements.txt are installed (as RTD uses Ubuntu behind the scenes).
    4. Docstrings aren't stored in the docs subdirectory at all, so all are read from the source folder. Updating the docstrings in augly/<modality> will be sufficient.
    CLA Signed 
    opened by iAdityaEmpire 59
  • Enable to set random ranges to RandomEmojiOverlay parameters

    Enable to set random ranges to RandomEmojiOverlay parameters

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    Added randomness to RandomEmojiOverlay parameters like emoji_size. A motivation is that I wanted to randomly change the parameters on-the-fly, rather than "fixed" like current implementation. This modified augmentation was actually used in ISC21 Descriptor Track 1st-place solution.

    Test Results

    The tests are passed except the following error.

    Traceback (most recent call last):
      File "/Users/shuhei.yokoo/Documents/AugLy/augly/tests/image_tests/transforms_unit_test.py", line 177, in test_RandomEmojiOverlay
        self.evaluate_class(
      File "/Users/shuhei.yokoo/Documents/AugLy/augly/tests/image_tests/base_unit_test.py", line 129, in evaluate_class
        are_equal_images(dst, ref), "Expected and outputted images do not match"
      File "/Users/shuhei.yokoo/Documents/AugLy/augly/tests/image_tests/base_unit_test.py", line 20, in are_equal_images
        return a.size == b.size and np.allclose(np.array(a), np.array(b))
      File "<__array_function__ internals>", line 5, in allclose
      File "/Users/shuhei.yokoo/.pyenv/versions/augly/lib/python3.8/site-packages/numpy/core/numeric.py", line 2249, in allclose
        res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
      File "<__array_function__ internals>", line 5, in isclose
      File "/Users/shuhei.yokoo/.pyenv/versions/augly/lib/python3.8/site-packages/numpy/core/numeric.py", line 2358, in isclose
        return within_tol(x, y, atol, rtol)
      File "/Users/shuhei.yokoo/.pyenv/versions/augly/lib/python3.8/site-packages/numpy/core/numeric.py", line 2339, in within_tol
        return less_equal(abs(x-y), atol + rtol * abs(y))
    ValueError: operands could not be broadcast together with shapes (1080,1920,3) (1080,1920,4) 
    
    ----------------------------------------------------------------------
    Ran 71 tests in 35.609s
    
    FAILED (errors=1, skipped=4)
    sys:1: ResourceWarning: unclosed file <_io.BufferedReader name='/Users/shuhei.yokoo/Documents/AugLy/augly/assets/tests/image/inputs/dfdc_1.jpg'>
    

    It seems that expected image has alpha channel whereas output image doesn't have alpha channel. I'm not sure why it happened. Replacing the expected image is needed?

    CLA Signed 
    opened by lyakaap 27
  • Optimize hflip

    Optimize hflip

    Summary

    • [X ] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    <Please summarize what you are trying to achieve, what changes you made, and how they acheive the desired result.> I am trying to speed up horizontally flipping, and what I did is use VidGear to execute an ffmpeg command with the preset of ultrafast.

    With my changes in running HFlip tests: 4.724 seconds (tests would complete around 2.5 seconds) Without my changes in running HFlip tests: 12.534 seconds (tests would complete around 10.5 seconds)

    I've run each test with and without my changes five times on fish shell with this command time for i in (seq 5); python -m unittest augly.tests.video_tests.transforms.ffmpeg_test.TransformsVideoUnitTest.test_HFlip;; end

    Unit Tests

    If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

    Audio

    python -m unittest discover -s augly/tests/audio_tests/ -p "*"
    

    Image

    python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py"
    # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)
    

    Text

    python -m unittest discover -s augly/tests/text_tests/ -p "*"
    

    Video

    python -m unittest discover -s augly/tests/video_tests/ -p "*"
    

    All

    python -m unittest discover -s augly/tests/ -p "*"
    

    Other testing

    If applicable, test your changes and paste the output here. For example, if your changes affect the requirements/installation, then test installing augly in a fresh conda env, then make sure you are able to import augly & run the unit test

    CLA Signed 
    opened by Adib234 27
  • Problem using audio augmentations with tensorflow

    Problem using audio augmentations with tensorflow

    I tried to use the audio augmentations in a tensorflow project but I had a "Segmentation fault" error in running time while importing the modules. In my case it can be reproduced only running these two lines:

    import tensorflow
    import augly.audio as audaugs
    

    Versions:

    tensorflow-gpu==2.4.1
    augly==0.1.1
    Python 3.8.5
    

    Thank you

    bug 
    opened by mcanan 20
  • Return images same mode (initial commit)

    Return images same mode (initial commit)

    Related Issue

    Fixes #{128}

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)
    • [x] Add src_mode arg to ret_and_save_image() in image/utils/utils.py
    • [x] Pass src_mode arg into ret_and_save_image() from every augmentation except convert_color in image/functional.py (e.g. here for apply_lambda)
    • [x] In image test evaluate_class(), assert that the mode of self.img & dst are equal.
    • [x] Run image tests, make sure they all pass: python -m unittest discover -s augly/tests/image_tests/ -p "*"

    Unit Tests

    If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

    Image

    python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py"
    # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)
    

    Test Output: n/a

    Other testing

    N/A

    CLA Signed Merged 
    opened by membriux 17
  • Added spatial bbox helper

    Added spatial bbox helper

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    Computes the bbox that encloses a white box in a black background for any augmentation.

    Image

    python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py"
    # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)
    
    Ran 82 tests in 53.014s
    
    OK (skipped=5)
    

    Other testing

    Colab notebook testing the bbox helper → https://colab.research.google.com/drive/1g_0I6f_bv4Wsna6l9jjZrOJ62a4dpu8U#scrollTo=yUczCe6FU9Bs

    CLA Signed 
    opened by membriux 15
  • `black` formatting

    `black` formatting

    Summary: Now our files are all correctly formatted for black, and thus we can run black on our files during code review & not have a million irrelevant changes!

    Followed the steps here: https://fb.prod.workplace.com/groups/pythonfoundation/posts/2990917737888352/

    • Removed aml/augly/ from list of dirs excluded from black formatting in fbsource/tools/arcanist/lint/fbsource-lint-engine.toml
    • Ran arc lint --take BLACK --apply-patches --paths-cmd 'hg files aml/augly/'
    • Fixed type issues in aml/augly/text/fb/augmenters/back_translate/ which were causing linter errors (there were pyre-ignore comments but they were no longer on the correct line post-black, so just added an assert & initialized a var so we no longer need them)
    • Made changes to TARGETS files suggested by linter (moving imports/changing from where some libraries were imported)

    Differential Revision: D31526814

    fb-exported 
    opened by zpapakipos 12
  • Fix for rotating bounding box the wrong way

    Fix for rotating bounding box the wrong way

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    The current method for rotating bounding boxes was bugged because it rotated the bounding box in the wrong direction. This simple sign change fixes it.

    Test case: https://github.com/juliusfrost/AugLy/blob/rotate-bounding-box-jupyter/examples/test_rotate.ipynb

    Unit Tests

    Passes unit tests.

    CLA Signed 
    opened by juliusfrost 11
  • Update resize() to be on par with torchvision speed

    Update resize() to be on par with torchvision speed

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    Summary: Refactored resize() from image/functional.py to be on par with torchvision. However, I just have one minor failure in my code. Please advise on where I should look :)

    Test results The following results were acquired from my machine

    • Augly original (without interpolation)= 0.04475s
    • Augly revised (with interpolation) = 0.02873s
    • torchvision (uses interpolation) = 0.02696s

    Test code → https://colab.research.google.com/drive/14-KZdSGaOaz73OgIS0DZZY4RsS3cJ0rg#scrollTo=xVI_h-1v49lC

    Unit Tests

    If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

    
    ### Image
    ```bash
    python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py"
    # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)
    

    TEST OUTPUT

    ======================================================================
    FAIL: test_Resize (transforms_unit_test.TransformsImageUnitTest)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/Users/macbookpro/Desktop/Github/AugLy/augly/tests/image_tests/transforms_unit_test.py", line 187, in test_Resize
        self.evaluate_class(imaugs.Resize(), fname="resize")
      File "/Users/macbookpro/Desktop/Github/AugLy/augly/tests/image_tests/base_unit_test.py", line 111, in evaluate_class
        self.assertTrue(
    AssertionError: False is not true
    
    ----------------------------------------------------------------------
    Ran 82 tests in 52.735s
    
    FAILED (failures=1, skipped=5)
    
    CLA Signed 
    opened by membriux 10
  • `OSError: unknown freetype error` when using `OverlayText`

    `OSError: unknown freetype error` when using `OverlayText`

    🐛 Bug

    To Reproduce

    Steps to reproduce the behavior:

    1. OverlayText with a combination of specific text [166, 287] and specific font (NotoSansBengaliUI-Regular.ttf") results in error.
    2. It appears that even if these two characters (with indices 166 and 287) are separated by any number of other characters, it results in the same error.
    # imports
    import os
    import augly.image as imaugs
    import augly.utils as utils
    from IPython.display import display
    
    # import paths
    from augly.utils.base_paths import (
        EMOJI_DIR,
        FONTS_DIR,
        SCREENSHOT_TEMPLATES_DIR,
    )
    
    # read sample input_img
    input_img_path = os.path.join(
        utils.TEST_URI, "image", "inputs", "dfdc_1.jpg"
    )
    input_img = imaugs.scale(input_img_path, factor=0.2)
    
    # This results in error
    overlay_text = imaugs.OverlayText(
        text = [166, 287],
        font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
    )
    overlay_text(input_image)
    
    # These do not result in error
    overlay_text = imaugs.OverlayText(
        text = [166],
        font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
    )
    overlay_text(input_image)
    
    # These do not result in error
    overlay_text = imaugs.OverlayText(
        text = [287],
        font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
    )
    overlay_text(input_image)
    
    # These do not result in error
    overlay_text = imaugs.OverlayText(
        text = [166, 287]
       # font_file not specified
    )
    overlay_text(input_image)
    

    Stack trace:

    ---------------------------------------------------------------------------
    OSError                                   Traceback (most recent call last)
    <ipython-input-74-a3f6c232cffd> in <module>
          3     font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
          4 )
    ----> 5 overlay_text(input_image)
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/augly/image/transforms.py in __call__(self, image, force, metadata)
         48             return image
         49 
    ---> 50         return self.apply_transform(image, metadata)
         51 
         52     def apply_transform(
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/augly/image/transforms.py in apply_transform(self, image, metadata)
        920             x_pos=self.x_pos,
        921             y_pos=self.y_pos,
    --> 922             metadata=metadata,
        923         )
        924 
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/augly/image/functional.py in overlay_text(image, output_path, text, font_file, font_size, opacity, color, x_pos, y_pos, metadata)
       1121         text=text_str,
       1122         fill=(color[0], color[1], color[2], round(opacity * 255)),
    -> 1123         font=font,
       1124     )
       1125 
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/PIL/ImageDraw.py in text(self, xy, text, fill, font, anchor, spacing, align, direction, features, language, stroke_width, stroke_fill, embedded_color, *args, **kwargs)
        461             else:
        462                 # Only draw normal text
    --> 463                 draw_text(ink)
        464 
        465     def multiline_text(
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/PIL/ImageDraw.py in draw_text(ink, stroke_width, stroke_offset)
        416                     ink=ink,
        417                     *args,
    --> 418                     **kwargs,
        419                 )
        420                 coord = coord[0] + offset[0], coord[1] + offset[1]
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/PIL/ImageFont.py in getmask2(self, text, mode, fill, direction, features, language, stroke_width, anchor, ink, *args, **kwargs)
        668         """
        669         size, offset = self.font.getsize(
    --> 670             text, mode, direction, features, language, anchor
        671         )
        672         size = size[0] + stroke_width * 2, size[1] + stroke_width * 2
    
    OSError: unknown freetype error
    

    Expected behavior

    I don't know why this combination of text and font does not work. I would expect there to be no error, since overlaying the same text using another font file does not result in error.

    Environment

    • AugLy Version (e.g., 0.1.2): 0.1.5
    • OS (e.g., Linux): Ubuntu 20.04
    • How you installed AugLy (pip install augly, clone & pip install -e AugLy): pip install augly
    • Python version: 3.7.9
    • Other relevant packages (Tensorflow, etc):

    Additional context

    Same error message can be reproduced using the following combinations as well: text [449, 262] with font file NotoSansThaana-Regular.ttf text [295, 481] with font file NotoSansBengali-Regular.ttf

    dependency bug 
    opened by jkim222383 10
  • increasing highpass efficiency

    increasing highpass efficiency

    Related Issue

    Fixes N/A

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    <Please summarize what you are trying to achieve, what changes you made, and how they acheive the desired result.> Speeding up highpassfilter by taking advantage of existing dependencies

    Previous runtime: 1.686479s New runtime: 0.045012s

    Unit Tests

    If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

    Audio

    python -m unittest discover -s augly/tests/audio_tests/ -p "*"
    ......./home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this w
    arning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      np.dtype(np.float): np.complex,
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warn
    ing, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      np.dtype(np.float): np.complex,
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:869: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning,
    use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      mag = np.abs(S).astype(np.float)
    ........../home/adityaprasad/.local/lib/python3.8/site-packages/librosa/core/spectrum.py:1223: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence
    this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      time_steps = np.arange(0, D.shape[1], rate, dtype=np.float)
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning,
     use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      np.dtype(np.float): np.complex,
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warn
    ing, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      np.dtype(np.float): np.complex,
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/core/spectrum.py:1223: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warni
    ng, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      time_steps = np.arange(0, D.shape[1], rate, dtype=np.float)
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:869: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning,
    use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      mag = np.abs(S).astype(np.float)
    ..................................................
    ----------------------------------------------------------------------
    Ran 67 tests in 5.665s
    
    OK
    
    

    If applicable, test your changes and paste the output here. For example, if your changes affect the requirements/installation, then test installing augly in a fresh conda env, then make sure you are able to import augly & run the unit test

    CLA Signed 
    opened by iAdityaEmpire 9
  • Paraphrasing using AugLy

    Paraphrasing using AugLy

    🚀 Feature

    As i was going thru AugLy i didn't found anything that can paraphrase a sentence and can create 2-3 sentence from 1 as xlnet does in nlpaug library. If it is already available in AugLy can you please mention the same?

    Motivation

    As i have less data to train. So Augmentation with paraphrasing will help me create more data and will let me train the model.

    Pitch

    I want one sentence to be paraphrase in how many number of sentence i want. For example if i give n=3, Function should produce 3 sentences from 1 sentence which have similar meaning(paraphrased basically)

    opened by ChiragM-Hexaware 0
  • Missing comparison in paper

    Missing comparison in paper

    Hi AugLy team, Thanks for the great package!

    I have seen your comparison with other libs in the paper and have to highlight a missing point: in the table below, pixelization is not compared with an alternative from albumentations. image

    My guess is you didn't find one, so may I suggest looking at https://albumentations.ai/docs/api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Downscale? It does pixelization using the default params.

    opened by arsenyinfo 0
  • Disabled transitions if the transition duration is too short.

    Disabled transitions if the transition duration is too short.

    Summary: Fixing a pattern of failures where ffmpeg fails when the transition duration is too short.

    Refactored the concatenation (no transition effect) in a separate function. Shifting to concatenation when the transition duration is below 0.5 seconds (chosen empirically).

    Differential Revision: D37737241

    CLA Signed fb-exported 
    opened by gpostelnicu 1
  • Issue about not specifying the path to the ffmpeg package

    Issue about not specifying the path to the ffmpeg package

    1 Issues

    When I use Augly to perform data enhancement operations on videos, I encounter this problem:

    Compression Mode is disabled, Kindly enable it to access this function.
    

    But I have installed ffmpeg and configured the environment. Then I tried to debug the code and found a bug.

    When the function add_augmenter() calls WriteGear(),The value of parameter custom_ffmpeg is not specified. But later in the code, it is necessary to use its real value:

    Line 215 in writegear.py:
        
                self.__ffmpeg = get_valid_ffmpeg_path(
                    custom_ffmpeg,
                    self.__os_windows,
                    ffmpeg_download_path=__ffmpeg_download_path,
                    logging=self.__logging,
                )
    

    So that in function get_valid_ffmpeg_path(), the value returned is always False, which causes the program to fail to get the locally downloaded ffmpeg package and must download it again.

    Line 885 in helper.py:
    
    def get_valid_ffmpeg_path(
        custom_ffmpeg="", is_windows=False, ffmpeg_download_path="", logging=False
    ):
        """
        ## get_valid_ffmpeg_path
    
        Validate the given FFmpeg path/binaries, and returns a valid FFmpeg executable path.
    
        Parameters:
            custom_ffmpeg (string): path to custom FFmpeg executables
            is_windows (boolean): is running on Windows OS?
            ffmpeg_download_path (string): FFmpeg static binaries download location _(Windows only)_
            logging (bool): enables logging for its operations
    
        **Returns:** A valid FFmpeg executable path string.
        """
        final_path = ""
        if is_windows:
            # checks if current os is windows
            if custom_ffmpeg:
                # if custom FFmpeg path is given assign to local variable
                final_path += custom_ffmpeg
            else:
                # otherwise auto-download them
                try:
                    if not (ffmpeg_download_path):
                        # otherwise save to Temp Directory
                        import tempfile
    
                        ffmpeg_download_path = tempfile.gettempdir()
    
                    logging and logger.debug(
                        "FFmpeg Windows Download Path: {}".format(ffmpeg_download_path)
                    )
    
                    # download Binaries
                    os_bit = (
                        ("win64" if platform.machine().endswith("64") else "win32")
                        if is_windows
                        else ""
                    )
                    _path = download_ffmpeg_binaries(
                        path=ffmpeg_download_path, os_windows=is_windows, os_bit=os_bit
                    )
                    # assign to local variable
                    final_path += _path
    

    2 Solution

    Giving the path to the local ffmpeg package when use get_valid_ffmpeg_path(). Like this:

     self.__ffmpeg = get_valid_ffmpeg_path(
                    "D:/workSoftware/anaconda/envs/augly/Library/bin/",
                    self.__os_windows,
                    ffmpeg_download_path=__ffmpeg_download_path,
                    logging=self.__logging,
                )
    

    Its path can be obtained using the following method:

    import distutils
    distutils.spawn.find_executable('ffmpeg')
    

    This way you won't need to reinstall the ffmpeg package every time you use it.

    opened by ZOMIN28 0
  • mosaic

    mosaic

    Hi, Is there any data enhancement for the following image, which looks like a mosaic, I used pixelation, but it doesn't look like this kind of corruption.

    0d9d6cd5d9160adace80c5c74031b45

    Thank you !

    opened by WEIZHIHONG720 0
  • Support For Keypoints?

    Support For Keypoints?

    🚀 Feature

    Can the augmentations track keypoints through transformations? I see you have support for bounding boxes, so is it possible to set the bounding box as a single coordinate and track it?

    Motivation

    Landmark Localization.

    Pitch

    Track x,y coordinates through transformations.

    opened by Schobs 0
Releases(v1.0.0)
  • v1.0.0(Mar 29, 2022)

    Changes

    Text:

    • Fixed return types in the doc strings so all text augmentations are consistent.
    • Conserved whitespace through tokenization/detokenization for all text augmentations, so they are now consistent.

    Image:

    • Fixed bug with bounding boxes in rotate augmentation.

    Overall:

    • Split dependencies by modality so installation will be lighter-weight for most users. See issue https://github.com/facebookresearch/AugLy/issues/208 as well as the README of each modality for more details.
    • Moved the test input/output data out of the main augly folder so it isn't packaged with the pypi package, to make installation lighter-weight.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Dec 17, 2021)

    Changes

    Audio:

    • New augmentations: loop
    • Efficiency improvements: made high_pass_filter & low_pass_filter ~97% faster by using torchaudio

    Image:

    • New augmentations: skew
    • Added bbox computation helper spatial_bbox_helper to make it easier to add new image augmentations & automatically compute the bounding box transformations (e.g. see how we used this for skew here)
    • Efficiency improvements: made resize ~35% faster by defaulting to bilinear interpolation

    Text:

    • Allow multi-word typo replacement
    • Efficiency improvements: made contractions, replace_similar_chars, replace_similar_unicode_chars, replace_upside_down ~40-60% faster using algorithmic improvements

    Video:

    • Efficiency improvements: made 30 of the video augmentations faster using vidgear (a new dependency we added in this release) to execute ffmpeg commands using higher compression rates (e.g. hflip 75% faster, loop 85% faster, remove_audio 96% faster, pixelization 71% faster)

    Overall:

    • Modified internal imports to be Python 3.6-compatible
    • Added error messages to unit tests for easier debugging
    • If you want to see a full report benchmarking the runtimes of all AugLy augmentations versus other libraries, keep an eye out for the AugLy paper, which will be up on Arxiv in January!
    Source code(tar.gz)
    Source code(zip)
  • v0.1.10(Oct 18, 2021)

    Changes

    Image

    • Added bounding box support to all augmentations
    • Images are now returned in the same format they were passed into all augmentations (except convert_color)

    Text

    • New augmentations: swap_gendered_words, merge_words, change_case, contractions
    • Allow for kwarg overriding in __call__() for all augmentations
    • Exposed typo_type param in simulate_typos aug
    • Added ignore_words param to replace_words & swap_gendered_words

    Video

    • New augmentation: augment_audio

    Other

    • Enforce black formatting
    Source code(tar.gz)
    Source code(zip)
  • v0.1.7(Sep 13, 2021)

    Changes

    Image

    • New augmentations: apply_pil_filter, clip_image_size, overlay_onto_background_image, overlay_onto_background_image_with_blurred_mask, apply_pil_filter, clip_image_size, overlay_onto_background_image
    • New unit tests: Compose, overlay_image
    • Fixed color_jitter_intensity
    • Don't modify input image in overlay_stripes
    • Added metadata arg to Compose operator
    • Added support to overlay_text for multi-line text
    • Added resize_src_to_match_template option to overlay_onto_screenshot
    • Improved meme_format error message

    Text

    • New augmentation: insert_whitespace_chars
    • Add metadata arg to Compose operator
    • Added more font options to replace_fun_fonts

    Video

    • Added metadata arg to Compose operator, added unit test
    Source code(tar.gz)
    Source code(zip)
  • v0.1.5(Jul 9, 2021)

  • v0.1.3(Jun 28, 2021)

  • v0.1.2(Jun 22, 2021)

This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

PyTorch Infer Utils This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model infer

Alex Gorodnitskiy 11 Mar 20, 2022
Iranian Cars Detection using Yolov5s, PyTorch

Iranian Cars Detection using Yolov5 Train 1- git clone https://github.com/ultralytics/yolov5 cd yolov5 pip install -r requirements.txt 2- Dataset ../

Nahid Ebrahimian 22 Dec 05, 2022
Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

StableNet StableNet is a deep stable learning method for out-of-distribution generalization. This is the official repo for CVPR21 paper "Deep Stable L

120 Dec 28, 2022
🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series

🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series (optical and radar) The PASTIS Dataset Dataset presentation PASTIS is a benchmark dataset for

86 Jan 04, 2023
Multi-Objective Loss Balancing for Physics-Informed Deep Learning

Multi-Objective Loss Balancing for Physics-Informed Deep Learning Code for ReLoBRaLo. Abstract Physics Informed Neural Networks (PINN) are algorithms

Rafael Bischof 16 Dec 12, 2022
AutoVideo: An Automated Video Action Recognition System

AutoVideo is a system for automated video analysis. It is developed based on D3M infrastructure, which describes machine learning with generic pipeline languages. Currently, it focuses on video actio

Data Analytics Lab at Texas A&M University 267 Dec 17, 2022
Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection"

Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection". LRPDenseNet.py

Pedro Ricardo Ariel Salvador Bassi 2 Sep 21, 2022
Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation

Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation The skip connections in U-Net pass features from the levels of enc

Boheng Cao 1 Dec 29, 2021
The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

15 Dec 27, 2022
Attendance Monitoring with Face Recognition using Python

Attendance Monitoring with Face Recognition using Python A python GUI integrated attendance system using face recognition to take attendance. In this

Vaibhav Rajput 2 Jun 21, 2022
neural image generation

pixray Pixray is an image generation system. It combines previous ideas including: Perception Engines which uses image augmentation and iteratively op

dribnet 398 Dec 17, 2022
TransMorph: Transformer for Medical Image Registration

TransMorph: Transformer for Medical Image Registration keywords: Vision Transformer, Swin Transformer, convolutional neural networks, image registrati

Junyu Chen 180 Jan 07, 2023
Source code for paper "Deep Diffusion Models for Robust Channel Estimation", TBA.

diffusion-channels Source code for paper "Deep Diffusion Models for Robust Channel Estimation". Generic flow: Use 'matlab/main.mat' to generate traini

The University of Texas Computational Sensing and Imaging Lab 15 Dec 22, 2022
This repository provides the official code for GeNER (an automated dataset Generation framework for NER).

GeNER This repository provides the official code for GeNER (an automated dataset Generation framework for NER). Overview of GeNER GeNER allows you to

DMIS Laboratory - Korea University 50 Nov 30, 2022
PyTorch implementation of "Simple and Deep Graph Convolutional Networks"

Simple and Deep Graph Convolutional Networks This repository contains a PyTorch implementation of "Simple and Deep Graph Convolutional Networks".(http

chenm 253 Dec 08, 2022
PINN(s): Physics-Informed Neural Network(s) for von Karman vortex street

PINN(s): Physics-Informed Neural Network(s) for von Karman vortex street This is

ShotaDEGUCHI 2 Apr 18, 2022
Classify bird species based on their songs using SIamese Networks and 1D dilated convolutions.

The goal is to classify different birds species based on their songs/calls. Spectrograms have been extracted from the audio samples and used as features for classification.

Aditya Dutt 9 Dec 27, 2022
Centroid-UNet is deep neural network model to detect centroids from satellite images.

Centroid UNet - Locating Object Centroids in Aerial/Serial Images Introduction Centroid-UNet is deep neural network model to detect centroids from Aer

GIC-AIT 19 Dec 08, 2022
本步态识别系统主要基于GaitSet模型进行实现

本步态识别系统主要基于GaitSet模型进行实现。在尝试部署本系统之前,建立理解GaitSet模型的网络结构、训练和推理方法。 系统的实现效果如视频所示: 演示视频 由于模型较大,部分模型文件存储在百度云盘。 链接提取码:33mb 具体部署过程 1.下载代码 2.安装requirements.txt

16 Oct 22, 2022
The official repo for CVPR2021——ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search.

ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search [paper] Introduction This is the official implementation of ViPNAS: Efficient V

Lumin 42 Sep 26, 2022