Aloception is a set of package for computer vision: aloscene, alodataset, alonet.

Overview

Logo aloception

Documenation

Aloception

Aloception is a set of packages for computer vision built on top of popular deep learning libraries: pytorch and pytorch lightning.

Aloscene

Aloscene extend the use of tensors with Augmented Tensors designed to facilitate the use of computer vision data (such as frames, 2d boxes, 3d boxes, optical flow, disparity, camera parameters...).

frame = aloscene.Frame("/path/to/image.jpg")
frame = frame.to("cpu")
frame.get_view().render()

Alodataset

Alodataset implement ready-to-use datasets for computer vision with the help of aloscene and augmented tensors to make it easier to transform and display your vision data.

coco_dataset = alodataset.CocoDetectionDataset(sample=True)
for frame in coco_dataset.stream_loader():
    frame.get_view().render()

Alonet

Alonet integrates several promising computer vision architectures. You can use it for research purposes or to finetune and deploy your model using TensorRT. Alonet is mainly built on top of lightning with the help of aloscene and alodataset.

Training

# Init the training pipeline
detr = alonet.detr.LitDetr()
# Init the data module
coco_loader = alonet.detr.CocoDetection2Detr()
# Run the training using the two components
detr.run_train(data_loader=coco_loader, project="detr", expe_name="test_experiment")

Inference

# Load model
model = alonet.detr.DetrR50(num_classes=91, weights="detr-r50").eval()

# Open and normalized frame
frame = aloscene.Frame("/path/to/image.jpg").norm_resnet()

# Run inference
pred_boxes = model.inference(model([frame]))

# Add and display the predicted boxes
frame.append_boxes2d(pred_boxes[0], "pred_boxes")
frame.get_view().render()

Note

One can use aloscene independently than the two other packages to handle computer vision data, or to improve its training pipelines with augmented tensors.

Install

Aloception's packages are built on top of multiple libraries. Most of them are listed in the requirements.txt

pip install -r requirements.txt

Once the others packages are installed, you still need to install pytorch based on your hardware and environment configuration. Please, ref to the pytorch website _ for this install.

Getting started

Tutorials

Alonet

Models

Model name Link alonet location Learn more
detr-r50 https://arxiv.org/abs/2005.12872 alonet.detr.DetrR50 Detr
deformable-detr https://arxiv.org/abs/2010.04159 alonet.deformable_detr.DeformableDETR Deformable detr
RAFT https://arxiv.org/abs/2003.12039 alonet.raft.RAFT RAFT

Detr

Here is a simple example to get started with Detr and aloception. To learn more about Detr, you can checkout the Tutorials or the detr README.

# Load model
model = alonet.detr.DetrR50(num_classes=91, weights="detr-r50").eval()

# Open and normalized frame
frame = aloscene.Frame("/path/to/image.jpg").norm_resnet()

# Run inference
pred_boxes = model.inference(model([frame]))

# Add and display the predicted boxes
frame.append_boxes2d(pred_boxes[0], "pred_boxes")
frame.get_view().render()

Deformable Detr

Here is a simple example to get started with Deformable Detr and aloception. To learn more about Deformable, you can checkout the Tutorials or the deformable detr README.

# Loading Deformable model
model = alonet.deformable_detr.DeformableDetrR50(num_classes=91, weights="deformable-detr-r50").eval()

# Open, normalize frame and send frame on the device
frame = aloscene.Frame("/home/thibault/Desktop/yoga.jpg").norm_resnet().to(torch.device("cuda"))

# Run inference
pred_boxes = model.inference(model([frame]))

# Add and display the predicted boxes
frame.append_boxes2d(pred_boxes[0], "pred_boxes")
frame.get_view().render()

RAFT

Here is a simple example to get started with RAFT and aloception. To learn more about RAFT, you can checkout the raft README.

# Use the left frame from the  Sintel Flow dataset and normalize the frame for the RAFT Model
frame = alodataset.SintelFlowDataset(sample=True).getitem(0)["left"].norm_minmax_sym()

# Load the model using the sintel weights
raft = alonet.raft.RAFT(weights="raft-sintel")

# Compute optical flow
padder = alonet.raft.utils.Padder()
flow = raft.inference(raft(padder.pad(frame[0:1]), padder.pad(frame[1:2])))

# Render the flow along with the first frame
flow[0].get_view().render()

Alodataset

Here is a list of all the datasets you can use on Aloception. If you're dataset is not in the list but is important for computer vision. Please let us know using the issues or feel free to contribute.

Datasets

Dataset name alodataset location To try
CocoDetection alodataset.CocoDetectionDataset python alodataset/coco_detection_dataset.py
CrowdHuman alodataset.CrowdHumanDataset python alodataset/crowd_human_dataset.py
Waymo alodataset.WaymoDataset python alodataset/waymo_dataset.py
ChairsSDHom alodataset.ChairsSDHomDataset python alodataset/chairssdhom_dataset.py
FlyingThings3DSubset alodataset.FlyingThings3DSubsetDataset python alodataset/flyingthings3D_subset_dataset.py
FlyingChairs2 alodataset.FlyingChairs2Dataset python alodataset/flying_chairs2_dataset.py
SintelDisparityDataset alodataset.SintelDisparityDataset python alodataset/sintel_disparity_dataset.py
SintelFlowDataset alodataset.SintelFlowDataset python alodataset/sintel_flow_dataset.py
MOT17 alodataset.Mot17 python alodataset/mot17.py

Unit tests

python -m pytest

Licence

Shield: CC BY-NC-SA 4.0

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

Comments
  • How can a resizing ruin your normalization

    How can a resizing ruin your normalization

    
    frame = torch.ones((3, 50, 50))
    frame[0, 0, 0 : 10] = 0
    
    frame = aloscene.Frame(frame, normalization="01", names=tuple("CHW"))
    frame = frame.norm_minmax_sym()
    
    print("(min, max) :", frame.min().item(), frame.max().item())
    frame = frame.resize((20, 20)) 
    
    frame = frame.norm_minmax_sym()  # nothing changes, just a clone
    print("(min, max) :", frame.min().item(), frame.max().item())
    
    (min, max) : -1.0 1.0
    (min, max) : 0.5 1.0
    
    
    bug 
    opened by Data-Iab 5
  • Alobugsdays - Aloception Logo

    Alobugsdays - Aloception Logo

    Update the Aloception Logo.

    • Fix #257 : The Aloception logo wasn't up to date.

    This pull request includes

    • [x] Bug fix (non-breaking change which fixes an issue)
    • [ ] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [ ] This change requires a documentation update
    documentation enhancement good first issue invalid quick-fix alobugdays 
    opened by Ardorax 4
  • add depth absolute/relative encoding

    add depth absolute/relative encoding

    Two methods added to Depth:

    • encode_inverse : invert depth with given scale and shift.
    • encode_absolute : undo encode_inverse changes.

    One method added to AugTensor:

    • to_squeezed_numpy: as its name indicates, converts to squeezed numpy.
    opened by Data-Iab 4
  • BoundingBoxes2d _rotate and _spatial_shift implementation doesn't support the case where the box is no longer visible

    BoundingBoxes2d _rotate and _spatial_shift implementation doesn't support the case where the box is no longer visible

    Minimal reproducible example : here

    Output :

    Len points before transforms 500
    Len boxes before transforms 500
    Len points after transforms 439
    Len boxes after transforms 500
    
    

    What happens : I create a random set of aloscene.Points2d, and also a set of aloscene.BoundingBoxes2d whose centers are the random points, with very small height and width. Therefore, the Points2d and the Boxes2d represent the same data. When I apply alotransforms such as Rotate and SpatialShift to my frame, some of the points are no longer visible. This is taken into account for the Points2d class (the number of points decreases after the transformation) but not in the Boxes2d class where the number of points stays the same no matter if some points/boxes are not visible anymore.

    Also, the lack of implementation of the _rotate function in the BoundingBoxes2d class means that the Points2d are correctly rendered on the rotated frame, but the Boxes2d are rendered as if the frame is not rotated. image

    Why it matters & why can't I just use the Points2d class ?

    Our DETR implementation uses the Boxes2d class for criterion & hungarian matching. All our implementations that inheritate from DETR also use this class, even Deformable2dPoints that predicts points & not bounding boxes. Therefore, everytime someone adapts DETR to a new project, there is a risk of error.

    bug help wanted invalid aloscene hard 
    opened by Dee61298 3
  • Linter: Black & Flake8

    Linter: Black & Flake8

    Github Actions to check coding style To install the correct versions: pip install black==22.10.0 pip install flake8==5.0.4

    • linter job in Github Actions : Use Black and Flake8.

    This pull request includes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [ ] This change requires a documentation update
    enhancement invalid discussion 
    opened by Ardorax 3
  • custom dataset from directory

    custom dataset from directory

    • New feature : Create image data iterator from path. Useful for:
      • Calibration: when the dataset iterator is not defined.
      • testing: when you need to test with some images stored is a folder.

    caution : the iterator does not take into consideration frames synchronization when the class is instantiated with a dict

    from alodataset import FromDirectoryDataset
    
    
    # You can either pass a dict or a list of paths
    # from list of paths.
    path1 = "/PATH/TO/DATA/DIR1"
    path2 = "/PATH/TO/DATA/DIR2"
    
    data = FromDirectoryDataset(dirs=[path1, path2])
    img = data[0]
    
    # from dict of list of paths.
    path0 = "/PATH/TO/DATA/DIR0"
    path1 = "/PATH/TO/DATA/DIR1"
    path2 = "/PATH/TO/DATA/DIR2"
    path3 = "/PATH/TO/DATA/DIR3"
    
    data = FromDirectoryDataset(dirs={"key1": [path0, path1], "key2": [path2, path3]])
    img_key1 = data[0]["key1"]
    img_key2 = data[0]["key2"]
    

    This pull request includes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [ ] This change requires a documentation update
    opened by Data-Iab 3
  • Kitti datasets

    Kitti datasets

    Kitti Dataset (Stereo, Flow, Scene Flow, Depth, Odometry, Object, Tracking, Road, Semantics)

    • Kitti Depth : How to use Kitti Depth
    date = "2011_09_26"
    idsOfDrives = [
        "0001",  # sample from training subset
        "0002",  # sample from validation subset
    ]
    custom_drives = {date: idsOfDrives}
    kitti_ds = KittiDepth(
        subset="all",
        return_depth=True,
        custom_drives=custom_drives,
    )
    
    for f, frames in enumerate(kitti_ds.train_loader(batch_size=2)):
        frames = Frame.batch_list(frames)
    
    • Kitti Semantic : The semantic class
    dataset = KittiSemanticDataset()
    obj = dataset.getitem(0)
    obj.get_view().render()
    
    • How to use the remaining task of the dataset : Dataset's class list : KittiStereoFlow2012, KittiStereoFlowSFlow2015, KittiOdometryDataset, KittiObjectDataset, KittiTrackingDataset, KittiRoadDataset
    dataset = DATASET_CLASS(right_frame=False)
    obj = dataset.getitem(0)
    obj["left"].get_view().render()
    
    • Scene Flow: dimensions : Error with shape of occlusion mask.

    This pull request includes

    • [x] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [ ] This change requires a documentation update
    enhancement alodataset hard 
    opened by Ardorax 3
  • allow user to complety specify the grid view

    allow user to complety specify the grid view

    Add from_list_of_list argument to Renderer.get_view().

    This allows the user to give a list of list of views, defining the 2D grid of views. All views keep their original shape, and are top-left aligned in their rows.

    Example

    # Create three gray frames, display them on two rows (2 on first rows, 1 on 2nd row)
    import numpy as np
    import aloscene
    arrays = [np.full((3, 600, 650), 100), np.full((3, 500, 500), 50), np.full((3, 500, 800), 200)]
    frames = [aloscene.Frame(arr) for arr in arrays]
    views = [[frames[0].get_view(), frames[1].get_view()], [frames[2].get_view()]]
    aloscene.render(views, renderer="matplotlib")
    
    aloscene 
    opened by jsalotti 3
  • add dataset merging order

    add dataset merging order

    New :star:

    • MergeDataset comes with weights argument to order samples from merged datasets. weights=[2, 1] will force the iterator to return 2 samples from the first dataset then 1 sample from the second one...
    alodataset 
    opened by Data-Iab 3
  • Serving ready

    Serving ready

    Export models based on Detr/Deformable detr in torch.jit.trace / onnx / tensorRT. Also, develop productions files in order to launch aloception services.

    This feature contains:

    • New parameter in models called tracing, in order to trace a model.
    • Fix outputs to Tuple(pred_boxes and pred_logits) for any traced model -> Update base_exporter.py
    • New model_handler.py class:
      • Base handler in Detr/Deformable used to create .mar file, needed to launch a service with torchserve command.
      • Customize inference procedure with setup_config.json file, (threshold / activation_fn / background_class), and enable_cuda_ops flag (for deformable models)
      • Include COCO_index file, using normally on torchserve command.

    Important requiremets (requirements.txt and tensorrt_requirements.txt updated):

    • Serving:
      • torchserve
      • torch-model-archiver
      • torch-workflow-archiver
    • tensorRT:
      • torch==1.10.0 # Not include in requirements.txt
      • nvidia-tensorrt==8.2.0.6

    Problems fixed:

    • Variable types in cuda.ops fixed from LONG to INT32 for a correct export in tensorRT.
    • Implement special ops in transformers (both models) to be able to export them in tensorRT
    • deformable_detr/trt_export.py demo correction.
    • detr/trt_export.py demo now include export on CPU or GPU (as default).
    • Remove build files in cuda.ops procedure.
    • Fix handle_op_Clip based on #125

    Examples to test:

    # Export Detr/Deformable trace model (for serving)
    python alonet/detr/production/export_to_pt.py
    python alonet/deformable_detr/production/export_to_pt.py
    
    # Test serving proof
    python alonet/detr/production/model_handler.py # after generating the `.pt` file
    
    # Launch serving
    # Read README.md files in productions folders
    
    # Export Detr/Deformable in onnx/tensorRT
    python alonet/detr/trt_exporter.py
    python alonet/deformable_detr/trt_exporter.py
    

    Important notes:

    • All models that require export in onnx/tensorRT MUST INCLUDE tracing=True or model.tracing = True.
    • If export onnx/tensorRT are decired, the input size are fixed to HW given values in trt_exporter.py arguments. A bad inference will be produced if the dimensions of the input do not match the dimensions of the generated model.
    enhancement alonet 
    opened by Johansmm 3
  • Sampler management in alodataset train_loader not generic

    Sampler management in alodataset train_loader not generic

    Inside train_loader, sampler arg is called with dataset as arguments, but other samplers need other arguments (like SubsetRandomSampler)

    The sampler should be constructed before the call, but we don't not have the dataset yet (for the random sampler for instance).

    opened by ragier 3
  • new: motion blur augmentation

    new: motion blur augmentation

    • New feature : 3 different implementations of focus blur augmentation
    from alodataset.transforms import RandomFocusBlur, RandomFocusBlurV2, RandomFocusBlurV3
    
    import aloscene
    import torch
    
    frame = aloscene.Frame(torch.rand((3, 300, 300)))
    blured_frame1 = RandomFocusBlur()(frame)
    blured_frame2 = RandomFocusBlurV2()(frame)
    blured_frame3 = RandomFocusBlurV3()(frame)
    
    
    • New feature : Motion blur augmentation from optical flow
    ## Motion blur from RAFT-flow
    from alonet.raft.raft import RAFT
    
    flow_model = RAFT(weights="raft-things")
    flow_model = model.eval()
    
    frame_t0_t1 = aloscene.Frame(torch.ones((2, 3, 300, 300)), names=tuple("TCHW"))
    frame_t0 = frame_t0_t1[0]
    frame_t1 = frame_t0_t1[1]
    
    blured_t1 = RandomFlowMotionBlur(flow_model=flow_model)(frame_t1, p_frame=frame_t0)
    blured_t1.get_view().render()
    
    ## Motion blur from ground truth optical flow
    flow = aloscene.Flow(torch.ones((2, 300, 300)))
    blured_t1 = RandomFlowMotionBlur()(frame_t1, flow=flow)
    blured_t1.get_view().render()
    
    
    • Fix bug X : CameraIntrinsic initialization with a shape of 4x4 was not possible using __init__

    This pull request includes

    • [x] Bug fix (non-breaking change which fixes an issue)
    • [x] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [ ] This change requires a documentation update
    opened by Data-Iab 0
  • thomas/depth-smart-resize

    thomas/depth-smart-resize

    Implements a pooling resize option to Depth. When downsampled, the depth is first [max/min]pooled before resizing to desired size using the NEAREST interpolation.


    This pull request includes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [X] New feature (non-breaking change which adds functionality)
    • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [ ] This change requires a documentation update
    opened by tflahaul 0
  • 1st commit

    1st commit

    Added a "title" feature to Matplolib viewer (which previously was only for openCV viewer).

    Added a "title" argument to the get_view() method to be able to input a title directly. Use example : frames.get_view(title="test").render()

    enhancement aloscene 
    opened by Dee61298 0
  • Fix camera_calibe to solve boxes 3d wrong display

    Fix camera_calibe to solve boxes 3d wrong display

    camera_calib.py : There was just two misplaced variables in the cam_intrinsic update.

    spatial_augmented_tensor.py : I had to add this value=0 argument to workaround a bug that may be fixed now.

    opened by FlorianCoissac 0
  • kumler-bauer now support distortion coefficients in pixel instead of in meter

    kumler-bauer now support distortion coefficients in pixel instead of in meter

    General description of your pull request with the list of new features and/or bugs.

    • New feature 1 : Kumler-bauer distortion coefficients are now in pixel instead of meter. This is more logical with aloception computation because we work alot on image plan, it's too complicated to have a coefficient in meter. Since now, distortion property in case of kumler_bauer only needs 2 elements instead of 3 in actual version.
    import numpy as np
    from aloscene import Frame
    
    data = np.random.uniform(size=(3, 1024, 1024))
    frame = Frame(data, names=("C", "H", "W"), projection="kumler_bauer", distortion=[0.1, 0.1])
    
    

    This pull request includes

    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [ ] New feature (non-breaking change which adds functionality)
    • [x] Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [ ] This change requires a documentation update
    opened by anhtu293 0
  • Contatenating AugmentedTensor does not work when giving a tuple to torch.cat

    Contatenating AugmentedTensor does not work when giving a tuple to torch.cat

    In aloception-oss, we have overloaded some operation of torch.tensor. For example, a mechanism allows torch.cat to concatenate multiple AugmentedTensor and theirs children, in a recursive manner.

    But in the current state of the code: torch.cat works as expected with a List of AugmentedTensor as input, but not with a tuple of AugmentedTensor.

    from aloscene import Frame
    from aloscene.tensors import AugmentedTensor
    
    x = Frame(torch.rand(3, 10, 10), names=('C', 'H', 'W'))
    x.add_child('mychild',AugmentedTensor(torch.rand(2), names=("N",)) , mergeable=True, align_dim=["B", "T"])
    y = Frame(torch.rand(3, 10, 10), names=('C', 'H', 'W'))
    y.add_child('mychild',AugmentedTensor(torch.rand(2), names=("N",)) , mergeable=True, align_dim=["B", "T"])
    result = torch.cat((x.batch(), y.batch()), dim=0)
    print(result.mychild.names, " - ", result.mychild.shape)
    

    Expected output:

    ('B', 'N')  -  torch.Size([2, 2])
    

    Current output:

    ('B', 'N')  -  torch.Size([1, 2])
    
    bug 
    opened by jsalotti 0
Releases(v0.3.0)
  • v0.3.0(Aug 16, 2022)

    What's Changed

    Features


    • Conversion between distance and depth by @anhtu293 in https://github.com/Visual-Behavior/aloception/pull/170

    The shortest distance between two points is a straight line. - Archimedes

    As said Archimedes, knowing the distance (straight line) between camera and a point is as important as knowing planar depth. Therefore, it's convenient to have methods that can do the conversion between them

    What's new ?

    • Handle negative points in encode_absolute: For wide range camera (FoV > 180), it's possible to have points whose planar depth is small than 0 (points behind camera). To keep these points instead of clipping by 0, pass keep_negative=True in argument.
    • Depth to distance as_distance(): Convert depth to distance. Only pinhole camera and linear equidistant camera are supported at this time.
    • Distance to depth as_depth(): Convert distance to depth. Only pinhole camera and linear equidistant camera are supported at this time.
    • Possible to create a tensor of Distance by passing is_distance=True at initialization.
    • Support functions in depth_utils.

    Update

    • Change the term to avoid the confusion: "euclidean depth" for distance and "planar depth" for usual depth.
    • as_distance() becomes as_euclidean()
    • as_depth() becomes as_planar()

    Archimedes's quote now becomes: The shortest "euclidean depth" between two points is a straight line.


    • SupportProjection model and distortion by @anhtu293 in https://github.com/Visual-Behavior/aloception/pull/174

    New feature

    • Add projection and distortion as new properties of SpatialAugmentedTensor so that we can inherit for other types of tensor. Two projection models are supported: pinhole and equidistant. Default values are pinhole and 1.0 for distortion so it won't change anything for initialization if we are working on "pinhole" image. Only aloscene.Depth is supported for distortion and equidistant projection at this time.
    • Depth.as_points3d is now supported equidistant model with distortion. If no projection model and distortion are specified in arguments, as_points3d uses the projection and distortion property.
    • Depth.as_planar and Depth.as_euclidean now use projection and distortion property if there is no projection model and distortion specified in arguments.
    • Depth.__get_view__ now has color legend if legend is set True.

    • add tensorrt quantization by @Data-Iab in https://github.com/Visual-Behavior/aloception/pull/172
    • New :collision: :

      • TensorRt engines can now be built with int8 precision using Post Training Quantization.
      • 4 calibrators are available for quantization : MinMaxCalibrator, LegacyCalibrator, EntropyCalibrator and EntropyCalibrator2.
      • Added a QuantizedModel interface to convert model to quantized model for Training Aware Quantization.
    • Fixed :wrench: :

      • Adapt graph option is removed, we just adapt graph once it's exported from torch to ONNX.

    • add profiling verbosity by @Data-Iab in https://github.com/Visual-Behavior/aloception/pull/176

    New :star: :

    • profiling_verbosity option is added to the TRTEngineBuilder to better inspect the details of each node when calling the tensorrt.EngineInspector
    • Some quantization related arguments are added to the BaseTRTExporter.

    • random downscale and crop transform by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/184
    • RandomDownScale : transform to randomly downscale between original and a minimum frame size
    • RandomDownScaleCrop : a compose transform to randomly downscale then crop

    • add cuda shared memory for reccurent engines by @Data-Iab in https://github.com/Visual-Behavior/aloception/pull/186

    New :seedling:

    • Engine's inputs/outputs can share the same space in GPU for faster execution. Hosts with shared memory can be retrieved with outputs_to_cpu argument and can be updated using inputs_to_gpu argument.

    • Dynamic cropping by @anhtu293 in https://github.com/Visual-Behavior/aloception/pull/188

    Dynamic Cropping

    • Possibility to crop an image to smaller fixed size image in the position we want. The crop position can be parsed by argument center which can be float or int.
    • If crop is out of image border, an error is triggered.

    • new: depth metrics by @Data-Iab in https://github.com/Visual-Behavior/aloception/pull/195

    New :star: :

    • Depth evaluation metrics are added to alonet metrics.

    • Lvis Dataset + Coco Update + minor fix by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/196
    • CocoDetectionDataset can now use a given ann_file when loaded
    • CocoPanopticDataset can now use ignore_classes to ignore some classed when loading the panoptic anns
    • In DetrCriterion interpolation is an option that can be changed with upscale_interpolate
    • Lvis Dataset based on CocoDetectionDataset with a different ann file

    • allow user to complety specify the grid view by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/201
    # Create three gray frames, display them on two rows (2 on first rows, 1 on 2nd row)
    import numpy as np
    import aloscene
    arrays = [np.full((3, 600, 650), 100), np.full((3, 500, 500), 50), np.full((3, 500, 800), 200)]
    frames = [aloscene.Frame(arr) for arr in arrays]
    views = [[frames[0].get_view(), frames[1].get_view()], [frames[2].get_view()]]
    aloscene.render(views, renderer="matplotlib")
    

    • Scene flow by @Ardorax in https://github.com/Visual-Behavior/aloception/pull/204

    Create scene flow by calling the class with a file path, a tensor or a ndarray.

    If you have optical flow, depth at time T and T + 1 and the camera intrinsic. You can create scene flow with the class method from_optical_flow. It handle the creation of the occlusion mask if some parameters have one.


    • GitHub actions by @Ardorax in https://github.com/Visual-Behavior/aloception/pull/205
    • Github action who automatically launch unit test when there is a commit or pull request in master branch

    • Scene flow in frame by @Ardorax in https://github.com/Visual-Behavior/aloception/pull/208 New method 'append_scene_flow' in frame class.

    Fix

    • fix depth absolute/inverse assertion by @Data-Iab in https://github.com/Visual-Behavior/aloception/pull/167
    • Fixed some issues by @Dee61298 in https://github.com/Visual-Behavior/aloception/pull/171
    • better colorbar position by @anhtu293 in https://github.com/Visual-Behavior/aloception/pull/178
    • Check if depth is planar before projecting to 3d points by @anhtu293 in https://github.com/Visual-Behavior/aloception/pull/177
    • Merge dataset weights by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/175
    • update arg name by @anhtu293 in https://github.com/Visual-Behavior/aloception/pull/179
    • Fix package prod dependencies by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/181
    • remove tracing assertion by @Data-Iab in https://github.com/Visual-Behavior/aloception/pull/182
    • clip low values of depth before conversion to disp by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/180
    • Pass arguments to RandomScale and RandomCrop in ResizeCropTransform by @anhtu293 in https://github.com/Visual-Behavior/aloception/pull/189
    • add execution context failed creation exception by @Data-Iab in https://github.com/Visual-Behavior/aloception/pull/190
    • fix: AugmentedTensor clone method by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/191
    • bugfix: close plt figure by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/192
    • fix masking dimension mismatch by @Data-Iab in https://github.com/Visual-Behavior/aloception/pull/194
    • ignore same_on_sequence when no time dimension by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/200
    • RealisticNoise default values by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/199
    • allow for non integer principal point coordinates by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/202
    • check disp_format and clamp if necessary by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/203
    • GLOBAL_COLOR_SET_CLASS will automaticly adjust its size for giving random color for a given object class

    New Contributors

    • @Dee61298 made their first contribution in https://github.com/Visual-Behavior/aloception/pull/171
    • @Ardorax made their first contribution in https://github.com/Visual-Behavior/aloception/pull/204

    Full Changelog: https://github.com/Visual-Behavior/aloception/compare/v0.2.1...v0.3.0

    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Apr 15, 2022)

    What's Changed


    • fix tracing assertion by @Data-Iab in https://github.com/Visual-Behavior/aloception/pull/166 Check if tracing attribute exists before checking if it's set to True.

    • camera calib by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/168 Add new method for getting distance from one pose to an other
    pose.distance_with(other_pos)
    

    Set default names to extrinsic to (None, None)


    • depth encode inverse by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/169
    • Make inverse False by default when creating Depth tensor.
    • scale and shift are not required. They're optional.

    Full Changelog: https://github.com/Visual-Behavior/aloception/compare/v0.2.0...v0.2.1

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Apr 12, 2022)

    What's Changed


    • Update README.md by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/156

    • Fix model none on BaseTRTExporter by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/158

    BaseTRTExporter can now be create from a None model. This is usefull if one want to only export from an onnx file.


    • add depth absolute/relative encoding by @Data-Iab in https://github.com/Visual-Behavior/aloception/pull/159

    Two methods added to Depth:

    encode_inverse : invert depth with given scale and shift. encode_absolute : undo encode_inverse changes. One method added to AugTensor:

    to_squeezed_numpy: as its name indicates, converts to squeezed numpy.


    • fixe project run_id default by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/161

    Fixe when using load_training without loading the common argparse, the no_run_id was used in load_training. I now use a default value if the value is not set into the args.


    • fix: resize intrinsic matrix by @anhtu293 in https://github.com/Visual-Behavior/aloception/pull/164

    Ratio of width and height are not exact in the resize method of matrix intrinsic.


    • add batch(dim) and temporal(dim) by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/162

    In this merge request:

    It is now possible to do

    tensor.temporal(dim=1) # where dim can be 0 or 1
    

    and

    tensor.batch(dim=1) # where dim can be 0 or 1
    

    TODO: Check back unit test to check that everything is correct.


    • Noisy aug, pose update, depth abs, render, batch_list by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/165

    Add a mergeable pose label to the Frame object.

    It can be used as such

    P = aloscene.Pose(cam_pos)
    

    Pose directly inherit from CameraExtrinsic but usually refer to the global world coordinates/

    Fix noisy pos to propagate the normalization and to use the device properly.

    Add aloscene.render()

    You can now directly render a list of view using aloscene.render()

    aloscene.render(views)
    

    Here is a example to add views and to record a video

    views = []
    # Run DFM on side cameras
    
    for frames in data_loader:
    
        # Build a list of view
        for frames_side in frames:
            output = model.inference(model(frames))
            views.append(output.get_view())
        
        # render the list
        aloscene.render(views, record_file="model_outputs.mp4")
    
    # Save the final video
    aloscene.save_renderer()
    

    batch list from aloscene

    Instead of doing

    SpatialAugmentedTensor.batch_list(tensors)
    

    or

    tensors[0].batch_list(tensors)
    

    You can now do:

    aloscene.batch_list(tensors)
    

    Compute translation between two pose/extrinsic

    ref.pose.translation_with(src.pose)
    

    New Contributors

    • @Data-Iab made their first contribution in https://github.com/Visual-Behavior/aloception/pull/159

    Full Changelog: https://github.com/Visual-Behavior/aloception/compare/v0.1.0...v0.2.0

    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Mar 18, 2022)

    What's Changed

    • add version by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/154

    • Trt export from onnx by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/155
    • add skip_adapt_graph option to not adapt the graph before to export to TensorRT.
    • Fix issue when calling TRTExecutor() without engine, its now TRTExecutor(stream=cuda.Stream())
    • Automatically adapt graph by default: handle clip operations + simplify onnx graph. Is is not mandatory to override this method anymore. This change will not affect the current trt exporter since the adapt_graph method is supposed to be override.

    Full Changelog: https://github.com/Visual-Behavior/aloception/compare/v0.0.1...v0.1.0

    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Mar 10, 2022)

    What's Changed

    • Requirements vbfolder update by @LucBourrat1 in https://github.com/Visual-Behavior/aloception/pull/5
    • update README for docs by @LucBourrat1 in https://github.com/Visual-Behavior/aloception/pull/6
    • Prevent a known issue with camera parameters by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/40
    • use randomsampler in train_dataloader by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/37
    • #45 fix tensorboard logger by @ragier in https://github.com/Visual-Behavior/aloception/pull/46
    • 3 cocodetection mask implement by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/36
    • 1 update samples by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/41
    • 50 train with samples by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/58
    • Prevent crop outside of the spatial size by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/34
    • tuto getting started: modif typo + better frame slicing example by @LucBourrat1 in https://github.com/Visual-Behavior/aloception/pull/33
    • 35 doc getting started datasets by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/59
    • alotransform: probabilistic same_on_* by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/61
    • Readmes by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/63
    • Batch list improvment by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/62
    • Frame with labels by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/64
    • 47 training your model alonet by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/65
    • Run from run_ID by @LucBourrat1 in https://github.com/Visual-Behavior/aloception/pull/68
    • About augmented tensor by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/70
    • Detr def arch by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/69
    • sample download progress bar and skip user prompt by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/81
    • Frame api by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/82
    • update load weight and load train functions by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/80
    • remove the need to instantiate boxes by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/84
    • Coco panoptic dataset by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/77
    • Fixe weights is None for detr & Deformable by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/87
    • Points2D by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/72
    • sample download progress bar and skip user prompt by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/90
    • fix bug in load_training by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/91
    • 88 load weights finetune models by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/92
    • new colormaps and clipping option for disp view by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/95
    • 42 development panoptic module by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/73
    • fix bug load png image masks by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/99
    • Fix compute pq metric mask2id by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/101
    • 43 panoptic quality metrics by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/89
    • fix stric mode by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/102
    • 44 panoptic docs by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/94
    • Quick fix log mask and panoptic by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/105
    • Points2D & Boxes2D : Full pad support + new augmentations by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/93
    • temporal base metrics by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/110
    • 86 coco panoptic and masks doc by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/96
    • Try to import and raise error only if use with instruction by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/114
    • Augmented tensor: labels renamed by Child + default args when adding one node by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/111
    • Fix augmented tensor on resize with Tensor label + done 57 issue by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/83
    • Depth disp pts3d by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/118
    • Fixe crop with fit & absolute boses by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/117
    • 113 unknown weights by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/115
    • 119 depth on frame by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/120
    • 98 deformable panoptic head by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/123
    • Fix apply on child by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/127
    • Fix boxes display by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/124
    • Serving ready test by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/129
    • 79 coco detection splitmixin by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/122
    • Raft refacto by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/128
    • Serving ready by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/130
    • fix: resize camera calib by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/132
    • update threshold by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/133
    • Trt export profiling by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/134
    • include verbose in scope names by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/138
    • Compatibility with embed systems by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/143
    • fix append occlusion and title view by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/141
    • fix labels get_view error when frame is None by @anhtu293 in https://github.com/Visual-Behavior/aloception/pull/139
    • fix log_image for tensorboard logger by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/140
    • Fix crop on boxes and pts2d when using absolute position by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/136
    • Panoptic2trt by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/145
    • add_rotation by @LucBourrat1 in https://github.com/Visual-Behavior/aloception/pull/144
    • Panoptic2trt fix by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/147
    • Fix CameraIntrinsic last diag element by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/146
    • bugfix: make depth label mergeable by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/149
    • Depth.get_view(): fix normalization and add reverse cmap feature by @jsalotti in https://github.com/Visual-Behavior/aloception/pull/150
    • Deformable panoptic by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/148
    • 142: Load best model instead of last one by @Johansmm in https://github.com/Visual-Behavior/aloception/pull/151
    • Optional tensort by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/152
    • save method on renderer by @thibo73800 in https://github.com/Visual-Behavior/aloception/pull/153

    New Contributors

    • @LucBourrat1 made their first contribution in https://github.com/Visual-Behavior/aloception/pull/5
    • @thibo73800 made their first contribution in https://github.com/Visual-Behavior/aloception/pull/40
    • @jsalotti made their first contribution in https://github.com/Visual-Behavior/aloception/pull/37
    • @ragier made their first contribution in https://github.com/Visual-Behavior/aloception/pull/46
    • @Johansmm made their first contribution in https://github.com/Visual-Behavior/aloception/pull/36
    • @anhtu293 made their first contribution in https://github.com/Visual-Behavior/aloception/pull/139

    Full Changelog: https://github.com/Visual-Behavior/aloception/commits/v0.0.1

    Source code(tar.gz)
    Source code(zip)
Owner
Visual Behavior
We are working on the future of robotics
Visual Behavior
A python program to block out your face

Readme This is a small program I threw together in about 6 hours to block out your face. It probably doesn't work very well, so be warned. By default,

1 Oct 17, 2021
Corner-based Region Proposal Network

Corner-based Region Proposal Network CRPN is a two-stage detection framework for multi-oriented scene text. It employs corners to estimate the possibl

xhzdeng 140 Nov 04, 2022
Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized

SceneTextPapers Tracking the latest progress in Scene Text Detection and Recognition: must-read papers well organized Information about this repositor

Shangbang Long 763 Jan 01, 2023
Text-to-Image generation

Generate vivid Images for Any (Chinese) text CogView is a pretrained (4B-param) transformer for text-to-image generation in general domain. Read our p

THUDM 1.3k Jan 05, 2023
[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

Junhyeong Cho 18 Jul 19, 2022
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

CRAFT: Character-Region Awareness For Text detection Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

188 Dec 28, 2022
CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

Watson Yang 356 Dec 08, 2022
Optical character recognition for Japanese text, with the main focus being Japanese manga

Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran

Maciej Budyś 327 Jan 01, 2023
Lightning Fast Language Prediction 🚀

whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req

Indix 152 Oct 16, 2022
Image processing using OpenCv

Image processing using OpenCv Write a program that opens the webcam, and the user selects one of the following on the video: ✅ If the user presses the

M.Najafi 4 Feb 18, 2022
A Python wrapper for the tesseract-ocr API

tesserocr A simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR). tesserocr integrates directly with

Fayez 1.7k Dec 31, 2022
Tesseract Open Source OCR Engine (main repository)

Tesseract OCR About This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM

48.4k Jan 09, 2023
Sort By Face

Sort-By-Face This is an application with which you can either sort all the pictures by faces from a corpus of photos or retrieve all your photos from

0 Nov 29, 2021
OCR system for Arabic language that converts images of typed text to machine-encoded text.

Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l

Hussein Youssef 144 Jan 05, 2023
Semantic-based Patch Detection for Binary Programs

PMatch Semantic-based Patch Detection for Binary Programs Requirement tensorflow-gpu 1.13.1 numpy 1.16.2 scikit-learn 0.20.3 ssdeep 3.4 Usage tar -xvz

Mr.Curiosity 3 Sep 02, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 03, 2023
color detection using python

colordetection color detection using python In this color detection Python project, we are going to build an application through which you can automat

Ruchith Kumar 1 Nov 04, 2021
This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

EAST: An Efficient and Accurate Scene Text Detector Description: This version will be updated soon, please pay attention to this work. The motivation

Dejia Song 544 Dec 20, 2022
Toolbox for OCR post-correction

Ochre Ochre is a toolbox for OCR post-correction. Please note that this software is experimental and very much a work in progress! Overview of OCR pos

National Library of the Netherlands / Research 117 Nov 10, 2022
An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss This is an unofficial implementation of AutoVC based on the official one. The reposi

Chien-yu Huang 27 Jun 16, 2022