3D Vision functions with end-to-end support for deep learning developers, written in Ivy.

Overview



3D Vision functions with end-to-end support for deep learning developers, written in Ivy.

Contents

Overview

What is Ivy Vision?

Ivy vision focuses predominantly on 3D vision, with functions for camera geometry, image projections, co-ordinate frame transformations, forward warping, inverse warping, optical flow, depth triangulation, voxel grids, point clouds, signed distance functions, and others. Check out the docs for more info!

The library is built on top of the Ivy deep learning framework. This means all functions simultaneously support: Jax, Tensorflow, PyTorch, MXNet, and Numpy.

A Family of Libraries

Ivy vision is one library in a family of Ivy libraries. There are also Ivy libraries for mechanics, robotics, differentiable memory, and differentiable gym environments. Click on the icons below for their respective github pages.


Quick Start

Ivy vision can be installed like so: pip install ivy-vision

To quickly see the different aspects of the library, we suggest you check out the demos! we suggest you start by running the script run_through.py, and read the "Run Through" section below which explains this script.

For more interactive demos, we suggest you run either coords_to_voxel_grid.py or render_image.py in the interactive demos folder.

Run Through

We run through some of the different parts of the library via a simple ongoing example script. The full script is available in the demos folder, as file run_through.py. First, we select a random backend framework f to use for the examples, from the options ivy.jax, ivy.tensorflow, ivy.torch, ivy.mxnd or ivy.numpy.

from ivy_demo_utils.framework_utils import choose_random_framework
f = choose_random_framework()

Camera Geometry

To get to grips with some of the basics, we next show how to construct ivy containers which represent camera geometry. The camera intrinsic matrix, extrinsic matrix, full matrix, and all of their inverses are central to most of the functions in this library.

All of these matrices are contained within the Ivy camera geometry class.

# intrinsics

# shared intrinsic params
img_dims = [512, 512]
pp_offsets = f.array([dim / 2 - 0.5 for dim in img_dims], 'float32')
cam_persp_angles = f.array([60 * np.pi / 180] * 2, 'float32')

# ivy cam intrinsics container
intrinsics = ivy_vision.persp_angles_and_pp_offsets_to_intrinsics_object(
    cam_persp_angles, pp_offsets, img_dims)

# extrinsics

# 3 x 4
cam1_inv_ext_mat = f.array(np.load(data_dir + '/cam1_inv_ext_mat.npy'), 'float32')
cam2_inv_ext_mat = f.array(np.load(data_dir + '/cam2_inv_ext_mat.npy'), 'float32')

# full geometry

# ivy cam geometry container
cam1_geom = ivy_vision.inv_ext_mat_and_intrinsics_to_cam_geometry_object(
    cam1_inv_ext_mat, intrinsics)
cam2_geom = ivy_vision.inv_ext_mat_and_intrinsics_to_cam_geometry_object(
    cam2_inv_ext_mat, intrinsics)
cam_geoms = [cam1_geom, cam2_geom]

The geometries used in this quick start demo are based upon the scene presented below.

The code sample below demonstrates all of the attributes contained within the Ivy camera geometry class.

for cam_geom in cam_geoms:

    assert cam_geom.intrinsics.focal_lengths.shape == (2,)
    assert cam_geom.intrinsics.persp_angles.shape == (2,)
    assert cam_geom.intrinsics.pp_offsets.shape == (2,)
    assert cam_geom.intrinsics.calib_mats.shape == (3, 3)
    assert cam_geom.intrinsics.inv_calib_mats.shape == (3, 3)

    assert cam_geom.extrinsics.cam_centers.shape == (3, 1)
    assert cam_geom.extrinsics.Rs.shape == (3, 3)
    assert cam_geom.extrinsics.inv_Rs.shape == (3, 3)
    assert cam_geom.extrinsics.ext_mats_homo.shape == (4, 4)
    assert cam_geom.extrinsics.inv_ext_mats_homo.shape == (4, 4)

    assert cam_geom.full_mats_homo.shape == (4, 4)
    assert cam_geom.inv_full_mats_homo.shape == (4, 4)

Load Images

We next load the color and depth images corresponding to the two camera frames. We also construct the depth-scaled homogeneous pixel co-ordinates for each image, which is a central representation for the ivy_vision functions. This representation simplifies projections between frames.

# loading

# h x w x 3
color1 = f.array(cv2.imread(data_dir + '/rgb1.png').astype(np.float32) / 255)
color2 = f.array(cv2.imread(data_dir + '/rgb2.png').astype(np.float32) / 255)

# h x w x 1
depth1 = f.array(np.reshape(np.frombuffer(cv2.imread(
    data_dir + '/depth1.png', -1).tobytes(), np.float32), img_dims + [1]))
depth2 = f.array(np.reshape(np.frombuffer(cv2.imread(
    data_dir + '/depth2.png', -1).tobytes(), np.float32), img_dims + [1]))

# pixel coords

# h x w x 3
u_pix_coords = ivy_vision.create_uniform_pixel_coords_image(img_dims, f=f)
pixel_coords1 = u_pix_coords * depth1
pixel_coords2 = u_pix_coords * depth2

The rgb and depth images are presented below.

Optical Flow and Depth Triangulation

Now that we have two cameras, their geometries, and their images fully defined, we can start to apply some of the more interesting vision functions. We start with some optical flow and depth triangulation functions.

# required mat formats
cam1to2_full_mat_homo = f.matmul(cam2_geom.full_mats_homo, cam1_geom.inv_full_mats_homo)
cam1to2_full_mat = cam1to2_full_mat_homo[..., 0:3, :]
full_mats_homo = f.concatenate((f.expand_dims(cam1_geom.full_mats_homo, 0),
                                f.expand_dims(cam2_geom.full_mats_homo, 0)), 0)
full_mats = full_mats_homo[..., 0:3, :]

# flow
flow1to2 = ivy_vision.flow_from_depth_and_cam_mats(pixel_coords1, cam1to2_full_mat)

# depth again
depth1_from_flow = ivy_vision.depth_from_flow_and_cam_mats(flow1to2, full_mats)

Visualizations of these images are given below.

Inverse and Forward Warping

Most of the vision functions, including the flow and depth functions above, make use of image projections, whereby an image of depth-scaled homogeneous pixel-coordinates is transformed into cartesian co-ordinates relative to the acquiring camera, the world, another camera, or transformed directly to pixel co-ordinates in another camera frame. These projections also allow warping of the color values from one camera to another.

For inverse warping, we assume depth to be known for the target frame. We can then determine the pixel projections into the source frame, and bilinearly interpolate these color values at the pixel projections, to infer the color image in the target frame.

Treating frame 1 as our target frame, we can use the previously calculated optical flow from frame 1 to 2, in order to inverse warp the color data from frame 2 to frame 1, as shown below.

# inverse warp rendering
warp = u_pix_coords[..., 0:2] + flow1to2
color2_warp_to_f1 = ivy.bilinear_resample(color2, warp)

# projected pixel coords 2
pixel_coords1_wrt_f2 = ivy_vision.pixel_to_pixel_coords(pixel_coords1, cam1to2_full_mat)

# projected depth 2
depth1_wrt_f2 = pixel_coords1_wrt_f2[..., -1:]

# inverse warp depth
depth2_warp_to_f1 = ivy.bilinear_resample(depth2, warp)

# depth validity
depth_validity = f.abs(depth1_wrt_f2 - depth2_warp_to_f1) < 0.01

# inverse warp rendering with mask
color2_warp_to_f1_masked = f.where(depth_validity, color2_warp_to_f1, f.zeros_like(color2_warp_to_f1))

Again, visualizations of these images are given below. The images represent intermediate steps for the inverse warping of color from frame 2 to frame 1, which is shown in the bottom right corner.

For forward warping, we instead assume depth to be known in the source frame. A common approach is to construct a mesh, and then perform rasterization of the mesh.

The ivy method ivy_vision.render_pixel_coords instead takes a simpler approach, by determining the pixel projections into the target frame, quantizing these to integer pixel co-ordinates, and scattering the corresponding color values directly into these integer pixel co-ordinates.

This process in general leads to holes and duplicates in the resultant image, but when compared to inverse warping, it has the beneft that the target frame does not need to correspond to a real camera with known depth. Only the target camera geometry is required, which can be for any hypothetical camera.

We now consider the case of forward warping the color data from camera frame 2 to camera frame 1, and again render the new color image in target frame 1.

# forward warp rendering
pixel_coords1_proj = ivy_vision.pixel_to_pixel_coords(pixel_coords2,
                                                      f.inv(cam1to2_full_mat_homo)[..., 0:3, :])
pix_coords_w_color_in_f1 = f.concatenate((pixel_coords1_proj, color2), -1)

# without depth buffer
f1_forward_warp_no_db, _, _ = ivy_vision.render_pixel_coords(
    f.reshape(pix_coords_w_color_in_f1, (-1, 6)), f.zeros_like(pix_coords_w_color_in_f1[..., 2:]),
    img_dims, with_db=False)

# with depth buffer
f1_forward_warp_w_db, _, _ = ivy_vision.render_pixel_coords(
    f.reshape(pix_coords_w_color_in_f1, (-1, 6)), f.zeros_like(pix_coords_w_color_in_f1[..., 2:]),
    img_dims, with_db=False if f is ivy.mxnd else True)

Again, visualizations of these images are given below. The images show the forward warping of both depth and color from frame 2 to frame 1, which are shown with and without depth buffers in the right-hand and central columns respectively.

Interactive Demos

In addition to the examples above, we provide two further demo scripts, which are more visual and interactive, and are each built around a particular function.

Rather than presenting the code here, we show visualizations of the demos. The scripts for these demos can be found in the interactive demos folder.

Co-ordinates to Voxel Grid

The first demo captures depth and color images from a set of cameras, converts the depth to world-centric co-ordinartes, and uses the method ivy_vision.coords_to_voxel_grid to voxelize the depth and color values into a grid, as shown below:

Image Rendering

The second demo again captures depth and color images from a set of cameras, but this time uses the method ivy_vision.render_pixel_coords to dynamically forward warp and render the images into a new target frame, as shown below. The acquiring cameras all remain static, while the target frame for rendering moves freely.

Get Involed

We hope the functions in this library are useful to a wide range of deep learning developers. However, there are many more areas of 3D vision which could be covered by this library.

If there are any particular vision functions you feel are missing, and your needs are not met by the functions currently on offer, then we are very happy to accept pull requests!

We look forward to working with the community on expanding and improving the Ivy vision library.

Citation

@article{lenton2021ivy,
  title={Ivy: Templated Deep Learning for Inter-Framework Portability},
  author={Lenton, Daniel and Pardo, Fabio and Falck, Fabian and James, Stephen and Clark, Ronald},
  journal={arXiv preprint arXiv:2102.02886},
  year={2021}
}
You might also like...
kyle's vision of how datadog's python client should look

kyle's datadog python vision/proposal not for production use See examples/comprehensive.py for a mostly working example of the proposed API. 📈 🐶 ❤️

A collection of 100 Deep Learning images and visualizations
A collection of 100 Deep Learning images and visualizations

A collection of Deep Learning images and visualizations. The project has been developed by the AI Summer team and currently contains almost 100 images.

Simple and lightweight Spotify Overlay written in Python.
Simple and lightweight Spotify Overlay written in Python.

Simple Spotify Overlay This is a simple yet powerful Spotify Overlay. About I have been looking for something like this ever since I got Spotify. I th

This repository contains a streaming Dataflow pipeline written in Python with Apache Beam, reading data from PubSub.

Sample streaming Dataflow pipeline written in Python This repository contains a streaming Dataflow pipeline written in Python with Apache Beam, readin

A script written in Python that generate output custom color (HEX or RGB input to x1b hexadecimal)
A script written in Python that generate output custom color (HEX or RGB input to x1b hexadecimal)

ColorShell ─ 1.5 Planned for v2: setup.sh for setup alias This script converts HEX and RGB code to x1b x1b is code for colorize outputs, works on ou

This is a Boids Simulation, written in Python with Pygame.
This is a Boids Simulation, written in Python with Pygame.

PyNBoids A Python Boids Simulation This is a Boids simulation, written in Python3, with Pygame2 and NumPy. To use: Save the pynboids_sp.py file (and n

A small script written in Python3 that generates a visual representation of the Mandelbrot set.
A small script written in Python3 that generates a visual representation of the Mandelbrot set.

Mandelbrot Set Generator A small script written in Python3 that generates a visual representation of the Mandelbrot set. Abstract The colors in the ou

Certificate generating and sending system written in Python.
Certificate generating and sending system written in Python.

Certificate Generator & Sender How to use git clone https://github.com/saadhaxxan/Certificate-Generator-Sender.git cd Certificate-Generator-Sender Add

Visualizations of linear algebra algorithms for people who want a deep understanding
Visualizations of linear algebra algorithms for people who want a deep understanding

Visualising algorithms on symmetric matrices Examples QR algorithm and LR algorithm Here, we have a GIF animation of an interactive visualisation of t

Comments
  • # inverse warp rendering

    # inverse warp rendering


    ValueError Traceback (most recent call last) /tmp/ipykernel_33/3014420539.py in 13 14 # depth validity ---> 15 depth_validity = ivy.abs(depth1_wrt_f2 - depth2_warp_to_f1) < 0.01 16 17 # inverse warp rendering with mask

    ValueError: operands could not be broadcast together with shapes (512,512,1) (262144,1)

    opened by devsugun 4
  • unsupported operrand types

    unsupported operrand types

    File "/home/gitpod/.pyenv/versions/3.8.12/lib/python3.8/site-packages/ivy_core-1.1.9-py3.8.egg/ivy/neural_net_stateful/optimizers.py", line 291, in _step self._vw = grads ** 2 TypeError: unsupported operand type(s) for ** or pow(): 'NoneType' and 'int'

    opened by devsugun 2
  • open3D installation

    open3D installation

    earlier while i was trying to setup my work env for the demo vision project. I found out that there was an issue is using pip install open3D.

    But to solve this problem first go to the GitHub repo and clone the repo and then retry the pip install open3D.

    If it is not properly configured the utils package will not work.

    opened by charlsefrancis 0
Releases(v1.1.9)
Owner
Ivy
The Templated Deep Learning Framework
Ivy
Cryptocurrency Centralized Exchange Visualization

This is a simple one that uses Grafina to visualize cryptocurrency from the Bitkub exchange. This service will make a request to the Bitkub API from your wallet and save the response to Postgresql. G

Popboon Mahachanawong 1 Nov 24, 2021
The Timescale NFT Starter Kit is a step-by-step guide to get up and running with collecting, storing, analyzing and visualizing NFT data from OpenSea, using PostgreSQL and TimescaleDB.

Timescale NFT Starter Kit The Timescale NFT Starter Kit is a step-by-step guide to get up and running with collecting, storing, analyzing and visualiz

Timescale 102 Dec 24, 2022
Gaphas is the diagramming widget library for Python.

Gaphas Gaphas is the diagramming widget library for Python. Gaphas is a library that provides the user interface component (widget) for drawing diagra

Gaphor 144 Dec 14, 2022
Graphical visualizer for spectralyze by Lauchmelder23

spectralyze visualizer Graphical visualizer for spectralyze by Lauchmelder23 Install Install matplotlib and ffmpeg. Put ffmpeg.exe in same folder as v

Matthew 1 Dec 21, 2021
A small timeseries transformation API built on Flask and Pandas

#Mcflyin ###A timeseries transformation API built on Pandas and Flask This is a small demo of an API to do timeseries transformations built on Flask a

Rob Story 84 Mar 25, 2022
A napari plugin for visualising and interacting with electron cryotomograms.

napari-tomoslice A napari plugin for visualising and interacting with electron cryotomograms. Installation You can install napari-tomoslice via pip: p

3 Jan 03, 2023
📊 Extensions for Matplotlib

📊 Extensions for Matplotlib

Nico Schlömer 519 Dec 30, 2022
A filler visualizer built using python

filler-visualizer 42 filler のログをビジュアライズしてスポーツさながら楽しむことができます! Usage (標準入力でvisualizer.pyに渡せばALL OK) 1. 既にあるログをビジュアライズする $ ./filler_vm -t 3 -p1 john_fill

Takumi Hara 1 Nov 04, 2021
Design your own matplotlib stylefile interactively

Tired of playing with font sizes and other matplotlib parameters every time you start a new project or write a new plotting function? Want all you plots have the same style? Use matplotlib configurat

yobi byte 207 Dec 08, 2022
This is Pygrr PolyArt, a program used for drawing custom Polygon models for your Pygrr project!

This is Pygrr PolyArt, a program used for drawing custom Polygon models for your Pygrr project!

Isaac 4 Dec 14, 2021
Automatic data visualization in atom with the nteract data-explorer

Data Explorer Interactively explore your data directly in atom with hydrogen! The nteract data-explorer provides automatic data visualization, so you

Ben Russert 65 Dec 01, 2022
Python toolkit for defining+simulating+visualizing+analyzing attractors, dynamical systems, iterated function systems, roulette curves, and more

Attractors A small module that provides functions and classes for very efficient simulation and rendering of iterated function systems; dynamical syst

1 Aug 04, 2021
RockNext is an Open Source extending ERPNext built on top of Frappe bringing enterprise ready utilization.

RockNext is an Open Source extending ERPNext built on top of Frappe bringing enterprise ready utilization.

Matheus Breguêz 13 Oct 12, 2022
JupyterHub extension for ContainDS Dashboards

ContainDS Dashboards for JupyterHub A Dashboard publishing solution for Data Science teams to share results with decision makers. Run a private on-pre

Ideonate 179 Nov 29, 2022
Python histogram library - histograms as updateable, fully semantic objects with visualization tools. [P]ython [HYST]ograms.

physt P(i/y)thon h(i/y)stograms. Inspired (and based on) numpy.histogram, but designed for humans(TM) on steroids(TM). The goal is to unify different

Jan Pipek 120 Dec 08, 2022
100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)

100 pandas puzzles Puzzles notebook Solutions notebook Inspired by 100 Numpy exerises, here are 100* short puzzles for testing your knowledge of panda

Alex Riley 1.9k Jan 08, 2023
The official colors of the FAU as matplotlib/seaborn colormaps

FAU - Colors The official colors of Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) as matplotlib / seaborn colormaps. We support the old colo

Machine Learning and Data Analytics Lab FAU 9 Sep 05, 2022
Implement the Perspective open source code in preparation for data visualization

Task Overview | Installation Instructions | Link to Module 2 Introduction Experience Technology at JP Morgan Chase Try out what real work is like in t

Abdulazeez Jimoh 1 Jan 23, 2022
在原神中使用围栏绘图

yuanshen_draw 在原神中使用围栏绘图 文件说明 toLines.py 将一张图片转换为对应的线条集合,视频可以按帧转换。 draw.py 在原神家园里绘制一张线条图。 draw_video.py 在原神家园里绘制视频(自动按帧摆放,截图(win)并回收) cat_to_video.py

14 Oct 08, 2022
Focus on Algorithm Design, Not on Data Wrangling

The dataTap Python library is the primary interface for using dataTap's rich data management tools. Create datasets, stream annotations, and analyze model performance all with one library.

Zensors 37 Nov 25, 2022