MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble

Last update: Jan 07, 2023

Overview

datasketch: Big Data Looks Small

datasketch gives you probabilistic data structures that can process and search very large amount of data super fast, with little loss of accuracy.

This package contains the following data sketches:

Data Sketch	Usage
MinHash	estimate Jaccard similarity and cardinality
Weighted MinHash	estimate weighted Jaccard similarity
HyperLogLog	estimate cardinality
HyperLogLog++	estimate cardinality

The following indexes for data sketches are provided to support sub-linear query time:

Index	For Data Sketch	Supported Query Type
MinHash LSH	MinHash, Weighted MinHash	Jaccard Threshold
MinHash LSH Forest	MinHash, Weighted MinHash	Jaccard Top-K
MinHash LSH Ensemble	MinHash	Containment Threshold

datasketch must be used with Python 2.7 or above and NumPy 1.11 or above. Scipy is optional, but with it the LSH initialization can be much faster.

Note that MinHash LSH and MinHash LSH Ensemble also support Redis and Cassandra storage layer (see MinHash LSH at Scale).

Install

To install datasketch using pip:

pip install datasketch

This will also install NumPy as dependency.

To install with Redis dependency:

pip install datasketch[redis]

To install with Cassandra dependency:

pip install datasketch[cassandra]

To install with Scipy for faster MinHashLSH initialization:

pip install datasketch[scipy]

Comments

Redis backend for datasketch LSH
Summary

We have been using datasketch extensively in data-driven applications (primarily the MinHashLSH class). The existing implementation requires all data to be stored in memory, which is a severe limitation for very large datasets. We add support for a Redis backend. Default behavior remains in-memory storage.

Changes

Redis backend for MinHashLSH

Insertion session for bulk insertion (reduces network calls when inserting with a Redis backend)

Functionality to view LSH bucket sizes (useful for diagnosis/debugging)

Accompanying docs and tests

Examples

For examples, see the additions to the documentation.
opened by ae-foster 25
Add Cassandra storage layer

This is a storage layer that uses Cassandra as a backend. It is optimized to take into account Cassandra internals and its (intrinsic) limitations (see for example how all the keys are retrieved and how the list abstraction is implemented). I have also added a new batch of tests to both TestMinHashLSH and TestWeightedMinHashLSH. Since Cassandra is required (Cassandra is not mocked) they can be enabled by flipping the DO_TEST_CASSANDRA variable.

opened by ostefano 18
Implement #123 arbitrary Mongo URL & collection name

I have also updated the documentation a bit.

The purpose of the function AsyncIOMotorClient.get_default_database is to allow for the Mongo URL to define a database. If the URL contains a database, that database in the URL will be used, otherwise, the calculated database name is used. As for get_collection, I changed it for consistency only.

I wanted to update the unit tests, but they were skipped… and when I tried to un-skip them everything failed spetacularly. I'm not sure what to do about that.

Cheers!

opened by ekevoo 14
Easy computation of all duplicates?

Thanks for this library!

One question: Is there an easy way to get all pairs from a LSH? In other words, instead of a query for a single minhash, I need all pairs of records inside a same bucket.

I tried to dive into the code, but I didn't understand very well how hashranges/hashtables work. What I need is similar to candidate_duplicates of this other implementation: https://mattilyra.github.io/2017/05/23/document-deduplication-with-lsh.html

opened by fjsj 14
Performance improvements
I've made some performance improvements in the MinHash class and added a helper method for bulk computation.

Precasting _mersenne_prime and _max_hash, they don't need to be cast multiple times.

I've modified the update method to accept a list of bytes so hash values can be computed in batches. This avoids using for loops in python space and packs all those operations into the numpy calls once. Only caveat here is that the matrix operations use more memory and you should do chunking if you're dealing with huge sets.

The bulk helper removes unnecessary initialization overhead of the permutations and initial hashvalues. They are computed once then passed on to new MinHash objects as they are produced.

These changes give some pretty nice performance improvements, up to 3X for updates, and anywhere from 5-25X for bulk computing. See my benchmark gist here
opened by Sinusoidal36 10
AsyncMinhashLSH doesn't use index key with mongodb backend, and hence is very slow.

We'd struggled with AsyncMinhashLSH's query performance, eventually finding out it wasn't using the default collection index key ('_id'), and instead querying by it's 'key' field. (see: here )

This can either be fixed by inserting into the collections with primary key '_id', or else creating indices on the 'key' field for all collections in the database.

For anyone looking for the quickfix solution, simply run:

db.lsh_...._bucket_#.createIndex( { key: -1 } )

on each of the buckets and key collections in the database. We found performance went from 11 seconds/ query to >1ms/ query pretty much instantly.

I believe this should be added to the documentation somewhere, or fixed in the code (although it'd break existing databases). I'd volunteer to do either!

opened by oisincar 8
add ability to set storage basename

If you are connecting to a redis database and you lose the reference to the connection, you will lose reference to all data in your hash and you also lose the ability to remove it. By having the ability to set the base name, you can alway reference the same keys no matter if you create a new connection or use the old one.

opened by KevinMann13 8
how to restore minhashes from MinHashLSH with redis backend?

I what to find duplicates,so I need pop a hash then compare left hashes.

or should I make a copy of original data to redis? or just not use the redis storge wait the dataset feed up then use MinHashLSH

opened by bung87 8
Persistent storage on MongoDB

Hi guys,

Question, I'd like to store a MinHashLSH on a Mongo-db instance. I've seen you have Redis & Cassandra as storage-types in the code; is there an easy way for me to store in on mongo as well?

Greets Tb

opened by blah-crusader 7
Combining/Storing LSH with different thresholds

Hi,

First of all thanks for this amazing package. I have two LSH, one optimized with a threshold of 0.65 and the other with a threshold of 0.55. I found the results from both these LSH are relevant to me, and I would finally take a union of the results of both of these, and get my duplicate contents.

In order to store both LSH, I'm using Redis caching mechanism, but later I will have to call both the LSH, run individually and take the union of the results.

Is there any way I could optimize so that I could do the same with only one LSH if possible or some method to create a special type of indexes in Redis such that I don't have to call the LSH's Individually. This would help me save Memory and Computational Time.

opened by thakur-nandan 7
Question: Setting Expiry Time for Entries in Redis LSH?

Hi,

First of all thanks for this great library :)

I wanted to know if there is any way to set an expiry time for entries being added into an LSH connected to Redis? Or any way to keep a default expiry time for all keys?

Should I instead set a maxmemory policy for my Redis with an allkeys-lru or allkeys-random eviction policy? Is this recommended or will it cause issues with the LSH?

Regards, Spandan

opened by SpandanThakur 7
Support numpy>=1.20.0

numpy.int was only ever an alias for the built-in, and is now deprecated from version 1.20.0

See: https://numpy.org/doc/stable/release/1.20.0-notes.html#using-the-aliases-of-builtin-types-like-np-int-is-deprecated

opened by joehalliwell 0
Given a minHash instance or minHash LSH , How to find most unsimilar instance with an existed minHash LSH?

In API document , we can learn how to find approximate neighbours with Jaccard similarity more than threshold . On the other hand, how to find most unsimilar instance. Further, given a minHash LSH which initialized by thousands of plain text - set A, and given other thousands of plain text - set B, how to discover most unsimilar text of set B with set A , or how to discover text of set B which jaccard similarity less than threshold . Is there as possible as fast solution to deal with this problem ? Thanks !

opened by loulianzhang 2

Getting `pymongo.errors.ExecutionTimeout` for using more than 1 instance of AsyncMinHashLSH

I have a use case where i have to store min hash of (n) different categories of file and the query them. For example if i have documents of category A and I want to store all of them in one db and then query at later point.

I am initializing this in a microservice as (fastapi) as follows

@app.on_event("startup")
async def startup_event():
    LSHService.lsh_js, LSHService.lsh_css, LSHService.lsh_html, = await asyncio.gather(
        *[
            await AsyncMinHashLSH(
                storage_config={
                    **_storage,
                    "basename": "category-a".encode(),
                },
                threshold=0.9,
                num_perm=256,
            ),
            await AsyncMinHashLSH(
                storage_config={
                    **_storage,
                    "basename": "category-b".encode(),
                },
                threshold=0.9,
                num_perm=256,
            ),
            await AsyncMinHashLSH(
                storage_config={
                    **_storage,
                    "basename": "category-b".encode(),
                },
                threshold=0.9,
                num_perm=256,
            ),
        ]
    )

If i initialize like this i get
Getting pymongo.errors.ExecutionTimeout Error

However If i just maintain 1 object then I dont get any error.

What can I do mitigate this? Thanks in advance

opened by RonaldRegan69 3

Question: Why getting a min-hash for a single sample is possible?

Say if we have three documents A, B and C. Each document might contains different words.

According to the document of data sketch.MinHash, we can get a min-hash for A with

minxish = Minhash(num_perm=128) minhash.update(A.encode('utf-8')) vector = minhash.digest()

But isn't that we need to create a vocabulary consisting of all words from A, B and C before getting the vector?

opened by bdeng3 2

Releases(v1.5.8)

v1.5.8(Aug 21, 2022)
What's Changed

Add GitHub URL for PyPi by @andriyor in https://github.com/ekzhu/datasketch/pull/179

Support asyncio redis by @long2ice in https://github.com/ekzhu/datasketch/pull/185

Fix name construction for all values of b by @SenadI in https://github.com/ekzhu/datasketch/pull/190

New Contributors

@andriyor made their first contribution in https://github.com/ekzhu/datasketch/pull/179

@long2ice made their first contribution in https://github.com/ekzhu/datasketch/pull/185

@SenadI made their first contribution in https://github.com/ekzhu/datasketch/pull/190

Full Changelog: https://github.com/ekzhu/datasketch/compare/v1.5.7...v1.5.8
Source code(tar.gz)
Source code(zip)
v1.5.7(Feb 4, 2022)
What's Changed

Unable to create multiple lsh indices each one in its own keyspace - issue #171 by @ronassa in https://github.com/ekzhu/datasketch/pull/172

New Contributors

@ronassa made their first contribution in https://github.com/ekzhu/datasketch/pull/172

Full Changelog: https://github.com/ekzhu/datasketch/compare/v1.5.6...v1.5.7
Source code(tar.gz)
Source code(zip)
v1.5.6(Dec 27, 2021)

Fixed broken packaging setup.py that missed experimental/aio.
Source code(tar.gz)
Source code(zip)
v1.5.5(Dec 16, 2021)
What's Changed

Adding minhash_many to WeightedMinHashGenerator. by @jroose-jv in https://github.com/ekzhu/datasketch/pull/165

Add query buffer by @hguhlich in https://github.com/ekzhu/datasketch/pull/167

New Contributors

@jroose-jv made their first contribution in https://github.com/ekzhu/datasketch/pull/165

@hguhlich made their first contribution in https://github.com/ekzhu/datasketch/pull/167

Full Changelog: https://github.com/ekzhu/datasketch/compare/v1.5.4...v1.5.5
Source code(tar.gz)
Source code(zip)
v1.5.4(Dec 4, 2021)
What's Changed

Fixes #146; MinhashLSH creates mongo index key. by @oisincar in https://github.com/ekzhu/datasketch/pull/148

Add redis_buffer configuration. by @QthCN in https://github.com/ekzhu/datasketch/pull/152

minhash: Get rid of deprecation warning by @xkubov in https://github.com/ekzhu/datasketch/pull/156

New Contributors

@oisincar made their first contribution in https://github.com/ekzhu/datasketch/pull/148

@QthCN made their first contribution in https://github.com/ekzhu/datasketch/pull/152

@xkubov made their first contribution in https://github.com/ekzhu/datasketch/pull/156

Full Changelog: https://github.com/ekzhu/datasketch/compare/1.5.2...v1.5.4
Source code(tar.gz)
Source code(zip)
1.5.2(Dec 15, 2020)
Performance improvement for MinHash's update method.

Make MinHash updates 4.5X faster by using update_batch method for bulk update on MinHash. [See API doc].(http://ekzhu.com/datasketch/documentation.html#datasketch.MinHash.update_batch)

Further performance gain by using bulk generation of MinHash using MinHash.bulk or MinHash.generator. See API doc and pull request.

Optional compression for MinHash LSH index by hashing the bucket key produced by MinHashLSH._H. See pull request. This leads to saving of memory/storage space used by the index.

Thank you @Sinusoidal36!
Source code(tar.gz)
Source code(zip)

v1.5.0(Nov 26, 2019)

Minor bug fixes
Cassandra storage layer, thank @ostefano! Now you can specify the Cassandra config just like the Redis one.

from datasketch import MinHashLSH

lsh = MinHashLSH(
    threashold=0.5, num_perm=128, storage_config={
        'type': 'cassandra',
        'cassandra': {
            'seeds': ['127.0.0.1'],
            'keyspace': 'lsh_test',
            'replication': {
                'class': 'SimpleStrategy',
                'replication_factor': '1',
            },
            'drop_keyspace': False,
            'drop_tables': False,
        }
    }
)

Source code(tar.gz)
Source code(zip)

v1.4.0(Jan 6, 2019)

Now support hashfunc parameter for MinHash and HyperLogLog. The old parameter hashobj is removed.

# Let's use MurmurHash3.
import mmh3

# We need to define a new hash function that outputs an integer that
# can be encoded in 32 bits.
def _hash_func(d):
    return mmh3.hash32(d)

# Use this function in MinHash constructor.
m = MinHash(hashfunc=_hash_func)

Source code(tar.gz)
Source code(zip)

v.1.3.0(Dec 27, 2018)

Use dynamic programming to create optimal partition, allow LSH Ensemble index to adapt to any set size distribution.
Source code(tar.gz)
Source code(zip)
v1.2.10(Nov 2, 2018)
Adding batch removal functionality for Async MinHashLSH

Because Redis does not support async operation, removed Redis support from Async MinHashLSH

For details see Pull #70 Thanks @aastafiev for the contribution.
Source code(tar.gz)
Source code(zip)
v1.2.9(Oct 22, 2018)

Add support for MongoDB replica set
Source code(tar.gz)
Source code(zip)
v1.2.8(Oct 17, 2018)

Source code(tar.gz)
Source code(zip)
v1.2.7(Jul 27, 2018)
Added Asynchronous MinHash LSH module. Thanks @aastafiev!

Added ability to set the base name in storage config. Base name is used as the prefix for generating keys in the underlying storage (e.g., Redis). This change allows client to "reconnect" to an existing LSH index in the storage through its base name.

Source code(tar.gz)
Source code(zip)
v1.2.6(Jul 21, 2018)

Fix a bug with UnorderedStorage.get_many (#56)
Source code(tar.gz)
Source code(zip)
v1.2.5(Nov 21, 2017)
Fix issue #35

Test cases for checking consistency of hash value length in LSH.

Source code(tar.gz)
Source code(zip)
v1.2.4(Nov 8, 2017)

Thanks @vmarkovtsev
Source code(tar.gz)
Source code(zip)
v1.2.3(Sep 8, 2017)
Introduced a Redis storage layer for MinHash LSH. Thanks to @ae-foster

Added __hash__ method for Lean MinHash.

Source code(tar.gz)
Source code(zip)
v1.2.0(Mar 31, 2017)
Added a slightly simplified version of LSH Ensemble that supports containment search with MinHash data sketches.

An introduction on containment link.

Update documentations

Source code(tar.gz)
Source code(zip)
v1.1.3(Mar 26, 2017)

MinHash now uses Numpy's random number generator instead of Python's built-in random. This makes MinHash generate consistent hash values across different Python versions.

The side-effect is that now MinHash created before version 1.1.3 won’t work (i.e., jaccard, merge and union) correctly with those created after.
Source code(tar.gz)
Source code(zip)
v1.1.2(Mar 15, 2017)
LeanMinHash is a subclass of MinHash. It uses less memory and allows faster (de)serialization. See documentation for details.

Removed serialize, deserialize, and bytesize methods from MinHash. These are supported in LeanMinHash instead.

Serialized MinHash objects before this version will not be deserialized properly. To migrate see here.

Documentation now have its own website!

Source code(tar.gz)
Source code(zip)
v1.0.0(Feb 12, 2017)

After nearly 2 years working on this project on-and-off, the API is now stable, and the features of MinHash-related sketches are completed.

I will continue to add more data sketches and indexes.
Source code(tar.gz)
Source code(zip)
v0.2.6(Jan 7, 2017)
MinHash LSH Forest implementation and benchmark using synthetic data

Improve existing MinHash LSH benchmark using synthetic data for more tunable data distributions

Improve MinHash and LSH performance

Source code(tar.gz)
Source code(zip)
v0.2.4(Jul 10, 2016)
Fixed Issue #4 - int overflow error on Windows platform

Use Python build-in random number generator for better MinHash accuracy

Source code(tar.gz)
Source code(zip)
v0.2.3(Apr 12, 2016)
Add remove method for LSH index - lsh.remove(key)

Add membership check for LSH - key in lsh

Source code(tar.gz)
Source code(zip)
datasketch-0.2.3-py2.py3-none-any.whl(43.13 KB)
v0.2.1(Apr 12, 2016)
Add Weighted MinHash data sketch

Add Weighted MinHash LSH index

Performance and accuracy benchmark for Weighted MinHash

Rename digest to update for MinHash and HyperLogLog, and use bytes as input argument.

Make hashobj customizable through data sketch constructors

Add new methods for data sketches

Bug fixes

Source code(tar.gz)
Source code(zip)
datasketch-0.2.1-py2.py3-none-any.whl(42.96 KB)

Owner

Eric Zhu

GitHub Repository https://ekzhu.github.io/datasketch

🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.

🌎 JSONClasses JSONClasses is a declarative data flow pipeline and data graph framework. Official Website: https://www.jsonclasses.com Official Docume

53 Dec 09, 2022

This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras)

Yogi-Optimizer_Keras This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras) The NeurIPS-Paper can be found here: http://papers.nips.c

14 Sep 13, 2022

Info and sample codes for "NTU RGB+D Action Recognition Dataset"

"NTU RGB+D" Action Recognition Dataset "NTU RGB+D 120" Action Recognition Dataset "NTU RGB+D" is a large-scale dataset for human action recognition. I

578 Dec 30, 2022

you can add any codes in any language by creating its respective folder (if already not available).

HACKTOBERFEST-2021-WEB-DEV Beginner-Hacktoberfest Need Your first pr for hacktoberfest 2k21 ? come on in About This is repository of Responsive Portfo

8 Oct 17, 2022

Metrics to evaluate quality and efficacy of synthetic datasets.

An Open Source Project from the Data to AI Lab, at MIT Metrics for Synthetic Data Generation Projects Website: https://sdv.dev Documentation: https://

129 Jan 03, 2023

Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

MKGFormer Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion" Model Architecture Illu

68 Dec 28, 2022

MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution (CVPR2021)

MASA-SR Official PyTorch implementation of our CVPR2021 paper MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Re

126 Dec 20, 2022

Code for the paper "A Study of Face Obfuscation in ImageNet"

A Study of Face Obfuscation in ImageNet Code for the paper: A Study of Face Obfuscation in ImageNet Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng,

35 Oct 04, 2022

Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too

4 Nov 24, 2022

This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

CLGo This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints An earlier

32 Dec 20, 2022

[NeurIPS 2021] Code for Unsupervised Learning of Compositional Energy Concepts

Unsupervised Learning of Compositional Energy Concepts This is the pytorch code for the paper Unsupervised Learning of Compositional Energy Concepts.

45 Nov 30, 2022

In this project we predict the forest cover type using the cartographic variables in the training/test datasets.

Kaggle Competition: Forest Cover Type Prediction In this project we predict the forest cover type (the predominant kind of tree cover) using the carto

1 Mar 15, 2022

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion Yinghao Aaron Li, Ali Zare, Nima Mesgarani We pres

282 Jan 01, 2023

Official PyTorch implementation of Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations Zhenyu Jiang, Yifeng Zhu, Maxwell Svetlik, Kuan Fang, Yu

63 Jan 03, 2023

⚡ Fast • 🪶 Lightweight • 0️⃣ Dependency • 🔌 Pluggable • 😈 TLS interception • 🔒 DNS-over-HTTPS • 🔥 Poor Man's VPN • ⏪ Reverse & ⏩ Forward • 👮🏿 "Proxy Server" framework • 🌐 "Web Server" framework • ➵ ➶ ➷ ➠ "PubSub" framework • 👷 "Work" acceptor & executor framework

Table of Contents Features Install Using PIP Stable version Development version Using Docker Stable version Development version Using HomeBrew Stable

2.2k Jan 08, 2023

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble

Related tags

Overview

datasketch: Big Data Looks Small

Install

Comments

Summary

Changes

Examples

Releases(v1.5.8)

v1.5.8(Aug 21, 2022)

What's Changed

New Contributors

v1.5.7(Feb 4, 2022)

What's Changed

New Contributors

v1.5.6(Dec 27, 2021)

v1.5.5(Dec 16, 2021)

What's Changed

New Contributors

v1.5.4(Dec 4, 2021)

What's Changed

New Contributors

1.5.2(Dec 15, 2020)

v1.5.0(Nov 26, 2019)

v1.4.0(Jan 6, 2019)

v.1.3.0(Dec 27, 2018)

v1.2.10(Nov 2, 2018)

v1.2.9(Oct 22, 2018)

v1.2.8(Oct 17, 2018)

v1.2.7(Jul 27, 2018)

v1.2.6(Jul 21, 2018)

v1.2.5(Nov 21, 2017)

v1.2.4(Nov 8, 2017)

v1.2.3(Sep 8, 2017)

v1.2.0(Mar 31, 2017)

v1.1.3(Mar 26, 2017)

v1.1.2(Mar 15, 2017)

v1.0.0(Feb 12, 2017)

v0.2.6(Jan 7, 2017)

v0.2.4(Jul 10, 2016)

v0.2.3(Apr 12, 2016)

v0.2.1(Apr 12, 2016)

Owner

Eric Zhu

🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.

This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras)

Info and sample codes for "NTU RGB+D Action Recognition Dataset"

you can add any codes in any language by creating its respective folder (if already not available).

Metrics to evaluate quality and efficacy of synthetic datasets.

Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution (CVPR2021)

Code for the paper "A Study of Face Obfuscation in ImageNet"

Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

[NeurIPS 2021] Code for Unsupervised Learning of Compositional Energy Concepts

In this project we predict the forest cover type using the cartographic variables in the training/test datasets.

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

Official PyTorch implementation of Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Pytorch implementation of the paper "Class-Balanced Loss Based on Effective Number of Samples"

PyTorch implementation of MulMON

Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021, Pytorch)

code for `Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation`

Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".