A library for efficient similarity search and clustering of dense vectors.

Related tags

Deep Learningfaiss
Overview

Faiss

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research.

News

See CHANGELOG.md for detailed information about latest features.

Introduction

Faiss contains several methods for similarity search. It assumes that the instances are represented as vectors and are identified by an integer, and that the vectors can be compared with L2 (Euclidean) distances or dot products. Vectors that are similar to a query vector are those that have the lowest L2 distance or the highest dot product with the query vector. It also supports cosine similarity, since this is a dot product on normalized vectors.

Most of the methods, like those based on binary vectors and compact quantization codes, solely use a compressed representation of the vectors and do not require to keep the original vectors. This generally comes at the cost of a less precise search but these methods can scale to billions of vectors in main memory on a single server.

The GPU implementation can accept input from either CPU or GPU memory. On a server with GPUs, the GPU indexes can be used a drop-in replacement for the CPU indexes (e.g., replace IndexFlatL2 with GpuIndexFlatL2) and copies to/from GPU memory are handled automatically. Results will be faster however if both input and output remain resident on the GPU. Both single and multi-GPU usage is supported.

Building

The library is mostly implemented in C++, with optional GPU support provided via CUDA, and an optional Python interface. The CPU version requires a BLAS library. It compiles with a Makefile and can be packaged in a docker image. See INSTALL.md for details.

How Faiss works

Faiss is built around an index type that stores a set of vectors, and provides a function to search in them with L2 and/or dot product vector comparison. Some index types are simple baselines, such as exact search. Most of the available indexing structures correspond to various trade-offs with respect to

  • search time
  • search quality
  • memory used per index vector
  • training time
  • need for external data for unsupervised training

The optional GPU implementation provides what is likely (as of March 2017) the fastest exact and approximate (compressed-domain) nearest neighbor search implementation for high-dimensional vectors, fastest Lloyd's k-means, and fastest small k-selection algorithm known. The implementation is detailed here.

Full documentation of Faiss

The following are entry points for documentation:

Authors

The main authors of Faiss are:

Reference

Reference to cite when you use Faiss in a research paper:

@article{JDH17,
  title={Billion-scale similarity search with GPUs},
  author={Johnson, Jeff and Douze, Matthijs and J{\'e}gou, Herv{\'e}},
  journal={arXiv preprint arXiv:1702.08734},
  year={2017}
}

Join the Faiss community

For public discussion of Faiss or for questions, there is a Facebook group at https://www.facebook.com/groups/faissusers/

We monitor the issues page of the repository. You can report bugs, ask questions, etc.

License

Faiss is MIT-licensed.

Comments
  • faiss::gpu::runMatrixMult failure

    faiss::gpu::runMatrixMult failure

    The full log: Faiss assertion err == CUBLAS_STATUS_SUCCESS failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with T = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at utils/MatrixMult.cu:141Aborted (core dumped)

    I have successfully run demo_ivfpq_indexing_gpu, which I think the faiss was installed successfully.

    bug cant-repro 
    opened by hellolovetiger 36
  • No module named '_swigfaiss' for conda install

    No module named '_swigfaiss' for conda install

    Summary

    Platform

    OS: macOS 10.13.4

    Faiss version:

    Faiss compilation options:

    Running on :

    • [ ] CPU

    Reproduction instructions

    I installed with

    conda install faiss-cpu -c pytorch
    

    and got No module named '_swigfaiss' error. I went into faiss directory and tried to import again, but got the same error message. It is mentioned in the trouble shooting that this error is caused by faiss not being compiled. Since I use conda install, I suppose it is not the case?

    bug install 
    opened by hsiaoma 29
  • make py: fatal error: Python.h: No such file or directory

    make py: fatal error: Python.h: No such file or directory

    I am also facing same issue, i did following steps

    1. Cloned FAISS
    2. updated makefile.inc with anaconda python path and installed necessary dependencies like libopenblas-dev python-numpy python-dev
    3. make (After this step i am not finding any _swigfaiss.so files anywhere)
    4. make py (Gave following error) $ make py g++ -I. -fPIC -m64 -Wall -g -O3 -msse4 -mpopcnt -fopenmp -Wno-sign-compare -std=c++11 -fopenmp -g -fPIC -fopenmp -I~/anaconda2/envs/faissenv/include/python2.7/ -I~/anaconda2/envs/faissenv/lib/python2.7/site-packages/numpy/core/include -shared
      -o python/_swigfaiss.so python/swigfaiss_wrap.cxx libfaiss.a /usr/lib/libopenblas.so.0 python/swigfaiss_wrap.cxx:154:21: fatal error: Python.h: No such file or directory compilation terminated. Makefile:84: recipe for target 'python/_swigfaiss.so' failed make: *** [python/_swigfaiss.so] Error 1 I am able to run cpp implementation, but only this python wrapper is failing, let me know what i am setting wrong. As _swigfaiss.so is not generated, what went wrong while doing make?

    Originally posted by @Mahanteshambi in https://github.com/facebookresearch/faiss/issues/336#issuecomment-365565492

    question cant-repro install 
    opened by daisy-belle 24
  • Faiss import error when run in virtualenv by using own built Faiss-python

    Faiss import error when run in virtualenv by using own built Faiss-python

    Summary

    I have built faiss-core and faiss-python by myself. I installed python into my local virtual env and try to import faiss and I got an error, checked egg file, it does have _swigfaiss.so inside. I checked conda swigfaiss.py, it's still using old swig_import_helper, not sure if caused by this you remove it by using swig create python/swigfaiss.py as follows:

    https://github.com/facebookresearch/faiss/commit/7f5b22b0fff0882ce4afd93ce54cc2833a224909#diff-8cf6167d58ce775a08acafcfe6f40966

    $ ls faiss-1.5.2-py3.6/faiss
    __init__.py	__pycache__	_swigfaiss.so	swigfaiss.py
    

    Platform

    OS: centos 7

    Faiss version: 1.5.2

    Faiss compilation options:

     ./configure  --prefix=/usr --without-cuda --with-blas=/usr/lib64/libblas.so.3 --with-lapack=/usr/lib64/liblapack.so.3
    make
    sudo make install
    make py
    cd ~ && rm -rf env && python3 -m venv env
    source env/bin/activate
    cd ~/faiss && sudo make -C python install
    

    Running on:

    • [X] CPU
    • [ ] GPU

    Interface:

    • [ ] C++
    • [X] Python

    Reproduction instructions

    $ python
    Python 3.6.7 | packaged by conda-forge | (default, Feb 28 2019, 09:07:38)  [GCC 7.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import faiss
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/midas/env/lib/python3.6/site-packages/faiss-1.5.2-py3.6.egg/faiss/__init__.py", line 18, in <module>
      File "/home/midas/env/lib/python3.6/site-packages/faiss-1.5.2-py3.6.egg/faiss/swigfaiss.py", line 13, in <module>
    ImportError: cannot import name '_swigfaiss'
    
    install 
    opened by billyean 23
  • PyTorch tensor / Faiss index interoperability

    PyTorch tensor / Faiss index interoperability

    Summary: This diff allows for native usage of PyTorch tensors for Faiss indexes on both CPU and GPU. It is currently only implemented in this diff for things that inherit from faiss.Index, which covers the non-binary indices, and it patches the same functions on faiss.Index that were also covered by __init__.py for numpy interoperability.

    There must be uniformity among the inputs: if any array input is a Torch tensor, then all array inputs must be Torch tensors. Similarly, if any array input is a numpy ndarray, then all array inputs must be numpy ndarrays.

    If faiss.contrib.torch_utils is imported, it ensures that import faiss has already been performed to patch all of the functions using the base __init__.py numpy wrappers, and then patches the following functions again:

    add
    add_with_ids
    assign
    train
    search
    remove_ids
    reconstruct
    reconstruct_n
    range_search
    update_vectors
    search_and_reconstruct
    sa_encode
    sa_decode
    

    to allow usage of PyTorch CPU tensors, and additionally PyTorch GPU tensors if the index being used is on the GPU.

    numpy functionality is still available when faiss.contrib.torch_utils is imported; we pass through to the original patched numpy function when we detect numpy inputs.

    In addition, to allow for better (asynchronous) GPU usage without requiring the CPU to be involved, all of these functions which construct tensors/arrays for output now take optional arguments for storage (numpy or torch.Tensor) to be provided that will contain the output data. range_search is the only exception to this, as the size of the output data is indeterminate. The eventual GPU implementation will likely require the user to provide a maximum cap on the output size, and allow that to be passed instead. If the optional pre-allocated output values are presented by the user, they are used; otherwise, new return ndarray / Tensors are constructed as before and used for the return. If this feature were not provided on the GPU, then every execution would be completely serial as we would depend upon the CPU to allocate GPU memory before every operation. Instead, now this can function much like NN graph execution on the GPU, assuming that all of the data requirements are pre-allocated, so the execution will run at the full speed of the GPU and not be stalled sequentially launching kernels.

    This diff also exposes the GpuResources shared_ptr object owned by a GPU index. This is required for pytorch GPU so that we can perform proper stream ordering in Faiss with respect to the current pytorch stream. So, Faiss indices now perform more or less as any NN operation in Torch does.

    Note, however, that a Faiss index has its own setting on current device, and if the pytorch GPU tensor inputs are resident on a different device than what the Faiss index expects, a cross-device copy will be initiated. I may choose to make this an error in the future and require matching device to device.

    This diff also found a bug when passing GPU data directly to train() for GpuIndexIVFFlat and GpuIndexIVFScalarQuantizer, as I guess we never tested passing GPU data directly to these functions before. GpuIndexIVFPQ was doing the right thing however.

    The assign function is now also implemented on the GPU as well, and is now marked const to be in line with the search function.

    Also added better checking of non-contiguous inputs for both Torch tensors and numpy ndarrays.

    Updated the knn_gpu function with a base implementation always present that allows for usage of numpy arrays, which is overridden when torch_utils is imported to allow torch usage. This supports row/column major layout, float32/float16 data and int64/int32 indices for both numpy and torch.

    Reviewed By: mdouze

    Differential Revision: D24299400

    CLA Signed fb-exported 
    opened by wickedfoo 21
  • GPU issue when installing from conda

    GPU issue when installing from conda

    Summary

    I install Faiss from conda (GPU version) image

    And I got ImportError: No module named 'swigfaiss' Could you guys help me out? Did I forget anything?

    Platform

    OS: Ubuntu

    Faiss version:

    Faiss compilation options:

    Running on :

    • [ ] CPU
    • [x] GPU

    Reproduction instructions

    image

    GPU install 
    opened by hminle 20
  • Speedup exhaustive_L2sqr_blas for AVX2, ARM NEON and AVX512

    Speedup exhaustive_L2sqr_blas for AVX2, ARM NEON and AVX512

    Summary: Add a fused kernel for exhaustive_L2sqr_blas() call that combines a computation of dot product and the search for the nearest centroid. As a result, no temporary dot product values are written and read in RAM.

    Significantly speeds up the training of PQx[1] indices for low-dimensional PQ vectors ( 1, 2, 4, 8 ), and the effect is higher for higher values of [1]. AVX512 provides additional overloads for dimensionality of 12 and 16.

    The speedup is also beneficial for higher values of pq.cp.max_points_per_centroid (which is 256 by default).

    Speeds up IVFPQ training as well.

    AVX512 kernel is not enabled, but I've seen it speeding up the training TWICE versus AVX2 version. So, please feel free to use it by enabling AVX512 manually.

    Differential Revision: D41166766

    CLA Signed fb-exported 
    opened by alexanderguzhva 18
  • Does Faiss support searching from Disk?

    Does Faiss support searching from Disk?

    I checked this issue[#552] and also this demo file. But when I checked the demo file, it was not for searching from disk, The demo file was about how save an trained index and load the index to memory for searching. Does Faiss really support searching from disk? If it does, could you let me know where I can refer to do it.

    question 
    opened by sam3oh5 18
  • _swigfaiss_avx2.so may not be loaded properly in conda

    _swigfaiss_avx2.so may not be loaded properly in conda

    Summary

    When I install faiss via conda, IndexPQFastScan is slower than IndexPQ. It seems that AVX2 is not activated properly because _swigfaiss_avx2.so is not loaded correctly.

    Platform

    OS: Ubuntu 20.04 on AWS EC2. (ami-0e039c7d64008bd84, c5.large)

    Faiss version: faiss-cpu 1.7.0 (pytorch/linux-64::faiss-cpu-1.7.0-py3.8_h2a577fa_0_cpu)

    Installed from: conda install -c pytorch faiss-cpu

    Faiss compilation options:

    Running on:

    • [x] CPU
    • [ ] GPU

    Interface:

    • [ ] C++
    • [x] Python

    Reproduction instructions

    I found that IndexPQFastScan is slower than IndexPQ for faiss 1.7.0 installed from conda. Here is the benchmark code.

    import faiss
    import numpy as np
    import time
    
    np.random.seed(123)
    D = 128
    N = 1000
    X = np.random.random((N, D)).astype(np.float32)
    M = 64
    nbits = 4
    
    pq = faiss.IndexPQ(D, M, nbits)
    pq.train(X)
    pq.add(X)
    
    pq_fast = faiss.IndexPQFastScan(D, M, nbits)
    pq_fast.train(X)
    pq_fast.add(X)
    
    t0 = time.time()
    d1, ids1 = pq.search(x=X[:3], k=5)
    t1 = time.time()
    print(f"pq: {(t1 - t0) * 1000} msec")
    
    t0 = time.time()
    d2, ids2 = pq_fast.search(x=X[:3], k=5)
    t1 = time.time()
    print(f"pq_fast: {(t1 - t0) * 1000} msec")
    
    assert np.allclose(ids1, ids2)
    

    The result is:

    pq: 0.4680156707763672 msec
    pq_fast: 1.6791820526123047 msec
    

    After investigating, the cause seems that _swigfaiss_avx2.so is not loaded correctly. If I rename _swigfaiss_avx2.so to _swigfaiss.so, the above code works as expected:

    cd ~/anaconda/lib/python3.8/site-packages/faiss/
    mv _swigfaiss.so _swigfaiss.so.bk
    mv _swigfaiss_avx2.so _swigfaiss.so
    

    Then the benchmark results in:

    pq: 0.8258819580078125 msec
    pq_fast: 0.07104873657226562 msec
    

    Here, IndexPQFastScan becomes much faster.

    The root cause seems that swigfaiss.py is somehow exactly the same as swigfaiss_avx2.py.

    diff swigfaiss.py swigfaiss_avx2.py     # same
    

    If I understand correctly, swigfaiss_avx2.py must load _swigfaiss_avx2.so. But currently swigfaiss_avx2.py is the same as swigfaiss.py and loads _swigfaiss.so.

    install 
    opened by matsui528 16
  • Indexing 1B vectors by creating smaller indexes on batches and merging them

    Indexing 1B vectors by creating smaller indexes on batches and merging them

    Need guidance...

    We'll have an application where we will stream a set of vectors (on the order of a billion). We cannot wait until we collect all the vectors to train an index (you recommend IMI at this scale). We are thinking of building indexes for smaller batches of vectors... once we have a batch ready, we could train the index from a sample, create an index for the batch and in the end merge all the indexes. I understand only IVF supports merging of indexes, wanted your thoughts on this approach.

    Thanks

    question GPU 
    opened by mvss80 16
  • CUDA 9 issue: results of GPU Index are not right?

    CUDA 9 issue: results of GPU Index are not right?

    1. The result of GPU index is not the same as CPU, even although on the same dateset with the same index

    import numpy as np
    d = 64                           # dimension
    nb = 100000                      # database size
    nq = 10000                       # nb of queries
    np.random.seed(1234)             # make reproducible
    xb = np.random.random((nb, d)).astype('float32')
    xb[:, 0] += np.arange(nb) / 1000.
    xq = np.random.random((nq, d)).astype('float32')
    xq[:, 0] += np.arange(nq) / 1000.
    #=================================================================
    import faiss                   # make faiss available
    index = faiss.IndexFlatL2(d)   # build the index
    index.add(xb)                  # add vectors to the index
    k = 4                          # we want to see 4 nearest neighbors
    D, I = index.search(xq, k)     # actual search
    print I[-5:]                # neighbors of the 5 last queries
    print D[-5:]
    
    del index, D, I
    #=================================================================
    print "================="
    index = faiss.IndexFlatL2(d)   # build the index
    res = faiss.StandardGpuResources()
    index = faiss.index_cpu_to_gpu(res, 0, index)
    index.add(xb)                  # add vectors to the index
    k = 4                          # we want to see 4 nearest neighbors
    D, I = index.search(xq, k)     # actual search
    print I[-5:]                # neighbors of the 5 last queries
    print D[-5:]
    
    del index, D, I
    
    exit(1)
    

    The result is

    [[ 9900 10500  9309  9831]
     [11055 10895 10812 11321]
     [11353 11103 10164  9787]
     [10571 10664 10632  9638]
     [ 9628  9554 10036  9582]]
    [[ 6.53157043  6.97875977  7.00392151  7.01379395]
     [ 4.33526611  5.23693848  5.31942749  5.70327759]
     [ 6.07269287  6.57675171  6.61395264  6.7322998 ]
     [ 6.63751221  6.64874268  6.85787964  7.00964355]
     [ 6.21836853  6.45251465  6.54876709  6.58129883]]
    =================
    number of GPUs: 1
    [[10500 10500  9831  9831]
     [10895 10895 10812 11321]
     [11103 11103  9787  9787]
     [10632 10632  9638  9638]
     [ 9628  9554  9582  9582]]
    [[ 6.53156281  6.97874451  7.00393677  7.01376343]
     [ 4.33531189  5.23696899  5.31942749  5.70326233]
     [ 6.07269287  6.57672119  6.61393738  6.73226929]
     [ 6.63748169  6.64871216  6.85783386  7.00959778]
     [ 6.21837616  6.45251465  6.54875183  6.58128357]]
    

    The result of the GPU index and CPU index are not the same

    2. Duplicate items in the GPU result

    As the result shown above, there are duplicate ids in the result but with different distances, like [10500 10500 9831 9831].

    Could someone tell me what is the problem and how to fix it, THX!

    bug GPU 
    opened by DrLai12club 16
  • Tests fail to link: undefined symbol: testing::AssertionSuccess()

    Tests fail to link: undefined symbol: testing::AssertionSuccess()

    Summary

    ld: error: undefined symbol: testing::AssertionSuccess()
    >>> referenced by test_binary_flat.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_binary_flat.cpp.o:(BinaryFlat_accuracy_Test::TestBody())
    >>> referenced by test_dealloc_invlists.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_dealloc_invlists.cpp.o:((anonymous namespace)::test_dealloc_invlists(char const*))
    >>> referenced by test_ivfpq_codec.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_ivfpq_codec.cpp.o:(IVFPQ_codec_Test::TestBody())
    >>> referenced 533 more times
    
    ld: error: undefined symbol: testing::Message::Message()
    >>> referenced by test_binary_flat.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_binary_flat.cpp.o:(BinaryFlat_accuracy_Test::TestBody())
    >>> referenced by test_dealloc_invlists.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_dealloc_invlists.cpp.o:((anonymous namespace)::test_dealloc_invlists(char const*))
    >>> referenced by test_ivfpq_codec.cpp
    >>>               tests/CMakeFiles/faiss_test.dir/test_ivfpq_codec.cpp.o:(IVFPQ_codec_Test::TestBody())
    >>> referenced 746 more times
    

    Platform

    OS: FreeBSD 13.1

    Faiss version: 1.7.3

    Installed from: FreeBSD port

    opened by yurivict 0
  • Have max_codes consider only subset entries in IndexIVF search

    Have max_codes consider only subset entries in IndexIVF search

    Summary

    Hey! Nice work with V1.7.3! I have a feature request.

    Is there a way to have the max_codes stop criteria in IndexIVF searches only consider those entries that actually belong to a subset if one is specified? Wit the current implementation, to my understanding, when the number of scanned entries reaches max_codes, the search is stopped. However, for subset searches, this might happen before we actually scanned max_codes entries in the subset as even entries not in the subset count towards this limit.

    Specifically, just as a proof of concept, all that would be necessary is to have scan_one_list not return list_size but instead return the number returned by scanner->scan_codes(list_size, codes, ids, simi, idxi, k); a few lines above. See here.

    Obviously, that's just a quick hack only for the IVF index. 🙃 I assume in order to not break the current behavior, this would need to be controlled via an additional search parameter for all indices that have the same behavior currently.

    Faiss version: 1.7.3 (19f7696deedc93615c3ee0ff4de22284b53e0243)

    Running on:

    • [x] CPU
    • [ ] GPU

    Interface:

    • [x] C++
    • [ ] Python
    opened by wro-ableton 0
  • Scan exactly max_codes elements

    Scan exactly max_codes elements

    Summary: The max_codes search parameter for IVF indexes limits the number of distance computations that are performed. Previously, the number of distance computations could exceed max_codes because inverted lists were scanned completely. This diff changed this to scan the beginning of the last inverted list to reach max_codes exactly.

    Differential Revision: D42367593

    CLA Signed fb-exported 
    opened by mdouze 2
  • search slow after time.sleep

    search slow after time.sleep

    Platform

    OS: Ubuntu 18.04.6 LTS

    Faiss version: 1.5.3

    Installed from: pip install faiss

    Running on:

    • unknown

    Interface:

    • [x] Python

    build index: index = faiss.index_factory(self.d, "IDMap,Flat", faiss.METRIC_INNER_PRODUCT) save index: faiss.write_index(index, index_path) read_index: faiss_index = faiss.read_index(index_path)

    loop 100, if "time.sleep(0.2)", some step cost time > 20ms if no "time.sleep(0.2)", all step cost time is steady

    #1 for i in range(0, 100): time.sleep(0.2) s_time = time.time() D, I = faiss_index.search(feature, 10) print(time.time() - s_time)

    time(s): 0.033809662 0.001636744 0.001227379 0.000584841 0.000673294 0.001588345 0.000566244 0.025577307 0.000347614 0.000542164 0.00073719 0.000379801 0.000360966 0.000362158 0.000305891 0.000477791 0.000341892 0.000299692 0.027928352 0.000314474 0.000792265 0.000283957 0.000373125 0.000294924 0.000402451 0.000293255 0.000303745 0.000368595 0.000586987 0.0218997 0.000355959 0.000353813 0.000363588 0.000471115 0.000345945 0.00036335 0.000501871 0.000407934 0.000304461 0.025905132 0.000546932 0.000391483 0.000262737 0.000678778 0.000277281 0.000338316 0.000325441 0.000415325 0.000396729 0.000430822 0.025371552 0.000266314 0.000350237 0.000250816 0.000309944 0.000453234 0.000368357 0.000521183 0.000347614 0.000543833 0.000417709 0.051602125 0.000535011 0.00065589 0.00056839 0.000513554 0.000328541 0.000306129 0.00067091 0.00054121 0.00051856 0.00036788 0.02731204 0.000954151 0.00055337 0.000694036 0.000400543 0.000449419 0.00043416 0.000398636 0.000354052 0.000365257 0.033364534 0.000450373 0.000359058 0.004323483 0.000331402 0.000561714 0.000916481 0.000369787 0.000481844 0.000393391 0.000357866 0.025733948 0.000584841 0.000360727 0.000318527 0.000590801 0.000495434 0.000266552

    #2 for i in range(0, 100): #time.sleep(0.2)
    s_time = time.time() D, I = faiss_index.search(feature, 10) print(time.time() - s_time)

    time(s): 0.046122789 0.000362396 0.00031805 0.000313759 0.000325203 0.00032568 0.000318527 0.000315666 0.000306606 0.000331163 0.000328302 0.000318289 0.000317335 0.000319004 0.00031662 0.00031805 0.000314713 0.000321388 0.000338554 0.000316143 0.000310659 0.000306129 0.000330448 0.000365973 0.000255823 0.000335455 0.00032115 0.000276089 0.000339508 0.000310898 0.000317812 0.00032568 0.000333309 0.00030756 0.000320435 0.000317812 0.00032258 0.000314236 0.000326872 0.000309706 0.000336885 0.000307322 0.000322104 0.00032711 0.00032711 0.000305414 0.000321388 0.000312805 0.000305891 0.00031805 0.000324965 0.00030899 0.000313282 0.000323772 0.000318527 0.000325918 0.000321627 0.000317097 0.000327587 0.000323296 0.000310898 0.000326872 0.000333548 0.000359297 0.000272274 0.000305414 0.000329018 0.000317335 0.000315666 0.000325441 0.00031662 0.000314474 0.00033021 0.000314951 0.000320911 0.00033021 0.000313282 0.000319958 0.000318289 0.000332832 0.000331879 0.000303507 0.000319242 0.000331879 0.000316381 0.000310659 0.000353813 0.000301838 0.000322819 0.00031662 0.000310183 0.000318766 0.000341415 0.000312328 0.00033021 0.000317335 0.000331402 0.000324726 0.000315905 0.000311375

    opened by safehumeng 0
  • GpuIndexFlatL2 doesn't produce distances for the last 8 queries

    GpuIndexFlatL2 doesn't produce distances for the last 8 queries

    Platform

    OS: Windows 10 Faiss version: 1.7.3

    Installed from: Compiled using Visual Studio 17 2022

    Faiss compilation options: Using MKL 2202.2.1

    Cuda version: 12.0.0

    GPU: GTX 1060

    Running on:

    • [X] CPU
    • [X] GPU

    Interface:

    • [X] C++
    • [ ] Python

    Reproduction instructions

    Using the test file linked below, faiss makes a CPU index and a GPU index. Then performs a query search on the first 1000 vectors from a 100000 vector database. Code copied directly from 1-Flat for the CPU portion, and 4-GPU for the GPU portion.

    Consistently, the last 8 vectors from the distance matrix are all 0's. Whether querying 1000 elements, or 10000 elements, it's only the last 8 elements.

    6-GPU-CPU.zip

    Output of the program is as follows:

    Building data
    Make index
    is_trained = true
    ntotal = 100000
    I (5 first results)=
        0   723   254   152   403    92   368  1129   673   571
        1   995   136   183   223   555   880   671     5    68
        2   312   253    29   124   148   112   718   713   260
        3   983   467    88   786   327   326   684   367  1053
        4   403   112   643   430   679   142   733   119   382
    I (10 last results)=
      990   962  2284   863  1133  1683  1463  2339  1730  2228
      991  1026   995   540  1396   365  1348  1271  1861   975
      992   257   163   135  1489  1315   878  1017   219   777
      993  1331   210  1362   286   444  1329   608  1191   986
      994   155   134   631   469  1044   388  1042   766  1561
      995   511     1   664   991  1800   689    37   634   631
      996   770  1043   827  1264  1310  1828  1504  1535   876
      997  1288   920   742  1432   840  1174  1337  1041  1113
      998   689  1044   810  1229  2199  1448  2112  1888  1442
      999  1722   901  1161  1044  1251   505  1310   791   308
    D (10 last results)=
          0 6.46885 6.56971 6.80382 7.19488 7.25274 7.44602 7.56737 7.75592  7.8215
          0 5.75124 5.96521 6.00626 6.17735  6.6787 6.74106 6.87712 6.89094 6.89425
          0 5.82659 6.08222 6.16805 6.19852 6.25793 6.56962 6.60474 6.71429 6.72893
          0 6.79663 6.83468  6.9018 6.90929 7.06563 7.07221 7.15147 7.18442 7.20781
          0 6.02754 6.53414 6.62136 6.73151 6.83076 6.85785 6.86768 6.87643 6.89012
          0 5.52238 5.78548 5.80803 5.96521 5.97704 6.12522  6.1321 6.18419 6.51028
          0 5.73736 6.25742 6.38132 6.43517 6.63315 6.70425 6.81538 6.84794  6.8531
          0 6.59953 6.84864 7.11777 7.33908 7.38752 7.39641 7.48399 7.52819 7.60603
          0 5.54166 5.68894 5.72082 5.98355 6.49582 6.52649  6.5502 6.66038 6.66049
          0 6.26311 6.37093 6.39842 6.62256 6.73258 6.82148 6.83769 6.84539 6.91491
    is_trained = true
    ntotal = 100000
    I (5 first results)=
        0   723   254   152   403    92   368  1129   673   571
        1   995   136   183   223   555   880   671     5    68
        2   312   253    29   124   148   112   718   713   260
        3   983   467    88   786   327   326   684   367  1053
        4   403   112   643   430   679   142   733   119   382
    I (10 last results)=
      990   962  2284   863  1133  1683  1463  2339  1730  2228
      991  1026   995   540  1396   365  1348  1271  1861   975
      992   257   163   135  1489  1315   878  1017   219   777
      993  1331   210  1362   286   444  1329   608  1191   986
      994   155   134   631   469  1044   388  1042   766  1561
      995   511     1   664   991  1800   689    37   634   631
      996   770  1043   827  1264  1310  1828  1504  1535   876
      997  1288   920   742  1432   840  1174  1337  1041  1113
      998   689  1044   810  1229  2199  1448  2112  1888  1442
      999  1722   901  1161  1044  1251   505  1310   791   308
    D (10 last results)=
    7.62939e-06 6.46885 6.56971 6.80381 7.19488 7.25273 7.44602 7.56738 7.75592  7.8215
          0 5.75124 5.96521 6.00626 6.17735 6.67871 6.74106 6.87711 6.89094 6.89426
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
          0       0       0       0       0       0       0       0       0       0
    
    cant-repro GPU 
    opened by JulianThijssen 1
Releases(v1.7.3)
  • v1.7.3(Nov 30, 2022)

    Added

    • Sparse k-means routines and moved the generic kmeans to contrib
    • FlatDistanceComputer for all FlatCodes indexes
    • Support for fast accumulation of 4-bit LSQ and RQ
    • Product additive quantization support
    • Support per-query search parameters for many indexes + filtering by ids
    • write_VectorTransform and read_vectorTransform were added to the public API (by @AbdelrahmanElmeniawy)
    • Support for IDMap2 in index_factory by adding "IDMap2" to prefix or suffix of the input String (by @AbdelrahmanElmeniawy)
    • Support for merging all IndexFlatCodes descendants (by @AbdelrahmanElmeniawy)
    • Remove and merge features for IndexFastScan (by @AbdelrahmanElmeniawy)
    • Performance improvements: 1) specialized the AVX2 pieces of code speeding up certain hotspots, 2) specialized kernels for vector codecs (this can be found in faiss/cppcontrib)

    Fixed

    • Fixed memory leak in OnDiskInvertedLists::do_mmap when the file is not closed (by @AbdelrahmanElmeniawy)
    • LSH correctly throws error for metric types other than METRIC_L2 (by @AbdelrahmanElmeniawy)
    Source code(tar.gz)
    Source code(zip)
  • v1.7.2(Jan 10, 2022)

    ADDED

    • Support LSQ on GPU (by @KinglittleQ)
    • Support for exact 1D kmeans (by @KinglittleQ)
    • LUT-based search for additive quantizers
    • Autogenerated Python docstrings from Doxygen comments

    CHANGED

    • Cleanup of index_factory parsing
    Source code(tar.gz)
    Source code(zip)
  • v1.6.4(Oct 22, 2020)

    Features

    • Arbitrary dimensions per sub-quantizer now allowed for GpuIndexIVFPQ.
    • Brute-force kNN on GPU (bfKnn) now accepts int32 indices.
    • Faiss CPU now supports Windows. Conda packages are available from the nightly channel.
    Source code(tar.gz)
    Source code(zip)
  • v1.5.3(Jun 24, 2019)

    Bugfixes:

    • slow scanning of inverted lists (#836).

    Features:

    • add basic support for 6 new metrics in CPU IndexFlat and IndexHNSW (#848);
    • add support for IndexIDMap/IndexIDMap2 with binary indexes (#780).

    Misc:

    • throw python exception for OOM (#758);
    • make DistanceComputer available for all random access indexes;
    • gradually moving from long to int64_t for portability.
    Source code(tar.gz)
    Source code(zip)
  • v1.5.2(May 30, 2019)

    The license was changed from BSD+Patents to MIT.

    Changelog:

    • propagates exceptions raised in sub-indexes of IndexShards and IndexReplicas;
    • support for searching several inverted lists in parallel (parallel_mode != 0);
    • better support for PQ codes where nbit != 8 or 16;
    • IVFSpectralHash implementation: spectral hash codes inside an IVF;
    • 6-bit per component scalar quantizer (4 and 8 bit were already supported);
    • combinations of inverted lists: HStackInvertedLists and VStackInvertedLists;
    • configurable number of threads for OnDiskInvertedLists prefetching (including 0=no prefetch);
    • more test and demo code compatible with Python 3 (print with parentheses);
    • refactored benchmark code: data loading is now in a single file.
    Source code(tar.gz)
    Source code(zip)
  • v1.5.1(May 30, 2019)

    Changelog:

    • a MatrixStats object, which reports useful statistics about a dataset;
    • an option to round coordinates during k-means optimization;
    • an alternative option for search in HNSW;
    • moved stats() and imbalance_factor() from IndexIVF to InvertedLists object;
    • range search is now available for IVFScalarQuantizer;
    • support for direct uint_8 codec in ScalarQuantizer;
    • renamed IndexProxy to IndexReplicas (now ;
    • better support for PQ code assignment with external index;
    • support for IMI2x16 (4B virtual centroids!);
    • support for k = 2048 search on GPU (instead of 1024);
    • most CUDA mem alloc failures now throw exceptions instead of terminating on an assertion;
    • support for renaming an ondisk invertedlists;
    • interrupt computations with interrupt signal (ctrl-C) in python;
    • simplified build system (with --with-cuda/--with-cuda-arch options);
    • updated example Dockerfile;
    • conda packages now depend on the cudatoolkit packages, which fixes some interferences with pytorch. Consequentially, faiss-gpu should now be installed by conda install -c pytorch faiss-gpu cudatoolkit=10.0.
    Source code(tar.gz)
    Source code(zip)
  • v1.5.0(May 30, 2019)

  • v1.4.0(Aug 31, 2018)

    Faiss 1.4.0

    Features:

    • automatic tracking of C++ references in Python
    • non-intel platforms supported -- some functions optimized for ARM
    • override nprobe for concurrent searches
    • support for floating-point quantizers in binary indexes

    Bug fixes:

    • no more segfaults in python (I know it's the same as the first feature but it's important!)
    • fix GpuIndexIVFFlat issues for float32 with 64 / 128 dims
    • fix sharding of flat indexes on GPU with index_cpu_to_gpu_multiple

    The Python interface of Faiss closely mimics the C++ interface. This means that all C++ functions, objects, fields and methods are visible and accessible in Python. This is done thanks to SWIG, that automatically generates Python classes from the C++ headers. The downside is that this low-level access means that there is no automatic tracking of C++ references in Python. For example:

    index = IndexIVFFlat(IndexFlatL2(10), 10, 100) 
    

    would crash. Python does not know that the IndexFlatL2 is referenced by the IndexIVFFlat, so the garbage collector deallocates the IndexFlatL2 while IndexIVFFlat still references it. In Faiss 1.4.0, we added code to all such constructors that adds a Python-level reference to the object and prevents deallocation. With this upgrade, there should be no crashes in pure Python any more, you can report them right away as issues.

    Faiss was developed on 64-bit x86 platforms, Linux and Mac OS. There were quite a few locations in the code that shamelessly assumed that they were compiled with SSE support. Faiss 1.4.0 is portable to other hardware, it has pure C++ code for all operations, and SSE/AVX is only enabled if the appropriate macro are set. This was tested on an ARM platform and also a few operations were optimized for the ARM SIMD operations (in utils_simd.cpp).

    To compile on a non-x86 platform, you will need to provide a BLAS library (OpenBLAS works for aarch64) and remove x86-specific flags from the makefile.inc (manually for now). Faiss is not portable to other compilers than g++/clang though.

    The search-time parameters like nprobe for IndexIVF are set in the index object. What if you want to perform concurrent searches from several threads with different search parameters? This was not possible so far. Now there is an IVFSearchParameters object that can override the parameters set at the object level. See tests/test_params_override.cpp

    Faiss' support for binary indexes is recent, and not so many index types are supported. To work around this, we added IndexBinaryFromFloat, a binary index that wraps around any floating-point index. This makes it possible, for example, to use an IndexHNSW as a quantizer for an IndexBinaryIVF. See tests/test_index_binary_from_float.py

    We also fixed a few bugs that correspond to github issues.

    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Jul 12, 2018)

    Features:

    • Support for binary indexes (IndexBinaryFlat, IndexBinaryIVF)
    • Support fp16 encoding in scalar quantizer
    • Support for deduplication in IndexIVFFlat
    • Support for index serialization

    Bugs:

    • Fix MMAP bug for normal indexes
    • Fix propagation of io_flags in read func
    • Fix k-selection for CUDA 9
    • Fix race condition in OnDiskInvertedLists
    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Mar 1, 2018)

Owner
Meta Research
Meta Research
OpenGAN: Open-Set Recognition via Open Data Generation

OpenGAN: Open-Set Recognition via Open Data Generation ICCV 2021 (oral) Real-world machine learning systems need to analyze novel testing data that di

Shu Kong 90 Jan 06, 2023
Python Implementation of the CoronaWarnApp (CWA) Event Registration

Python implementation of the Corona-Warn-App (CWA) Event Registration This is an implementation of the Protocol used to generate event and location QR

MaZderMind 17 Oct 05, 2022
Hough Transform and Hough Line Transform Using OpenCV

Hough transform is a feature extraction method for detecting simple shapes such as circles, lines, etc in an image. Hough Transform and Hough Line Transform is implemented in OpenCV with two methods;

Happy N. Monday 3 Feb 15, 2022
Container : Context Aggregation Network

Container : Context Aggregation Network If you use this code for a paper please cite: @article{gao2021container, title={Container: Context Aggregati

AI2 47 Dec 16, 2022
Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

torch-imle Concise and self-contained PyTorch library implementing the I-MLE gradient estimator proposed in our NeurIPS 2021 paper Implicit MLE: Backp

UCL Natural Language Processing 249 Jan 03, 2023
Editing a Conditional Radiance Field

Editing Conditional Radiance Fields Project | Paper | Video | Demo Editing Conditional Radiance Fields Steven Liu, Xiuming Zhang, Zhoutong Zhang, Rich

Steven Liu 216 Dec 30, 2022
PyTorch implementation of our method for adversarial attacks and defenses in hyperspectral image classification.

Self-Attention Context Network for Hyperspectral Image Classification PyTorch implementation of our method for adversarial attacks and defenses in hyp

22 Dec 02, 2022
The implementation of PEMP in paper "Prior-Enhanced Few-Shot Segmentation with Meta-Prototypes"

Prior-Enhanced network with Meta-Prototypes (PEMP) This is the PyTorch implementation of PEMP. Overview of PEMP Meta-Prototypes & Adaptive Prototypes

Jianwei ZHANG 8 Oct 14, 2021
FairFuzz: AFL extension targeting rare branches

FairFuzz An AFL extension to increase code coverage by targeting rare branches. FairFuzz has a particular advantage on programs with highly nested str

Caroline Lemieux 222 Nov 16, 2022
For the paper entitled ''A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining''

Summary This is the source code for the paper "A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining", which was accepted as fu

1 Nov 10, 2021
[NeurIPS 2021] Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods Large Scale Learning on Non-Homophilous Graphs: New Benchmark

60 Jan 03, 2023
Winners of the Facebook Image Similarity Challenge

Winners of the Facebook Image Similarity Challenge

DrivenData 111 Jan 05, 2023
DilatedNet in Keras for image segmentation

Keras implementation of DilatedNet for semantic segmentation A native Keras implementation of semantic segmentation according to Multi-Scale Context A

303 Mar 15, 2022
[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark Accepted as a spotlight paper at ICLR 2021. Table of content File structure Prerequi

72 Jan 03, 2023
Code of the paper "Shaping Visual Representations with Attributes for Few-Shot Learning (ASL)".

Shaping Visual Representations with Attributes for Few-Shot Learning This code implements the Shaping Visual Representations with Attributes for Few-S

chx_nju 9 Sep 01, 2022
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group 14 Nov 13, 2022
Norm-based Analysis of Transformer

Norm-based Analysis of Transformer Implementations for 2 papers introducing to analyze Transformers using vector norms: Kobayashi+'20 Attention is Not

Goro Kobayashi 52 Dec 05, 2022
Semantic Image Synthesis with SPADE

Semantic Image Synthesis with SPADE New implementation available at imaginaire repository We have a reimplementation of the SPADE method that is more

NVIDIA Research Projects 7.3k Jan 07, 2023
implement of SwiftNet:Real-time Video Object Segmentation

SwiftNet The official PyTorch implementation of SwiftNet:Real-time Video Object Segmentation, which has been accepted by CVPR2021. Requirements Python

haochen wang 64 Dec 14, 2022
A curated list of programmatic weak supervision papers and resources

A curated list of programmatic weak supervision papers and resources

Jieyu Zhang 118 Jan 02, 2023