Awesome Semantic-Search
Logo made by @createdbytango.
Following repository aims to serve a meta-repository for Semantic Search and Semantic Similarity related tasks.
Semantic Search isn't limited to text! It can be done with images, speech, etc. So there are numerous different use-cases and applications of semantic search.
Contributions / Milestones
Have a look at the project board for the task list
Table Of Contents
Papers
2014
2015
- Skip-Thought Vectors
๐
2016
- Bag of Tricks for Efficient Text Classification
๐ - Enriching Word Vectors with Subword Information
๐ - Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs
- On Approximately Searching for Similar Word Embeddings
๐
2017
2018
- Universal Sentence Encoder
๐ - Learning Semantic Textual Similarity from Conversations
๐ - Google AI Blog: Advances in Semantic Textual Similarity
๐ - Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data
2019
- LASER: Language Agnostic Sentence Representations
๐ - Document Expansion by Query Prediction
๐ - Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
๐ - Multi-Stage Document Ranking with BERT
๐
2020
- Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned
๐ - PASSAGE RE-RANKING WITH BERT
๐ - CO-Search: COVID-19 Information Retrieval with Semantic Search, Question Answering, and Abstractive Summarization
๐ - LaBSE:Language-agnostic BERT Sentence Embedding
๐ - Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset
๐ - DeText: A deep NLP framework for intelligent text understanding
๐ - Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation
๐ - Pretrained Transformers for Text Ranking: BERT and Beyond
๐
2021
- Augmented SBERT
๐ - BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
๐ - Compatibility-aware Heterogeneous Visual Search
๐ท
?2021/2022?
Libraries and Tools
- fastText
- Universal Sentence Encoder
- SBERT
- LaBSE
- LASER
- Haystack
- Jina.AI
- SentEval Toolkit
- BEIR :Benchmarking IR
- Which Frame?
- PySerini
- milvus
- weaviate
- natural-language-youtube-search
- same.energy
- scaNN
- annoy
- faiss
- DPR
- rank_BM25
- nearPy
- vearch
- PyNNDescent
- pgANN