Pythonic search engine based on PyLucene.

Overview

image image image image image image image image image

Lupyne is a search engine based on PyLucene, the Python extension for accessing Java Lucene. Lucene is a relatively low-level toolkit, and PyLucene wraps it through automatic code generation. So although Java idioms are translated to Python idioms where possible, the resulting interface is far from Pythonic. See ./docs/examples.ipynb for comparisons with the Lucene API.

Lupyne also provides GraphQL and RESTful search services, based on Starlette. Note Solr and Elasticsearch are popular options for Lucene-based search, if no further (Python) customization is needed. So while the services are suitable for production usage, their primary motivation is to be an extensible example.

Not having to initially choose between an embedded library and a server not only provides greater flexibility, it can provide better performance, e.g., batch indexing offline and remote searching live. Additionally only lightweight wrappers with extended behavior are used wherever possible, so falling back to using PyLucene directly is always an option, but should never be necessary for performance.

Usage

PyLucene requires initializing the VM.

import lucene

lucene.initVM()

Indexes are accessed through an IndexSearcher (read-only), IndexWriter, or the combined Indexer.

from lupyne import engine

searcher = engine.IndexSearcher('index/path')
hits = searcher.search('text:query')

See ./lupyne/services/README.md for services usage.

Installation

% pip install lupyne[graphql,rest]

PyLucene is not pip installable.

Dependencies

  • PyLucene >=8
  • strawberry-graphql >=0.84.4 (if graphql option)
  • fastapi (if rest option)

Tests

100% branch coverage.

% pytest [--cov]

Changes

dev

  • PyLucene >=8.6 required
  • PyLucene 8.11 supported
  • CherryPy server removed

2.5

  • Python >=3.7 required
  • PyLucene 8.6 supported
  • CherryPy server deprecated

2.4

  • PyLucene >=8 required
  • Hit.keys renamed to Hit.sortkeys

2.3

  • PyLucene >=7.7 required
  • PyLucene 8 supported

2.2

  • PyLucene 7.6 supported

2.1

  • PyLucene >=7 required

2.0

  • PyLucene >=6 required
  • Python 3 support
  • client moved to external package

1.9

  • Python 2.6 dropped
  • PyLucene 4.8 and 4.9 dropped
  • IndexWriter implements context manager
  • Server DocValues updated via patch method
  • Spatial tile search optimized

1.8

  • PyLucene 4.10 supported
  • PyLucene 4.6 and 4.7 dropped
  • Comparator iteration optimized
  • Support for string based FieldCacheRangeFilters
Google Project: Search and auto-complete sentences within given input text files, manipulating data with complex data-structures.

Auto-Complete Google Project In this project there is an implementation for one feature of Google's search engines - AutoComplete. Autocomplete, or wo

Hadassah Engel 10 Jun 20, 2022
Google Drive file searcher

Google Drive file searcher

Hafitz Setya 25 Dec 09, 2022
A simple tool for searching images inside a local folder with text/image input using CLIP

clip-search (WIP) A simple tool for searching images inside a local folder with text/image input using CLIP 10 results for "a blonde woman" in a folde

5 Dec 25, 2022
A sentence search engine that fetches examples from trusted news/media organisations. Great for writing better English.

A sentence search engine that fetches examples from trusted news/media websites. Great for improving writing & speaking better English.

Stephen Appiah 1 Apr 04, 2022
User-friendly, tiny source code searcher written by pure Python.

User-friendly, tiny source code searcher written in pure Python. Example Usages Cat is equivalent in the regular expression as '^Cat$' bor class Cat

Furkan Onder 106 Nov 02, 2022
Search emails from a domain through search engines

EmailFinder - search emails through Search Engines

Josué Encinar 155 Dec 30, 2022
Simple algorithm search engine like google in python using function

Mini-Search-Engine-Like-Google I have created the simple algorithm search engine like google in python using function. I am matching every word with w

Sachin Vinayak Dabhade 5 Sep 24, 2021
GitScanner is a script to make it easy to search for Exposed Git through an advanced Google search.

GitScanner Legal disclaimer Usage of GitScanner for attacking targets without prior mutual consent is illegal. It is the end user's responsibility to

Kaio Gomes 3 Oct 28, 2022
Jina allows you to build deep learning-powered search-as-a-service in just minutes

Cloud-native neural search framework for any kind of data

Jina AI 17k Dec 31, 2022
ElasticSearch ODM (Object Document Mapper) for Python - pip install esengine

esengine - The Elasticsearch Object Document Mapper esengine is an ODM (Object Document Mapper) it maps Python classes in to Elasticsearch index/doc_t

SEEK International AI 109 Nov 22, 2022
Eland is a Python Elasticsearch client for exploring and analyzing data in Elasticsearch with a familiar Pandas-compatible API.

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

elastic 463 Dec 30, 2022
A library for fast import of Windows NT Registry(REGF) into Elasticsearch.

A library for fast import of Windows NT Registry(REGF) into Elasticsearch.

S.Nakano 3 Apr 01, 2022
Python Elasticsearch handler for the standard python logging framework

Python Elasticsearch Log handler This library provides an Elasticsearch logging appender compatible with the python standard logging library. This lib

Mohammed Mousa 0 Dec 08, 2021
A play store search application programming interface ( API )

Play-Store-API A play store search application programming interface ( API ) Made with Python3

Fayas Noushad 8 Oct 21, 2022
A fast, efficiency python package for searching and getting search results with many different search engines

search A fast, efficiency python package for searching and getting search results with many different search engines. Installation To install the pack

Neurs 0 Oct 06, 2022
A sphinx extension for designing beautiful, screen-size responsive web components.

sphinx-design A sphinx extension for designing beautiful, view size responsive web components. Created with inspiration from Bootstrap (v5), Material

Executable Books 109 Jan 01, 2023
Pysolr — Python Solr client

pysolr pysolr is a lightweight Python client for Apache Solr. It provides an interface that queries the server and returns results based on the query.

Haystack Search 626 Dec 01, 2022
document organizer with tags and full-text-search, in a simple and clean sqlite3 schema

document organizer with tags and full-text-search, in a simple and clean sqlite3 schema

Manos Pitsidianakis 152 Oct 29, 2022
Super Simple Similarities Service

Super Simple Similarities Service

vincent d warmerdam 95 Dec 25, 2022
Searches for MAC addresses in a text file of a Cisco "show IP arp" in any address format

show-ip-arp-mac-lookup Searches for MAC addresses in a text file of a Cisco "show IP arp" in any address format What it does: Takes a text file with t

Stew Alexander 0 Dec 24, 2022