Pythonic search engine based on PyLucene.

Overview

image image image image image image image image image

Lupyne is a search engine based on PyLucene, the Python extension for accessing Java Lucene. Lucene is a relatively low-level toolkit, and PyLucene wraps it through automatic code generation. So although Java idioms are translated to Python idioms where possible, the resulting interface is far from Pythonic. See ./docs/examples.ipynb for comparisons with the Lucene API.

Lupyne also provides GraphQL and RESTful search services, based on Starlette. Note Solr and Elasticsearch are popular options for Lucene-based search, if no further (Python) customization is needed. So while the services are suitable for production usage, their primary motivation is to be an extensible example.

Not having to initially choose between an embedded library and a server not only provides greater flexibility, it can provide better performance, e.g., batch indexing offline and remote searching live. Additionally only lightweight wrappers with extended behavior are used wherever possible, so falling back to using PyLucene directly is always an option, but should never be necessary for performance.

Usage

PyLucene requires initializing the VM.

import lucene

lucene.initVM()

Indexes are accessed through an IndexSearcher (read-only), IndexWriter, or the combined Indexer.

from lupyne import engine

searcher = engine.IndexSearcher('index/path')
hits = searcher.search('text:query')

See ./lupyne/services/README.md for services usage.

Installation

% pip install lupyne[graphql,rest]

PyLucene is not pip installable.

Dependencies

  • PyLucene >=8
  • strawberry-graphql >=0.84.4 (if graphql option)
  • fastapi (if rest option)

Tests

100% branch coverage.

% pytest [--cov]

Changes

dev

  • PyLucene >=8.6 required
  • PyLucene 8.11 supported
  • CherryPy server removed

2.5

  • Python >=3.7 required
  • PyLucene 8.6 supported
  • CherryPy server deprecated

2.4

  • PyLucene >=8 required
  • Hit.keys renamed to Hit.sortkeys

2.3

  • PyLucene >=7.7 required
  • PyLucene 8 supported

2.2

  • PyLucene 7.6 supported

2.1

  • PyLucene >=7 required

2.0

  • PyLucene >=6 required
  • Python 3 support
  • client moved to external package

1.9

  • Python 2.6 dropped
  • PyLucene 4.8 and 4.9 dropped
  • IndexWriter implements context manager
  • Server DocValues updated via patch method
  • Spatial tile search optimized

1.8

  • PyLucene 4.10 supported
  • PyLucene 4.6 and 4.7 dropped
  • Comparator iteration optimized
  • Support for string based FieldCacheRangeFilters
A web search server for ParlAI, including Blenderbot2.

Description A web search server for ParlAI, including Blenderbot2. Querying the server: The server reacting correctly: Uses html2text to strip the mar

Jules Gagnon-Marchand 119 Jan 06, 2023
Searches for MAC addresses in a text file of a Cisco "show IP arp" in any address format

show-ip-arp-mac-lookup Searches for MAC addresses in a text file of a Cisco "show IP arp" in any address format What it does: Takes a text file with t

Stew Alexander 0 Dec 24, 2022
rclip - AI-Powered Command-Line Photo Search Tool

rclip is a command-line photo search tool based on the awesome OpenAI's CLIP neural network.

Yurij Mikhalevich 394 Dec 12, 2022
A fast, efficiency python package for searching and getting search results with many different search engines

search A fast, efficiency python package for searching and getting search results with many different search engines. Installation To install the pack

Neurs 0 Oct 06, 2022
GitScanner is a script to make it easy to search for Exposed Git through an advanced Google search.

GitScanner Legal disclaimer Usage of GitScanner for attacking targets without prior mutual consent is illegal. It is the end user's responsibility to

Kaio Gomes 3 Oct 28, 2022
An image inline search telegram bot.

Image-Search-Bot An image inline search telegram bot. Note: Use Telegram picture bot. That is better. Not recommending to deploy this bot. Made with P

Fayas Noushad 24 Oct 21, 2022
Deep Image Search - AI-Based Image Search Engine

Deep Image Search is an AI-based image search engine that includes deep transfer learning features Extraction and tree-based vectorized search technique.

144 Jan 05, 2023
A library for fast parse & import of Windows Prefetch into Elasticsearch.

prefetch2es Fast import of Windows Prefetch(.pf) into Elasticsearch. prefetch2es uses C library libscca. Usage When using from the commandline interfa

S.Nakano 5 Nov 24, 2022
Python Elasticsearch handler for the standard python logging framework

Python Elasticsearch Log handler This library provides an Elasticsearch logging appender compatible with the python standard logging library. This lib

Mohammed Mousa 0 Dec 08, 2021
An open source, non-profit search engine implemented in python

Mwmbl: No ads, no tracking, no cruft, no profit Mwmbl is a non-profit, ad-free, free-libre and free-lunch search engine with a focus on useability and

639 Jan 04, 2023
document organizer with tags and full-text-search, in a simple and clean sqlite3 schema

document organizer with tags and full-text-search, in a simple and clean sqlite3 schema

Manos Pitsidianakis 152 Oct 29, 2022
A play store search application programming interface ( API )

Play-Store-API A play store search application programming interface ( API ) Made with Python3

Fayas Noushad 8 Oct 21, 2022
TG-searcherBot - Search any channel/chat from keyword

TG-searcherBot Search any channel/chat from keyword. Commands /start - Starts th

TechiError 12 Nov 04, 2022
A library for fast import of Windows NT Registry(REGF) into Elasticsearch.

A library for fast import of Windows NT Registry(REGF) into Elasticsearch.

S.Nakano 3 Apr 01, 2022
Inverted index creation and query search mechanism on Wikipedia pages.

WikiPedia Search Engine Step 1 : Installing Requirements Install "stemming" module for python using pip. Step 2 : Parsing the Data To parse the data,

Piyush Atri 1 Nov 27, 2021
A simple search engine that allow searching for chess games

A simple search engine that allow searching for chess games based on queries about opening names & opening moves. Built with Python 3.10 and python-chess.

Tyler Hoang 1 Jun 17, 2022
a Telegram bot writen in Python for searching files in Drive. Based on SearchX-bot

Drive Search Bot This is a Telegram bot writen in Python for searching files in Drive. Based on SearchX-bot How to deploy? Clone this repo: git clone

Hafitz Setya 25 Dec 09, 2022
A simple tool for searching images inside a local folder with text/image input using CLIP

clip-search (WIP) A simple tool for searching images inside a local folder with text/image input using CLIP 10 results for "a blonde woman" in a folde

5 Dec 25, 2022
基于RSSHUB阅读器实现的获取P站排行和P站搜图,使用时需使用代理

基于RSSHUB阅读器实现的获取P站排行和P站搜图

34 Dec 05, 2022
Home for Elasticsearch examples available to everyone. It's a great way to get started.

Introduction This is a collection of examples to help you get familiar with the Elastic Stack. Each example folder includes a README with detailed ins

elastic 2.5k Jan 03, 2023