A web search server for ParlAI, including Blenderbot2.

Last update: Jan 06, 2023

Related tags

Search ParlAI_SearchEngine

Overview

Description

A web search server for ParlAI, including Blenderbot2.

Querying the server:

The server reacting correctly:

Uses html2text to strip the markup out of the page.
Uses beautifulsoup4 to parse the title.
Currently only uses the googlesearch module to query Google for urls, but is coded in a modular / search engine agnostic way to allow very easily add new search engine support.

Using the googlesearch module is very slow because it parses webpages instead of querying webservices. This is fine for playing with the model, but makes that searcher unusable for training or large scale inference purposes.

To be able to train, one would just have to for example pay for Google Cloud or Microsoft Azure's search services, and derive the Search class to query them.

Quick Start:

First install the requirements:

pip install -r requirements.txt

Run this command in one terminal tab:

python search_server.py serve --host 0.0.0.0:8080

[Optional] You can then test the server with

curl -X POST "http://0.0.0.0:8080" -d "q=baseball&n=1"

Then for example start Blenderbot2 in a different terminal tab:

python -m parlai interactive --model-file zoo:blenderbot2/blenderbot2_3B/model --search_server 0.0.0.0:8080

Colab

There is a jupyter notebook. Just run it. Some instances run out of memory, some don't.

Testing the server:

You need to already be running a server by calling serve on the same hostname and ip. This will create a parlai.agents.rag.retrieve_api.SearchEngineRetriever and try to connect and send a query, and parse the answer.

python search_server.py test_server --host 0.0.0.0:8080

Testing the parser:

python search_server.py test_parser www.some_url_of_your_choice.com/

A web search server for ParlAI, including Blenderbot2.

Related tags

Overview

Description

Quick Start:

Colab

Testing the server:

Testing the parser:

Owner

Jules Gagnon-Marchand

A simple tool for searching images inside a local folder with text/image input using CLIP

Google Project: Search and auto-complete sentences within given input text files, manipulating data with complex data-structures.

基于RSSHUB阅读器实现的获取P站排行和P站搜图，使用时需使用代理

A Python web searcher library with different search engines

This project is a sample demo of Arxiv search related to AI/ML Papers built using Streamlit, sentence-transformers and Faiss.

Yuno is context based search engine for anime.

document organizer with tags and full-text-search, in a simple and clean sqlite3 schema

A simple search engine that allow searching for chess games

a Telegram bot writen in Python for searching files in Drive. Based on SearchX-bot

An image inline search telegram bot.

Search emails from a domain through search engines

Senginta is All in one Search Engine Scrapper for used by API or Python Module. It's Free!

Super Simple Similarities Service

Wagtail CLIP allows you to search your Wagtail images using natural language queries.

GitScanner is a script to make it easy to search for Exposed Git through an advanced Google search.

solrpy is a Python client for Solr

Pysolr — Python Solr client

ElasticSearch ODM (Object Document Mapper) for Python - pip install esengine

This is a Telegram Bot written in Python for searching data on Google Drive.

Yet another googlesearch - A Python library for executing intelligent, realistic-looking, and tunable Google searches.