Question answering on russian with XLMRobertaLarge as a service

Overview

QA Roberta Ru SaaS

Question answering on russian with XLMRobertaLarge as a service. Thanks for the model to Alexander Kaigorodov.

Stack

  • Flask
  • Gunicorn

Build image

sudo docker build . --tag qa-roberta-ru-saas

Run on CPU and predict

sudo docker run --rm -p 8080:8080 --name qa-roberta-ru-saas qa-roberta-ru-saas

curl -H "Content-Type: application/json" --data @tests/app/data/test_input.json 0.0.0.0:8080/predict

Run on GPU

Change device to cuda:0 in config before docker build:

device: cuda:0

After build:

sudo docker run --rm --gpus 0 -p 8080:8080 --name qa-roberta-ru-saas qa-roberta-ru-saas

To run with restart:

sudo docker run --gpus 0 -p 8080:8080 --restart always --name qa-roberta-ru-saas qa-roberta-ru-saas

To stop it later:

docker update --restart unless-stopped qa-roberta-ru-saas

To run tests:

pytest tests/

To run app without docker container

PYTHONPATH=. python app/app_main.py

TODO:

  • GPU/CPU support
  • Support of context longer than 512 bpe
  • Predict on long context with sliding window
Owner
Gladkikh Prohor
Data Science Team Lead at Sber. Master's degree at Moscow Institute of Physics and Technology in 2015.
Gladkikh Prohor
Returns unicode slugs

Python Slugify A Python slugify application that handles unicode. Overview Best attempt to create slugs from unicode strings while keeping it DRY. Not

Val Neekman 1.3k Jan 04, 2023
Map Reduce Wordcount in Python using gRPC

This project is implemented in Python using gRPC. The input files are given in .txt format and the word count operation is performed.

Divija 4 Dec 05, 2022
Amazing GitHub Template - Sane defaults for your next project!

🚀 Useful README.md, LICENSE, CONTRIBUTING.md, CODE_OF_CONDUCT.md, SECURITY.md, GitHub Issues and Pull Requests and Actions templates to jumpstart your projects.

276 Jan 01, 2023
一款高性能敏感词(非法词/脏字)检测过滤组件,附带繁体简体互换,支持全角半角互换,汉字转拼音,模糊搜索等功能。

一款高性能非法词(敏感词)检测组件,附带繁体简体互换,支持全角半角互换,获取拼音首字母,获取拼音字母,拼音模糊搜索等功能。

ToolGood 3.6k Jan 07, 2023
An implementation of figlet written in Python

All of the documentation and the majority of the work done was by Christopher Jones ([emai

Peter Waller 1.1k Jan 02, 2023
An experimental Fang Song style Chinese font generated with skeleton-tracing and pix2pix

An experimental Fang Song style Chinese font generated with skeleton-tracing and pix2pix, with glyphs based on cwTeXFangSong. The font is optimised fo

Lingdong Huang 98 Jan 07, 2023
Hspell, the free Hebrew spellchecker and morphology engine.

Hspell, the free Hebrew spellchecker and morphology engine.

16 Sep 15, 2022
A program that looks through entered text and replaces certain commands with mathematical symbols

TextToSymbolConverter A program that looks through entered text and replaces certain commands with mathematical symbols Example: Syntax: Enter text in

1 Jan 02, 2022
A Python3 script that simulates the user typing a text on their keyboard.

A Python3 script that simulates the user typing a text on their keyboard. (control the speed, randomness, rate of typos and more!)

Jose Gracia Berenguer 3 Feb 22, 2022
A production-ready pipeline for text mining and subject indexing

A production-ready pipeline for text mining and subject indexing

UF Open Source Club 12 Nov 06, 2022
Text to ASCII and ASCII to text

Text2ASCII Description This python script (converter.py) contains two functions: encode() is used to return a list of Integer, one item per character

4 Jan 22, 2022
Widevine KEY Extractor in Python

Widevine Client 3 This was originally written by T3rry7f. This repo is slightly modified version of his repo. This only works on standard Windows! Usa

Vank0n (SJJeon) 68 Dec 29, 2022
Python library for creating PEG parsers

PyParsing -- A Python Parsing Module Introduction The pyparsing module is an alternative approach to creating and executing simple grammars, vs. the t

Pyparsing 1.7k Dec 27, 2022
Goblin-sim - Procedural fantasy world generator

goblin-sim This project is an attempt to create a procedural goblin fantasy worl

3 May 18, 2022
This is a text summarizing tool written in Python

Summarize Written by: Ling Li Ya This is a text summarizing tool written in Python. User Guide Some things to note: The application is accessible here

Marcus Lee 2 Feb 18, 2022
Adventura is an open source Python Text Adventure Engine

Adventura Adventura is an open source Python Text Adventure Engine, Not yet uplo

5 Oct 02, 2022
AnnIE - Annotation Platform, tool for open information extraction annotations using text files.

AnnIE - Annotation Platform, tool for open information extraction annotations using text files.

Niklas 29 Dec 20, 2022
Python Q&A for Network Engineers

Q & A I am often asked questions about how to solve this or that problem, and I decided to post these questions and solutions here, in case it is also

Natasha Samoylenko 30 Nov 15, 2022
An anthology of a variety of tools for the Persian language in Python

An anthology of a variety of tools for the Persian language in Python

Persian Tools 106 Nov 08, 2022
Umamusume story patcher with python

umamusume-story-patcher How to use Go to your umamusume folder, usually C:\Users\user\AppData\LocalLow\Cygames\umamusume Make a mods folder and clon

8 May 07, 2022