pyMorfologik MorfologikpyMorfologik - Python binding for Morfologik.

Overview

Python binding for Morfologik

Morfologik is Polish morphological analyzer. For more information see http://github.com/morfologik/morfologik-stemming/ and http://http://www.morfologik.blogspot.com/

Requirements

This binding works with Python 2 and Python 3.

Installation

Install it from pip

pip install pyMorfologik

or directly from github

git clone https://github.com/dmirecki/pyMorfologik.git

Usage

Now, only simple stems are supported:

>>> from pymorfologik import Morfologik
>>> from pymorfologik.parsing import ListParser
>>>
>>> parser = ListParser()
>>> stemmer = Morfologik()
>>> stemmer.stem(['Ala ma kota'], parser)
[(u'Ala',
  {u'Al': [u'subst:sg:acc:m1+subst:sg:gen:m1'],
   u'Ala': [u'subst:sg:nom:f'],
   u'Alo': [u'subst:sg:acc:m1+subst:sg:gen:m1']}),
 (u'ma',
  {u'mieć': [u'verb:fin:sg:ter:imperf:refl.nonrefl'],
   u'mój': [u'adj:sg:nom.voc:f:pos']}),
 (u'kota', {u'kot': [u'subst:sg:acc:m1'], u'kota': [u'subst:sg:nom:f']})]

Acknowledgements

This repo is based on Morfologik, a great contribution of Marcin Miłowski (http://marcinmilkowski.pl) and Dawid Weiss (http://www.dawidweiss.com).

Contributions

Damian Mirecki

Adrian Bohdanowicz

Owner
Damian Mirecki
Data Scientist @ Domodi
Damian Mirecki
Nystromformer: A Nystrom-based Algorithm for Approximating Self-Attention

Nystromformer: A Nystrom-based Algorithm for Approximating Self-Attention April 6, 2021 We extended segment-means to compute landmarks without requiri

Zhanpeng Zeng 322 Jan 01, 2023
Repositório do trabalho de introdução a NLP

Trabalho da disciplina de BI NLP Repositório do trabalho da disciplina Introdução a Processamento de Linguagem Natural da pós BI-Master da PUC-RIO. Eq

Leonardo Lins 1 Jan 18, 2022
Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation This is the implementaion of our paper: Bridging the

hezw.tkcw 20 Dec 12, 2022
Universal End2End Training Platform, including pre-training, classification tasks, machine translation, and etc.

背景 安装教程 快速上手 (一)预训练模型 (二)机器翻译 (三)文本分类 TenTrans 进阶 1. 多语言机器翻译 2. 跨语言预训练 背景 TrenTrans是一个统一的端到端的多语言多任务预训练平台,支持多种预训练方式,以及序列生成和自然语言理解任务。 安装教程 git clone git

Tencent Minority-Mandarin Translation Team 42 Dec 20, 2022
A versatile token stream for handwritten parsers.

Writing recursive-descent parsers by hand can be quite elegant but it's often a bit more verbose than expected, especially when it comes to handling indentation and reporting proper syntax errors. Th

Valentin Berlier 8 Nov 30, 2022
pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

297 Dec 29, 2022
Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

Wav2Vec2 STT Python Beta Software Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 mode

David Zurow 22 Dec 29, 2022
a test times augmentation toolkit based on paddle2.0.

Patta Image Test Time Augmentation with Paddle2.0! Input | # input batch of images / / /|\ \ \ # apply

AgentMaker 110 Dec 03, 2022
Collection of scripts to pinpoint obfuscated code

Obfuscation Detection (v1.0) Author: Tim Blazytko Automatically detect control-flow flattening and other state machines Description: Scripts and binar

Tim Blazytko 230 Nov 26, 2022
Making text a first-class citizen in TensorFlow.

TensorFlow Text - Text processing in Tensorflow IMPORTANT: When installing TF Text with pip install, please note the version of TensorFlow you are run

1k Dec 26, 2022
A telegram bot to translate 100+ Languages

🔥 GOOGLE TRANSLATER 🔥 The owner would not be responsible for any kind of bans due to the bot. • ⚡ INSTALLING ⚡ • • 🔰 Deploy To Railway 🔰 • • ✅ OFF

Aɴᴋɪᴛ Kᴜᴍᴀʀ 5 Dec 20, 2021
DiY Oxygen Concentrator based on the OxiKit

M19O2 DiY Oxygen Concentrator based on / inspired by the OxiKit, OpenOx, Marut, RepRap and Project Apollo platforms. About Read about the project on H

Maker's Asylum 62 Dec 22, 2022
This repo is to provide a list of literature regarding Deep Learning on Graphs for NLP

This repo is to provide a list of literature regarding Deep Learning on Graphs for NLP

Graph4AI 230 Nov 22, 2022
Part of Speech Tagging using Hidden Markov Model (HMM) POS Tagger and Brill Tagger

Part of Speech Tagging using Hidden Markov Model (HMM) POS Tagger and Brill Tagger In this project, our aim is to tune, compare, and contrast the perf

Chirag Daryani 0 Dec 25, 2021
This github repo is for Neurips 2021 paper, NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

NORESQA: Speech Quality Assessment using Non-Matching References This is a Pytorch implementation for using NORESQA. It contains minimal code to predi

Meta Research 36 Dec 08, 2022
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,

Mutian He 19 Oct 14, 2022
An A-SOUL Text Generator Based on CPM-Distill.

ASOUL-Generator-Backend 本项目为 https://asoul.infedg.xyz/ 的后端。 模型为基于 CPM-Distill 的 transformers 转化版本 CPM-Generate-distill 训练而成。

infinityedge 46 Dec 11, 2022
AIDynamicTextReader - A simple dynamic text reader based on Artificial intelligence

AI Dynamic Text Reader: This is a simple dynamic text reader based on Artificial

Md. Rakibul Islam 1 Jan 18, 2022
TPlinker for NER 中文/英文命名实体识别

本项目是参考 TPLinker 中HandshakingTagging思想,将TPLinker由原来的关系抽取(RE)模型修改为命名实体识别(NER)模型。

GodK 113 Dec 28, 2022