trstop

Turkish Stop Words Türkçe Dolgu Sözcükleri In this repository I put Turkish stop words that is contained in the first 10 thousand words with the highest frequency. In order to test the new candidate words in future, I add a small python script, and a 10 thousand item word list with highest frequency. At https://github.com/sgsinclair/trombone/blob/master/src/main/resources/org/voyanttools/trombone/keywords/stop.tr.turkish-lucene.txt are some Turkish stop words. However, some stop words in that list do not belong to the ten thousand highest frequency words.

In order to use the module:

import trstop

print(trstop.is_stop_word(parameter))

Contributors:

Ahmet Aksoy
Toprak Öztürk

Bu depoya en sık kullanılan 10 bin Türkçe sözcük listesinde yer alan dolgu sözcüklerini ekledim. Dolgu sözcükleri (stop words), sık kullanılan, ama iptal edildiklerinde ayrıldıkları cümlenin anlamında önemli değişiklikler oluşturmayan sözcüklerdir.

"Stop words" terimine karşılık "dolgu sözcükleri" terimini kullandım. Daha iyi bir seçenek varsa, değiştirmeye hazırım. Depoya eklediğim "turkce-stop-words-dict.py" betiğini, ileride listeye yeni sözcükler eklemek istediğimizde kullanım sıklığını denetlemek amacıyla kullanabiliriz.

https://github.com/sgsinclair/trombone/blob/master/src/main/resources/org/voyanttools/trombone/keywords/stop.tr.turkish-lucene.txt adresinde de bazı dolgu sözcükleri listelenmiş. Ancak buradaki bazı sözcükler ilk on bine girecek kadar yoğun frekansa sahip değil.

Modülü kullanmak için:

import trstop

print(trstop.is_stop_word(parametre))

Projeye katkıda bulunanlar:

Ahmet Aksoy
Toprak Öztürk

Son güncelleme: 29.06.2018

Turkish Stop Words Türkçe Dolgu Sözcükleri

Related tags

Overview

trstop

In order to use the module:

Contributors:

Modülü kullanmak için:

Projeye katkıda bulunanlar:

Owner

Ahmet Aksoy

Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)

The first online catalogue for Arabic NLP datasets.

A workshop with several modules to help learn Feast, an open-source feature store

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

使用Mask LM预训练任务来预训练Bert模型。训练垂直领域语料的模型表征，提升下游任务的表现。

Fully featured implementation of Routing Transformer

This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers.

PyWorld3 is a Python implementation of the World3 model

Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP

GooAQ 🥑 : Google Answers to Google Questions!

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS)

Code for the paper "Flexible Generation of Natural Language Deductions"

🏖 Easy training and deployment of seq2seq models.

ProtFeat is protein feature extraction tool that utilizes POSSUM and iFeature.

Open solution to the Toxic Comment Classification Challenge

Every Google, Azure & IBM text to speech voice for free

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Train and use generative text models in a few lines of code.

This is the source code of RPG (Reward-Randomized Policy Gradient)