Develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications

Overview

Yarub_library

#The Problem

اللغة العربية تعد من اكثر اللغات انتشارا و استخداما و تتميز لغة الضاد بثراء رصيدها من الكلمات والصيغ ، وهي لغة متميزة من الناحية الصوتية ، فقد اشتملت على جميع الأصوات التي اشتملت عليها اللغات السامية الأخرى . كما تتميز بالمرونة حيث تستوعب جميع الألفاظ المشتقة والمترادفة وتضع لكل مقام مقال لها

ادركنا اهمية اللغة العربية و مكانتها بين شعوب الشرق الاوسط و العالم, و نسعى فى ادراج اللغة العربية ضمن اللغات التى يتيسر استخدامها فى تطبيقات الذكاء الاصطناعى و معالجة اللغات الطبيعية للبشر

In this Omdena project, our goal was to develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications like Morphological analysis, Named Entity Recognition, Sentiment Analysis, Word Embedding, Dialect Identification, Part of speech, and so on the training dataset. This article contains interesting code and could be beneficial for whatever your level of experience, but for beginners, it is a great start-up in data collection using web scraping with referral links to official documentation pages for every mentioned library.

Owner
BADER ALABDAN
Artificial Intelligence Scientist Researcher
BADER ALABDAN
Pytorch-Named-Entity-Recognition-with-BERT

BERT NER Use google BERT to do CoNLL-2003 NER ! Train model using Python and Inference using C++ ALBERT-TF2.0 BERT-NER-TENSORFLOW-2.0 BERT-SQuAD Requi

Kamal Raj 1.1k Dec 25, 2022
Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE

smaller-LaBSE LaBSE(Language-agnostic BERT Sentence Embedding) is a very good method to get sentence embeddings across languages. But it is hard to fi

Jeong Ukjae 13 Sep 02, 2022
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It

Steven Loria 8.4k Dec 26, 2022
Unsupervised intent recognition

INTENT author: steeve LAQUITAINE description: deployment pattern: currently batch only Setup & run git clone https://github.com/slq0/intent.git bash

sl 1 Apr 08, 2022
Persian-lexicon - A lexicon of 70K unique Persian (Farsi) words

Persian Lexicon This repo uses Uppsala Persian Corpus (UPC) to construct a lexic

Saman Vaisipour 7 Apr 01, 2022
iBOT: Image BERT Pre-Training with Online Tokenizer

Image BERT Pre-Training with iBOT Official PyTorch implementation and pretrained models for paper iBOT: Image BERT Pre-Training with Online Tokenizer.

Bytedance Inc. 435 Jan 06, 2023
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)

Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)

Weitang Liu 1.6k Jan 03, 2023
NVDA, the free and open source Screen Reader for Microsoft Windows

NVDA NVDA (NonVisual Desktop Access) is a free, open source screen reader for Microsoft Windows. It is developed by NV Access in collaboration with a

NV Access 1.6k Jan 07, 2023
基于pytorch_rnn的古诗词生成

pytorch_peot_rnn 基于pytorch_rnn的古诗词生成 说明 config.py里面含有训练、测试、预测的参数,更改后运行: python main.py 预测结果 if config.do_predict: result = trainer.generate('丽日照残春')

西西嘛呦 3 May 26, 2022
使用Mask LM预训练任务来预训练Bert模型。训练垂直领域语料的模型表征,提升下游任务的表现。

Pretrain_Bert_with_MaskLM Info 使用Mask LM预训练任务来预训练Bert模型。 基于pytorch框架,训练关于垂直领域语料的预训练语言模型,目的是提升下游任务的表现。 Pretraining Task Mask Language Model,简称Mask LM,即

Desmond Ng 24 Dec 10, 2022
Develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications

Develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications

BADER ALABDAN 2 Oct 22, 2022
Creating a Feed of MISP Events from ThreatFox (by abuse.ch)

ThreatFox2Misp Creating a Feed of MISP Events from ThreatFox (by abuse.ch) What will it do? This will fetch IOCs from ThreatFox by Abuse.ch, convert t

17 Nov 22, 2022
Telegram AI chat bot written in Python using Pyrogram

Aurora_Al Just another Telegram AI chat bot written in Python using Pyrogram. A public running instance can be found on telegram as @AuroraAl. Require

♗CσNϙUҽRσR_MҽSƙEƚҽҽR 1 Oct 31, 2021
A BERT-based reverse dictionary of Korean proverbs

Wisdomify A BERT-based reverse-dictionary of Korean proverbs. 김유빈 : 모델링 / 데이터 수집 / 프로젝트 설계 / back-end 김종윤 : 데이터 수집 / 프로젝트 설계 / front-end / back-end 임용

94 Dec 08, 2022
This is the writeup of all the challenges from Advent-of-cyber-2019 of TryHackMe

Advent-of-cyber-2019-writeup This is the writeup of all the challenges from Advent-of-cyber-2019 of TryHackMe https://tryhackme.com/shivam007/badges/c

shivam danawale 5 Jul 17, 2022
Model parallel transformers in JAX and Haiku

Table of contents Mesh Transformer JAX Updates Pretrained Models GPT-J-6B Links Acknowledgments License Model Details Zero-Shot Evaluations Architectu

Ben Wang 4.9k Jan 04, 2023
A method for cleaning and classifying text using transformers.

NLP Translation and Classification The repository contains a method for classifying and cleaning text using NLP transformers. Overview The input data

Ray Chamidullin 0 Nov 15, 2022
The source code of HeCo

HeCo This repo is for source code of KDD 2021 paper "Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning". Paper Link: htt

Nian Liu 106 Dec 27, 2022
precise iris segmentation

PI-DECODER Introduction PI-DECODER, a decoder structure designed for Precise Iris Segmentation and Location. The decoder structure is shown below: Ple

8 Aug 08, 2022
A telegram bot to translate 100+ Languages

🔥 GOOGLE TRANSLATER 🔥 The owner would not be responsible for any kind of bans due to the bot. • ⚡ INSTALLING ⚡ • • 🔰 Deploy To Railway 🔰 • • ✅ OFF

Aɴᴋɪᴛ Kᴜᴍᴀʀ 5 Dec 20, 2021