A Practitioner's Guide to Natural Language Processing

Last update: Jan 03, 2023

Overview

Text Analytics with Python - 2nd Edition

A Practitioner's Guide to Natural Language Processing

Text analytics can be a bit overwhelming and frustrating at times with the unstructured and noisy nature of textual data and the vast amount of information available. "Text Analytics with Python" is a book packed with 674 pages of useful information based on techniques, algorithms, experiences and various lessons learnt over time in analyzing text data. This repository contains datasets and code used in this book. I will also be adding various notebooks and bonus content here from time to time. Keep watching this space!

Get the book

About the book

Leverage Natural Language Processing (NLP) in Python and learn how to set up your own robust environment for performing text analytics. This second edition has gone through a major revamp and introduces several significant changes and new topics based on the recent trends in NLP.

You’ll see how to use the latest state-of-the-art frameworks in NLP, coupled with machine learning and deep learning models for supervised sentiment analysis powered by Python to solve actual case studies. Start by reviewing Python for NLP fundamentals on strings and text data and move on to engineering representation methods for text data, including both traditional statistical models and newer deep learning-based embedding models. Improved techniques and new methods around parsing and processing text are discussed as well.
Text summarization and topic models have been overhauled so the book showcases how to build, tune, and interpret topic models in the context of an interest dataset on NIPS conference papers. Additionally, the book covers text similarity techniques with a real-world example of movie recommenders, along with sentiment analysis using supervised and unsupervised techniques. There is also a chapter dedicated to semantic analysis where you’ll see how to build your own named entity recognition (NER) system from scratch. While the overall structure of the book remains the same, the entire code base, modules, and chapters has been updated to the latest Python 3.x release.

^{Edition: 2nd
Pages: 674
Language: English
Book Title: Text Analytics with Python
Book Subtitle: A Practitioner's Guide to Natural Language Processing
Publisher: Apress (a part of Springer)
Print ISBN: 978-1-4842-4353-4
Online ISBN: 978-1-4842-4354-1
DOI: 10.1007/978-1-4842-4354-1
Copyright: Dipanjan Sarkar}

With this book you will:

Understanding NLP and text syntax, semantics and structure
Discover text cleaning and feature engineering strategies
Learn and implement text classification and text clustering
Understand and build text summarization and topic models
Learn about the promise of deep learning and transfer learning for NLP
Implement hands-on examples based on Python and several popular open source libraries in NLP and text analytics, such as the natural language toolkit (nltk), gensim, scikit-learn, spaCy, keras and tensorflow

A Practitioner's Guide to Natural Language Processing

Related tags

Overview

Text Analytics with Python - 2nd Edition

A Practitioner's Guide to Natural Language Processing

Get the book

About the book

Owner

Dipanjan (DJ) Sarkar

hashily is a Python module that provides a variety of text decoding and encoding operations.

Python library for Serbian Natural language processing (NLP)

Switch spaces for knowledge graph embeddings

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering.

Automated question generation and question answering from Turkish texts using text-to-text transformers

A single model that parses Universal Dependencies across 75 languages.

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

End-to-end text to speech system using gruut and onnx. There are 40 voices available across 8 languages.

PyTorch Implementation of "Bridging Pre-trained Language Models and Hand-crafted Features for Unsupervised POS Tagging" (Findings of ACL 2022)

숭실대학교 컴퓨터학부 전공종합설계프로젝트

Data loaders and abstractions for text and NLP

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP 2020)

Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT.

jiant is an NLP toolkit

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

A multi-voice TTS system trained with an emphasis on quality

2021海华AI挑战赛·中文阅读理解·技术组·第三名

A paper list for aspect based sentiment analysis.

基于GRU网络的句子判断程序/A program based on GRU network for judging sentences