Conversational text Analysis using various NLP techniques

Last update: Dec 25, 2022

Overview

PyConverse

Let me try first

Installation

pip install pyconverse

Usage

Please try this notebook that demos the core functionalities: basic usage notebook

Introduction

Conversation analytics plays an increasingly important role in shaping great customer experiences across various industries like finance/contact centres etc... primarily to gain a deeper understanding of the customers and to better serve their needs. This library, PyConverse is an attempt to provide tools & methods which can be used to gain an understanding of the conversations from multiple perspectives using various NLP techniques.

Why PyConverse?

I have been doing what can be called conversational text NLP with primarily contact centre data from various domains like Financial services, Banking, Insurance etc for the past year or so, and I have not come across any interesting open-source tools that can help in understanding conversational texts as such I decided to create this library that can provide various tools and methods to analyse calls and help answer important questions/compute important metrics that usually people want to find from conversations, in contact centre data analysis settings.

Where can I use PyConverse?

The primary use case is geared towards contact centre call analytics, but most of the tools that Converse provides can be used elsewhere as well.

There’s a lot of insights hidden in every single call that happens, Converse enables you to extract those insights and compute various kinds of KPIs from the point of Operational Efficiency, Agent Effectiveness & monitoring Customer Experience etc.

If you are looking to answer questions like these:-

What was the overall sentiment of the conversation that was exhibited by the speakers?
Was there periods of dead air(silence periods) between the agents and customer? if so how much?
Was the agent empathetic towards the customer?
What was the average agent response time/average hold time?
What was being said on calls?

and more... pyconverse might be of small help.

What can PyConverse do?

At the moment pyconverse can do a few things that broadly fall into these categories:-

Emotion identification
Empathetic statement identification
Call Segmentation
Topic identification from call segments
Compute various types of Speaker attributes:
1. linguistic attributes like: word counts/number of words per utterance/negations etc.
2. Identify periods of silence & interruptions.
3. Question identification
4. Backchannel identification
Assess the overall nature of the speaker via linguistic attributes and tell if the Speaker is:
1. Talkative, verbally fluent
2. Informal/Personal/social
3. Goal-oriented or Forward/future-looking/focused on past
4. Identify inhibitions

What Next?

Improve documentation.
Add more use case notebooks/examples.
Improve some of the functionalities and make it more streamlined.

Built with:

Transformers	Spacy	Pytorch

Credits:

Note: The backchannel Utterance classification method is inspired by facebook's Unsupervised Topic Segmentation of Meetings with BERT Embeddings paper (arXiv:2106.12978 [cs.LG])

It is a system used to detect bone fractures. using techniques deep learning and image processing

MohammedHussiengadalla-Intelligent-Classification-System-for-Bone-Fractures It is a system used to detect bone fractures. using techniques deep learni

7 Nov 11, 2022

Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

8 Nov 1, 2022

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Multimodal Deep Learning 🎆 🎆 🎆 Announcing the multimodal deep learning repository that contains implementation of various deep learning-based model

Deep Cognition and Language Research (DeCLaRe) Lab

398 Dec 30, 2022

Collection of NLP model explanations and accompanying analysis tools

Thermostat is a large collection of NLP model explanations and accompanying analysis tools. Combines explainability methods from the captum library wi

126 Nov 22, 2022

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks A Transformer-based library for SocialNLP classification tasks. Currently

298 Jan 7, 2023

Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

47 Jan 3, 2023

🐦 Opytimizer is a Python library consisting of meta-heuristic optimization techniques.

Opytimizer: A Nature-Inspired Python Optimizer Welcome to Opytimizer. Did you ever reach a bottleneck in your computational experiments? Are you tired

546 Dec 31, 2022

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

8.1k Jan 6, 2023

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

2.8k Jan 8, 2023

Comments

SemanticTextSegmentation NaN With All Stop Words

When running semantic text segmentation, I found that if the input utterance line is all stop words, (i.e. "Bye. Uh huh. Yeah."), SemanticTextSegmentation._get_similarity fails with ValueError: Input contains NaN.

I found that adding a check for nan in both embeddings could solve this problem.

def _get_similarity(self, text1, text2):
    sentence_1 = [i.text.strip()
                  for i in nlp(text1).sents if len(i.text.split(' ')) > 1]
    sentence_2 = [i.text.strip()
                  for i in nlp(text2).sents if len(i.text.split(' ')) > 2]
    embeding_1 = model.encode(sentence_1)
    embeding_2 = model.encode(sentence_2)
    embeding_1 = np.mean(embeding_1, axis=0).reshape(1, -1)
    embeding_2 = np.mean(embeding_2, axis=0).reshape(1, -1)

    if np.any(np.isnan(embeding_1)) or np.any(np.isnan(embeding_2)):
            return 1

    sim = cosine_similarity(embeding_1, embeding_2)
    return sim

I would like to have someone else look at it because I don't want to make any assumptions that the stop words should be part of the same segments.

opened by Haowjy 1

Updated lru_cache decorator.

After installing and running the library pyconverse on python-3.7 or below and using the import statement it gives error in import itself. I went through the utils file and saw that the "@lru_cache" decorator was written as per the new python(i.e. 3.8+) style hence when calling in older versions(py 3.7 and below it raises a NoneType Error) as the LRU_CACHE decorator is written as -" @lru_cache() " with paranthesis for older versions . Hence made the changes. The changes made do not cause any error on the newer versions.

opened by AkashKhamkar 0
Error in importing Callyzer, SpeakerStats

When I want to load the model it's showing this error.Whether it is currently in devloped mode

KeyError: "[E002] Can't find factory for 'tok2vec'. This usually happens when spaCy callsnlp.create_pipewith a component name that's not built in - for example, when constructing the pipeline from a model's meta.json. If you're using a custom component, you can write to Language.factories['tok2vec'] or remove it from the ### model meta and add it vianlp.add_pipeinstead.

opened by kalpa277 0

Releases(v0.2.0)

v0.2.0(Nov 21, 2021)
First Release of PyConverse library.

Conversational Transcript Analysis using various NLP techniques.

Emotion identification

Empathetic statement identification

Call Segmentation

Topic identification from call segments

Compute various types of Speaker attributes:

linguistic attributes like : word counts/number of words per utterance/negations etc

Identify periods of silence & interruptions.

Question identification

Backchannel identification

Assess the overall nature of the speaker via linguistic attributes and tell if the Speaker is:

Talkative, verbally fluent

Informal/Personal/social

Goal-oriented or Forward/future-looking/focused on past

Identify inhibitions

Source code(tar.gz)
Source code(zip)

Owner

Rita Anjana

ML engineer

GitHub Repository

Codes to calculate solar-sensor zenith and azimuth angles directly from hyperspectral images collected by UAV. Works only for UAVs that have high resolution GNSS/IMU unit.

UAV Solar-Sensor Angle Calculation Table of Contents About The Project Built With Getting Started Prerequisites Installation Datasets Contributing Lic

1 Jan 15, 2022

The modify PyTorch version of Siam-trackers which are speed-up by TensorRT.

SiamTracker-with-TensorRT The modify PyTorch version of Siam-trackers which are speed-up by TensorRT or ONNX. [Updating...] Examples demonstrating how

9 Dec 13, 2022

Crab is a ﬂexible, fast recommender engine for Python that integrates classic information ﬁltering recommendation algorithms in the world of scientiﬁc Python packages (numpy, scipy, matplotlib).

Crab - A Recommendation Engine library for Python Crab is a ﬂexible, fast recommender engine for Python that integrates classic information ﬁltering r

1.2k Dec 21, 2022

Differential fuzzing for the masses!

NEZHA NEZHA is an efficient and domain-independent differential fuzzer developed at Columbia University. NEZHA exploits the behavioral asymmetries bet

147 Dec 05, 2022

Driller: augmenting AFL with symbolic execution!

Driller Driller is an implementation of the driller paper. This implementation was built on top of AFL with angr being used as a symbolic tracer. Dril

791 Jan 06, 2023

A Bayesian cognition approach for belief updating of correlation judgement through uncertainty visualizations

Overview Code and supplemental materials for Karduni et al., 2020 IEEE Vis. "A Bayesian cognition approach for belief updating of correlation judgemen

1 Feb 08, 2022

Puzzle-CAM: Improved localization via matching partial and full features.

Puzzle-CAM The official implementation of "Puzzle-CAM: Improved localization via matching partial and full features".

150 Nov 14, 2022

PyTorch implementation of ICLR 2022 paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

147 Jan 07, 2023

Unofficial implementation of Fast-SCNN: Fast Semantic Segmentation Network

Fast-SCNN: Fast Semantic Segmentation Network Unofficial implementation of the model architecture of Fast-SCNN. Real-time Semantic Segmentation and mo

69 Aug 11, 2022

[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

18 Jul 19, 2022

This is our ARTS test set, an enriched test set to probe Aspect Robustness of ABSA.

This is the repository for our 2020 paper "Tasty Burgers, Soggy Fries: Probing Aspect Robustness in Aspect-Based Sentiment Analysis". Data We provide

35 Nov 16, 2022

Auxiliary data to the CHIIR paper Searching to Learn with Instructional Scaffolding

Searching to Learn with Instructional Scaffolding This is the data and analysis code for the paper "Searching to Learn with Instructional Scaffolding"

2 Mar 02, 2022

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

An Image Captioning codebase This is a codebase for image captioning research. It supports: Self critical training from Self-critical Sequence Trainin

1.3k Dec 31, 2022

A pre-trained language model for social media text in Spanish

RoBERTuito A pre-trained language model for social media text in Spanish READ THE FULL PAPER Github Repository RoBERTuito is a pre-trained language mo

25 Dec 29, 2022

Wider-Yolo Kütüphanesi ile Yüz Tespit Uygulamanı Yap

WIDER-YOLO : Yüz Tespit Uygulaması Yap Wider-Yolo Kütüphanesinin Kullanımı 1. Wider Face Veri Setini İndir Train Dataset Val Dataset Test Dataset Not:

6 Aug 22, 2022

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

Squirrel Core Share, load, and transform data in a collaborative, flexible, and efficient way What is Squirrel? Squirrel is a Python library that enab

249 Dec 07, 2022