Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.

Overview

Welcome to Healthsea โœจ

Create better access to health with spaCy.

Healthsea is a pipeline for analyzing user reviews to supplement products by extracting their effects on health.

Learn more about Healthsea in our blog post!

๐Ÿ’‰ Creating better access to health

Healthsea aims to analyze user-written reviews of supplements in relation to their effects on health. Based on this analysis, we try to provide product recommendations. For many people, supplements are an addition to maintaining health and achieving personal goals. Due to their rising popularity, consumers have increasing access to a variety of products.

However, it's likely that most of the products on the market are redundant or produced in a "quantity over quality" fashion to maximize profit. The resulting white noise of products makes it hard to find the right supplements.

Healthsea automizes the analysis and provides information in a more digestible way. โœจ


๐ŸŸข Requirements

To run this project you need:

spacy>=3.2.0
benepar>=0.2.0
torch>=1.6.0
spacy-transformers>=1.1.2

You can install them in the project folder via spacy project run install

๐Ÿ“– Documentation

Documentation
๐Ÿงญ Usage How to use the pipeline
โš™๏ธ Pipeline Learn more about the architecture of the pipeline
๐Ÿช spaCy project Introduction to the spaCy project
โœจ Demos Introduction to the Healthsea demos

๐Ÿงญ Usage

The pipeline processes reviews to supplements and returns health effects for every found health aspect.

You can either train the pipeline yourself with the provided datasets in the spaCy project or directly download the trained Healthsea pipeline from Huggingface via pip install https://huggingface.co/explosion/en_healthsea/resolve/main/en_healthsea-any-py3-none-any.whl

import spacy

nlp = spacy.load("en_healthsea")
doc = nlp("This is great for joint pain.")

# Clause Segmentation & Blinding
print(doc._.clauses)

>    {"split_indices": [0, 7],
>    "has_ent": true,
>    "ent_indices": [4, 6],
>    "blinder": "_CONDITION_",
>    "ent_name": "joint pain",
>    "cats": {
>        "POSITIVE": 0.9824668169021606,
>        "NEUTRAL": 0.017364952713251114,
>        "NEGATIVE": 0.00002889777533710003,
>        "ANAMNESIS": 0.0001394189748680219
>    },
>    "prediction_text": ["This", "is", "great", "for", "_CONDITION_", "!"]}

# Aggregated results
print(doc._.health_effects)

>    {"joint_pain": {
>        "effects": ["POSITIVE"],
>        "effect": "POSITIVE",
>        "label": "CONDITION",
>        "text": "joint pain"
>    }}


โš™๏ธ Pipeline

The pipeline consists of the following components:

pipeline = [sentencizer, tok2vec, ner, benepar, segmentation, clausecat, aggregation]

It uses Named Entity Recognition to detect two types of entities Condition and Benefit.

Condition entities are defined as health aspects that are improved by decreasing them. They include diseases, symptoms and general health problems (e.g. pain in back). Benefit entities on the other hand, are desired states of health (muscle recovery, glowing skin) that improve by increasing them.

The pipeline uses a modified model that performs Clause Segmentation based on the benepar parser, Entity Blinding and Text Classification. It predicts four exclusive effects: Positive, Negative, Neutral, and Anamnesis.


๐Ÿช spaCy project

The project folder contains a spaCy project with all the training data and workflows.

Use spacy project run inside the project folder to get an overview of all commands and assets. For more detailed documentation, visit the project folders readme.

Use spacy project run install to install dependencies needed for the pipeline.

โœจ Demo

Healthsea Demo

A demo for exploring the results of Healthsea on real data can be found at Hugging Face Spaces.

Healthsea Pipeline

A demo for exploring the Healthsea pipeline with its individual processing steps can be found at Hugging Face Spaces.

Owner
Explosion
A software company specializing in developer tools for Artificial Intelligence and Natural Language Processing
Explosion
Ukrainian TTS (text-to-speech) using Coqui TTS

title emoji colorFrom colorTo sdk app_file pinned Ukrainian TTS ๐Ÿธ green green gradio app.py false Ukrainian TTS ๐Ÿ“ข ๐Ÿค– Ukrainian TTS (text-to-speech)

Yurii Paniv 85 Dec 26, 2022
This is my reading list for my PhD in AI, NLP, Deep Learning and more.

This is my reading list for my PhD in AI, NLP, Deep Learning and more.

Zhong Peixiang 156 Dec 21, 2022
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides

ASYML 2.3k Jan 07, 2023
ไธญๆ–‡ๅŒป็–—ไฟกๆฏๅค„็†ๅŸบๅ‡†CBLUE: A Chinese Biomedical LanguageUnderstanding Evaluation Benchmark

English | ไธญๆ–‡่ฏดๆ˜Ž CBLUE AI (Artificial Intelligence) is playing an indispensabe role in the biomedical field, helping improve medical technology. For fur

452 Dec 30, 2022
This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text.

Text Summarizer This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text. Team Members This mini-project was

1 Nov 16, 2021
Utilize Korean BERT model in sentence-transformers library

ko-sentence-transformers ์ด ํ”„๋กœ์ ํŠธ๋Š” KoBERT ๋ชจ๋ธ์„ sentence-transformers ์—์„œ ๋ณด๋‹ค ์‰ฝ๊ฒŒ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด ๋งŒ๋“ค์–ด์กŒ์Šต๋‹ˆ๋‹ค. Ko-Sentence-BERT-SKTBERT ํ”„๋กœ์ ํŠธ์—์„œ๋Š” KoBERT ๋ชจ๋ธ์„ sentence-trans

Junghyun 40 Dec 20, 2022
Shared, streaming Python dict

UltraDict Sychronized, streaming Python dictionary that uses shared memory as a backend Warning: This is an early hack. There are only few unit tests

Ronny Rentner 192 Dec 23, 2022
Korean extractive summarization. 2021 AI ํ…์ŠคํŠธ ์š”์•ฝ ์˜จ๋ผ์ธ ํ•ด์ปคํ†ค ํ™”์„ฑ๊ฐˆ๋„๋‹ˆ๊นŒํŒ€ ์ฝ”๋“œ

korean extractive summarization 2021 AI ํ…์ŠคํŠธ ์š”์•ฝ ์˜จ๋ผ์ธ ํ•ด์ปคํ†ค ํ™”์„ฑ๊ฐˆ๋„๋‹ˆ๊นŒํŒ€ ์ฝ”๋“œ Leaderboard Notice Text Summarization with Pretrained Encoders์— ๋‚˜์˜ค๋Š” bertsumext๋ชจ๋ธ(ext

3 Aug 10, 2022
Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).

XLM-EMO: Multilingual Emotion Prediction in Social Media Text Abstract Detecting emotion in text allows social and computational scientists to study h

MilaNLP 35 Sep 17, 2022
List of GSoC organisations with number of times they have been selected.

Welcome to GSoC Organisation Frequency And Details ๐Ÿ‘‹ List of GSoC organisations with number of times they have been selected, techonologies, topics,

Shivam Kumar Jha 41 Oct 01, 2022
Black for Python docstrings and reStructuredText (rst).

Style-Doc Style-Doc is Black for Python docstrings and reStructuredText (rst). It can be used to format docstrings (Google docstring format) in Python

Telekom Open Source Software 13 Oct 24, 2022
TaCL: Improve BERT Pre-training with Token-aware Contrastive Learning

TaCL: Improve BERT Pre-training with Token-aware Contrastive Learning

Yixuan Su 26 Oct 17, 2022
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Table of contents Introduction Using BARTpho with fairseq Using BARTpho with transformers Notes BARTpho: Pre-trained Sequence-to-Sequence Models for V

VinAI Research 58 Dec 23, 2022
LSTM based Sentiment Classification using Tensorflow - Amazon Reviews Rating

LSTM based Sentiment Classification using Tensorflow - Amazon Reviews Rating (Dataset) The dataset is from Amazon Review Data (2018)

Immanuvel Prathap S 1 Jan 16, 2022
null

CP-Cluster Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection, Instance Segme

Yichun Shen 41 Dec 08, 2022
Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

IMDB Sentiment Analysis This is the final project of Machine Learning Courses in Huazhong University of Science and Technology, School of Artificial I

Daniel 0 Dec 27, 2021
PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

molten A minimal, extensible, fast and productive API framework for Python 3. Changelog: https://moltenframework.com/changelog.html Community: https:/

3.2k Dec 28, 2022
Snips Python library to extract meaning from text

Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur

Snips 3.7k Dec 30, 2022
auto_code_complete is a auto word-completetion program which allows you to customize it on your need

auto_code_complete v1.3 purpose and usage auto_code_complete is a auto word-completetion program which allows you to customize it on your needs. the m

RUO 2 Feb 22, 2022