FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

Last update: Sep 01, 2022

Related tags

Deep Learning famie

Overview

FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE). FAMIE is designed to address a fundamental problem in existing AL frameworks where annotators need to wait for a long time between annotation batches due to the time-consuming nature of model training and data selection at each AL iteration. With a novel proxy AL mechanism and the integration of our SOTA multilingual toolkit Trankit, FAMIE can quickly provide users with a labeled dataset and a ready-to-use model for different IE tasks over 100 languages.

FAMIE's documentation page: https://famie.readthedocs.io

FAMIE's demo website: http://nlp.uoregon.edu:9000/

Installation

FAMIE can be easily installed via one of the following methods:

Using pip

pip install famie

The command would install FAMIE and all dependent packages automatically.

From source

git clone https://github.com/nlp-uoregon/famie.git
cd famie
pip install -e .

This would first clone our github repo and install FAMIE.

Usage

FAMIE currently supports Named Entity Recognition and Event Detection for over 100 languages. Using FAMIE includes three following steps:

Start an annotation session.
Annotate data for a target task.
Access the labeled data and a ready-to-use model returned by FAMIE.

Starting an annotation session

To start an annotation session, please use the following command:

famie start

This will run a server on users' local machines (no data or models will leave users' local machines), users can access FAMIE's web interface via the URL: http://127.0.0.1:9000/ . As FAMIE is an AL framework, it provides different data selection algorithms that recommend users the most beneficial examples to label at each annotation iteration. This is done via passing an optional argument --selection [mnlp|badge|bertkm|random].

Annotating data

Accessing the labeled data and the trained model

import famie

# access a project via its name
p = famie.get_project('named-entity-recognition') 

# access the project's labeled data
data = p.get_labeled_data() # a Python dictionary

# export the project's labeled data to a file
p.export_labeled_data('data.json')

# export the project's trained model to a file
p.export_trained_model('model.ckpt')

# access the project's trained model
model = p.get_trained_model()

# access a trained model from file
model = famie.load_model_from_file('model.ckpt')

# use the trained model to make predicions
model.predict('Oregon is a beautiful state!')
# ['B-Location', 'O', 'O', 'O', 'O']

FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

Related tags

Overview

FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

Installation

Using pip

From source

Usage

Starting an annotation session

Annotating data

Accessing the labeled data and the trained model

Owner

A generalized framework for prototyping full-stack cooperative driving automation applications under CARLA+SUMO.

Federated Learning Based on Dynamic Regularization

Picasso: A CUDA-based Library for Deep Learning over 3D Meshes

a project for 3D multi-object tracking

Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

A Python package to create, run, and post-process MODFLOW-based models.

Differentiable Abundance Matching With Python

Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

OpenVisionAPI server

Memoized coduals - Shows that it is possible to implement reverse mode autodiff using a variation on the dual numbers called the codual numbers

NPBG++: Accelerating Neural Point-Based Graphics

This is the repository of shape matching algorithm Iterative Rotations and Assignments (IRA)

Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

some academic posters as references. May we have in-person poster session soon!

This is a demo app to be used in the video streaming applications

Multi agent DDPG algorithm written in Python + Pytorch

Biomarker identification for COVID-19 Severity in BALF cells Single-cell RNA-seq data

Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision