Interpretable-contrastive-word-mover-s-embedding

Paper Datasets

Here is a Dropbox link to the datasets used in the paper: https://www.dropbox.com/sh/nf532hddgdt68ix/AABGLUiPRyXv6UL2YAcHmAFqa?dl=0 The dataset in the above link was provided in .mat file. You may need to transform to the .npy file to run our code. Each mat file contains following component
X is a cell array of all documents, each represented by a dxm matrix where d is the dimensionality of the word embedding and m is the number of unique words in the document. ("BBCsports.npy")
Y is an array of labels ("BBCsports_grade.npy")
BOW_X is a cell array of word counts for each document('weight.npy')
indices is a cell array of global unique IDs for words in a document
TR is a matrix whose ith row is the ith training split of document indices('index_tr.npy')
TE is a matrix whose ith row is the ith testing split of document indices('index_te.npy')
'BBCsports_length.npy' is the number of unique words for each sample.

Demo

In the demo code we use BBCsports data set. The data is preprocessed and has been saved as .npy file can be found in the following link: https://drive.google.com/drive/folders/1GuQsHS1J8J24GnCmTCTDPH5hWWYtmw4s?usp=sharing
Please put the data into the same path as 2 python files.
Use

python run_pos.py

to run the file.

Citation

If you find this repo useful for your research, please consider citing the paper

@misc{jiang2021interpretable,
    title={Interpretable contrastive word mover's embedding},
    author={Ruijie Jiang and Julia Gouvea and Eric Miller and David Hammer and Shuchin Aeron},
    year={2021},
    eprint={2111.01023},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Any question please feel free to contact Ruijie Jiang ([email protected]).

Interpretable-contrastive-word-mover-s-embedding

Related tags

Overview

Interpretable-contrastive-word-mover-s-embedding

Paper Datasets

Demo

Citation

Owner

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationships.

DataCLUE: 国内首个以数据为中心的AI测评（含模型分析报告）

SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

Deformable DETR is an efficient and fast-converging end-to-end object detector.

[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

Classifies galaxy morphology with Bayesian CNN

Pytorch implementation of face attention network

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.

The Video-based Accident Detection System built in Python

Deep metric learning methods implemented in Chainer

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Auditing Black-Box Prediction Models for Data Minimization Compliance

InsightFace: 2D and 3D Face Analysis Project on MXNet and PyTorch

A blender add-on that automatically re-aligns wrong axis objects.

The "breathing k-means" algorithm with datasets and example notebooks

Code for binary and multiclass model change active learning, with spectral truncation implementation.

Mmdet benchmark with python

An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.