This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Overview

Introduction

This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

If you find this code useful, please cite the following paper:

@article{tan2022coherence,
  title = {Coherence-Based Distributed Document Representation Learning for Scientific Documents},
  author = {Tan, Shicheng and Zhao, Shu and Zhang, Yanping},
  journal = {arXiv},
  year = {2022},
  type = {Journal Article}
}

Run

  1. Installation environment (ref. requirements.txt)
  2. Download data: Link: https://pan.baidu.com/s/1EEJk0_P55Ov5ReXsmyVZPA Password: rkh0
  3. python _av_CTE.py

信息检索数据运行指南

  1. 数据处理(4个文件):使用“...data helper-IR.py”获取3份数据,原始数据处理暂存文件、原始数据处理暂存文件的语料、构建的数据集,然后使用“_aj_get dataset corpus.py”获得 构建的数据集的语料
  2. 词向量训练(4个文件):使用“_ak_get word embedding.py”训练第一步的2个语料得到2个词表和2个词向量文件,glove需要去除后缀名“.txt”
  3. 运行5次“_al_em-avg.py”得到5个结果,avg-word2vec、avg-word2vec(globe)、avg-glove、avg-glove(globe)、random embedding
  4. 运行“_ac_tf-idf.py”得到一个距离矩阵和1个结果,矩阵用于CTE方法
  5. LDA、doc2vec、BM25、LSI、GPT2、XLNet、GPT、Transformer-XL、XLM 对应文件各运行一次得到9个结果
  6. 运行“_ah_WMD.py”4次得到4个结果,WMD-word2vec、WMD-word2vec(globe)、WMD-glove、WMD-glove(globe)
  7. 运行“_at_BERT.py”2次得到2个结果,BERT-Large uncased、BERT-Large uncased(wwm)
  8. 运行“_at_ELMo.py”2次得到2个结果,ELMo-Original(5.5B)、ELMo-Original(5.5B,级联)
  9. 运行“_av_CET.py”13次得到13个结果,基于 random embedding 等13种基础词向量
Owner
tsc
Artificial general intelligence
tsc
Pytorch implementation of AREL

Status: Archive (code is provided as-is, no updates expected) Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement

8 Nov 25, 2022
SpecAugmentPyTorch - A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

SpecAugment An implementation of SpecAugment for Pytorch How to use Install pytorch, version=1.9.0 (new feature (torch.Tensor.take_along_dim) is used

IMLHF 3 Oct 11, 2022
Melanoma Skin Cancer Detection using Convolutional Neural Networks and Transfer Learning🕵🏻‍♂️

This is a Kaggle competition in which we have to identify if the given lesion image is malignant or not for Melanoma which is a type of skin cancer.

Vipul Shinde 1 Jan 27, 2022
Convolutional Neural Network for 3D meshes in PyTorch

MeshCNN in PyTorch SIGGRAPH 2019 [Paper] [Project Page] MeshCNN is a general-purpose deep neural network for 3D triangular meshes, which can be used f

Rana Hanocka 1.4k Jan 04, 2023
Code accompanying "Adaptive Methods for Aggregated Domain Generalization"

Adaptive Methods for Aggregated Domain Generalization (AdaClust) Official Pytorch Implementation of Adaptive Methods for Aggregated Domain Generalizat

Xavier Thomas 15 Sep 20, 2022
Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos

Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos

Google 89 Dec 22, 2022
Official code release for: EditGAN: High-Precision Semantic Image Editing

Official code release for: EditGAN: High-Precision Semantic Image Editing

565 Jan 05, 2023
WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

30 Oct 28, 2022
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet, ICCV 2021 Update: 2021/03/11: update our new results. Now our T2T-ViT-14 w

YITUTech 1k Dec 31, 2022
Multilingual Image Captioning

Multilingual Image Captioning Authors: Bhavitvya Malik, Gunjan Chhablani Demo Link: https://huggingface.co/spaces/flax-community/multilingual-image-ca

Gunjan Chhablani 32 Nov 25, 2022
Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial Transformers

Official TensorFlow implementation of the unsupervised reconstruction model using zero-Shot Learned Adversarial TransformERs (SLATER). (https://arxiv.

ICON Lab 22 Dec 22, 2022
Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [CVPR 2021]

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [BCNet, CVPR 2021] This is the official pytorch implementation of BCNet built on

Lei Ke 434 Dec 01, 2022
NEO: Non Equilibrium Sampling on the orbit of a deterministic transform

NEO: Non Equilibrium Sampling on the orbit of a deterministic transform Description of the code This repo describes the NEO estimator described in the

0 Dec 01, 2021
A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection 1. 介绍 用以替代 NMS,在所有 bbox 中挑选出最优的集合。 NMS 仅考虑了 bbox 的得分,然后根据 IOU 来

44 Sep 15, 2022
A library for efficient similarity search and clustering of dense vectors.

Faiss Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any

Meta Research 18.8k Jan 08, 2023
Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

TANG, shixiang 6 Nov 25, 2022
Source code of article "Towards Toxic and Narcotic Medication Detection with Rotated Object Detector"

Towards Toxic and Narcotic Medication Detection with Rotated Object Detector Introduction This is the source code of article: Towards Toxic and Narcot

Woody. Wang 3 Oct 29, 2022
Collection of NLP model explanations and accompanying analysis tools

Thermostat is a large collection of NLP model explanations and accompanying analysis tools. Combines explainability methods from the captum library wi

126 Nov 22, 2022
We utilize deep reinforcement learning to obtain favorable trajectories for visual-inertial system calibration.

Unified Data Collection for Visual-Inertial Calibration via Deep Reinforcement Learning Update: The lastest code will be updated in this branch. Pleas

ETHZ ASL 27 Dec 29, 2022
Using pretrained language models for biomedical knowledge graph completion.

LMs for biomedical KG completion This repository contains code to run the experiments described in: Scientific Language Models for Biomedical Knowledg

Rahul Nadkarni 41 Nov 30, 2022