TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

Last update: Dec 26, 2022

Related tags

Deep Learning TEDSummary

Overview

TEDSummary

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id. This script crawls the TEDTalk website to get the above information. However, this script does not supply audio data. You can use the utterance id to align TED-LIUM3 (https://www.openslr.org/51/) or extract audio from the MP4 file.

References

[1] Takatomo Kano, Atsunori Ogawa, Marc Delcroix, and Shinji Watanabe "Attention-based Multi-hypothesis Fusion for Speech Summarization," Proc. ASRU, pp. –, 2021

Citation
@inproceedings{attention-fusion,
author = {Takatomo Kano and Atsunori Ogawa and Marc Delcroix and Shinji Watanabe},
title = {Attention-based Multi-hypothesis Fusion for Speech Summarization},
booktitle = {{ASRU 2021 - 2021 IEEE Automatic Speech Recoginition and Understanding Workshop (ASRU)}},
pages={-},
year = {2021}
}

Install tools

Python 3. requests unidecode json tqdm unicodedata

How to run

cd TEDSummary/ python TEDListCrawler.py

Outputs

telklist.json: URLs list for tedtalks.
ted_summary.json: Summarization dataset. That includes summary IDs, TEDTalk URL, mp4 URL, document, abstract, title, speaker name, and uttrance id for Tedlium alignment.

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

Related tags

Overview

TEDSummary

References

Install tools

How to run

Outputs

Owner

PyTorch implementation of ENet

The Video-based Accident Detection System built in Python

Mmdetection3d Noted - MMDetection3D is an open source object detection toolbox based on PyTorch

Prevent `CUDA error: out of memory` in just 1 line of code.

The official repository for BaMBNet

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

This repository is all about spending some time the with the original problem posed by Minsky and Papert

TorchX: A PyTorch Extension Library for More Efficient Deep Learning

Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

KITTI-360 Annotation Tool is a framework that developed based on python(cherrypy + jinja2 + sqlite3) as the server end and javascript + WebGL as the front end.

Official Pytorch implementation of "Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021)

This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation).

code for Multi-scale Matching Networks for Semantic Correspondence, ICCV

The fastai book, published as Jupyter Notebooks

This project aims to segment 4 common retinal lesions from Fundus Images.

TJU Deep Learning & Neural Network

A universal framework for learning timestamp-level representations of time series

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP. Democratize AI for everyone.

An open source app to help calm you down when needed.