Convert Lecture Videos to PDF

Overview

Convert Lecture Videos to PDF

Description

Want to go through lecture videos faster without missing any information? Wish you can read the lecture video instead of watching it? Now you can! With this python application, you can convert lecture videos to a PDF file! The PDF file will contain a screenshot of lecture slides presented in the video, along with a transcription of your instructor explaining those lecture slide. It can also handle instructors making annotations on their lecture slides and mild amounts of PowerPoint animations.

Table of Contents

  • Walkthrough
  • Getting Started
  • Tweeking the Application
  • Next steps
  • Usage
  • Credits
  • License

Walkthrough of this project

Users will need to download a video file of their lecture. For instance, the video file might look like this:

Users will also need a copy of the video's subtitles.

After running the command line tool, they will get a PDF that looks like this:

where each page contains an image of the lecture video, and a transcription of the instructor explaining about that slide.

Getting Started

  1. Ensure Python3 and Pip is installed on your machine

  2. Next, install package dependencies by running:

    pip3 install -r requirements.txt

  3. Now, run:

    python3 src/main.py tests/videos/input_1.mp4 -s tests/subtitles/subtitles_1.vtt -o output.pdf

    to generate a PDF of this lecture video with these subtitles

  4. The generated PDF will be saved as output.pdf

Tweeking the Application

This application uses computer vision with OpenCV to detect when the instructor has moved on to the next PowerPoint slide, detect animations, etc.

You can adjust the sensitivity to video frame changes in the src/video_segment_finder.py file. You can also visualize how well the application detect transitions and animations via the src/plot.py tool.

Next Steps

  • Automatically generate subtitles
  • Wrap project into a web app?

Usage

Please note that this project is used for educational purposes and is not intended to be used commercially. We are not liable for any damages/changes done by this project.

Credits

Emilio Kartono, who made the entire project.

License

This project is protected under the GNU licence. Please refer to the LICENSE.txt for more information.

Owner
Emilio Kartono
Computer Science Graduate from the University of Toronto
Emilio Kartono
Camelot is a Python library that can help you extract tables from PDFs!

A Python library to extract tabular data from PDFs

1.8k Jan 03, 2023
Merge multiple PDF files into one.

PDF Merger Merge multiple PDF files into one. Usage % python pdf_merger.py -h usage: pdf_merger.py [-h] [-o OUTPUT] [-f [FILES ...]] optional argumen

Duo Apps 6 Oct 03, 2022
This is PDF Merger Application Developed using Just Python

This is PDF Merger Application Developed using Just Python

Sandeep Kumar Reddy 2 Nov 18, 2021
Performing the following operations using python on PDF.

Python PDF Handling Tutorial Python is a highly versatile language with a huge set of libraries. It is a high level language with simple syntax. Pytho

Prajwol Lamichhane 131 Dec 16, 2022
Simple HTML and PDF document generator for Python - with built-in support for popular data analysis and plotting libraries.

Esparto is a simple HTML and PDF document generator for Python. Its primary use is for generating shareable single page reports with content from popular analytics and data science libraries.

Dom 76 Dec 12, 2022
A bulk pdf generator. This application can generate PDFs in bulk by using just one click.

A bulk html pdf generator. This application can generate PDFs in bulk by using just one click. Screenshots Requirements 🧱 Your system must have the f

Aman Nirala 3 Apr 23, 2022
An application which enables the users to perform simple yet intriguing PDF operations

AstutePDF A repository containing the GUI for an application which enables the users to perform simple yet intriguing PDF operations. These include, M

Raghav S 5 Jan 22, 2022
A Python tool to generate a static HTML file that represents the internal structure of a PDF file

PDFSyntax A Python tool to generate a static HTML file that represents the internal structure of a PDF file At some point the low-level functions deve

Martin D. 394 Dec 30, 2022
borb is a library for reading, creating and manipulating PDF files in python.

borb is a library for reading, creating and manipulating PDF files in python.

Joris Schellekens 2.9k Jan 01, 2023
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. ocrmypdf

8k Jan 08, 2023
Python bindings for MuPDF's rendering library.

PyMuPDF 1.19.3 Release date: December 15, 2021 On PyPI since August 2016: Author Jorj X. McKie, based on original code by Ruikai Liu. Introduction PyM

Jorj X. McKie 0 Nov 03, 2022
A simple Python script to convert multiple images (well technically also a single image) into a pdf.

PythonImage2PDF A simple Python script to convert multiple images into a single PDF-document. Created basically for only my own needs for converting m

Joona Gynther 1 Jun 28, 2022
x-ray is a Python library for finding bad redactions in PDF documents.

A tool to detect whether a PDF has a bad redaction

Free Law Project 73 Dec 19, 2022
A bot for PDF for doing Many Things....

Telegram PDF Bot A Telegram bot that can: Compress, crop, decrypt, encrypt, merge, preview, rename, rotate, scale and split PDF files Compare text dif

Mr. Developer 60 Dec 27, 2022
A simple pdf size compressing telegram robot witten in python.

Pdf Compressor Telegram Bot ##About : A simple pdf size compressing telegram robot witten in python. Mostly useful for digital documentation. Deploy t

Renjith Mangal 22 Oct 28, 2022
Program that locks/unlocks pdf files🐍

🐍 📄 PDFtools 📄 🐍 Programa que bloqueia/desbloqueia arquivos pdf Requisitos • Como usar • Capturas de Tela 🚨 Aviso 🚨 Altere os caminhos referente

João Victor Vilela dos Santos 1 Nov 04, 2021
pdf_sprinkles: sprinkles text in your PDFs

pdf_sprinkles: sprinkles text in your PDFs pdf_sprinkles remotely OCRs a PDF with Google Cloud Document AI, and returns the result as a PDF with searc

Will Angley 2 Dec 17, 2021
Trata PDF para torná-lo compatível com PDF/X e com impressoras em escala de cinza.

tratapdf Trata PDF para torná-lo compatível com PDF/X e com impressoras em escala de cinza. dependências icc-profiles ghostscript visualizador de PDF

1 Nov 30, 2021
Pdfencrypt is a tool to encrypt/lock PDFs

Pdfencrypt Pdfencrypt is a tool to encrypt/lock PDFs Installation $ apt update $ apt upgrade $ apt install git $ apt install python $ git clone https:

Anontemitayo 5 Nov 28, 2021
Split given PDF document into 4 page groups and convert them to booklet format

PUTO: PDF to Booklet converter Split given PDF document into 4 page groups and convert them to booklet format. It creates a PDF like shown below: Fir

3 Mar 12, 2022