Automatically erase objects in the video, such as logo, text, etc.

Overview

Video-Auto-Wipe

Read English Introduction:Here

  本人不定期的基于生成技术制作一些好玩有趣的算法模型,这次带来的作品是“视频擦除”方向的应用模型,它实现的功能是自动感知到视频中我们不想看见的部分(譬如广告、水印、字幕、图标等等)然后进行擦除。由于图标擦除模型存在潜在的被利用于侵权行为的隐患,因此我暂时只分享了字幕擦除模型,希望能帮助到大家。
  我后续会持续不断的探索和制作新的生成方向的技术内容。基于生成模型可玩的点还有很多,此项目仅展示了其中一个做落地应用的例子。本项目的模型版权所属为:www.seeprettyface.com ,未获得授权请不要直接用作商业用途。关于算法的细节介绍可以参阅我的研究笔记



效果预览

1. 图标擦除

  图标擦除模型的功能是模型自动感知到视频中图标的位置然后进行擦除,感知图标的方法为在时域上静止不动的小块像素块被视作图标。

1.1 测试1-电视剧的台标、剧名和角标擦除

Image text

查看视频



1.2 测试2-足球赛的台标、状态栏擦除

Image text

查看视频



1.3 测试3-综艺节目的台标、状态栏擦除

Image text

查看视频



1.4 测试4-短视频MV的遮挡图标擦除

Image text

查看视频



1.5 测试5-短视频MV的遮挡水印擦除

Image text

查看视频



1.6 测试6-新闻媒体的台标擦除

Image text

查看视频





2. 动态图标擦除

  动态图标擦除模型的功能是模型自动感知到视频中动态图标的位置然后进行擦除,感知动态图标的方法为在时域上闪烁出现或动态移动的固定像素块被视作动态图标,这个在制作上有一定难度所以还没有对外开放。

2.1 测试1-闪烁出现的特效文字擦除

Image text

Image text

查看视频





3. 字幕擦除

  字幕擦除模型的功能是模型自动感知到视频中字幕的位置然后进行擦除,感知字幕的方法为具有统一样式的文字区域被视作字幕。

3.1 测试1-电影字幕擦除

Image text

查看视频



3.2 测试2-电视剧字幕擦除

Image text

查看视频



3.3 测试3-综艺节目字幕擦除

Image text

查看视频



3.4 测试4-综艺节目特殊字幕擦除

Image text

查看视频



3.5 测试5-网络视频字幕擦除

Image text

查看视频



3.6 测试6-小语种字幕擦除

Image text

查看视频





使用方法

1.环境配置

  torch>1.0
  其他的缺什么依赖就pip install xxx,需要的东西不多

2.运行方法

  1. 下载预训练模型放在pretrained-weight文件夹里;
    预训练模型下载地址:链接:https://pan.baidu.com/s/1ubZHkgkcskS7Bpg8ZbtoRQ 提取码:ricn

  2. 将视频文件和mask文件放在input文件夹里,编辑demo.py(或通过命令行参数)选中对应文件位置;
    输入样例下载地址:https://pan.baidu.com/s/1rfdAwxomCVjTJ1zwl7hu3g 提取码:qk64

  3. 图标擦除任务运行:python demo.py delogo
   字幕擦除任务运行:python demo.py detext



训练方法

训练数据

  1.YoutubeVOS2018数据集;

  2.基于搜集的300余部高清电影制作了2,709部电影片段数据集;
    下载地址:https://pan.baidu.com/s/1CIgJmFmx5iR2JfgAyjVaeg 提取码:xb7o

  3.基于搜集的40余部综艺节目制作了864部综艺片段数据集;
    下载地址:https://pan.baidu.com/s/1lJk6IIWlwxknAie0LlGYOg 提取码:9rd4

训练过程

  第1步. 针对特定任务的时域感知训练;
  第2步. 融合擦除模型的微调训练。

训练配置

最近寻觅到了一种非常简易的制作和训练方法:
  '图标擦除'模型在单卡3090上训练3天;
  '字幕擦除'模型在单卡3090上训练2天;





更多玩法

  这个项目目前还只是做了很短期的尝试,实际上视频擦除可玩的点还有很多,譬如敏感内容(涉黄涉暴等)擦除、广告擦除、指定人/物擦除、背景人擦除等等。只要是能寻找到有像素预测的场景+有像素预测的需求都是“视频擦除”可以玩出花样的情景~

Sample





了解更多

  本人的研究方向是生成模型的应用技术研究。生成技术解决的问题是像素的预测,也就是在一个有缺失/完全缺失的图像棋盘上进行像素的填补/预测,使填补/预测完的图像符合真实图像的规律。基于这种模式可展开的玩法有很多,除了我之前做的数字人生成、视频内容生成等,我们还可以拓展出更多并行的思路出来。
  尽管目前大部分的CV落地项目都集中在感知和识别任务上,而对于重构和生成任务的研发相对较少,但这不应影响我们对于生成技术价值的判断,毕竟生成技术是相对较新、参与人较少,但是应用前景较广的研究方向。我后续将持续致力于探索生成方向的落地型算法研发,欢迎访问我的网站了解这方面最新的研究进展:www.seeprettyface.com

Sample

Owner
seeprettyface.com
seeprettyface.com
Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"

corn-ordinal-neuralnet This repository contains the orginal model code and experiment logs for the paper "Deep Neural Networks for Rank Consistent Ord

Raschka Research Group 14 Dec 27, 2022
Training PSPNet in Tensorflow. Reproduce the performance from the paper.

Training Reproduce of PSPNet. (Updated 2021/04/09. Authors of PSPNet have provided a Pytorch implementation for PSPNet and their new work with support

Li Xuhong 126 Jul 13, 2022
Python Algorithm Interview Book Review

파이썬 알고리즘 인터뷰 책 리뷰 리뷰 IT 대기업에 들어가고 싶은 목표가 있다. 내가 꿈꿔온 회사에서 일하는 사람들의 모습을 보면 멋있다고 생각이 들고 나의 목표에 대한 열망이 강해지는 것 같다. 미래의 핵심 사업 중 하나인 SW 부분을 이끌고 발전시키는 우리나라의 I

SharkBSJ 1 Dec 14, 2021
This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

International Business Machines 72 Jan 06, 2023
MultiLexNorm 2021 competition system from ÚFAL

ÚFAL at MultiLexNorm 2021: Improving Multilingual Lexical Normalization by Fine-tuning ByT5 David Samuel & Milan Straka Charles University Faculty of

ÚFAL 13 Jun 28, 2022
A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Visdom A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Python. Overview Concepts Setup Usage API To

FOSSASIA 9.4k Jan 07, 2023
Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"

Manga Character Screentone Synthesis Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters" presented in IEEE ISM 2

Tsubota 2 Nov 20, 2021
YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

阿才 73 Dec 16, 2022
Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning, CVPR 2021

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning By Zhenda Xie*, Yutong Lin*, Zheng Zhang, Yue Ca

Zhenda Xie 293 Dec 20, 2022
A hybrid SOTA solution of LiDAR panoptic segmentation with C++ implementations of point cloud clustering algorithms. ICCV21, Workshop on Traditional Computer Vision in the Age of Deep Learning

ICCVW21-TradiCV-Survey-of-LiDAR-Cluster Motivation In contrast to popular end-to-end deep learning LiDAR panoptic segmentation solutions, we propose a

YimingZhao 103 Nov 22, 2022
Plenoxels: Radiance Fields without Neural Networks

Plenoxels: Radiance Fields without Neural Networks Alex Yu*, Sara Fridovich-Keil*, Matthew Tancik, Qinhong Chen, Benjamin Recht, Angjoo Kanazawa UC Be

Sara Fridovich-Keil 81 Dec 25, 2022
Franka Emika Panda manipulator kinematics&dynamics simulation

pybullet_sim_panda Pybullet simulation environment for Franka Emika Panda Dependency pybullet, numpy, spatial_math_mini Simple example (please check s

0 Jan 20, 2022
This repository introduces a short project about Transfer Learning for Classification of MRI Images.

Transfer Learning for MRI Images Classification This repository introduces a short project made during my stay at Neuromatch Summer School 2021. This

Oscar Guarnizo 3 Nov 15, 2022
Codebase of deep learning models for inferring stability of mRNA molecules

Kaggle OpenVaccine Models Codebase of deep learning models for inferring stability of mRNA molecules, corresponding to the Kaggle Open Vaccine Challen

Eternagame 40 Dec 29, 2022
Hypersearch weight debugging and losses tutorial

tutorial Activate tensorboard option Running TensorBoard remotely When working on a remote server, you can use SSH tunneling to forward the port of th

1 Dec 11, 2021
Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

Self-Supervised-MVS This repository is the official PyTorch implementation of our AAAI 2021 paper: "Self-supervised Multi-view Stereo via Effective Co

hongbin_xu 127 Jan 04, 2023
Deep learned, hardware-accelerated 3D object pose estimation

Isaac ROS Pose Estimation Overview This repository provides NVIDIA GPU-accelerated packages for 3D object pose estimation. Using a deep learned pose e

NVIDIA Isaac ROS 41 Dec 18, 2022
The Instructed Glacier Model (IGM)

The Instructed Glacier Model (IGM) Overview The Instructed Glacier Model (IGM) simulates the ice dynamics, surface mass balance, and its coupling thro

27 Dec 16, 2022
Video-based open-world segmentation

UVO_Challenge Team Alpes_runner Solutions This is an official repo for our UVO Challenge solutions for Image/Video-based open-world segmentation. Our

Yuming Du 84 Dec 22, 2022
Datasets, Transforms and Models specific to Computer Vision

torchvision The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision. Installat

13.1k Jan 02, 2023