Automatically erase objects in the video, such as logo, text, etc.

Overview

Video-Auto-Wipe

Read English Introduction:Here

  本人不定期的基于生成技术制作一些好玩有趣的算法模型,这次带来的作品是“视频擦除”方向的应用模型,它实现的功能是自动感知到视频中我们不想看见的部分(譬如广告、水印、字幕、图标等等)然后进行擦除。由于图标擦除模型存在潜在的被利用于侵权行为的隐患,因此我暂时只分享了字幕擦除模型,希望能帮助到大家。
  我后续会持续不断的探索和制作新的生成方向的技术内容。基于生成模型可玩的点还有很多,此项目仅展示了其中一个做落地应用的例子。本项目的模型版权所属为:www.seeprettyface.com ,未获得授权请不要直接用作商业用途。关于算法的细节介绍可以参阅我的研究笔记



效果预览

1. 图标擦除

  图标擦除模型的功能是模型自动感知到视频中图标的位置然后进行擦除,感知图标的方法为在时域上静止不动的小块像素块被视作图标。

1.1 测试1-电视剧的台标、剧名和角标擦除

Image text

查看视频



1.2 测试2-足球赛的台标、状态栏擦除

Image text

查看视频



1.3 测试3-综艺节目的台标、状态栏擦除

Image text

查看视频



1.4 测试4-短视频MV的遮挡图标擦除

Image text

查看视频



1.5 测试5-短视频MV的遮挡水印擦除

Image text

查看视频



1.6 测试6-新闻媒体的台标擦除

Image text

查看视频





2. 动态图标擦除

  动态图标擦除模型的功能是模型自动感知到视频中动态图标的位置然后进行擦除,感知动态图标的方法为在时域上闪烁出现或动态移动的固定像素块被视作动态图标,这个在制作上有一定难度所以还没有对外开放。

2.1 测试1-闪烁出现的特效文字擦除

Image text

Image text

查看视频





3. 字幕擦除

  字幕擦除模型的功能是模型自动感知到视频中字幕的位置然后进行擦除,感知字幕的方法为具有统一样式的文字区域被视作字幕。

3.1 测试1-电影字幕擦除

Image text

查看视频



3.2 测试2-电视剧字幕擦除

Image text

查看视频



3.3 测试3-综艺节目字幕擦除

Image text

查看视频



3.4 测试4-综艺节目特殊字幕擦除

Image text

查看视频



3.5 测试5-网络视频字幕擦除

Image text

查看视频



3.6 测试6-小语种字幕擦除

Image text

查看视频





使用方法

1.环境配置

  torch>1.0
  其他的缺什么依赖就pip install xxx,需要的东西不多

2.运行方法

  1. 下载预训练模型放在pretrained-weight文件夹里;
    预训练模型下载地址:链接:https://pan.baidu.com/s/1ubZHkgkcskS7Bpg8ZbtoRQ 提取码:ricn

  2. 将视频文件和mask文件放在input文件夹里,编辑demo.py(或通过命令行参数)选中对应文件位置;
    输入样例下载地址:https://pan.baidu.com/s/1rfdAwxomCVjTJ1zwl7hu3g 提取码:qk64

  3. 图标擦除任务运行:python demo.py delogo
   字幕擦除任务运行:python demo.py detext



训练方法

训练数据

  1.YoutubeVOS2018数据集;

  2.基于搜集的300余部高清电影制作了2,709部电影片段数据集;
    下载地址:https://pan.baidu.com/s/1CIgJmFmx5iR2JfgAyjVaeg 提取码:xb7o

  3.基于搜集的40余部综艺节目制作了864部综艺片段数据集;
    下载地址:https://pan.baidu.com/s/1lJk6IIWlwxknAie0LlGYOg 提取码:9rd4

训练过程

  第1步. 针对特定任务的时域感知训练;
  第2步. 融合擦除模型的微调训练。

训练配置

最近寻觅到了一种非常简易的制作和训练方法:
  '图标擦除'模型在单卡3090上训练3天;
  '字幕擦除'模型在单卡3090上训练2天;





更多玩法

  这个项目目前还只是做了很短期的尝试,实际上视频擦除可玩的点还有很多,譬如敏感内容(涉黄涉暴等)擦除、广告擦除、指定人/物擦除、背景人擦除等等。只要是能寻找到有像素预测的场景+有像素预测的需求都是“视频擦除”可以玩出花样的情景~

Sample





了解更多

  本人的研究方向是生成模型的应用技术研究。生成技术解决的问题是像素的预测,也就是在一个有缺失/完全缺失的图像棋盘上进行像素的填补/预测,使填补/预测完的图像符合真实图像的规律。基于这种模式可展开的玩法有很多,除了我之前做的数字人生成、视频内容生成等,我们还可以拓展出更多并行的思路出来。
  尽管目前大部分的CV落地项目都集中在感知和识别任务上,而对于重构和生成任务的研发相对较少,但这不应影响我们对于生成技术价值的判断,毕竟生成技术是相对较新、参与人较少,但是应用前景较广的研究方向。我后续将持续致力于探索生成方向的落地型算法研发,欢迎访问我的网站了解这方面最新的研究进展:www.seeprettyface.com

Sample

Owner
seeprettyface.com
seeprettyface.com
PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration (NeurIPS 2021) PyTorch implementation of the paper: CoFiNet: Reli

76 Jan 03, 2023
[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

Ranking Models in Unlabeled New Environments Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch 1.7.0 + torchivision 0.8.1

Borui Zhang 39 Dec 10, 2022
A general python framework for visual object tracking and video object segmentation, based on PyTorch

PyTracking A general python framework for visual object tracking and video object segmentation, based on PyTorch. 📣 Two tracking/VOS papers accepted

2.6k Jan 04, 2023
Generalized Decision Transformer for Offline Hindsight Information Matching

Generalized Decision Transformer for Offline Hindsight Information Matching [arxiv] If you use this codebase for your research, please cite the paper:

Hiroki Furuta 35 Dec 12, 2022
Air Quality Prediction Using LSTM

AirQualityPredictionUsingLSTM In this Repo, i present to you the winning solution of smart gujarat hackathon 2019 where the task was to predict the qu

Deepak Nandwani 2 Dec 13, 2022
MAME is a multi-purpose emulation framework.

MAME's purpose is to preserve decades of software history. As electronic technology continues to rush forward, MAME prevents this important "vintage" software from being lost and forgotten.

Michael Murray 6 Oct 25, 2020
A Novel Plug-in Module for Fine-grained Visual Classification

Pytorch implementation for A Novel Plug-in Module for Fine-Grained Visual Classification. fine-grained visual classification task.

ChouPoYung 109 Dec 20, 2022
PyTorch implementation of Super SloMo by Jiang et al.

Super-SloMo PyTorch implementation of "Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation" by Jiang H., Sun

Avinash Paliwal 2.9k Jan 03, 2023
Asynchronous Advantage Actor-Critic in PyTorch

Asynchronous Advantage Actor-Critic in PyTorch This is PyTorch implementation of A3C as described in Asynchronous Methods for Deep Reinforcement Learn

Reiji Hatsugai 38 Dec 12, 2022
CIFAR-10 Photo Classification

Image-Classification CIFAR-10 Photo Classification CIFAR-10_Dataset_Classfication CIFAR-10 Photo Classification Dataset CIFAR is an acronym that stand

ADITYA SHAH 1 Jan 05, 2022
DGL-TreeSearch and the Gurobi-MWIS interface

Independent Set Benchmarking Suite This repository contains the code for our maximum independent set benchmarking suite as well as our implementations

Maximilian Böther 19 Nov 22, 2022
Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

auto-self-checker 자동으로 자가진단 해주는 프로그램(python 필요) 중요 이 프로그램이 실행될때에는 절대로 마우스포인터를 움직이거나 키보드를 건드리면 안된다(화면인식, 마우스포인터로 직접 클릭) 사용법 프로그램을 구동할 폴더 내의 cmd창에서 pip

1 Dec 30, 2021
An Inverse Kinematics library aiming performance and modularity

IKPy Demo Live demos of what IKPy can do (click on the image below to see the video): Also, a presentation of IKPy: Presentation. Features With IKPy,

Pierre Manceron 481 Jan 02, 2023
Semantic segmentation task for ADE20k & cityscapse dataset, based on several models.

semantic-segmentation-tensorflow This is a Tensorflow implementation of semantic segmentation models on MIT ADE20K scene parsing dataset and Cityscape

HsuanKung Yang 83 Oct 13, 2022
code and models for "Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation"

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation This repository contains code and models for the method described in: Golnaz

55 Jun 18, 2022
Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

Piggyback: https://arxiv.org/abs/1801.06519 Pretrained masks and backbones are available here: https://uofi.box.com/s/c5kixsvtrghu9yj51yb1oe853ltdfz4q

Arun Mallya 165 Nov 22, 2022
Migration of Edge-based Distributed Federated Learning

FedFly: Towards Migration in Edge-based Distributed Federated Learning About the research Due to mobility, a device participating in Federated Learnin

qub-blesson 11 Nov 13, 2022
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (

Jaemin Cho 98 Dec 15, 2022
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

73 Nov 06, 2022
Auto grind btdb2 exp for tower

Bloons TD Battles 2 EXP Grinder Auto grind btdb2 exp for towers Setup I suggest checking out every screenshot to see what they are supposed to be, so

Vincent 6 Jul 29, 2022