Zoom , GoogleMeets에서 Vtuber 데뷔하기

Overview

EasyVtuber

  • Facial landmark와 GAN을 이용한 Character Face Generation
  • Google Meets, Zoom 등에서 자신만의 웹툰, 만화 캐릭터로 대화해보세요!
  • 악세사리는 어느정도 추가해도 잘 작동해요!
  • 안타깝게도 RTX 2070 미만에서는 실시간으로 잘 작동하지 않을 수도 있어요 ㅠㅠ



Demo



Requirements

  • Python >= 3.8
  • Pytorch >= 1.7
  • pyvirtualcam
  • mediapipe
  • opencv-python



Quick Start

  • ※ 이 프로젝트는 사용 전 OBS 설치가 필수입니다
  • 아래 설치 순서를 지켜주세요!
  1. OBS studio 설치

    • OBS virtualcam을 사용하기 위해서 먼저 OBS Studio를 설치해야합니다
  2. pip install -r requirements.txt

    • OBS virtualcam을 설치되어있어야 requirements에 포함된 pyvirtualcam이 정상적으로 설치되어 사용할 수 있습니다
  3. pretrianed model download

    • 아래 파일들을 pretrained folder에 넣어주세요
      • combiner.pt
      • eyebrow_decomposer.pt
      • eyebrow_morphing_combiner.pt
      • face_morpher.pt
      • two_algo_face_rotator.pt
  4. character image를 character folder에 넣어주세요

    • character image 파일은 다음의 조건을 충족해야합니다
      • alpha 채널을 포함할 것(png 확장자)
      • 1명의 인간형 캐릭터일 것
      • 캐릭터가 정면을 볼 것
      • 캐릭터의 머리가 128 x 128 pixel 내에 들어올 것 (기본적으로 256 x 256으로 resize되기 때문에 256 x 256 기준 128x128 안에 들어와야함)

    Example image is refenced by TalkingHeadAnime2

5.python main.py --webcam_output

  • 실제 facial feature가 어떻게 잡히는지 보고 싶다면 --debug 옵션을 추가하여 실행해주세요



How to make Custom Character

  1. 네이버, 구글 등에서 본인이 원하는 캐릭터를 찾으세요!
    • 되도록이면 위의 4가지 조건을 맞춰주세요! google search

  2. 찾은 이미지에서 캐릭터 얼굴이 중앙으로 가도록 가로세로 1:1 비율로 이미지를 잘라주세요!
  3. 이미지 배경을 제거해서 alpha 채널을 만들어 주세요!
  4. 완성!
    • character folder에 이미지를 넣고 python main.py --output_webcam --character (.png_제외한_캐릭터파일_이름) 실행!



Folder Structure

      │
      ├── character/ - character images 
      ├── pretrained/ - save pretrained models 
      ├── tha2/ - Talking Head Anime2 Library source files 
      ├── facial_points.py - facial feature point constants
      ├── main.py - main script to excute
      ├── models.py - GAN models defined
      ├── pose.py - process facial landmark to pose vector
      └── utils.py - util fuctions for pre/postprocessing image



Usage

webcam으로 송출 시

  • python main.py --output_webcam

캐릭터 지정

  • python main.py --character (character folder에 있는 .png를 제외한 캐릭터 파일 이름)

facial feature 확인 시

  • python main.py --debug

동영상 파일 inference

  • python main.py --input video파일_경로 --output_dir frame_저장할_디렉토리



TODOs

  • Add eyebrow feature
  • Parameter Controller GUI
  • Automation of Making Drivable Character



Thanks to



Acknowledgements

  • EasyVtuber는 TalkingHeadAnime2를 기반으로 제작되었습니다.
  • tha2 folder의 source와 pretrained model file은 원저작자 repo의 Liscense를 확인하고 사용하시기 바랍니다.
Owner
Gunwoo Han
EE student / Deep Learning / Computer Vision / Manager of 파이썬 처음처럼
Gunwoo Han
Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text

Christian Bartz 572 Jan 05, 2023
Tesseract Open Source OCR Engine (main repository)

Tesseract OCR About This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM

48.4k Jan 09, 2023
Learn computer graphics by writing GPU shaders!

This repo contains a selection of projects designed to help you learn the basics of computer graphics. We'll be writing shaders to render interactive two-dimensional and three-dimensional scenes.

Eric Zhang 1.9k Jan 02, 2023
OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。

OpenCV-CameraCalibration-Example FishEyeCameraCalibration.mp4 OpenCVを用いたカメラキャリブレーションのサンプルです 2021/06/21時点でPython実装のある以下3種類について用意しています。 通常カメラ向け 魚眼レンズ向け(

KazuhitoTakahashi 34 Nov 17, 2022
pulse2percept: A Python-based simulation framework for bionic vision

pulse2percept: A Python-based simulation framework for bionic vision Retinal degenerative diseases such as retinitis pigmentosa and macular degenerati

67 Dec 29, 2022
OCR engine for all the languages

Description kraken is a turn-key OCR system optimized for historical and non-Latin script material. kraken's main features are: Fully trainable layout

431 Jan 04, 2023
Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)

Optical Character Recognition OCR (Optical Character Recognition) is a technology that enables the conversion of document types such as scanned paper

21 Dec 25, 2022
Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

Morphologycal-edge-detection-using-erosion-and-dialation the task is to detect object boundary using erosion or dialation . Here, use the kernel or st

Tamzid hasan 3 Nov 25, 2022
🔎 Like Chardet. 🚀 Package for encoding & language detection. Charset detection.

Charset Detection, for Everyone 👋 The Real First Universal Charset Detector A library that helps you read text from an unknown charset encoding. Moti

TAHRI Ahmed R. 332 Dec 31, 2022
SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

NVIDIA Research Projects 31 Nov 22, 2022
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 07, 2022
Generate a list of papers with publicly available source code in the daily arxiv

2021-06-08 paper code optimal network slicing for service-oriented networks with flexible routing and guaranteed e2e latency networkslicing multi-moda

79 Jan 03, 2023
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining

Scene Text Recognition Recommendations Everythin about Scene Text Recognition SOTA • Papers • Datasets • Code Contents 1. Papers 2. Datasets 2.1 Synth

Deep Learning and Vision Computing Lab, SCUT 197 Jan 05, 2023
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

Multi-Type-TD-TSR Check it out on Source Code of our Paper: Multi-Type-TD-TSR Extracting Tables from Document Images using a Multi-stage Pipeline for

Pascal Fischer 178 Dec 27, 2022
A tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background.

EasyLaMa (WIP) This is a tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background. Installation For GP

3 Sep 17, 2022
text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

text-detection-ctpn Scene text detection based on ctpn (connectionist text proposal network). It is implemented in tensorflow. The origin paper can be

Shaohui Ruan 3.3k Dec 30, 2022
Image Recognition Model Generator

Takes a user-inputted query and generates a machine learning image recognition model that determines if an inputted image is or isn't their query

Christopher Oka 1 Jan 13, 2022
Image processing using OpenCv

Image processing using OpenCv Write a program that opens the webcam, and the user selects one of the following on the video: ✅ If the user presses the

M.Najafi 4 Feb 18, 2022
OpenGait is a flexible and extensible gait recognition project

A flexible and extensible framework for gait recognition. You can focus on designing your own models and comparing with state-of-the-arts easily with the help of OpenGait.

Shiqi Yu 335 Dec 22, 2022
A version of nrsc5-gui that merges the interface developed by cmnybo with the architecture developed by zefie in order to start a new baseline that is not heavily dependent upon Python processing.

NRSC5-DUI is a graphical interface for nrsc5. It makes it easy to play your favorite FM HD radio stations using an RTL-SDR dongle. It will also displa

61 Dec 22, 2022