PaddleBoBo是基于PaddlePaddle和PaddleSpeech、PaddleGAN等开发套件的虚拟主播快速生成项目

Last update: Jan 08, 2023

Related tags

Overview

PaddleBoBo - 元宇宙时代，你也可以动手做一个虚拟主播。

PaddleBoBo是基于飞桨PaddlePaddle深度学习框架和PaddleSpeech、PaddleGAN等开发套件的虚拟主播快速生成项目。PaddleBoBo致力于简单高效、可复用性强，只需要一张带人像的图片和一段文字，就能快速生成一个虚拟主播的视频；并能通过简单的二次开发更改文字输入，实现视频实时生成和实时直播功能。

应用案例

运行环境

飞桨AIStudio在线运行 (强烈推荐，Tesla V100冲！！！)
自建本地环境
- Windows 10
- Python 3.7+
- PaddlePaddle >= 2.2.1
- Nvidia显卡显存16G+（没测试过，希望有显卡的土豪大佬们反馈下）

快速开始

1.安装依赖包

pip install ppgan paddlespeech

2.配置文件(default.yaml)

GANDRIVING:
  FOM_INPUT_IMAGE: './file/input/test.png' #带人脸的静态图
  FOM_DRIVING_VIDEO: './file/input/zimeng.mp4' #用作表情迁移的参考视频
  FOM_OUTPUT_VIDEO: './file/input/test.mp4' #表情迁移后的视频输出路径

SAVEPATH:
  VIDEO_SAVE_PATH: './file/output/video/' #保存音频的路径
  AUDIO_SAVE_PATH: './file/output/audio/' #保存生成虚拟主播视频的路径

3.让静态人脸动起来

python create_virtual_human.py --config default.yaml

4.通用版本生成

python general_demo.py \
    --human ./file/input/test.mp4 \
    --output output.mp4 \
    --text 各位开发者大家好，欢迎使用飞桨。

参数	参数说明
human	第3步生成的人脸视频路径
output	生成虚拟主播视频的输出路径
text	虚拟主播语音文本

案例库

AI财经新闻主播

* 运行news_app.py 持续采集同花顺新闻数据并生成视频
* 运行play.py 实时和循环播放生成的视频

TODO LIST

最近有点累，如果大佬们有什么想法的话可以提Issue，同时也欢迎PR。

https://github.com/JiehangXie/PaddleBoBo/issues

PaddleBoBo是基于PaddlePaddle和PaddleSpeech、PaddleGAN等开发套件的虚拟主播快速生成项目

Related tags

Overview

PaddleBoBo - 元宇宙时代，你也可以动手做一个虚拟主播。

应用案例

运行环境

快速开始

1.安装依赖包

2.配置文件(default.yaml)

3.让静态人脸动起来

4.通用版本生成

案例库

AI财经新闻主播

更多应用案例正在开发中，欢迎开发者投稿

TODO LIST

参考资料

Owner

Compressed Video Action Recognition

This repository provides some of the code implemented and the data used for the work proposed in "A Cluster-Based Trip Prediction Graph Neural Network Model for Bike Sharing Systems".

LSTM Neural Networks for Spectroscopic Studies of Type Ia Supernovae

Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

TensorFlow 101: Introduction to Deep Learning for Python Within TensorFlow

Semi-supervised Implicit Scene Completion from Sparse LiDAR

This project uses Template Matching technique for object detecting by detection of template image over base image.

Mercury: easily convert Python notebook to web app and share with others

Efficient Training of Audio Transformers with Patchout

CVPR '21: In the light of feature distributions: Moment matching for Neural Style Transfer

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

StyleSwin: Transformer-based GAN for High-resolution Image Generation

Manifold-Mixup implementation for fastai V2

Open-AI's DALL-E for large scale training in mesh-tensorflow.

Face recognition. Redefined.

Pytorch implementation of Learning Rate Dropout.