一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.

Last update: Nov 10, 2022

Overview

AgentOCR

简介

AgentOCR 是一个基于 PaddleOCR 和 ONNXRuntime 项目开发的一个使用简单、调用方便的 OCR 项目
本项目目前包含 Python Package 【AgentOCR】 和 OCR 标注软件 【AgentOCRLabeling】

使用指南

Python Package：

快速安装：

# 安装 AgentOCR
$ pip install agentocr 

# 根据设备平台安装合适版本的 ONNXRuntime
$ pip install onnxruntime

简单调用：

# 导入 OCRSystem 模块
from agentocr import OCRSystem

# 初始化 OCR 模型
ocr = OCRSystem(config='ch')

# 使用模型对图像进行 OCR 识别
results = ocr.ocr('test.jpg')

服务器部署：

启动 AgentOCR Server 服务
```
$ agentocr server
```

Python 调用

import cv2
import json
import base64
import requests

# 图片 Base64 编码
def cv2_to_base64(image):
    data = cv2.imencode('.jpg', image)[1]
    image_base64 = base64.b64encode(data.tobytes()).decode('UTF-8')
    return image_base64


# 读取图片
image = cv2.imread('test.jpg')
image_base64 = cv2_to_base64(image)

# 构建请求数据
data = {
    'image': image_base64
}

# 发送请求
url = "http://127.0.0.1:5000/ocr"
r = requests.post(url=url, data=json.dumps(data))

# 打印预测结果
print(r.json())

Jupyter Notebook：【快速使用】
更多安装使用细节请参考：【Package 使用指南】

多语言支持

目前预置了如下语言的配置文件，可通过语言缩写直接调用该配置文件：

语种	描述	缩写	语种	描述	缩写
中文	chinese and english	ch	保加利亚文	Bulgarian	bg
英文	english	en	乌克兰文	Ukranian	uk
法文	french	fr	白俄罗斯文	Belarusian	be
德文	german	german	泰卢固文	Telugu	te
日文	japan	japan	阿巴扎文	Abaza	abq
韩文	korean	korean	泰米尔文	Tamil	ta
中文繁体	chinese traditional	cht	南非荷兰文	Afrikaans	af
意大利文	Italian	it	阿塞拜疆文	Azerbaijani	az
西班牙文	Spanish	es	波斯尼亚文	Bosnian	bs
葡萄牙文	Portuguese	pt	捷克文	Czech	cs
俄罗斯文	Russia	ru	威尔士文	Welsh	cy
阿拉伯文	Arabic	ar	丹麦文	Danish	da
印地文	Hindi	hi	爱沙尼亚文	Estonian	et
维吾尔	Uyghur	ug	爱尔兰文	Irish	ga
波斯文	Persian	fa	克罗地亚文	Croatian	hr
乌尔都文	Urdu	ur	匈牙利文	Hungarian	hu
塞尔维亚文（latin)	Serbian(latin)	rs_latin	印尼文	Indonesian	id
欧西坦文	Occitan	oc	冰岛文	Icelandic	is
马拉地文	Marathi	mr	库尔德文	Kurdish	ku
尼泊尔文	Nepali	ne	立陶宛文	Lithuanian	lt
塞尔维亚文（cyrillic)	Serbian(cyrillic)	rs_cyrillic	拉脱维亚文	Latvian	lv
毛利文	Maori	mi	达尔瓦文	Dargwa	dar
马来文	Malay	ms	因古什文	Ingush	inh
马耳他文	Maltese	mt	拉克文	Lak	lbe
荷兰文	Dutch	nl	莱兹甘文	Lezghian	lez
挪威文	Norwegian	no	塔巴萨兰文	Tabassaran	tab
波兰文	Polish	pl	比尔哈文	Bihari	bh
罗马尼亚文	Romanian	ro	迈蒂利文	Maithili	mai
斯洛伐克文	Slovak	sk	昂加文	Angika	ang
斯洛文尼亚文	Slovenian	sl	孟加拉文	Bhojpuri	bho
阿尔巴尼亚文	Albanian	sq	摩揭陀文	Magahi	mah
瑞典文	Swedish	sv	那格浦尔文	Nagpur	sck
西瓦希里文	Swahili	sw	尼瓦尔文	Newari	new
塔加洛文	Tagalog	tl	保加利亚文	Goan Konkani	gom
土耳其文	Turkish	tr	沙特阿拉伯文	Saudi Arabia	sa
乌兹别克文	Uzbek	uz	阿瓦尔文	Avar	ava
越南文	Vietnamese	vi	阿瓦尔文	Avar	ava
蒙古文	Mongolian	mn	阿迪赫文	Adyghe	ady

预训练模型

检测模型：

Model Name	Model Type	Pretrained Model
ch_ppocr_mobile_v2.0_det	det	Download
ch_ppocr_server_v2.0_det	det	Download
en_ppocr_mobile_v2.0_det	det	Download
en_ppocr_mobile_v2.0_table_det	det	Download

分类模型：

Model Name Model Type Pretrained Model

ch_ppocr_mobile_v2.0_cls cls Download

Model Name	Model Type	Pretrained Model
ch_ppocr_mobile_v2.0_cls	cls	Download

识别模型：

Model Name	Model Type	Pretrained Model
ch_ppocr_mobile_v2.0_rec	rec	Download
ch_ppocr_server_v2.0_rec	rec	Download
ka_ppocr_mobile_v2.0_rec	rec	Download
te_ppocr_mobile_v2.0_rec	rec	Download
ta_ppocr_mobile_v2.0_rec	rec	Download
cht_ppocr_mobile_v2.0_rec	rec	Download
japan_ppocr_mobile_v2.0_rec	rec	Download
latin_ppocr_mobile_v2.0_rec	rec	Download
arabic_ppocr_mobile_v2.0_rec	rec	Download
korean_ppocr_mobile_v2.0_rec	rec	Download
french_ppocr_mobile_v2.0_rec	rec	Download
german_ppocr_mobile_v2.0_rec	rec	Download
cyrillic_ppocr_mobile_v2.0_rec	rec	Download
en_ppocr_mobile_v2.0_table_rec	rec	Download
en_ppocr_mobile_v2.0_number_rec	rec	Download
devanagari_ppocr_mobile_v2.0_rec	rec	Download

XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf

125 Jan 4, 2023

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Smaller Multilingual Transformers This repository shares smaller versions of multilingual transformers that keep the same representations offered by t

79 Dec 28, 2022

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

XL-Sum This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Lang

190 Jan 3, 2023

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

21 May 18, 2022

Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning This is the Github repository of our paper, "Common S

19 Nov 30, 2022

Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

One model to speak them all 🌎 Audio Language Text ▷ Chinese 人人生而自由，在尊严和权利上一律平等。 ▷ English All human beings are born free and equal in dignity and rig

60 Nov 14, 2022

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021 The code for training mCOLT/mRASP2, a multilingua

104 Jan 1, 2023

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks A Transformer-based library for SocialNLP classification tasks. Currently

298 Jan 7, 2023

A multilingual version of MS MARCO passage ranking dataset

mMARCO A multilingual version of MS MARCO passage ranking dataset This repository presents a neural machine translation-based method for translating t

75 Dec 27, 2022

Comments

linux下的预测速度是否与windows存在差异
请问大佬有没有测试过在linux运行agentocr与windows下的性能差异，感觉速度差距有点大，条件限制没办法比对完全一样的硬件环境，只是猜测是不是和系统有关？

以下环境都是基于python3.7.10，agentocr 1.2.0，预测同一张图片

本地环境是笔记本电脑，win10，CPU是AMD Ryzen 7 5800H，8核16线程，预测得到结果耗时是2.5秒以内

一台Linux服务器，Centos7，是由Intel(R) Xeon(R) CPU E5-2680 v4划出来的4核虚拟机，预测得到结果耗时是在7.2秒以内

由上面笔记本电脑运行的VirtualBOX划分了4个CPU(VB上面显示有16个CPU，猜测应该是划分了4个核心线程出来)的虚拟机，Centos7，预测得到结果耗时也和第二点的linux服务器接近

一台windows服务器，winserver2012，是由Intel i7-8700划出来的2核虚拟机，预测得到结果耗时是在4.7秒以内

从任务运行情况来看，windows环境下在任务管理器可以看出，预测过程中所有核心都是参与工作的而linux环境通过top命令能看出CPU占用最高只能到200%，理论上4核心应该能到400%，是不是所有核心没有参与工作导致预测速度比较慢？条件有限，笔记本的CPU和台式服务器的CPU也没有直接的性能比较可以参考，但即便是比较旧的服务器CPU也不会跟7nm的AMD笔记本CPU有这么大差距吧，如果有大佬们测试过或者知道原因希望能告知一下！！
opened by w-Bro 10
【PaddlePaddle Hackathon】100 制作 Rubick 深度学习相关小插件
（此 ISSUE 为 PaddlePaddle Hackathon 活动的任务 ISSUE，更多详见PaddlePaddle Hackathon）

【任务说明】

任务标题：制作 Rubick 深度学习相关小插件

难度：中等（通过验收即可获得5000RMB）

技术标签：JavaScript、PaddlePaddle

详细描述：随着 Rubick、Utools 等高质量桌面效能工具箱的出现，使用深度学习进行赋能将会为其带来更多有趣的玩法。在本任务中，您可以借助 AgentOCR 或其他飞桨相关深度学习工具，结合 Paddle.JS 或 ONNX.JS 将深度学习模型以 Rubick 插件形式进行部署，例如使用 AgentOCR 的 OCR 能力让 Rubick 的截图拥有文字识别能力，当然你也可以选择自己喜欢的模型为 Rubick 进行赋能，只要以 Rubick 的插件形式进行开发即可视为有效提交。

Paddle.JS 主页：https://github.com/PaddlePaddle/Paddle.js

AgentOCR 主页：https://github.com/AgentMaker/AgentOCR

【提交内容】

项目 PR 到 AgentOCR

技术说明文档

【技术要求】

具备的 JavaScript 开发能力

PaddlePaddle Hackathon
opened by GT-ZhangAcer 0
【PaddlePaddle Hackathon】99 为 AgentOCR 工具适配 JavaScript 环境
（此 ISSUE 为 PaddlePaddle Hackathon 活动的任务 ISSUE，更多详见PaddlePaddle Hackathon）

【任务说明】

任务标题：为 AgentOCR 工具适配 JavaScript 环境

技术标签：JavaScript

任务难度：简单

详细描述：在 Web 前端以及、移动端 APP 开发甚至是桌面应用开发中， JavaScript 所体现的强大兼容性使得跨平台应用更加便捷。目前 AgentOCR 提供了飞桨 PaddlePaddle、ONNX、DML 三种后端支持，为更方便让基于 PaddleOCR 的 AgentOCR 更好适配更多开发者所需环境，我们可以通过不限于 Paddle.JS、ONNX.JS 中任一方式使得其支持JavaScript的OCR推理功能。本这个项目中，你需要在精度损失和速度损失较低的情况下制作 Paddle.JS 或 ONNX.JS 版本的 AgentOCR 开发程序包。

Paddle.JS 主页：https://github.com/PaddlePaddle/Paddle.js

AgentOCR 主页：https://github.com/AgentMaker/AgentOCR

【提交内容】

项目 PR 到 AgentOCR

技术说明文档

【技术要求】

具备的 JavaScript 开发能力

PaddlePaddle Hackathon
opened by GT-ZhangAcer 0

Releases(2.0.0)

2.0.0(Sep 29, 2021)
注意：

2.x 版本与 1.x 版本的模型文件互不兼容

更新：

新增 PaddleOCR v2 模型

优化识别模型字典

删除内置字体和 JSON 配置文件

多语言支持从使用具体语言切换更换为语言类型切换

Wheel 包体积缩小至 100k 左右

添加中国车牌检测识别子项目【AgentCLPR】

OCR 标注软件添加更多语言文本支持

多平台的可执行标注软件【Coming soon】

OCR 图形界面【Coming soon】

Source code(tar.gz)
Source code(zip)
agentocr-2.0.0-py3-none-any.whl(105.09 KB)
1.3.0(Sep 2, 2021)
优化识别代码，对齐识别模型精度

Source code(tar.gz)
Source code(zip)
agentocr-1.3.0-py3-none-any.whl(12.27 MB)
1.2.0(Aug 23, 2021)
新增服务器部署功能

修复 API 接口关闭检测时的 bug

增加 API 接口注释

Source code(tar.gz)
Source code(zip)
agentocr-1.2.0-py3-none-any.whl(12.27 MB)
1.1.3(Aug 21, 2021)
优化命令行功能

调整代码目录名称

Source code(tar.gz)
Source code(zip)
agentocr-1.1.3-py3-none-any.whl(12.27 MB)
1.1.2(Aug 20, 2021)
优化 log 信息

删除无用的配置选项

更新文档

Source code(tar.gz)
Source code(zip)
agentocr-1.1.2-py3-none-any.whl(12.28 MB)
1.1.1(Aug 20, 2021)
可通过 API 直接覆盖配置选项

将分类默认设为关闭

Source code(tar.gz)
Source code(zip)
agentocr-1.1.1-py3-none-any.whl(12.28 MB)
1.0.0(Aug 18, 2021)
初始版本

Source code(tar.gz)
Source code(zip)
agentocr-1.0.0-py3-none-any.whl(12.27 MB)

Owner

AgentMaker

Focus on deep learning tools

GitHub Repository

A trusty face recognition research platform developed by Tencent Youtu Lab

Introduction TFace: A trusty face recognition research platform developed by Tencent Youtu Lab. It provides a high-performance distributed training fr

956 Jan 01, 2023

Code for the paper "There is no Double-Descent in Random Forests"

Code for the paper "There is no Double-Descent in Random Forests" This repository contains the code to run the experiments for our paper called "There

2 Jan 14, 2022

Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

Hand Gesture Volume Control Modules There are basically three modules Handtracking Program Handtracking Module Volume Control Program Handtracking Pro

1 Jan 12, 2022

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation [arxiv] This is the official repository for CDTrans: Cross-domain Transformer for

238 Dec 22, 2022

Using fully convolutional networks for semantic segmentation with caffe for the cityscapes dataset

Using fully convolutional networks for semantic segmentation (Shelhamer et al.) with caffe for the cityscapes dataset How to get started Download the

27 Jun 06, 2022

DIRL: Domain-Invariant Representation Learning

DIRL: Domain-Invariant Representation Learning Domain-Invariant Representation Learning (DIRL) is a novel algorithm that semantically aligns both the

30 Nov 07, 2022

TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments for IV 2022.

TorchGRL TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffi

42 Dec 09, 2022

一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.

Related tags

Overview

AgentOCR

简介

使用指南

多语言支持

预训练模型

You might also like...

XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

A multilingual version of MS MARCO passage ranking dataset

Comments

linux下的预测速度是否与windows存在差异

【PaddlePaddle Hackathon】100 制作 Rubick 深度学习相关小插件

【PaddlePaddle Hackathon】99 为 AgentOCR 工具适配 JavaScript 环境

Releases(2.0.0)

2.0.0(Sep 29, 2021)

注意：

更新：

1.3.0(Sep 2, 2021)

1.2.0(Aug 23, 2021)

1.1.3(Aug 21, 2021)

1.1.2(Aug 20, 2021)

1.1.1(Aug 20, 2021)

1.0.0(Aug 18, 2021)

Owner

AgentMaker

A trusty face recognition research platform developed by Tencent Youtu Lab

Code for the paper "There is no Double-Descent in Random Forests"

Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Using fully convolutional networks for semantic segmentation with caffe for the cityscapes dataset

DIRL: Domain-Invariant Representation Learning

TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments for IV 2022.

Explainability for Vision Transformers (in PyTorch)

Few-Shot Object Detection via Association and DIscrimination

"Domain Adaptive Semantic Segmentation without Source Data" (ACM MM 2021)

SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data

NDE: Climate Modeling with Neural Diffusion Equation, ICDM'21

[NeurIPS 2021] Low-Rank Subspaces in GANs

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018

Tracking Progress in Question Answering over Knowledge Graphs

Learning to Self-Train for Semi-Supervised Few-Shot

Interactive web apps created using geemap and streamlit

Unofficial implementation of the ImageNet, CIFAR 10 and SVHN Augmentation Policies learned by AutoAugment using pillow

Repo for the paper Extrapolating from a Single Image to a Thousand Classes using Distillation

Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents