一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.

Overview

AgentOCR

GitHub forks GitHub Repo stars Pypi Downloads GitHub release (latest by date including pre-releases) GitHub

Test Build

简介

  • AgentOCR 是一个基于 PaddleOCRONNXRuntime 项目开发的一个使用简单、调用方便的 OCR 项目

  • 本项目目前包含 Python Package 【AgentOCR】 和 OCR 标注软件 【AgentOCRLabeling】

使用指南

  • Python Package:

    • 快速安装:

      # 安装 AgentOCR
      $ pip install agentocr 
      
      # 根据设备平台安装合适版本的 ONNXRuntime
      $ pip install onnxruntime
    • 简单调用:

      # 导入 OCRSystem 模块
      from agentocr import OCRSystem
      
      # 初始化 OCR 模型
      ocr = OCRSystem(config='ch')
      
      # 使用模型对图像进行 OCR 识别
      results = ocr.ocr('test.jpg')
    • 服务器部署:

      • 启动 AgentOCR Server 服务

        $ agentocr server
      • Python 调用

        import cv2
        import json
        import base64
        import requests
        
        # 图片 Base64 编码
        def cv2_to_base64(image):
            data = cv2.imencode('.jpg', image)[1]
            image_base64 = base64.b64encode(data.tobytes()).decode('UTF-8')
            return image_base64
        
        
        # 读取图片
        image = cv2.imread('test.jpg')
        image_base64 = cv2_to_base64(image)
        
        # 构建请求数据
        data = {
            'image': image_base64
        }
        
        # 发送请求
        url = "http://127.0.0.1:5000/ocr"
        r = requests.post(url=url, data=json.dumps(data))
        
        # 打印预测结果
        print(r.json())
    • Jupyter Notebook:【快速使用】

    • 更多安装使用细节请参考:【Package 使用指南】

多语言支持

  • 目前预置了如下语言的配置文件,可通过语言缩写直接调用该配置文件:

    语种 描述 缩写 语种 描述 缩写
    中文 chinese and english ch 保加利亚文 Bulgarian bg
    英文 english en 乌克兰文 Ukranian uk
    法文 french fr 白俄罗斯文 Belarusian be
    德文 german german 泰卢固文 Telugu te
    日文 japan japan 阿巴扎文 Abaza abq
    韩文 korean korean 泰米尔文 Tamil ta
    中文繁体 chinese traditional cht 南非荷兰文 Afrikaans af
    意大利文 Italian it 阿塞拜疆文 Azerbaijani az
    西班牙文 Spanish es 波斯尼亚文 Bosnian bs
    葡萄牙文 Portuguese pt 捷克文 Czech cs
    俄罗斯文 Russia ru 威尔士文 Welsh cy
    阿拉伯文 Arabic ar 丹麦文 Danish da
    印地文 Hindi hi 爱沙尼亚文 Estonian et
    维吾尔 Uyghur ug 爱尔兰文 Irish ga
    波斯文 Persian fa 克罗地亚文 Croatian hr
    乌尔都文 Urdu ur 匈牙利文 Hungarian hu
    塞尔维亚文(latin) Serbian(latin) rs_latin 印尼文 Indonesian id
    欧西坦文 Occitan oc 冰岛文 Icelandic is
    马拉地文 Marathi mr 库尔德文 Kurdish ku
    尼泊尔文 Nepali ne 立陶宛文 Lithuanian lt
    塞尔维亚文(cyrillic) Serbian(cyrillic) rs_cyrillic 拉脱维亚文 Latvian lv
    毛利文 Maori mi 达尔瓦文 Dargwa dar
    马来文 Malay ms 因古什文 Ingush inh
    马耳他文 Maltese mt 拉克文 Lak lbe
    荷兰文 Dutch nl 莱兹甘文 Lezghian lez
    挪威文 Norwegian no 塔巴萨兰文 Tabassaran tab
    波兰文 Polish pl 比尔哈文 Bihari bh
    罗马尼亚文 Romanian ro 迈蒂利文 Maithili mai
    斯洛伐克文 Slovak sk 昂加文 Angika ang
    斯洛文尼亚文 Slovenian sl 孟加拉文 Bhojpuri bho
    阿尔巴尼亚文 Albanian sq 摩揭陀文 Magahi mah
    瑞典文 Swedish sv 那格浦尔文 Nagpur sck
    西瓦希里文 Swahili sw 尼瓦尔文 Newari new
    塔加洛文 Tagalog tl 保加利亚文 Goan Konkani gom
    土耳其文 Turkish tr 沙特阿拉伯文 Saudi Arabia sa
    乌兹别克文 Uzbek uz 阿瓦尔文 Avar ava
    越南文 Vietnamese vi 阿瓦尔文 Avar ava
    蒙古文 Mongolian mn 阿迪赫文 Adyghe ady

预训练模型

  • 检测模型:

    Model Name Model Type Pretrained Model
    ch_ppocr_mobile_v2.0_det det Download
    ch_ppocr_server_v2.0_det det Download
    en_ppocr_mobile_v2.0_det det Download
    en_ppocr_mobile_v2.0_table_det det Download
  • 分类模型:

    Model Name Model Type Pretrained Model
    ch_ppocr_mobile_v2.0_cls cls Download
  • 识别模型:

    Model Name Model Type Pretrained Model
    ch_ppocr_mobile_v2.0_rec rec Download
    ch_ppocr_server_v2.0_rec rec Download
    ka_ppocr_mobile_v2.0_rec rec Download
    te_ppocr_mobile_v2.0_rec rec Download
    ta_ppocr_mobile_v2.0_rec rec Download
    cht_ppocr_mobile_v2.0_rec rec Download
    japan_ppocr_mobile_v2.0_rec rec Download
    latin_ppocr_mobile_v2.0_rec rec Download
    arabic_ppocr_mobile_v2.0_rec rec Download
    korean_ppocr_mobile_v2.0_rec rec Download
    french_ppocr_mobile_v2.0_rec rec Download
    german_ppocr_mobile_v2.0_rec rec Download
    cyrillic_ppocr_mobile_v2.0_rec rec Download
    en_ppocr_mobile_v2.0_table_rec rec Download
    en_ppocr_mobile_v2.0_number_rec rec Download
    devanagari_ppocr_mobile_v2.0_rec rec Download
You might also like...
XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Smaller Multilingual Transformers This repository shares smaller versions of multilingual transformers that keep the same representations offered by t

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

Code Repo for the ACL21 paper
Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning This is the Github repository of our paper, "Common S

Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

One model to speak them all 🌎 Audio Language Text ▷ Chinese 人人生而自由,在尊严和权利上一律平等。 ▷ English All human beings are born free and equal in dignity and rig

 Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021 The code for training mCOLT/mRASP2, a multilingua

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks A Transformer-based library for SocialNLP classification tasks. Currently

A multilingual version of MS MARCO passage ranking dataset

mMARCO A multilingual version of MS MARCO passage ranking dataset This repository presents a neural machine translation-based method for translating t

Comments
  • linux下的预测速度是否与windows存在差异

    linux下的预测速度是否与windows存在差异

    请问大佬有没有测试过在linux运行agentocr与windows下的性能差异,感觉速度差距有点大,条件限制没办法比对完全一样的硬件环境,只是猜测是不是和系统有关?

    1. 以下环境都是基于python3.7.10,agentocr 1.2.0,预测同一张图片
    2. 本地环境是笔记本电脑,win10,CPU是AMD Ryzen 7 5800H,8核16线程,预测得到结果耗时是2.5秒以内
    3. 一台Linux服务器,Centos7,是由Intel(R) Xeon(R) CPU E5-2680 v4划出来的4核虚拟机,预测得到结果耗时是在7.2秒以内
    4. 由上面笔记本电脑运行的VirtualBOX划分了4个CPU(VB上面显示有16个CPU,猜测应该是划分了4个核心线程出来)的虚拟机,Centos7,预测得到结果耗时也和第二点的linux服务器接近
    5. 一台windows服务器,winserver2012,是由Intel i7-8700划出来的2核虚拟机,预测得到结果耗时是在4.7秒以内

    image image

    从任务运行情况来看,windows环境下在任务管理器可以看出,预测过程中所有核心都是参与工作的 而linux环境通过top命令能看出CPU占用最高只能到200%,理论上4核心应该能到400%,是不是所有核心没有参与工作导致预测速度比较慢?条件有限,笔记本的CPU和台式服务器的CPU也没有直接的性能比较可以参考,但即便是比较旧的服务器CPU也不会跟7nm的AMD笔记本CPU有这么大差距吧,如果有大佬们测试过或者知道原因希望能告知一下!!

    opened by w-Bro 10
  • 【PaddlePaddle Hackathon】100 制作 Rubick 深度学习相关小插件

    【PaddlePaddle Hackathon】100 制作 Rubick 深度学习相关小插件

    (此 ISSUE 为 PaddlePaddle Hackathon 活动的任务 ISSUE,更多详见PaddlePaddle Hackathon

    【任务说明】

    • 任务标题:制作 Rubick 深度学习相关小插件

    • 难度:中等(通过验收即可获得5000RMB)

    • 技术标签:JavaScript、PaddlePaddle

    • 详细描述:随着 Rubick、Utools 等高质量桌面效能工具箱的出现,使用深度学习进行赋能将会为其带来更多有趣的玩法。在本任务中,您可以借助 AgentOCR 或其他飞桨相关深度学习工具,结合 Paddle.JS 或 ONNX.JS 将深度学习模型以 Rubick 插件形式进行部署,例如使用 AgentOCR 的 OCR 能力让 Rubick 的截图拥有文字识别能力,当然你也可以选择自己喜欢的模型为 Rubick 进行赋能,只要以 Rubick 的插件形式进行开发即可视为有效提交。

    Paddle.JS 主页:https://github.com/PaddlePaddle/Paddle.js

    AgentOCR 主页:https://github.com/AgentMaker/AgentOCR

    【提交内容】

    • 项目 PR 到 AgentOCR

    • 技术说明文档

    【技术要求】

    • 具备的 JavaScript 开发能力
    PaddlePaddle Hackathon 
    opened by GT-ZhangAcer 0
  • 【PaddlePaddle Hackathon】99 为 AgentOCR 工具适配 JavaScript 环境

    【PaddlePaddle Hackathon】99 为 AgentOCR 工具适配 JavaScript 环境

    (此 ISSUE 为 PaddlePaddle Hackathon 活动的任务 ISSUE,更多详见PaddlePaddle Hackathon

    【任务说明】

    • 任务标题:为 AgentOCR 工具适配 JavaScript 环境

    • 技术标签:JavaScript

    • 任务难度:简单

    • 详细描述:在 Web 前端以及、移动端 APP 开发甚至是桌面应用开发中, JavaScript 所体现的强大兼容性使得跨平台应用更加便捷。目前 AgentOCR 提供了飞桨 PaddlePaddle、ONNX、DML 三种后端支持,为更方便让基于 PaddleOCR 的 AgentOCR 更好适配更多开发者所需环境,我们可以通过不限于 Paddle.JS、ONNX.JS 中任一方式使得其支持JavaScript的OCR推理功能。本这个项目中,你需要在精度损失和速度损失较低的情况下制作 Paddle.JS 或 ONNX.JS 版本的 AgentOCR 开发程序包。

    Paddle.JS 主页:https://github.com/PaddlePaddle/Paddle.js

    AgentOCR 主页:https://github.com/AgentMaker/AgentOCR

    【提交内容】

    • 项目 PR 到 AgentOCR

    • 技术说明文档

    【技术要求】

    • 具备的 JavaScript 开发能力
    PaddlePaddle Hackathon 
    opened by GT-ZhangAcer 0
Releases(2.0.0)
Owner
AgentMaker
Focus on deep learning tools
AgentMaker
A trusty face recognition research platform developed by Tencent Youtu Lab

Introduction TFace: A trusty face recognition research platform developed by Tencent Youtu Lab. It provides a high-performance distributed training fr

Tencent 956 Jan 01, 2023
Code for the paper "There is no Double-Descent in Random Forests"

Code for the paper "There is no Double-Descent in Random Forests" This repository contains the code to run the experiments for our paper called "There

2 Jan 14, 2022
Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

Hand Gesture Volume Control Modules There are basically three modules Handtracking Program Handtracking Module Volume Control Program Handtracking Pro

VITTAL 1 Jan 12, 2022
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation [arxiv] This is the official repository for CDTrans: Cross-domain Transformer for

238 Dec 22, 2022
Using fully convolutional networks for semantic segmentation with caffe for the cityscapes dataset

Using fully convolutional networks for semantic segmentation (Shelhamer et al.) with caffe for the cityscapes dataset How to get started Download the

Simon Guist 27 Jun 06, 2022
DIRL: Domain-Invariant Representation Learning

DIRL: Domain-Invariant Representation Learning Domain-Invariant Representation Learning (DIRL) is a novel algorithm that semantically aligns both the

Ajay Tanwani 30 Nov 07, 2022
TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments for IV 2022.

TorchGRL TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffi

XXQQ 42 Dec 09, 2022
Explainability for Vision Transformers (in PyTorch)

Explainability for Vision Transformers (in PyTorch) This repository implements methods for explainability in Vision Transformers

Jacob Gildenblat 442 Jan 04, 2023
Few-Shot Object Detection via Association and DIscrimination

Few-Shot Object Detection via Association and DIscrimination Code release of our NeurIPS 2021 paper: Few-Shot Object Detection via Association and DIs

Cao Yuhang 49 Dec 18, 2022
"Domain Adaptive Semantic Segmentation without Source Data" (ACM MM 2021)

LDBE Pytorch implementation for two papers (the paper will be released soon): "Domain Adaptive Semantic Segmentation without Source Data", ACM MM2021.

benfour 16 Sep 28, 2022
SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data

SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data Au

14 Nov 28, 2022
NDE: Climate Modeling with Neural Diffusion Equation, ICDM'21

Climate Modeling with Neural Diffusion Equation Introduction This is the repository of our accepted ICDM 2021 paper "Climate Modeling with Neural Diff

Jeehyun Hwang 5 Dec 18, 2022
[NeurIPS 2021] Low-Rank Subspaces in GANs

Low-Rank Subspaces in GANs Figure: Image editing results using LowRankGAN on StyleGAN2 (first three columns) and BigGAN (last column). Low-Rank Subspa

112 Dec 28, 2022
PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place

Mikaela Uy 294 Dec 12, 2022
Tracking Progress in Question Answering over Knowledge Graphs

Tracking Progress in Question Answering over Knowledge Graphs Table of contents Question Answering Systems with Descriptions The QA Systems Table cont

Knowledge Graph Question Answering 47 Jan 02, 2023
Learning to Self-Train for Semi-Supervised Few-Shot

Learning to Self-Train for Semi-Supervised Few-Shot Classification This repository contains the TensorFlow implementation for NeurIPS 2019 Paper "Lear

86 Dec 29, 2022
Interactive web apps created using geemap and streamlit

geemap-apps Introduction This repo demostrates how to build a multi-page Earth Engine App using streamlit and geemap. You can deploy the app on variou

Qiusheng Wu 27 Dec 23, 2022
Unofficial implementation of the ImageNet, CIFAR 10 and SVHN Augmentation Policies learned by AutoAugment using pillow

AutoAugment - Learning Augmentation Policies from Data Unofficial implementation of the ImageNet, CIFAR10 and SVHN Augmentation Policies learned by Au

Philip Popien 1.3k Jan 02, 2023
Repo for the paper Extrapolating from a Single Image to a Thousand Classes using Distillation

Extrapolating from a Single Image to a Thousand Classes using Distillation by Yuki M. Asano* and Aaqib Saeed* (*Equal Contribution) Extrapolating from

Yuki M. Asano 16 Nov 04, 2022
Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents

DeepXML Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents Architectures and algorithms DeepXML supports

Extreme Classification 49 Nov 06, 2022