Rotational region detection based on Faster-RCNN.

Overview

R2CNN_Faster_RCNN_Tensorflow

Abstract

This is a tensorflow re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection.
It should be noted that we did not re-implementate exactly as the paper and just adopted its idea.

This project is based on Faster-RCNN, and completed by YangXue and YangJirui.

DOTA test results

1

Comparison

Part of the results are from DOTA paper.

Task1 - Oriented Leaderboard

Approaches mAP PL BD BR GTF SV LV SH TC BC ST SBF RA HA SP HC
SSD 10.59 39.83 9.09 0.64 13.18 0.26 0.39 1.11 16.24 27.57 9.23 27.16 9.09 3.03 1.05 1.01
YOLOv2 21.39 39.57 20.29 36.58 23.42 8.85 2.09 4.82 44.34 38.35 34.65 16.02 37.62 47.23 25.5 7.45
R-FCN 26.79 37.8 38.21 3.64 37.26 6.74 2.6 5.59 22.85 46.93 66.04 33.37 47.15 10.6 25.19 17.96
FR-H 36.29 47.16 61 9.8 51.74 14.87 12.8 6.88 56.26 59.97 57.32 47.83 48.7 8.23 37.25 23.05
FR-O 52.93 79.09 69.12 17.17 63.49 34.2 37.16 36.2 89.19 69.6 58.96 49.4 52.52 46.69 44.8 46.3
R2CNN 60.67 80.94 65.75 35.34 67.44 59.92 50.91 55.81 90.67 66.92 72.39 55.06 52.23 55.14 53.35 48.22
RRPN 61.01 88.52 71.20 31.66 59.30 51.85 56.19 57.25 90.81 72.84 67.38 56.69 52.84 53.08 51.94 53.58
ICN 68.20 81.40 74.30 47.70 70.30 64.90 67.80 70.00 90.80 79.10 78.20 53.60 62.90 67.00 64.20 50.20
R2CNN++ 71.16 89.66 81.22 45.50 75.10 68.27 60.17 66.83 90.90 80.69 86.15 64.05 63.48 65.34 68.01 62.05

Task2 - Horizontal Leaderboard

Approaches mAP PL BD BR GTF SV LV SH TC BC ST SBF RA HA SP HC
SSD 10.94 44.74 11.21 6.22 6.91 2 10.24 11.34 15.59 12.56 17.94 14.73 4.55 4.55 0.53 1.01
YOLOv2 39.2 76.9 33.87 22.73 34.88 38.73 32.02 52.37 61.65 48.54 33.91 29.27 36.83 36.44 38.26 11.61
R-FCN 47.24 79.33 44.26 36.58 53.53 39.38 34.15 47.29 45.66 47.74 65.84 37.92 44.23 47.23 50.64 34.9
FR-H 60.46 80.32 77.55 32.86 68.13 53.66 52.49 50.04 90.41 75.05 59.59 57 49.81 61.69 56.46 41.85
R2CNN - - - - - - - - - - - - - - - -
FPN 72.00 88.70 75.10 52.60 59.20 69.40 78.80 84.50 90.60 81.30 82.60 52.50 62.10 76.60 66.30 60.10
ICN 72.50 90.00 77.70 53.40 73.30 73.50 65.00 78.20 90.80 79.10 84.80 57.20 62.10 73.50 70.20 58.10
R2CNN++ 75.35 90.18 81.88 55.30 73.29 72.09 77.65 78.06 90.91 82.44 86.39 64.53 63.45 75.77 78.21 60.11

Face Detection

Environment: NVIDIA GeForce GTX 1060
2

ICDAR2015

3

Requirements

1、tensorflow >= 1.2
2、cuda8.0
3、python2.7 (anaconda2 recommend)
4、opencv(cv2)

Download Model

1、please download resnet50_v1resnet101_v1 pre-trained models on Imagenet, put it to data/pretrained_weights.
2、please download mobilenet_v2 pre-trained model on Imagenet, put it to data/pretrained_weights/mobilenet.
3、please download trained model by this project, put it to output/trained_weights.

Data Prepare

1、please download DOTA
2、crop data, reference:

cd $PATH_ROOT/data/io/DOTA
python train_crop.py 
python val_crop.py

3、data format

├── VOCdevkit
│   ├── VOCdevkit_train
│       ├── Annotation
│       ├── JPEGImages
│    ├── VOCdevkit_test
│       ├── Annotation
│       ├── JPEGImages

Compile

cd $PATH_ROOT/libs/box_utils/
python setup.py build_ext --inplace
cd $PATH_ROOT/libs/box_utils/cython_utils
python setup.py build_ext --inplace

Demo

Select a configuration file in the folder (libs/configs/) and copy its contents into cfgs.py, then download the corresponding weights.

DOTA

python demo_rh.py --src_folder='/PATH/TO/DOTA/IMAGES_ORIGINAL/' 
                  --image_ext='.png' 
                  --des_folder='/PATH/TO/SAVE/RESULTS/' 
                  --save_res=False
                  --gpu='0'

FDDB

python camera_demo.py --gpu='0'

Eval

python eval.py --img_dir='/PATH/TO/DOTA/IMAGES/' 
               --image_ext='.png' 
               --test_annotation_path='/PATH/TO/TEST/ANNOTATION/'
               --gpu='0'

Inference

python inference.py --data_dir='/PATH/TO/DOTA/IMAGES_CROP/'      
                    --gpu='0'

Train

1、If you want to train your own data, please note:

(1) Modify parameters (such as CLASS_NUM, DATASET_NAME, VERSION, etc.) in $PATH_ROOT/libs/configs/cfgs.py
(2) Add category information in $PATH_ROOT/libs/label_name_dict/lable_dict.py     
(3) Add data_name to line 75 of $PATH_ROOT/data/io/read_tfrecord.py 

2、make tfrecord

cd $PATH_ROOT/data/io/  
python convert_data_to_tfrecord.py --VOC_dir='/PATH/TO/VOCdevkit/VOCdevkit_train/' 
                                   --xml_dir='Annotation'
                                   --image_dir='JPEGImages'
                                   --save_name='train' 
                                   --img_format='.png' 
                                   --dataset='DOTA'

3、train

cd $PATH_ROOT/tools
python train.py

Tensorboard

cd $PATH_ROOT/output/summary
tensorboard --logdir=.

Citation

Some relevant achievements based on this code.

@article{[yang2018position](https://ieeexplore.ieee.org/document/8464244),
	title={Position Detection and Direction Prediction for Arbitrary-Oriented Ships via Multitask Rotation Region Convolutional Neural Network},
	author={Yang, Xue and Sun, Hao and Sun, Xian and  Yan, Menglong and Guo, Zhi and Fu, Kun},
	journal={IEEE Access},
	volume={6},
	pages={50839-50849},
	year={2018},
	publisher={IEEE}
}

@article{[yang2018r-dfpn](http://www.mdpi.com/2072-4292/10/1/132),
	title={Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks},
	author={Yang, Xue and Sun, Hao and Fu, Kun and Yang, Jirui and Sun, Xian and Yan, Menglong and Guo, Zhi},
	journal={Remote Sensing},
	volume={10},
	number={1},
	pages={132},
	year={2018},
	publisher={Multidisciplinary Digital Publishing Institute}
} 
Owner
UCAS-Det
UCAS-Det
利用Paddle框架复现CRAFT

CRAFT-Paddle 利用Paddle框架复现CRAFT CRAFT 本项目基于paddlepaddle框架复现CRAFT,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: CRAFT: Character-Region Awarenes

QuanHao Guo 2 Mar 07, 2022
Automatically resolve RidderMaster based on TensorFlow & OpenCV

AutoRiddleMaster Automatically resolve RidderMaster based on TensorFlow & OpenCV 基于 TensorFlow 和 OpenCV 实现的全自动化解御迷士小马谜题 Demo How to use Deploy the ser

神龙章轩 5 Nov 19, 2021
This tool will help you convert your text to handwriting xD

So your teacher asked you to upload written assignments? Hate writing assigments? This tool will help you convert your text to handwriting xD

Saurabh Daware 4.2k Jan 07, 2023
This project is basically to draw lines with your hand, using python, opencv, mediapipe.

Paint Opencv 📷 This project is basically to draw lines with your hand, using python, opencv, mediapipe. Screenshoots 📱 Tools ⚙️ Python Opencv Mediap

Williams Ismael Bobadilla Torres 3 Nov 17, 2021
POT : Python Optimal Transport

This open source Python library provide several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.

Python Optimal Transport 1.7k Jan 04, 2023
TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法,textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

zhangjing1 24 Apr 28, 2022
Tesseract Open Source OCR Engine (main repository)

Tesseract OCR About This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM

48.4k Jan 09, 2023
Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper.

EnergyExpenditure Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper. Additional data for replicating this s

Patrick S 42 Oct 26, 2022
Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

Revan Muhammad Dafa 5 Dec 06, 2021
A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

1 Dec 22, 2021
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 05, 2023
A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

Qbr Qbr, pronounced as Cuber, is a webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV. 🌈 Accurate color detection 🔍 Accurate 3x3x

Kim 金可明 502 Dec 29, 2022
This is a GUI program which consist of 4 OpenCV projects

Tkinter-OpenCV Project Using Tkinter, Opencv, Mediapipe This is a python GUI program using Tkinter which consist of 4 OpenCV projects 1. Finger Counte

Arya Bagde 3 Feb 22, 2022
Repository for Scene Text Detection with Supervised Pyramid Context Network with tensorflow.

Scene-Text-Detection-with-SPCNET Unofficial repository for [Scene Text Detection with Supervised Pyramid Context Network][https://arxiv.org/abs/1811.0

121 Oct 15, 2021
WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching Code based on our WACV 2022 Accepted Paper: https://arxiv.org/pdf/

Andres 13 Dec 17, 2022
✌️Using this you can control your PC/Laptop volume by Hand Gestures created with Python.

Hand Gesture Volume Controller ✋ Hand recognition 👆 Finger recognition 🔊 you can decrease and increase volume Demo Code Firstly I have created a Mod

Abbas Ataei 19 Nov 17, 2022
aardio的opencv库

opencv_aardio dll库下载地址:https://github.com/xuncv/opencv-plugin/releases import cv2 img = cv2.imread("./images/Lena.jpg",1) img = cv2.medianBlur(img,5)

71 Dec 31, 2022
GDB python tool to pretty print and debug c++ xtensor containers

gdb_xt2np GDB python tool to pretty print, examine, and debug c++ Xtensor containers. Xtensor is a c++ library for scientific computing using multidim

Christopher Burke 4 Oct 29, 2021
keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...

keras-ctpn [TOC] 说明 预测 训练 例子 4.1 ICDAR2015 4.1.1 带侧边细化 4.1.2 不带带侧边细化 4.1.3 做数据增广-水平翻转 4.2 ICDAR2017 4.3 其它数据集 toDoList 总结 说明 本工程是keras实现的CPTN: Detecti

mick.yi 107 Jan 09, 2023
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022