level2-data-annotation_cv-level2-cv-15 created by GitHub Classroom

Overview

[AI Tech 3기 Level2 P Stage] 글자 검출 대회

image

팀원 소개

김규리_T3016 박정현_T3094 석진혁_T3109 손정균_T3111 이현진_T3174 임종현_T3182

Overview

OCR (Optimal Character Recognition) 기술은 사람이 직접 쓰거나 이미지 속에 있는 문자를 얻은 다음 이를 컴퓨터가 인식할 수 있도록 하는 기술로, 컴퓨터 비전 분야에서 현재 널리 쓰이는 대표적인 기술 중 하나입니다.

OCR task는 글자 검출 (text detection), 글자 인식 (text recognition), 정렬기 (Serializer) 등의 모듈로 이루어져 있는데 본 대회는 글자 검출 (text detection)만을 해결하게 됩니다.

데이터를 구성하고 활용하는 방법에 집중하는 것을 장려하는 취지에서, 제공되는 베이스 코드 중 모델과 관련한 부분을 변경하는 것이 금지되어 있습니다. 데이터 수집과 preprocessing, data augmentation 그리고 optimizer, learning scheduler 등 최적화 방식을 변경할 수 있습니다.

  • Input : 글자가 포함된 전체 이미지
  • Output : bbox 좌표가 포함된 UFO Format

평가방법

  • DetEval

    이미지 레벨에서 정답 박스가 여러개 존재하고, 예측한 박스가 여러개가 있을 경우, 박스끼리의 다중 매칭을 허용하여 점수를 주는 평가방법 중 하나 입니다

    1. 모든 정답/예측박스들에 대해서 Area Recall, Area Precision을 미리 계산해냅니다.

    2. 모든 정답 박스와 예측 박스를 순회하면서, 매칭이 되었는지 판단하여 박스 레벨로 정답 여부를 측정합니다.

    3. 모든 이미지에 대하여 Recall, Precision을 구한 이후, 최종 F1-Score은 모든 이미지 레벨에서 측정 값의 평균으로 측정됩니다.

      image

Final Score  🏅

  • Public : f1 0.6897 → Private f1 : 0.6751
  • Public : 11위/19팀 → Private : 9위/19팀

image

Archive contents

template
├──code
│  ├──augmentation.py
│  ├──convert_mlt.py
│  ├──dataset.py
│  ├──deteval.py
│  ├──east_dataset.py
│  ├──inference.py
│  ├──loss.py
│  ├──model.py
│  └──train.py
└──input
   └──ICDAR2017_Korean
		  └──data
			  	├──images
		      └──ufo
			        ├──train.json
							└──val.json

Dataset

  • ICDAR MLT17 Korean : 536 images ⊆ ICDAR MLT17 : 7,200 images

  • ICDAR MLT19 : 10,000 images

  • ICAR ArT : 5,603 images

Experiment

Results

dataset 데이터 수 LB score (public→private) Recall Precision
01 ICDAR17_Korean 536 0.4469 → 0.4732 0.3580 → 0.3803 0.5944 → 0.6264
02 Camper (폴리곤 수정 전) 1288 0.4543 → 0.5282 0.3627 → 0.4349 0.6077 → 0.6727
03 Camper (폴리곤 수정 후) 1288 0.4644 → 0.5298 0.3491 → 0.4294 0.6936 → 0.6913
04 ICDAR17_Korean + Camper 1824 0.4447 → 0.5155 0.3471 → 0.4129 0.6183 → 0.6858
05 ICDAR17(859) 859 0.5435 → 0.5704 0.4510 → 0.4713 0.6837 → 0.7222
06 ICDAR17_MLT 7200 0.6749 → 0.6751 0.5877 → 0.5887 0.7927 → 0.7912
07 ICDAR19+ArT 약 15000 0.6344 → 0.6404 0.5489 → 0.5607 0.7514 → 0.7465

Requirements

pip install -r requirements.txt

UFO Format으로 변환

python convert_mlt.py

SRC_DATASET_DIR = {변환 전 data 경로}

DST_DATASET_DIR = {변환 된 data 경로}

UFO Format ****

File Name
    ├── img_h
    ├── img_w
    └── words
        ├── points
        ├── transcription
        ├── language
        ├── illegibillity
        ├── orientation
        └── word_tags

Train.py

python train.py --data_dir {train data path} --val_data_dir {val data path} --name {wandb run name} --exp_name {model name
EasyMultiClipboard - Python script written to handle more than 1 string in clipboard

EasyMultiClipboard - Python script written to handle more than 1 string in clipboard

WVlab 1 Jun 18, 2022
My Sublime Text theme

rsms sublime text theme Install: cd path/to/your/sublime/packages git clone https://github.com/rsms/sublime-theme.git rsms-theme You'll also need the

Rasmus 166 Jan 04, 2023
Collections of Beautiful Latex Snippets

HandyLatex Collections of Beautiful Latex Snippets Table 👉 Succinct table with bold separation line and gray text %################## Dependencies ##

Xintao 15 Apr 11, 2022
graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elliptical orbits. you can change timestamp value or scale from source code idc.

solarSystemOrbitalSimulation graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elli

Mega 3 Mar 03, 2022
NetBox plugin for BGP related objects documentation

Netbox BGP Plugin Netbox plugin for BGP related objects documentation. Compatibility This plugin in compatible with NetBox 2.10 and later. Installatio

Nikolay Yuzefovich 133 Dec 27, 2022
Anomaly Detection via Reverse Distillation from One-Class Embedding

Anomaly Detection via Reverse Distillation from One-Class Embedding Implementation (Official Code ⭐️ ⭐️ ⭐️ ) Environment pytorch == 1.91 torchvision =

73 Dec 19, 2022
Data-science-on-gcp - Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

data-science-on-gcp Source code accompanying book: Data Science on the Google Cloud Platform, 2nd Edition Valliappa Lakshmanan O'Reilly, Jan 2022 Bran

Google Cloud Platform 1.2k Dec 28, 2022
Plugins for MkDocs.

Plugins for MkDocs and Python Markdown pip install neoteroi-mkdocs This package includes the following plugins and extensions: Name Description Type m

35 Dec 23, 2022
FxBuzzly - Buzzly.art links do not embed in Discord, this fixes them (rudimentarily)

fxBuzzly Buzzly.art links do not embed in Discord, this fixes them (rudimentaril

Dania Rifki 2 Oct 27, 2022
A fast time mocking alternative to freezegun that wraps libfaketime.

python-libfaketime: fast date/time mocking python-libfaketime is a wrapper of libfaketime for python. Some brief details: Linux and OS X, Pythons 3.5

Simon Weber 68 Jun 10, 2022
A comprehensive and FREE Online Python Development tutorial going step-by-step into the world of Python.

FREE Reverse Engineering Self-Study Course HERE Fundamental Python The book and code repo for the FREE Fundamental Python book by Kevin Thomas. FREE B

Kevin Thomas 7 Mar 19, 2022
Plover jyutping - Plover plugin for Jyutping input

Plover plugin for Jyutping Installation Navigate to the repo directory: cd plove

Samuel Lo 1 Mar 17, 2022
An introduction to hikari, complete with different examples for different command handlers.

An intro to hikari This repo provides some simple examples to get you started with hikari. Contained in this repo are bots designed with both the hika

Ethan Henderson 18 Nov 29, 2022
layout-parser 3.4k Dec 30, 2022
🧙 A simple, typed and monad-based Result type for Python.

meiga 🧙 A simple, typed and monad-based Result type for Python. Table of Contents Installation 💻 Getting Started 📈 Example Features Result Function

Alice Biometrics 31 Jan 08, 2023
Main repository for the Sphinx documentation builder

Sphinx Sphinx is a tool that makes it easy to create intelligent and beautiful documentation for Python projects (or other documents consisting of mul

5.1k Jan 02, 2023
A complete kickstart devcontainer repository for python3

A complete kickstart devcontainer repository for python3

Viktor Freiman 3 Dec 23, 2022
Compare two CSV files for differences. Colorize the differences and align the columns.

pretty-csv-diff Compare two CSV files for differences. Colorize the differences and align the columns. Command-Line Example Command-Line Usage usage:

Devon 6 Dec 29, 2022
An MkDocs plugin to export content pages as PDF files

MkDocs PDF Export Plugin An MkDocs plugin to export content pages as PDF files The pdf-export plugin will export all markdown pages in your MkDocs rep

Terry Zhao 266 Dec 13, 2022
An open-source script written in python just for fun

Owersite Owersite is an open-source script written in python just for fun. It do

大きなペニスを持つ少年 7 Sep 21, 2022