level2-data-annotation_cv-level2-cv-15 created by GitHub Classroom

Last update: Jun 10, 2022

Overview

[AI Tech 3기 Level2 P Stage] 글자 검출 대회

팀원 소개

김규리_T3016	박정현_T3094	석진혁_T3109	손정균_T3111	이현진_T3174	임종현_T3182

Overview

OCR (Optimal Character Recognition) 기술은 사람이 직접 쓰거나 이미지 속에 있는 문자를 얻은 다음 이를 컴퓨터가 인식할 수 있도록 하는 기술로, 컴퓨터 비전 분야에서 현재 널리 쓰이는 대표적인 기술 중 하나입니다.

OCR task는 글자 검출 (text detection), 글자 인식 (text recognition), 정렬기 (Serializer) 등의 모듈로 이루어져 있는데 본 대회는 글자 검출 (text detection)만을 해결하게 됩니다.

데이터를 구성하고 활용하는 방법에 집중하는 것을 장려하는 취지에서, 제공되는 베이스 코드 중 모델과 관련한 부분을 변경하는 것이 금지되어 있습니다. 데이터 수집과 preprocessing, data augmentation 그리고 optimizer, learning scheduler 등 최적화 방식을 변경할 수 있습니다.

Input : 글자가 포함된 전체 이미지
Output : bbox 좌표가 포함된 UFO Format

평가방법

DetEval

이미지 레벨에서 정답 박스가 여러개 존재하고, 예측한 박스가 여러개가 있을 경우, 박스끼리의 다중 매칭을 허용하여 점수를 주는 평가방법 중 하나 입니다
1. 모든 정답/예측박스들에 대해서 Area Recall, Area Precision을 미리 계산해냅니다.
2. 모든 정답 박스와 예측 박스를 순회하면서, 매칭이 되었는지 판단하여 박스 레벨로 정답 여부를 측정합니다.
3. 모든 이미지에 대하여 Recall, Precision을 구한 이후, 최종 F1-Score은 모든 이미지 레벨에서 측정 값의 평균으로 측정됩니다.

Final Score 🏅

Public : f1 0.6897 → Private f1 : 0.6751
Public : 11위/19팀 → Private : 9위/19팀

Archive contents

template
├──code
│  ├──augmentation.py
│  ├──convert_mlt.py
│  ├──dataset.py
│  ├──deteval.py
│  ├──east_dataset.py
│  ├──inference.py
│  ├──loss.py
│  ├──model.py
│  └──train.py
└──input
   └──ICDAR2017_Korean
		  └──data
			  	├──images
		      └──ufo
			        ├──train.json
							└──val.json

Dataset

ICDAR MLT17 Korean : 536 images ⊆ ICDAR MLT17 : 7,200 images
ICDAR MLT19 : 10,000 images
ICAR ArT : 5,603 images

Experiment

wrapup report

Results

	dataset	데이터 수	LB score (public→private)	Recall	Precision
01	ICDAR17_Korean	536	0.4469 → 0.4732	0.3580 → 0.3803	0.5944 → 0.6264
02	Camper (폴리곤 수정 전)	1288	0.4543 → 0.5282	0.3627 → 0.4349	0.6077 → 0.6727
03	Camper (폴리곤 수정 후)	1288	0.4644 → 0.5298	0.3491 → 0.4294	0.6936 → 0.6913
04	ICDAR17_Korean + Camper	1824	0.4447 → 0.5155	0.3471 → 0.4129	0.6183 → 0.6858
05	ICDAR17(859)	859	0.5435 → 0.5704	0.4510 → 0.4713	0.6837 → 0.7222
06	ICDAR17_MLT	7200	0.6749 → 0.6751	0.5877 → 0.5887	0.7927 → 0.7912
07	ICDAR19+ArT	약 15000	0.6344 → 0.6404	0.5489 → 0.5607	0.7514 → 0.7465

Requirements

pip install -r requirements.txt

UFO Format으로 변환

python convert_mlt.py

SRC_DATASET_DIR = {변환 전 data 경로}

DST_DATASET_DIR = {변환 된 data 경로}

UFO Format ****

File Name
    ├── img_h
    ├── img_w
    └── words
        ├── points
        ├── transcription
        ├── language
        ├── illegibillity
        ├── orientation
        └── word_tags

Train.py

python train.py --data_dir {train data path} --val_data_dir {val data path} --name {wandb run name} --exp_name {model name

level2-data-annotation_cv-level2-cv-15 created by GitHub Classroom

Related tags

Overview

[AI Tech 3기 Level2 P Stage] 글자 검출 대회

팀원 소개

Overview

평가방법

Final Score 🏅

Archive contents

Dataset

Experiment

Results

Requirements

UFO Format으로 변환

UFO Format ****

Train.py

Owner

Uses diff command to compare expected output with student's submission output

Numpy's Sphinx extensions

A next-generation curated knowledge sharing platform for data scientists and other technical professions.

Bring RGB to life in Neovim

Projeto em Python colaborativo para o Bootcamp de Dados do Itaú em parceria com a Lets Code

Elliptic curve cryptography (ed25519) beginner tutorials in Python 3

A simple malware that tries to explain the logic of computer viruses with Python.

Quick tutorial on orchest.io that shows how to build multiple deep learning models on your data with a single line of code using python

Mozilla Campus Club CCEW is a student committee working to spread awareness on Open Source software.

Data-Scrapping SEO - the project uses various data scrapping and Google autocompletes API tools to provide relevant points of different keywords so that search engines can be optimized

Generate a backend and frontend stack using Python and json-ld, including interactive API documentation.

Highlight Translator can help you translate the words quickly and accurately.

A simple USI Shogi Engine written in python using python-shogi.

An awesome Data Science repository to learn and apply for real world problems.

ReStructuredText and Sphinx bridge to Doxygen

Automated generation of real Swagger/OpenAPI 2.0 schemas from Django REST Framework code.

Python Deep Dive Course - Accompanying Materials

Deduplicating archiver with compression and authenticated encryption.

A repository of links with advice related to grad school applications, research, phd etc

Swagger Documentation Generator for Django REST Framework: deprecated