level2-data-annotation_cv-level2-cv-15 created by GitHub Classroom

Overview

[AI Tech 3기 Level2 P Stage] 글자 검출 대회

image

팀원 소개

김규리_T3016 박정현_T3094 석진혁_T3109 손정균_T3111 이현진_T3174 임종현_T3182

Overview

OCR (Optimal Character Recognition) 기술은 사람이 직접 쓰거나 이미지 속에 있는 문자를 얻은 다음 이를 컴퓨터가 인식할 수 있도록 하는 기술로, 컴퓨터 비전 분야에서 현재 널리 쓰이는 대표적인 기술 중 하나입니다.

OCR task는 글자 검출 (text detection), 글자 인식 (text recognition), 정렬기 (Serializer) 등의 모듈로 이루어져 있는데 본 대회는 글자 검출 (text detection)만을 해결하게 됩니다.

데이터를 구성하고 활용하는 방법에 집중하는 것을 장려하는 취지에서, 제공되는 베이스 코드 중 모델과 관련한 부분을 변경하는 것이 금지되어 있습니다. 데이터 수집과 preprocessing, data augmentation 그리고 optimizer, learning scheduler 등 최적화 방식을 변경할 수 있습니다.

  • Input : 글자가 포함된 전체 이미지
  • Output : bbox 좌표가 포함된 UFO Format

평가방법

  • DetEval

    이미지 레벨에서 정답 박스가 여러개 존재하고, 예측한 박스가 여러개가 있을 경우, 박스끼리의 다중 매칭을 허용하여 점수를 주는 평가방법 중 하나 입니다

    1. 모든 정답/예측박스들에 대해서 Area Recall, Area Precision을 미리 계산해냅니다.

    2. 모든 정답 박스와 예측 박스를 순회하면서, 매칭이 되었는지 판단하여 박스 레벨로 정답 여부를 측정합니다.

    3. 모든 이미지에 대하여 Recall, Precision을 구한 이후, 최종 F1-Score은 모든 이미지 레벨에서 측정 값의 평균으로 측정됩니다.

      image

Final Score  🏅

  • Public : f1 0.6897 → Private f1 : 0.6751
  • Public : 11위/19팀 → Private : 9위/19팀

image

Archive contents

template
├──code
│  ├──augmentation.py
│  ├──convert_mlt.py
│  ├──dataset.py
│  ├──deteval.py
│  ├──east_dataset.py
│  ├──inference.py
│  ├──loss.py
│  ├──model.py
│  └──train.py
└──input
   └──ICDAR2017_Korean
		  └──data
			  	├──images
		      └──ufo
			        ├──train.json
							└──val.json

Dataset

  • ICDAR MLT17 Korean : 536 images ⊆ ICDAR MLT17 : 7,200 images

  • ICDAR MLT19 : 10,000 images

  • ICAR ArT : 5,603 images

Experiment

Results

dataset 데이터 수 LB score (public→private) Recall Precision
01 ICDAR17_Korean 536 0.4469 → 0.4732 0.3580 → 0.3803 0.5944 → 0.6264
02 Camper (폴리곤 수정 전) 1288 0.4543 → 0.5282 0.3627 → 0.4349 0.6077 → 0.6727
03 Camper (폴리곤 수정 후) 1288 0.4644 → 0.5298 0.3491 → 0.4294 0.6936 → 0.6913
04 ICDAR17_Korean + Camper 1824 0.4447 → 0.5155 0.3471 → 0.4129 0.6183 → 0.6858
05 ICDAR17(859) 859 0.5435 → 0.5704 0.4510 → 0.4713 0.6837 → 0.7222
06 ICDAR17_MLT 7200 0.6749 → 0.6751 0.5877 → 0.5887 0.7927 → 0.7912
07 ICDAR19+ArT 약 15000 0.6344 → 0.6404 0.5489 → 0.5607 0.7514 → 0.7465

Requirements

pip install -r requirements.txt

UFO Format으로 변환

python convert_mlt.py

SRC_DATASET_DIR = {변환 전 data 경로}

DST_DATASET_DIR = {변환 된 data 경로}

UFO Format ****

File Name
    ├── img_h
    ├── img_w
    └── words
        ├── points
        ├── transcription
        ├── language
        ├── illegibillity
        ├── orientation
        └── word_tags

Train.py

python train.py --data_dir {train data path} --val_data_dir {val data path} --name {wandb run name} --exp_name {model name
This repository outlines deploying a local Kubeflow v1.3 instance on microk8s and deploying a simple MNIST classifier using KFServing.

Zero to Inference with Kubeflow Getting Started This repository houses all of the tools, utilities, and example pipeline implementations for exploring

Ed Henry 3 May 18, 2022
Use Brainf*ck with python!

Brainfudge Run Brainf*ck code with python! Classes Interpreter(array_len): encapsulate all functions into class __init__(self, array_len: int=30000) -

1 Dec 14, 2021
MkDocs plugin for setting revision date from git per markdown file

mkdocs-git-revision-date-plugin MkDocs plugin that displays the last revision date of the current page of the documentation based on Git. The revision

Terry Zhao 48 Jan 06, 2023
Ultimaker Cura 2 Mooraker Upload Plugin

Klipper & Cura - Cura2MoonrakerPlugin Allows you to upload Gcode directly from Cura to your Klipper-based 3D printer (Fluidd, Mainsailos etc.) using t

214 Jan 03, 2023
Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021

ReasonBERT Code and pre-trained models for ReasonBert: Pre-trained to Reason with Distant Supervision, EMNLP'2021 Pretrained Models The pretrained mod

SunLab-OSU 29 Dec 19, 2022
A markdown wiki and dashboarding system for Datasette

datasette-notebook A markdown wiki and dashboarding system for Datasette This is an experimental alpha and everything about it is likely to change. In

Simon Willison 19 Apr 20, 2022
DocumentPy is a Python application that runs in a command-line interface environment, made for creating HTML documents.

DocumentPy DocumentPy is a Python application that runs in a command-line interface environment, made for creating HTML documents. Usage DocumentPy, a

Lotus 0 Jul 15, 2021
A tool that allows for versioning sites built with mkdocs

mkdocs-versioning mkdocs-versioning is a plugin for mkdocs, a tool designed to create static websites usually for generating project documentation. mk

Zayd Patel 38 Feb 26, 2022
A comprehensive and FREE Online Python Development tutorial going step-by-step into the world of Python.

FREE Reverse Engineering Self-Study Course HERE Fundamental Python The book and code repo for the FREE Fundamental Python book by Kevin Thomas. FREE B

Kevin Thomas 7 Mar 19, 2022
SamrSearch - SamrSearch can get user info and group info with MS-SAMR

SamrSearch SamrSearch can get user info and group info with MS-SAMR.like net use

knight 10 Oct 06, 2022
Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane.

Xanadu Quantum Codebook The Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane. This reposit

Xanadu 43 Dec 09, 2022
Quilt is a self-organizing data hub for S3

Quilt is a self-organizing data hub Python Quick start, tutorials If you have Python and an S3 bucket, you're ready to create versioned datasets with

Quilt Data 1.2k Dec 30, 2022
xeuledoc - Fetch information about a public Google document.

xeuledoc - Fetch information about a public Google document.

Malfrats Industries 651 Dec 27, 2022
Fun interactive program to sort a list :)

LHD-Build-Sort-a-list Fun interactive program to sort a list :) Inspiration LHD Build Write a script to sort a list. What it does It is a menu driven

Ananya Gupta 1 Jan 15, 2022
Make posters from Markdown files.

MkPosters Create posters using Markdown. Supports icons, admonitions, and LaTeX mathematics. At the moment it is restricted to the specific layout of

Patrick Kidger 243 Dec 20, 2022
Course materials for: Geospatial Data Science

Course materials for: Geospatial Data Science These course materials cover the lectures for the course held for the first time in spring 2022 at IT Un

Michael Szell 266 Jan 02, 2023
Some code that takes a pipe-separated input and converts that into a table!

tablemaker A program that takes an input: a | b | c # With comments as well. e | f | g h | i |jk And converts it to a table: ┌───┬───┬────┐ │ a │ b │

CodingSoda 2 Aug 30, 2022
Python solutions to solve practical business problems.

Python Business Analytics Also instead of "watching" you can join the link-letter, it's already being sent out to about 90 people and you are free to

Derek Snow 357 Dec 26, 2022
Documentation and issues for Pylance - Fast, feature-rich language support for Python

Documentation and issues for Pylance - Fast, feature-rich language support for Python

Microsoft 1.5k Dec 29, 2022