Transformers
π
Arabic licence plate recognition - Solution to the kaggle competition Machathon 3.0.
- Ranked in the top
6οΈβ£ at the final evaluation phase. - Check our solution now on collab!
- Check the solution presentation
Preprocessing Pipeline
Approach
Step1: Preprocessing Enhancments on the image.
- Most images had bad illumination and noise
- Morphological operations to Maximize Contrast.
- Gaussian Blur to remove Noise.
- Thresholding on both Value and Saturation channels.
Step2: Extracting white plate using countours.
- Get countours and sort based on Area.
- Polygon Approximation For noisy countours.
- Convex hull for Concave polygons.
- 4-Point transformation For difficult camera angles.
Now have numbers in a countor and letters in another.
Step3: Separating characters from white plate using sliding windows.
Can't use countours to get symbols in white plate since Arabic Letter may consist of multiple charachters e.g Ψͺ this may consist of 2/3 countours.
Solution
- Tuned 2 sliding windows, one for letters' white plate, the other for numbers.
- Variable window width
- Window height is the white plate height, since arabic characters may consist multiple parts
- Selecting which window
- Must have no black pixels on the sides
- Must have a specific range of black pixels inside
- For each group of windows the one with max black pixels is selected
Step4: Character Recognition.
- Training 2 model since Arabic letters and numbers are similar e.g (Ψ£,1) (5, Ω)
- one for classifing only arabic letters.
- one for classifying arabic numbers.
Project Organization
Scripts applied on images
./Macathon/code/
βββ extract_bbx_xml.ipynb : Takes directory of images and their bbx data stored in an xml files, and crop the bbxs from the images.
| The xml file contains licence label(name), xmin, ymin, xmax, ymax of the bbxs in an image.
βββ extract_bbx_txt.ipynb : Takes directory of images and their bbx data stored in a txt files, and crop the bbxs from the images.
| The txt file corresponding to one image may consist of multiple bbxs, each corresponds to a row of xmin,ymin,xmax,ymax for that bbx.
βββ crop_right_noise.ipynb : Crops an image with some percentage and replace with the cropped image.
Model versions
./Macathon/code/
βββ model.ipynb : - The preprocessing and modeling stage, Contains:
- Preprocessing Functions
- Training both classifers
- Prediction and generating the output csv file
Data Folder
./Macathon/data/
βββ challenging_images.rar : Contains most challenging images collected from the train data.
βββ cropped_letters.zip : 28 Subfolders corresponding to the 28 letter in Arabic alphabet.
| Each subfolder holds images for the letter it's named after, cropped from the train data distribution.
βββ cropped_numbers.zip : 10 Subfolders for the 10 numbers.
| Each subfolder holds images for the number it's named after, cropped from the train data distribution.
βββ machathon-3.zip : The uploaded data found with the kaggle competition.
βββ testLetters.zip : 200 images labeled from the test data distribution.
Each image has a corresponding xml file holding the bbxs locations in it.
Contributors
This masterpiece was designed, and implemented by
Hossam Saeed |
Mostafa Wael |
Nada Elmasry |
Noran Hany |