当前位置：网站首页>08 semi automatic annotation of target detection data set

08 semi automatic annotation of target detection data set

2022-07-18 07:06:00 【Smooth the bumps for Chengda Road】

Deep learning target detection usually requires a large data scale , But the annotation of data sets is usually a time-consuming and laborious thing with little significance , We can mark three or five hundred pictures first , Then train a relatively preliminary model , Then use this model to infer the unlabeled image , Export the reasoning result as voc Format , Then use the local image annotation software to annotate the pictures and annotation files labelimg open , After manually adjusting the position of the border , The annotation file can be used as a data set for subsequent model training .

Reasoning

!python /home/aistudio/PaddleDetection/tools/infer.py \
	-c /home/aistudio/PaddleDetection/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.yml \
	--draw_threshold=0.5 \
	--infer_dir=img \
	--output_dir=toolinfer \
	--use_vdl=True \
	--save_txt=True \
	-o weights=/home/aistudio/output/cascade_rcnn_r50_fpn_1x_coco/model_final

Save the reasoning results to txt In file

The reasoning result is modified as voc Format annotation

Insert picture description here

Move the generated inference result file to the specified folder , Modify the corresponding coordinate information , Generate voc Format xml Mark the file .

Convert code

import os
import cv2

headstr = """\ <annotation> <folder>VOC</folder> <filename>%s</filename> <source> <database>My Database</database> <annotation>VOC</annotation> <image>flickr</image> <flickrid>NULL</flickrid> </source> <owner> <flickrid>NULL</flickrid> <name>company</name> </owner> <size> <width>%d</width> <height>%d</height> <depth>%d</depth> </size> <segmented>0</segmented> """
objstr = """\ <object> <name>%s</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>%d</difficult> <bndbox> <xmin>%d</xmin> <ymin>%d</ymin> <xmax>%d</xmax> <ymax>%d</ymax> </bndbox> </object> """
tailstr = '''\ </annotation> '''


def save_annotations(boxes, img, filename):
    H = img.shape[0]
    W = img.shape[1]
    C = img.shape[2]
    # H,W,C = img.shape
    img_name = filename.split('.')[0] + '.bmp'
    head = headstr % (img_name, W, H, C)  #  Write header file 
    tail = tailstr  #  Write tail file 
    #  write in boxes
    save_path = anno_path + filename.split('.')[0] + '.xml'
    f = open(save_path, 'w')
    f.write(head)
    for box in boxes:
        f.write(objstr % (str(box[0]), 0, float(box[2]), float(box[3]), float(box[2]) + float(box[4]), float(box[3]) + float(box[5])))
    f.write(tail)


if __name__ == '__main__':
    #  Set the path 
    root_path = './'
    total_label_path = root_path + 'txt/'  # txt Storage path 
    total_img_path = root_path + 'img/'  #  Image storage path 
    anno_path = root_path + 'Annotations/'  #  Store generated xml Mark the file 
    #  Judge whether there is... In the current path Annotations This folder , If it does not exist , Automatically create a 
    if not os.path.exists(anno_path):
        os.mkdir(anno_path)
    #  Read one by one txt Mark the file 
    for filename in os.listdir(total_label_path):
        cur_label_path = total_label_path + filename
        cur_img_path = total_img_path + filename.split('.')[0] + '.bmp'  #  Change the file name suffix 
        cur_boxes = []
        #  Read the current txt Contents of the file 
        with open(cur_label_path, 'r') as file:
            while True:
                line = file.readline().strip()  # .strip() To remove '\r,\n'
                if not line:
                    break
                line_list = [ele for ele in line.split(' ')]
                cur_boxes.append(line_list)
        #  Read the current image 
        cur_img = cv2.imread(cur_img_path)
        #  Conduct xml Document storage 
        save_annotations(cur_boxes, cur_img, filename)

What needs to be revised ：

txt Path of file storage total_label_path
Image storage path total_img_path
Create and store voc Format folder anno_path
The suffix of the image file bmp

matters needing attention ：

Generally, three or five hundred images with initial annotation are required
Try to use high-precision target detection algorithm to complete pre training
According to the prediction results and the situation of the real target , Reasonably select the reasoning threshold . If there are many missed inspections , Then reduce the confidence threshold ; If there are many false inspections , Then increase the confidence threshold .

原网站

版权声明
本文为[Smooth the bumps for Chengda Road]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/199/202207151634466982.html

当前位置：网站首页>08 semi automatic annotation of target detection data set

08 semi automatic annotation of target detection data set

Reasoning

The reasoning result is modified as voc Format annotation

边栏推荐

猜你喜欢

随机推荐