当前位置:网站首页>Clean the label folder
Clean the label folder
2022-07-26 09:01:00 【The wind blows the fallen leaves and the flowers flutter】
Clean the marked folder
One 、 Preface
It is often used Ai After marking , Many categories are different from what we need .
For example, some boxes have no selected category , So when marking, this box is -1, This needs cleaning before training
There is also the need to delete empty annotation files
For this reason, I simply wrote about cleaning everything in the folder txt Procedure of documents
Two 、 Cleaning procedure
import os
''' Enter the folder address : Output the relative addresses of all files in this folder '''
def getFilePath(fileDo):
allFilePath=[]
for file in os.listdir(fileDo):
allFilePath.append(os.path.join(fileDo, file))
#print(file)
return allFilePath
''' Enter a list containing file addresses : Function to traverse the file and execute the following instructions 1、 Delete the category as -1 The item 2、 Change other categories to specified values k 3、 Delete empty files '''
def PreData(allFilePath,k):
for FilePath in allFilePath:
if FilePath.endswith('txt'):
tem = [] # Set category not to -1 Save the items of
with open(FilePath, 'r', encoding='utf-8') as f:
while True:
line = f.readline()
if line == '':
break
if line[0] != '-': # Complete the deletion of category -1 The task of
line=str(k)+line[1:]
tem.append(line)
else:
print(line)
# If , The text , What can be inserted normally is empty , Just delete the file
if len(tem)==0:
os.remove(FilePath)
print(' Delete file '+FilePath)
# If the part is empty, write
else:
with open(FilePath, 'w', encoding='utf-8') as w:
for line in tem:
w.write(str(line))# Rewrite the previously cached values
def main():
#fileDo = './ human beings / human beings - Baidu - Tag data '
fileDo=input(' Please enter the relative address of the folder to be cleaned :')
#print(fileDo)
# Output to this folder , The relative address of all files and returned to the variable
allFilePath=getFilePath(fileDo)
print(' All documents in the address are as follows :')
print(allFilePath)
# Normalize folders
PreData(allFilePath,3)
if __name__ =='__main__':
main()
边栏推荐
- Uploading pictures on Alibaba cloud OSS
- Deploy prometheus+grafana monitoring platform
- C # use npoi to operate Excel
- Web overview and b/s architecture
- Kept dual machine hot standby
- 基于序的评价指标 (特别针对推荐系统和多标签学习)
- day06 作业--技能题6
- Node-v download and application, ES6 module import and export
- C Entry series (31) -- operator overloading
- One click deployment of lamp and LNMP scripts is worth having
猜你喜欢
The lessons of 2000. Web3 = the third industrial revolution?
SSH,NFS,FTP
2022化工自动化控制仪表操作证考试题模拟考试平台操作
Learning notes of automatic control principle - Performance Analysis of continuous time system
Kept dual machine hot standby
Pan micro e-cology8 foreground SQL injection POC
Cadence(十)走线技巧与注意事项
CSDN TOP1“一个处女座的程序猿“如何通过写作成为百万粉丝博主?
pl/sql之集合-2
深度学习常用激活函数总结
随机推荐
[recommended collection] MySQL 30000 word essence summary - query and transaction (III)
NFT与数字藏品到底有何区别?
海内外媒体宣发自媒体发稿要严格把握内容关
Self review ideas of probability theory
深度学习常用激活函数总结
Database operation skills 6
sklearn 机器学习基础(线性回归、欠拟合、过拟合、岭回归、模型加载保存)
03 exception handling, state keeping, request hook -- 04 large project structure and blueprint
220. Presence of repeating element III
The idea shortcut key ALT realizes the whole column operation
ES6 modular import and export) (realize page nesting)
Study notes of automatic control principle --- stability analysis of control system
对标注文件夹进行清洗
ES6模块化导入导出)(实现页面嵌套)
Uploading pictures on Alibaba cloud OSS
Nuxt - 项目打包部署及上线到服务器流程(SSR 服务端渲染)
PAT 甲级 A1013 Battle Over Cities
Set of pl/sql -2
Pytoch learning - from tensor to LR
2022茶艺师(中级)特种作业证考试题库模拟考试平台操作