Project 6b - Object detection II, downloading dataset and preparation for use with YOLOv4
- James Canova
- Sep 4, 2021
- 3 min read
Project start: 14 January 2022
Project finish: 19 February 2022
Last updated: 10 October 2022
Objective: download files from Google Open Images V5 to train a YOLOv4 network to recognise cars and vehicle licence plates
install OIDv4 which is required for downloading data from Open Images:
note 1: there is a more recent version of OID available but it requires Python 3.7, however, only Python 3.6.9 is installed
note 2: OIDv4 downloads images and associated labels (.txt files) from Open Images V5, however, there is a more recent version of Open Images available
cd MyPython36VEnv (make sure this VEnv is deactivated first)
git clone https://github.com/EscVM/OIDv4_ToolKit
In VEnv:
pip3 install -U tqdm
pip3 install -U pandas
pip3 install -U awscli
A directory called OIDv4 will be created with a file called main.py which is executed to download the data from Google Open Images.
change permissions so can execute main.py:
(myPython36VEnv) james@myB01:~/Public/Projects/myPython36VEnv$ sudo chmod -R 777 OIDv4_ToolKit
download data:
note 1: if you go to the Open Images website you can search for the classes of data that is available, here we are using Car and Vehicle_registration_plate classes
note 2: I just downloaded training data, however, validation and test data can be downloaded as well
note 3: a total of 500 datasets for each class was downloaded - this was a guess on my part nut in the end was sufficient data to correctly train the YOLOv4 neural network
note 4: downloading took about 30 minutes
note 5: during downloading I had a couple warnings about low memory and my keyboard and mouse froze a few times
note 6: after executing main.py there will be a couple of places to answer 'y/n', select 'y'
note 7: activate VEnv myPython36VEnv
to download (takes about 30 minutes @ ~290 Mbit/s):
(myPython36VEnv) james@myB01:~/Public/Projects/Project_6b$
python3 /home/james/Public/Projects/myPython36VEnv/OIDv4_ToolKit/main.py downloader --classes Car Vehicle_registration_plate --type_csv train --multiclasses 1 --limit 500
A directory called OID is created and contains the downloaded data:
note 1: change directory name in Dataset/train from Car_Vehicle registration plate
to Car_Vehicle_registration_plate (i.e add a couple of underscore characters)
note 2: 'tree' is installed using $sudo apt-get install tree in $cd $HOME (used below to create directory/file trees)
(myPython36VEnv) james@myB01:~/Public/Projects/Project_6b$ tree -L 5
.
└── OID
├── csv_folder
│ ├── class-descriptions-boxable.csv
│ └── train-annotations-bbox.csv
└── Dataset
└── train
└── Car_Vehicle_registration_plate
├── 0006c87d3735fcef.jpg
├── 000b393437134262.jpg
├── 00166578c691cd43.jpg
├── 001679a19bb6fd3f.jpg
├── ffad3daadd475dee.jpg
├── fff1c14a00ae5a55.jpg
├──…..
└── Label
├── 0006c87d3735fcef.txt
├── 000b393437134262.txt
├── 00166578c691cd43.txt
├── 001679a19bb6fd3f.txt
├── 002521102ecfac4c.txt
├── 002901d9d194c4fb.txt
├── 002aeab534f0007a.txt
├──…..
creation of additional files:
manually create myClasses.txt and myClasses.names in /Car_Vehicle_registration_plate with the text:
Car
Vehicle_registration_plate
The following .zip file contains a Jupyter Notebook program to create additional files for YOLOv4 (to run Jupyter Notebook, type jupyter-notebook):
note: you will have to update PATH_TO_OID and PATH_TO_DOWNLOADED_DATA to match your directory names
The files shown in bold are created:
james@myB01:~/Public/Projects/Project_6b$ tree -L 4
.
├── OID
│── project_6b_V0.ipynb
│ ├── csv_folder
│ │ ├── class-descriptions-boxable.csv
│ │ └── train-annotations-bbox.csv
│ └── Dataset
│ └── train
│ └── Car_Vehicle_registration_plate
│ ├── fe0edf1f6b9b4c3e.txt
│ ├── fe128d605ee33973.jpg
│ ├── fe128d605ee33973.txt
│ ├── febafd4c9ba92d4c.jpg
│ ├── febafd4c9ba92d4c.txt
│ ├── ff398f1526a2d447.jpg
│ ├── ff398f1526a2d447.txt
│ ├──…..
│ ├──…..
│ ├── ff45ac2ddf509f89.jpg
│ ├── ff45ac2ddf509f89.txt
│ ├── ffab1ac4b3fc5168.jpg
│ ├── ffab1ac4b3fc5168.txt
│ ├── myClasses.names
│ ├── myClasses.txt (was manually created)
│ ├── myData.data
│ ├──myTrain.txt
└── Label
├── 0006c87d3735fcef.txt
├── 000b393437134262.txt
├── 00166578c691cd43.txt
├── 001679a19bb6fd3f.txt
├── 002521102ecfac4c.txt
├── 002901d9d194c4fb.txt
├── 002aeab534f0007a.txt
If you have any problems or need clarification please contact me: jscanova@gmail.com
Σχόλια