Training a Custom Object Detection Model with Tensorflow - Forecr.io

Training a Custom Object Detection Model with Tensorflow

07 April 2021
WHAT YOU WILL LEARN?

1- Configuring the dataset in VOC format

2- Setting up the Tensorflow container

3- Training and testing the model





ENVIRONMENT

Operating System: Ubuntu 18.04.5 LTS

CPU: AMD Ryzen 7 3700X 8-Core Processor

RAM: 32 GB DDR4 - 2133MHz

Graphic Card: RTX 2060 (6GB)

Graphic Card’s Driver Version: 460.39

In this blog post, we will train our custom masked face dataset with Tensorflow in NVIDIA Container Toolkit. First, we will change our YOLO type dataset to VOC. Then, we will set Tensorflow environment into Docker. Finally, we will train and test the trained model with SSD MobileNet v1 COCO.


Special thanks to Evan, to share TensorFlow Object Detection Classifier Training repository.


Configuring the dataset in VOC format


Dataset in VOC Format

In our previous training blogpost, we used custom YOLO dataset to detect masked faces from here. To begin with, let's change YOLO type dataset to VOC (we used YBat YOLO Annotation Tool from here ).



Remove the "labels" folders, add each VOC label file into its image folder and edit the dataset like that:


Setting up the Tensorflow container


Tensorflow Container Setup

Next, pull the Tensorflow image from NVIDIA NGC Container Catalog (container version must be the same version of docker). Download packages, organize files, test the container environment and create TFRecords from dataset.

$ cd ~

$ mkdir tensorflow_train && cd tensorflow_train/

$ mkdir training_result

$ git clone https://github.com/tensorflow/models.git

$ cp models/research/object_detection/samples/configs/ssd_mobilenet_v1_coco.config .

$ wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2018_01_28.tar.gz

Let's configure ssd_mobilenet_v1_coco.config file for our dataset:

Line 9: Change num_classes to:

num_classes: 2


Line 156: Change fine_tune_checkpoint to:

fine_tune_checkpoint: "/tensorflow_train/models/research/object_detection/ssd_mobilenet_v1_coco_2018_01_28/model.ckpt"



Lines 175 and 177. In the train_input_reader section, change input_path and label_map_path to:

input_path : "/tensorflow_train/models/research/object_detection/train.record"

label_map_path: "/tensorflow_train/models/research/object_detection/training/labelmap.pbtxt



Line 181. Change num_examples to the number of images you have in the /tensorflow_train/images/valid directory.



Lines 189 and 191. In the eval_input_reader section, change input_path and label_map_path to:

input_path : "/tensorflow_train/models/research/object_detection/valid.record"

label_map_path: "/tensorflow_train/models/research/object_detection/training/labelmap.pbtxt"



Let's move our dataset (images folder and obj_names.txt) and configuration files (label_map_generator.sh; tfrecord_generator.py and xml2csv.py) into the project folder.



You can download tensorflow configuration files from the downloads above.

Let's dive into the Docker environment. Start Tensorflow container with GPU and test with Python. Export additional Python packages and test with model_builder_tf1_test.py file.



$ docker --version

$ docker pull nvcr.io/nvidia/tensorflow:20.10-tf1-py3

$ docker run --runtime nvidia --rm -it -v ~/tensorflow_train/:/tensorflow_train --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tensorflow:20.10-tf1-py3

$ apt update

$ python >>> import tensorflow as tf >>> print(tf.test.gpu_device_name()) /device:GPU:0 >>> exit()

$ cd /tensorflow_train/

$ cd models/research/

$ protoc object_detection/protos/*.proto --python_out=.

$ export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim $ pip3 install tensorflow-object-detection-api

$ pip3 install tf_slim

$ python object_detection/builders/model_builder_tf1_test.py



Save XML files to CSV in dataset. Generate TFRecord's from CSV files and save them into the object_detection folder. Create label map file from the label names' text file.

$ cd /tensorflow_train/

$ python3 xml2csv.py

$ python3 tfrecord_generator.py --label_file=obj_names.txt --csv_input=images/train_labels.csv --output_path=${PWD}/models/research/object_detection/train.record --image_dir=${PWD}/images/train $ python3 tfrecord_generator.py --label_file=obj_names.txt --csv_input=images/valid_labels.csv --output_path=${PWD}/models/research/object_detection/valid.record --image_dir=${PWD}/images/valid $ mkdir ${PWD}/models/research/object_detection/training $ ./label_map_generator.sh obj_names.txt


 Training and testing the model


Train and Test The Model

Finally, Copy training configuration file into the object_detection/training folder. Set the training environment up with setup.py file. Extract SSD MobileNet file and start training. (Our training started with a loss of about 15 and ended under 2 (1.44).)

$ cp ssd_mobilenet_v1_coco.config ssd_mobilenet_v1_coco_2018_01_28.tar.gz /tensorflow_train/models/research/object_detection/training/ $ cd /tensorflow_train/models/research/ $ cp object_detection/packages/tf1/setup.py . $ python setup.py build $ python setup.py install $ cd /tensorflow_train/models/research/object_detection/ $ tar -zxvf ssd_mobilenet_v1_coco_2018_01_28.tar.gz $ apt-get install -y ffmpeg libsm6 libxext6 $ python model_main.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_coco.config

A few moments later...



After the training ended...



As you can see there our files saved into /tmp/tmpl_4xnvo5/ folder. Copy the model.ckpt-XXXX files (XXXX should be the highest-numbered .ckpt file) into the training folder. Generate the frozen inference graph (.pb file).

If you will delete models repository (/tensorflow_train/models folder), you can copy these output files in /tensorflow_train/training_result folder.

$ cd /tensorflow_train/models/research/object_detection/ $ cp /tmp/tmpl_4xnvo5/model.ckpt-200000.* training/ $ python export_inference_graph.py --input_type image_tensor --pipeline_config_path training/ssd_mobilenet_v1_coco.config --trained_checkpoint_prefix training/model.ckpt-200000 --output_directory training $ cp -r training/* /tensorflow_train/training_result/



These files are read-only due to the owner is root. To get these files' ownership type these commands (the host PC's username is "user"):

$ sudo chown -hR user ~/tensorflow_train/training_result/



Copy an image from dataset to object_detection folder and get the model file with Evan's image test file. Change frozen model file's folder and class number

$ cd /tensorflow_train/models/research/object_detection/ $ cp $(ls /tensorflow_train/images/train/*.jpg| head -1) /tensorflow_train/models/research/object_detection/test1.jpg $ wget https://raw.githubusercontent.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10/master/Object_detection_image.py $ sed -i "s/MODEL_NAME = 'inference_graph'/MODEL_NAME = 'training'/g" Object_detection_image.py $ sed -i "s/NUM_CLASSES = 6/NUM_CLASSES = 2/g" Object_detection_image.py $ sed -i "s/imshow('Object detector'/imwrite('detection_result.jpg'/g" Object_detection_image.py $ python Object_detection_image.py

/tensorflow_train/models/research/object_detection/test1.jpg file



/tensorflow_train/models/research/object_detection/detection_result.jpg file.


Thank you for reading our blog post.