How to Run Yolov5 Real Time Object Detection on NVIDIA & Jetson Nano?

WHAT YOU WILL LEARN?

1- How to use Yolov5 model files in docker.

2- How to use Yolov5 Object Detection with both USB webcam and CSI camera.

3- How to run Yolov5 Object Detection in docker.

ENVIRONMENT

Hardware: DSBOX-N2

OS: Ubuntu 18.04 LTS, JetPack 4.5

How to use Yolov5 model files in docker

In this blog post, you will learn how to run Yolov5 Object Detection in real time with both a USB camera, and a CSI camera. The GitHub repo has been taken as a reference for the whole process.

If you are going to use a CSI camera for object detection, you should connect it to Jetson™ Nano™ before powering it up. First, let us go to the Documents folder, then let us install the required files while in this folder:

cd ./Documents 
git clone https://github.com/amirhosseinh77/JetsonYolo.git

Now, let us go into the weights folder and download assets for Yolov5:

cd weights 
wget https://github.com/ultralytics/yolov5/releases/download/v5.0/yolov5s.pt

And the final requirement for our setup, installing Gstreamer, let us install it:

sudo apt-get install libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev libgstreamer-plugins-bad1.0-dev gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav gstreamer1.0-doc gstreamer1.0-tools gstreamer1.0-x gstreamer1.0-alsa gstreamer1.0-gl gstreamer1.0-gtk3 gstreamer1.0-qt5 gstreamer1.0-pulseaudio 
sudo apt-get install gstreamer1.0-tools

How to use Yolov5 Object Detection with both USB webcam and CSI camera

Since our preparations outside the docker are done, let us install our container and run it:

cd 
docker pull nvcr.io/nvidia/l4t-ml:r32.5.0-py3 
sudo docker run -it --rm --gpus all -e DISPLAY=:0 -v /tmp/.X11-unix:/tmp/.X11-unix -v /home/nvidia/Documents:/home/nvidia/Documents -v /tmp/argus_socket:/tmp/argus_socket --device /dev/video0:/dev/video0 --network host nvcr.io/nvidia/l4t-ml:r32.5.0-py3

Now, we should be inside our container. We need to create, write and run a python file, and in order to do that, we need to install nano, Tkinter and tqdm:

apt-get update 
apt-get install nano 
apt-get install python3-tk 
pip3 install tqdm

After we got these installations, we need to create a python file inside JetsonYolo folder, and write our code:

cd /home/nvidia/Documents/JetsonYolo/ 
touch JetsonYolo.py 
nano JetsonYolo.py

Now you should see an empty file like below:

Now, if you are going to use a USB Webcam, copy the code below and paste it with Ctrl + Shift + P:

import cv2 
import numpy as np 
from elements.yolo import OBJ_DETECTION 

Object_classes = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',                'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',                'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',                'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',                'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',                'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',                'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',                'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',                'hair drier', 'toothbrush' ] 

Object_colors = list(np.random.rand(80,3)*255) 
Object_detector = OBJ_DETECTION('weights/yolov5s.pt', Object_classes) 

cap = cv2.VideoCapture(0) 
if cap.isOpened(): 
        window_handle = cv2.namedWindow("USB Camera", cv2.WINDOW_AUTOSIZE) 
        # Window 
        while cv2.getWindowProperty("USB Camera", 0) >= 0: 
                ret, frame = cap.read() 
                if ret: 
                        # detection process 
                        objs = Object_detector.detect(frame) 

                        # plotting 
                        for obj in objs: 
                                # print(obj) 
                                label = obj['label'] 
                                score = obj['score'] 
                                [(xmin,ymin),(xmax,ymax)] = obj['bbox'] 
                                color = Object_colors[Object_classes.index(label)] 
                                frame = cv2.rectangle(frame, (xmin,ymin), (xmax,ymax), color, 2) 
                                frame = cv2.putText(frame, f'{label} ({str(score)})', (xmin,ymin),
                 cv2.FONT_HERSHEY_SIMPLEX , 0.75, color, 1, cv2.LINE_AA) 

                cv2.imshow("USB Camera", frame) 
                keyCode = cv2.waitKey(30) 
                if keyCode == ord('q'): 
                        break 
        cap.release() 
        cv2.destroyAllWindows() 
else: 
        print("Unable to open camera") 

If you are going to use a CSI camera, then copy below code and paste it with Ctrl + Shift + P:import cv2 
import numpy as np 
from elements.yolo import OBJ_DETECTION 

Object_classes = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',                'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',                'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',                'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',                'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',                'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',                'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',                'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',                'hair drier', 'toothbrush' ] 

Object_colors = list(np.random.rand(80,3)*255) 
Object_detector = OBJ_DETECTION('weights/yolov5s.pt', Object_classes) 

def gstreamer_pipeline( 
        capture_width=1280, 
        capture_height=720, 
        display_width=800, 
        display_height=600, 
        framerate=30, 
        flip_method=2, 
): 
        return ( 
                "nvarguscamerasrc ! " 
                "video/x-raw(memory:NVMM), " 
                "width=(int)%d, height=(int)%d, " 
                "format=(string)NV12, framerate=(fraction)%d/1 ! " 
                "nvvidconv flip-method=%d ! " 
                "video/x-raw, width=(int)%d, height=(int)%d, format=(string)BGRx ! " 
                "videoconvert ! " 
                "video/x-raw, format=(string)BGR ! appsink" 
                % ( 
                        capture_width, 
                        capture_height, 
                        framerate, 
                        flip_method, 
                        display_width, 
                        display_height, 
                ) 
        ) 
# To flip the image, modify the flip_method parameter (0 and 2 are the most common) 
print(gstreamer_pipeline(flip_method=2)) 
cap = cv2.VideoCapture(gstreamer_pipeline(flip_method=2), cv2.CAP_GSTREAMER) 
if cap.isOpened(): 
        window_handle = cv2.namedWindow("CSI Camera", cv2.WINDOW_AUTOSIZE) 
        # Window 
        while cv2.getWindowProperty("CSI Camera", 0) >= 0: 
                ret, frame = cap.read() 
                if ret: 
                        # detection process 
                        objs = Object_detector.detect(frame) 

                        # plotting 
                        for obj in objs: 
                                # print(obj) 
                                label = obj['label'] 
                                score = obj['score'] 
                                [(xmin,ymin),(xmax,ymax)] = obj['bbox'] 
                                color = Object_colors[Object_classes.index(label)] 
                                frame = cv2.rectangle(frame, (xmin,ymin), (xmax,ymax), color, 2) 
                                frame = cv2.putText(frame, f'{label} ({str(score)})', (xmin,ymin),
                 cv2.FONT_HERSHEY_SIMPLEX , 0.75, color, 1, cv2.LINE_AA) 

                cv2.imshow("CSI Camera", frame) 
                keyCode = cv2.waitKey(30) 
                if keyCode == ord('q'): 
                        break 
        cap.release() 
        cv2.destroyAllWindows() 
else: 
        print("Unable to open camera")

How to run Yolov5 Object Detection in docker

Now, we need to gain access to our camera from docker. Open a new terminal using Ctrl + Alt + T, and write the following:

xhost +

We should see the following output from the terminal. Then let’s switch back to our first terminal and run our code, since we are ready:

python3 JetsonYolo.py

Running our file may take a couple of minutes. You can see how it works below:

Press ‘q’ to stop the program.

Thank you for reading our blog post.