How to Integrate Custom TLT Semantic Segmentation Model to NVIDIA® Jetson™ Modules?

How to Integrate Custom TLT Semantic Segmentation Model to NVIDIA® Jetson™ Modules?

Jetson AGX Xavier | Jetson Nano | Jetson TX2 NX | Jetson Xavier NX

23 August 2021
WHAT YOU WILL LEARN?

1- How to Generate Engine Using tlt-converter?

2- How to Deploy the Model in Deepstream?

ENVIRONMENT

Hardware: DSBOX-NX2

OS: Jetpack 4.5


In this blog post, we will show how to integrate custom segmentation model that we previously trained using NVIDIA® Transfer Learning Toolkit into Jetson™ modules with Deepstream. To learn how to train a custom segmentation model with transfer learning toolkit, click here.

Before we get started, you should download tlt-converter to generate an engine and ds-tlt-segmentation to test the model.


To install tlt-converter go to the following page, download and unzip the zip file for Jetpack 4.5:


https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/tensorrt.html#tlt-converter-matrix


You need to install OpenSSL package as well.


sudo apt-get install libssl-dev 


Export the following environment variables.


export TRT_LIB_PATH=”/usr/lib/aarch64-linux-gnu”
export TRT_INC_PATH=”/usr/include/aarch64-linux-gnu”


To install ds-tlt, go to following GitHub post and download deepstream-tlt-apps according to the instructions. You do not need to download models since we will use our own model. 

https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps#2-build-sample-application


 How to Generate Engine Using tlt-converter?


Make sure you have exported models that we previously trained. 

Now, we will generate an engine using tlt-converter. You can find the usage of tlt-converter by running the following command.


./tlt-converter -h


We will show the steps for each exported model (fp16, fp32 and int8).


Fp16 model:


cd exported_models/fp16/
/home/nvidia/tlt-converter -k forecr -p input_1,1x3x320x320,2x3x320x320,3x3x320x320 segmentation_model_fp16.etlt -e fp16.engine -t fp16


Fp32 Model:


cd exported_models/fp32/
/home/nvidia/tlt-converter -k forecr -p input_1,1x3x320x320,2x3x320x320,3x3x320x320 segmentation_model_fp32.etlt -e fp32.engine -t fp32


Int8 Model (do not forget to add calibration cache file):


cd exported_models/int8/
/home/nvidia/tlt-converter -k forecr -p input_1,1x3x320x320,2x3x320x320,3x3x320x320 segmentation_model_int8.etlt -e int8.engine -t int8 -c cal.bin

How to Deploy the Model in Deepstream?


Before running ds-tlt, we need to make changes on the Make files. 

Set NVDS_VERSION:=5.1 in apps/Makefile and post_processor/Makefile. 

Now, you can set the CUDA version and run Make files.


export CUDA_VER=xy.z                   // xy.z is CUDA version, e.g. 10.2
make


Also, go to deepstream source code and change input/output dimensions according to your model. For example, for Resnet18-3 channel model set the following.


#define MODEL_OUTPUT_WIDTH 320
#define MODEL_OUTPUT_HEIGHT 320



Finally, you can run ds-tlt-segmentation to test the model. Supported formats for the test input are jpeg and h264. You can use the sample videos and pictures that comes with deepstream, under opt/nvidia/deepstream/deepstream-5.0/samples/streams. 


./apps/tlt_segmentation/ds-tlt-segmentation -c configs/unet_tlt/pgie_unet_tlt_config.txt -i streams/sample_720p.jpeg 

Thank you for reading our blog post. 

English