TensorRT based Real Time Object Classification using EfficientNet on Nvidia Jetson Nano.

In this blog , I would like to share an Edge based Deep Learning project implemented on Nvidia Jetson Nano Developer Kit[4GB].

The project is basically to help beginners like me to quickly understand the flow of converting DL models using PyTorch to ONNX and then to TensorRT engine. It also has a python script that will readily perform real-time TensorRT based inferencing using camera input and display the class as an overlay on the screen.

A bit of background, I am an Embedded Systems Engineer and relatively new to the field of Deep Learning and Computer Vision. From last year onwards as part of my job, I was introduced to the Jetson Xavier (NX,AGX,TX2 and Nano) series of embedded boards.

Thanks to Nvidia’s Deep Learning Institute, the courses gave me a base to start gaining more knowledge on this field.

Coursera’s Deep Learning Specialization and Udacity’s Computer Vision NanoDegree both gave me the required foundational knowledge and practical skills on Deep Learning and Computer Vision.

If you are interested to explore more on the courses, please refer to the links below,

Around last year October(2020), Nvidia had announced AI Certification programs based on Jetson devices.

Therefore, to gain more experience on Nvidia’s TensorRT framework and edge inferencing, I decided to explore it on Jetson Nano.


Please find the project at

arvcode/TensorRT_classifier_efficientNet: Real Time TensorRT based object classification using EfficientNet CNN (github.com)

The project is based on a Convolutional Neural Network (CNN) model named EfficientNet. For more details on EfficientNet, please refer to Google AI Blog: EfficientNet: Improving Accuracy and Efficiency through AutoML and Model Scaling (googleblog.com)

The EfficientNet used is pre-trained on ImageNet (1000 classes).

The reason why EfficientNet is choosen is because the number of parameters are less and it is faster than ResNet(x) models. Since Jetson Nano is a small embedded device, it is better to choose EfficientNet.

There are primarily 4 main Jupyter notebooks in the project. These notebooks are step by step instructions on converting pretrained model in PyTorch to ONNX and to TensorRT engine.

The notebooks are as following,

  1. 1_Preparation.ipynb — This notebook downloads EfficientNetB2 model and does a test inference without TensorRT. We can change to any model and follow a similar template for the conversion. For eg) ResNet50 etc.
  2. 2_Convert_to_ONNX.ipynb — This notebook converts the PyTorch model to ONNX.
  3. 3_Convert_ONNX_to_TensorRT.ipynb — This notebook converts the ONNX model to TensorRT engine.
  4. 4_TensorRT_Inference_Test.ipynb — This notebook performs a test inference using the new TensorRT engine.
  5. TensorRT_efficientnet_infer.ipynb — This notebook contains the real time object classification code. The notebook will take feed from camera and do real time inferencing. Once everything is done, we can convert this to a python script [TensorRT_efficientnet_infer.py]

If we want to directly do the inferencing, we can use the TensorRT_efficientnet_infer.py script.

Sample Output

The repository also contains jetson_nano_setup_instructions.md that have instructions on the setup environment in Jetson Nano, while running Jupyter notebooks.


Some of the challenges faced during the project were

  1. Lack of RAM during ONNX conversion to TensorRT engine. This can be averted by using swap space.
  2. If we do not use explicit batch, then ONNX conversion to TensorRT engine and inferencing would have issues. We can avoid this situation by using

explicit_batch = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)

3. TensorRT expects inputs to be [Batch,Height,Width,Channels] whereas PyTorch tensor is [Batch,Channels,Height,Width], here height==width. This can be avoided by using a proper transpose operation.

Usual issues of missing packages etc, can be easily fixed by installing via apt or pip.

The repository also contains the references and resources. References.md. Thanks to the Deep Learning community for the resources.

Hardware Setup:

As for the hardware setup, My Jetson Nano has two cameras.

  1. One connected to the MIPI-CSI-2 interface.
  2. Other connected to USB interface.

For ease of moving the camera to detect objects, I used the USB interface camera.

I also networked the device via router, so that it can be accessed via PC. This enabled Jupyter notebooks to be accessed through PC. Hence file copies, debugging were more faster.

For accessing Jupyter notebook via PC, please refer to the Jupyter notebook setup section in jetson_nano_setup_instructions.md.

For more information of the hardware accessories, please check

SparkFun DLI Kit (without Jetson Nano) — KIT-16389 — SparkFun Electronics

From this project, I was able to gain a hands-on experience of model conversions and inferencing using TensorRT on an embedded edge device such as Jetson Nano.

Thanks for reading.!