Comparative experiment of inference speed for different TensorRT api

Introduction

Based on TensorRT 8.2.4 ，see the environment section below for details
Target:

Compare inference speed for pytorch、onnx runtime、tensorrt C++(include onnxparser、definition api)、tensorrt python(include onnxparser、definition api)
Process：

Using PyTorch to implement or collect some classical CNN network, training to get .pth model file;
Python api with tensorrt and cuda :
- OnnxParser build network: model.pth -> model.onnx -> model.plan；
- Definition api build network layer by layer: model.pth -> model.npz-> model.plan；
C++ api with tensorrt and cuda :
- OnnxParser build network: model.pth -> model.onnx -> model.plan；
- Definition api build network layer by layer: model.pth -> model.wts-> model.plan；
Compare time cost for different inference api;

Result

TensorRT FP32

	PyTorch	ONNX	Python trt onnxparser	Python trt api	C++ trt onnxparser	C++ trt api
VGG16	93 ms	74 ms	9 ms	9 ms	5 ms	5 ms
ResNet50	96 ms	75 ms	9 ms	9 ms	5 ms	5 ms
UNet	181 ms	152 ms	26 ms	26 ms	27 ms	26 ms
Deeplabv3+	208 ms	158 ms	28 ms	30 ms	26 ms	25 ms

TensorRT INT8

	Python trt onnxparser	Python trt api	C++ trt onnxparser	C++ trt api
ResNet50	6 ms	6 ms	3 ms	2 ms
Deeplabv3+	15 ms	15 ms	11 ms	10 ms

Add: precision comparison before and after int8 quantization

	before quantization	after quantization
ResNet50（precision）	95.08%	95.08%
Deeplabv3+（mIOU）	61.99%	60.96%

Note：backbone of Deeplabv3+ is resnet50

Reference

https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/  # TensorRT official document(C++ api)
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/  # TensorRT official document(python api)
https://github.com/NVIDIA/trt-samples-for-hackathon-cn/tree/master/cookbook
https://github.com/wang-xinyu/tensorrtx

File descripition

project dir
    ├── flower_classify_dataset  # Dataset used for classification, 5 kind of flowers, link below 
    │   ├── train
    │   ├── val
    │   └── test
    ├── Camvid_segment_dataset  # Dataset used for segmentation，link below 
    │   ├── images  # original images
    |   │   ├── train
    |   │   ├── val
    |   │   └── test
    │   ├── labels  # mask labels，Consists of the category index 
    │   ├── train.lst  # format: train_image_path'\t'train_label_path
    │   ├── val.lst  # format: valid_image_path'\t'valid_label_path
    │   └── labels.txt  # category index and it's color
    ├── VGG16
    │   ├── PyTorch
    │   │   ├── dataset.py
    │   │   ├── model  # auto generated after running train.py, '.pth' type model file will be here
    │   │   ├── pytorch_inference.py  # use pytorch api inference
    │   │   ├── train.py
    │   │   └── vgg.py  # pytorch network file
    │   └── TensorRT
    │       ├── C++
    │       │   ├── api_model
    │       │   │   ├── calibrator.cpp  # for int8 quantization
    │       │   │   ├── calibrator.h
    │       │   │   ├── Makefile
    │       │   │   ├── pth2wts.py  # '.pth' type model file to '.wts' type file
    │       │   │   ├── public.h
    │       │   │   ├── trt_infer.cpp  # generate tensorrt plan file, use tensorrt c++ api inference
    │       │   │   └── vgg.py  # it's the same file as above and below
    │       │   └── onnx_parser
    │       │       ├── calibrator.cpp
    │       │       ├── calibrator.h
    │       │       ├── Makefile
    │       │       ├── onnx_infer.py  # '.pth' type file to '.onnx' type file, and onnx runtime inference
    │       │       ├── public.h
    │       │       ├── trt_infer.cpp  # onnx to tensorrt plan file, and tensorrt c++ api inference
    │       │       └── vgg.py
    │       └── python
    │           ├── api_model
    │           │   ├── calibrator.py  # for int8 quantization
    │           │   ├── pth2npz.py  # '.pth' type model file to '.npz' type file
    │           │   ├── trt_inference.py  # generate tensorrt plan file, use tensorrt python api inference
    │           │   └── vgg.py
    │           └── onnx_parser
    │               ├── calibrator.py
    │               ├── onnx_infer.py  # '.pth' type file to '.onnx' type file, and onnx runtime inference
    │               ├── trt_infer.py  # onnx to tensorrt plan file, and tensorrt python api inference
    │               └── vgg.py
    ├── ResNet50  # The file structure is basically the same as VGG16
    │   ├── PyTorch
    │   └── TensorRT
    │       ├── C++
    │       │   ├── api_model
    │       │   └── onnx_parser
    │       └── python
    │           ├── api_model
    │           └── onnx_parser
    ├── UNet  # The file structure is basically the same as VGG16
    │   ├── PyTorch
    │   └── TensorRT
    │       ├── C++
    │       │   ├── api_model
    │       │   └── onnx_parser
    │       └── python
    │           ├── api_model
    │           └── onnx_parser
    └── Deeplabv3+  # The file structure is basically the same as VGG16
        ├── PyTorch
        └── TensorRT
            ├── C++
            │   ├── api_model
            │   └── onnx_parser
            └── python
                ├── api_model
                └── onnx_parser

For details on the operation of each subproject, please read README under the specific subproject directory
datasets link：dataset extract code：z3qp

Environment

Base environment

Ubuntu 16.04
GPU：GeForce RTX 2080 Ti
CUDA 11.2
docker，nvidia-docker

Pull base image

docker pull nvcr.io/nvidia/tensorrt:22.04-py3

The library versions in the image are as follows:

CUDA	cuDNN	TensorRT	python
11.6.2	8.4.0.27	8.2.4.2	3.8.10

Install other libraries

build docker container
```
docker run -it --gpus device=0 --shm-size 32G -v /home:/workspace nvcr.io/nvidia/tensorrt:22.04-py3 bash
```
-v /home:/workspace Mounts the /home directory of the host to the container to facilitate file interaction. You can also select other directories

install OpenCV-4.5.0

OpenCV-4.5.0 source link is as follows, download the zip package, decompress it and put it in the host /home directory, that is, the container /workspace directory

https://github.com/opencv/opencv

The following operations are in the container

# Install dependency
apt install build-essential
apt install libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
apt install libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libdc1394-22-dev
# start to install OpenCV
cd /workspace/opencv-4.5.0
mkdir build
cd build
cmake -D CMAKE_INSTALL_PREFIX=/usr/local -D CMAKE_BUILD_TYPE=Release -D OPENCV_GENERATE_PKGCONFIG=ON -D OPENCV_ENABLE_NONFREE=True ..
make -j6
make install

install PyTorch

download torch-1.12.0

# open link: https://download.pytorch.org/whl/torch/  # pytorch official website
# find torch-1.12.0+cu116-cp38-cp38-linux_x86_64.whl
# download and place in /workspace directory
# run:
pip install torch-1.12.0+cu116-cp38-cp38-linux_x86_64.whl

download torchvision-0.13.0

# open link https://download.pytorch.org/whl/torchvision/
# find torchvision-0.13.0+cu116-cp38-cp38-linux_x86_64.whl
# download and place in /workspace directory
# run:
pip install torchvision-0.13.0+cu116-cp38-cp38-linux_x86_64.whl

install other python library

pip install opencv-python==3.4.16.59
pip install opencv-python-headless==3.4.16.59
pip install tensorboard
pip install cuda-python
pip install onnxruntime-gpu==1.10
pip install scipy
pip install matplotlib
pip install tqdm

At this point, all programs in the project can be run.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README-en.md

README-en.md

Comparative experiment of inference speed for different TensorRT api

Introduction

Result

Reference

File descripition

Environment

Base environment

Pull base image

Install other libraries

Files

README-en.md

Latest commit

History

README-en.md

File metadata and controls

Comparative experiment of inference speed for different TensorRT api

Introduction

Result

Reference

File descripition

Environment

Base environment

Pull base image

Install other libraries