Tutorial 15 — TensorRT Export: Optimise Any PyTorch Model for Jetson

A vanilla PyTorch model runs at 8–12 FPS on Jetson. The same model exported to TensorRT FP16 runs at 50+ FPS. This tutorial shows you exactly how to convert any PyTorch model to a TensorRT engine — not just YOLO models, but any custom network you have trained.

What you will learn

  • How TensorRT optimisation works (layer fusion, precision calibration)
  • How to export YOLO models to TensorRT in one line
  • How to export any custom PyTorch model via ONNX → TensorRT
  • FP32 vs FP16 vs INT8 — when to use each precision mode
  • How to validate accuracy after export

Step 1 — Export a YOLO model (easiest)

from ultralytics import YOLO

model = YOLO("yolov8s.pt")   # or yolo11s.pt, or your custom best.pt
model.export(
    format = "engine",
    device = 0,
    half   = True,    # FP16 — best speed/accuracy balance
    imgsz  = 640
)
# Output: yolov8s.engine — ready to use on Jetson

Step 2 — Export any custom PyTorch model via ONNX

import torch

# Step A: Export PyTorch → ONNX
model  = MyCustomModel()
model.load_state_dict(torch.load("my_model.pth"))
model.eval()

dummy  = torch.randn(1, 3, 640, 640)
torch.onnx.export(model, dummy, "my_model.onnx",
                  opset_version=11, input_names=["input"],
                  output_names=["output"])
# Step B: Convert ONNX → TensorRT engine on Jetson
trtexec     --onnx=my_model.onnx     --saveEngine=my_model.engine     --fp16     --workspace=2048

Step 3 — Run inference with the TensorRT engine

import tensorrt as trt
import pycuda.driver as cuda
import numpy as np

# Helper script included on your kit
from trt_infer import TRTInferencer

engine = TRTInferencer("my_model.engine")
output = engine.infer(input_image)
print(f"Inference result shape: {output.shape}")

Next: Tutorial 16 — DeepStream Multi-Camera | Back to Jetson Kit

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
0

Subtotal