This python application takes frames from a live video stream and
perform object detection on GPUs. We use a pre-trained Single Shot
Detection (SSD) model with Inception V2, apply TensorRT’s optimizations,
generate a runtime for our GPU, and then perform inference on the video
feed to get labels and bounding boxes. The application then annotates
the original frames with these bounding boxes and class labels. The
resulting video feed has bounding box predictions from our object
detection network overlaid on it. The same approach can be extended to
other tasks such as classification and segmentation.
Note: By changing the argument to the "p" flag, you can change which
precision the model will run in. The options are FP32 (-p 32), FP16 (-p
16), and INT8 (-p 8)
Note: When running in INT8 precision, an extra step will be performed
for calibration. But just like building an engine, it will only be
performed the first time you run the model.