Loitering Detection
| Property | Value |
|---|---|
| Category | Object Detection + Tracking + Zone Analytics (GstAnalytics) |
| Source Framework | PyTorch (Ultralytics) |
| Supported Precisions | FP32, FP16, INT8 (mixed-precision) |
| Inference Engine | OpenVINO |
| Hardware | CPU, GPU, NPU |
| Detected Class | person (COCO class 0) |
Overview
Loitering Detection is a Metro Analytics use case that flags people who remain inside a configurable region of interest for longer than a dwell-time threshold.
It is built on YOLO26 for person detection, paired with a multi-object tracker that assigns persistent IDs across frames.
DLStreamer's gvaanalytics element defines the monitoring zone and automatically attaches GstAnalyticsZoneMtd metadata to every tracked person whose center falls inside the polygon.
A Python probe reads this GstAnalytics metadata to accumulate per-person dwell time and raises a loitering event when the threshold is exceeded.
Typical Metro deployments include:
- Restricted-Area Monitoring -- raise alerts when a person lingers near tracks, equipment rooms, or after-hours zones.
- Platform Edge Safety -- detect prolonged presence inside a yellow-line buffer.
- ATM and Ticketing Security -- identify suspicious dwell at unattended kiosks.
- Crowd-Free Zone Enforcement -- monitor emergency exits and corridors that must remain clear.
Available variants: yolo26n, yolo26s, yolo26m, yolo26l, yolo26x.
Smaller variants (yolo26n, yolo26s) are recommended for high-FPS edge deployment.
Prerequisites
- Python 3.11+
- Install Intel DLStreamer
Create and activate a Python virtual environment before running the scripts:
python3 -m venv .venv --system-site-packages
source .venv/bin/activate
Note: The
--system-site-packagesflag is required so the virtual environment can access the system-installed OpenVINO and DLStreamer Python packages.
Getting Started
Download and Quantize Model
Run the provided script to download, export to OpenVINO IR, and optionally quantize:
chmod +x export_and_quantize.sh
./export_and_quantize.sh
This exports the default yolo26n model in FP16 precision.
Optional: Select a Different Variant or Precision
./export_and_quantize.sh yolo26n FP32 # full-precision
./export_and_quantize.sh yolo26n INT8 # quantized
./export_and_quantize.sh yolo26s # larger variant, default FP16
Replace yolo26n with any variant (yolo26s, yolo26m, yolo26l, yolo26x).
The second argument selects the precision (FP32, FP16, INT8); the default is FP16.
The script performs the following steps:
- Installs dependencies (
openvino,ultralytics; addsnncffor INT8). - Downloads the sample surveillance video (
VIRAT_S_000101.mp4) from the Intel Metro AI Suite project into the current directory. - Downloads the PyTorch weights and exports to OpenVINO IR.
- (INT8 only) Quantizes the model using NNCF post-training quantization.
Output files:
yolo26n_openvino_model/-- FP32 or FP16 OpenVINO IR model directory.yolo26n_loitering_int8.xml/yolo26n_loitering_int8.bin-- INT8 quantized model (only whenINT8is selected).
Precision / Device Compatibility
| Precision | CPU | GPU | NPU |
|---|---|---|---|
| FP32 | Yes | Yes | No |
| FP16 | Yes | Yes | Yes |
| INT8 | Yes | Yes | Yes |
Note: The INT8 calibration uses frames from the bundled sample video. For production accuracy, replace it with a representative set of frames from the target deployment site.
Defining the Monitoring Zone
The zone is a polygon defined in JSON and passed to DLStreamer's
gvaanalytics element, which automatically detects when tracked objects
are inside the zone using GstAnalytics metadata -- no Python polygon math
required.
A typical surveillance-zone configuration on a 1280x720 source might be:
[
{
"id": "loiter_zone",
"type": "polygon",
"points": [
{"x": 0, "y": 200},
{"x": 300, "y": 200},
{"x": 300, "y": 400},
{"x": 0, "y": 400}
]
}
]
LOITERING_SECONDS = 5.0 # dwell threshold, in seconds (demo value)
Note: The sample uses a 5-second threshold so that loitering events are triggered quickly on the short demo video. For production deployments, increase this to 10--30 seconds depending on the site's operational requirements.
The gvaanalytics element attaches GstAnalyticsZoneMtd to each detection
whose center falls inside the polygon. The Python probe checks for this
metadata to accumulate per-person dwell time.
Note: The zone polygon supports arbitrary shapes (not just rectangles). Use
draw-zones=true(the default) so thatgvawatermarkrenders the zone boundary on the output video.
DLStreamer Sample
Set up the environment:
source /opt/intel/openvino_2026/setupvars.sh
source /opt/intel/dlstreamer/scripts/setup_dls_env.sh
export PYTHONPATH=/opt/intel/dlstreamer/python:/opt/intel/dlstreamer/gstreamer/lib/python3/dist-packages:${PYTHONPATH:-}
Run loitering detection:
from collections import defaultdict
import json
import sys
import gi
gi.require_version("Gst", "1.0")
gi.require_version("GstAnalytics", "1.0")
gi.require_version("DLStreamerMeta", "1.0")
gi.require_version("DLStreamerWatermarkMeta", "1.0")
from gi.repository import Gst, GLib, GstAnalytics, DLStreamerMeta, DLStreamerWatermarkMeta
Gst.init([])
# Register DLStreamerMeta types so GstAnalytics iteration can handle them
_ov = sys.modules["gi.overrides.GstAnalytics"]
_ov.__mtd_types__[DLStreamerMeta.ZoneMtd.get_mtd_type()] = DLStreamerMeta.relation_meta_get_zone_mtd
_ov.__mtd_types__[DLStreamerMeta.TripwireMtd.get_mtd_type()] = DLStreamerMeta.relation_meta_get_tripwire_mtd
MODEL = "yolo26n_openvino_model/yolo26n.xml"
VIDEO = "VIRAT_S_000101.mp4"
ZONE_JSON = json.dumps([{
"id": "loiter_zone",
"type": "polygon",
"points": [{"x": 0, "y": 200}, {"x": 300, "y": 200},
{"x": 300, "y": 400}, {"x": 0, "y": 400}]
}])
LOITERING_SECONDS = 5.0
pipeline = Gst.parse_launch(
f"filesrc location={VIDEO} ! decodebin3 ! videoconvert ! "
f"gvadetect model={MODEL} device=GPU threshold=0.5 ! queue ! "
f"gvatrack tracking-type=short-term-imageless ! queue ! "
f"gvaanalytics name=analytics draw-zones=true ! "
f"gvafpscounter ! identity name=probe ! gvawatermark name=watermark ! "
f"videoconvert ! video/x-raw,format=I420 ! "
f"openh264enc ! h264parse ! mp4mux ! filesink location=output_dlstreamer.mp4"
)
pipeline.get_by_name("analytics").set_property("zones", ZONE_JSON)
pipeline.get_by_name("watermark").set_property("displ-cfg", "hide-roi=person")
dwell = defaultdict(float)
last_seen = {}
flagged = set()
def on_buffer(pad, info):
buf = info.get_buffer()
now = buf.pts / Gst.SECOND if buf.pts != Gst.CLOCK_TIME_NONE else 0.0
rmeta = GstAnalytics.buffer_get_analytics_relation_meta(buf)
if not rmeta:
return Gst.PadProbeReturn.OK
# Iterate only over object-detection entries
for od in rmeta.iter_on_type(GstAnalytics.ODMtd):
label = GLib.quark_to_string(od.get_obj_type())
if label != "person":
continue
# Find tracking ID via direct relation
track_id = None
for trk in od.iter_direct_related(GstAnalytics.RelTypes.RELATE_TO, GstAnalytics.TrackingMtd):
success, tracking_id, *_ = trk.get_info()
if success:
track_id = tracking_id
break
if track_id is None:
continue
# Check if gvaanalytics placed this detection inside the zone
in_zone = False
for zone in od.iter_direct_related(GstAnalytics.RelTypes.RELATE_TO, DLStreamerMeta.ZoneMtd):
in_zone = True
break
if not in_zone:
continue
# Accumulate dwell time for persons inside the zone
dwell[track_id] += now - last_seen.get(track_id, now)
last_seen[track_id] = now
if dwell[track_id] >= LOITERING_SECONDS and track_id not in flagged:
flagged.add(track_id)
_, x, y, w, h, _ = od.get_location()
print(f"LOITERING id={track_id} dwell={dwell[track_id]:.1f}s pos=({int(x + w/2)},{int(y + h)})")
return Gst.PadProbeReturn.OK
pipeline.get_by_name("probe").get_static_pad("src").add_probe(Gst.PadProbeType.BUFFER, on_buffer)
pipeline.set_state(Gst.State.PLAYING)
pipeline.get_bus().timed_pop_filtered(Gst.CLOCK_TIME_NONE, Gst.MessageType.EOS | Gst.MessageType.ERROR)
pipeline.set_state(Gst.State.NULL)
Expected output:
LOITERING id=26 dwell=5.0s pos=(147,341)
LOITERING id=27 dwell=5.0s pos=(122,337)
...
The annotated video is saved to output_dlstreamer.mp4.
The gvaanalytics element also draws the zone polygon on each frame via gvawatermark.
Expected Output
Device targets:
device=GPU-- default in the sample code.device=CPU-- changedevice=GPUtodevice=CPU.device=NPU-- changedevice=GPUtodevice=NPU; usebatch-size=1andnireq=4for best NPU utilization.
License
Licensed under the MIT License. See LICENSE for details.
