Real-Time Crowd Analytics at the Edge
AI-Powered Visual Intelligence
Axelera AI explores edge-native crowd and face analytics in real time, powered by the Metis® 4-chip PCIe AI Accelerator Card and Voyager® SDK for on-device inference.
Go to Computer Vision Use Case
What this document covers This solution brief describes how Axelera AI used its Metis 4-chip PCIe AI Accelerator Card and Voyager SDK to demonstrate real-time crowd analytics at the edge, processing a full 8K video stream with multiple simultaneous AI models, without sending any data to the cloud.
The problem Large venues, stadiums, festivals, and public events generate enormous volumes of visual data. Security teams and operators need to monitor crowd density, track movement, and respond in real time, but the tools to do this at scale have historically required costly GPU hardware, cloud connectivity, and high bandwidth, all of which introduce latency, privacy risk, and operational complexity that most organizations cannot absorb.
Key challenges addressed
- Processing 8K video in real time at the edge, without cloud dependency
- Maintaining detection accuracy for individuals at distance across wide fields of view
- Running multiple AI models simultaneously on a single accelerator card
- Keeping biometric data on-device to satisfy privacy and data sovereignty requirements
- Associating a person's body position in a live feed with their face in a stable operator view
The solution: Metis + Voyager SDK
Hardware: Metis PCIe card with 4 AIPUs, delivering up to 856 TOPS (214 TOPS per chip), operating at 30-58W typical power draw.
Software: Voyager SDK orchestrates multiple models in parallel and enables configurable tiling, where smaller tiles cover distant areas for fine-detail inference and larger tiles handle close-up regions where detail is already sufficient.
Models running simultaneously in this demonstration:
- Ultralytics YOLO11 for object detection, segmentation, and people detection
- Ultralytics YOLOv8 for keypoint/pose detection
- RetinaFace for high-precision face detection
- FaceNet for face recognition
Host: High-performance edge workstation (Intel Core i9-12900) managing 8K stream ingestion.
Results
- Full 8K video processed at 30 FPS on a single edge device
- Person-to-face association maintained across a live scene, with a companion operator grid displaying tracked faces in fixed positions while the live feed shows subjects in motion
- All inference runs fully on-device; no raw video leaves the local network
- Deployment requires no cloud infrastructure, no high-bandwidth uplink, and no off-premises data transfer
Why it matters This demonstration shows that high-resolution crowd analytics combining person detection, pose estimation, face recognition, object detection, and object tracking across 8K video can run at the edge today, at a power and cost profile that makes it practical outside of enterprise-scale deployments. The Metis AIPU provides the parallel processing headroom that multi-model, high-resolution inference demands. The Voyager SDK provides the tiling strategy, model compatibility, and pipeline control that make the system configurable for real deployment scenarios.



