AI-Powered Visual Intelligence

Axelera AI explores edge-native crowd and face analytics in real time, powered by the Metis® 4-chip PCIe AI Accelerator Card and Voyager® SDK for on-device inference.

What this document covers This solution brief describes how Axelera AI used its Metis 4-chip PCIe AI Accelerator Card and Voyager SDK to demonstrate real-time crowd analytics at the edge, processing a full 8K video stream with multiple simultaneous AI models, without sending any data to the cloud.

The problem Large venues, stadiums, festivals, and public events generate enormous volumes of visual data. Security teams and operators need to monitor crowd density, track movement, and respond in real time, but the tools to do this at scale have historically required costly GPU hardware, cloud connectivity, and high bandwidth, all of which introduce latency, privacy risk, and operational complexity that most organizations cannot absorb.

Key challenges addressed

Processing 8K video in real time at the edge, without cloud dependency
Maintaining detection accuracy for individuals at distance across wide fields of view
Running multiple AI models simultaneously on a single accelerator card
Keeping biometric data on-device to satisfy privacy and data sovereignty requirements
Associating a person's body position in a live feed with their face in a stable operator view

The solution: Metis + Voyager SDK

Hardware: Metis PCIe card with 4 AIPUs, delivering up to 856 TOPS (214 TOPS per chip), operating at 30-58W typical power draw.

Software: Voyager SDK orchestrates multiple models in parallel and enables configurable tiling, where smaller tiles cover distant areas for fine-detail inference and larger tiles handle close-up regions where detail is already sufficient.

Models running simultaneously in this demonstration:

Ultralytics YOLO11 for object detection, segmentation, and people detection
Ultralytics YOLOv8 for keypoint/pose detection
RetinaFace for high-precision face detection
FaceNet for face recognition

Host: High-performance edge workstation (Intel Core i9-12900) managing 8K stream ingestion.

Results

Full 8K video processed at 30 FPS on a single edge device
Person-to-face association maintained across a live scene, with a companion operator grid displaying tracked faces in fixed positions while the live feed shows subjects in motion
All inference runs fully on-device; no raw video leaves the local network
Deployment requires no cloud infrastructure, no high-bandwidth uplink, and no off-premises data transfer

Why it matters This demonstration shows that high-resolution crowd analytics combining person detection, pose estimation, face recognition, object detection, and object tracking across 8K video can run at the edge today, at a power and cost profile that makes it practical outside of enterprise-scale deployments. The Metis AIPU provides the parallel processing headroom that multi-model, high-resolution inference demands. The Voyager SDK provides the tiling strategy, model compatibility, and pipeline control that make the system configurable for real deployment scenarios.

Real-Time Crowd Analytics at the Edge

AI-Powered Visual Intelligence

Axelera AI explores edge-native crowd and face analytics in real time, powered by the Metis® 4-chip PCIe AI Accelerator Card and Voyager® SDK for on-device inference.