AI TECH INSIGHT
Exec Summary: The Metis AI Processing Unit (AIPU) is an inference-optimized accelerator for the Edge. We are proud to showcase up to a 5x performance boost over competitive accelerators in terms of raw inference performance for key families of Neural Networks for Computer Vision along with state-of-the-art accuracy. As significant as 5x is though, we believe the best measurement of performance is application-level performance which is a much better proxy for what the user will realize. For example, if the AIPU can infer that a cat is a cat at 900fps, but the post-processing slows it down so significantly that the user only sees 20fps, the 900fps is nearly useless. Thanks to our easy-to-use Voyager™ SDK, which optimizes the entire data pipeline, we also showcase that Axelera AI’s application performance brings worldclass speed to computer vision applications.
Three years ago we started Axelera AI with a singular mission, to empower everyone with the best performance for AI inference. Since then, we have taped-out 3 chips, built the Voyager SDK, and are fulfilling that promise.
Today we are pleased to release the latest performance benchmarks based on the upcoming public release of our Voyager SDK, which will be available via our Github repo in March. All of the data measured has been done on our products, within our labs. Competitor data has been utilized from their own published sources as noted below.
Performance Results: Metis vs. Competition
When compared to other AI accelerators, Metis consistently outperforms in key benchmarks. The chart and table below show the frames per second (FPS) processed by Metis, compared to the throughput of other AI accelerators.
This is just some of the benchmarks we have tested and a few of the more than 50 models available for immediate use within our Model Zoo. Software is extremely important to us at Axelera AI and we invest significant resources in ensuring we are always improving. We continue to add optimized models and capabilities to ease development and integration within AI pipelines. Having the highest performance only matters if the users can trust the accuracy of the inference being performed. We are thrilled to say that, thanks to the mixed precision architecture of Metis and the quantization capabilities of our SDK, the achieved accuracy is state of the art.
In the following table we list the accuracy measured for various models when running on a machine with full numerical precision (32-bit Floating Point arithmetic, a.k.a. ‘FP32’) and compare it with the accuracy of the same models running on Metis after being quantized by the Voyager SDK. As you can see, the accuracy reduction with Metis is negligible in many cases. Our software team continues to work on optimizations and will keep updates in our public release.
Voyager SDK
Without a robust and easy to use software stack, AI hardware is useless. There, we said it! So, to ensure developers can get the most out of our performance-leading hardware, we built the Voyager™ SDK which facilitates the development of high-performance applications. Developers can build their computer vision pipelines using a straightforward, high-level declarative language, YAML. A Voyager pipeline may include one or more neural networks along with their associated pre- and post-processing tasks, which can include complex image processing operations. The SDK automatically compiles, optimizes, and deploys the entire pipeline. While the neural network runs on the Metis AI Processing Unit (AIPU), the SDK also generates code for the non-neural operations of the pipeline, such as image preprocessing and inference post-processing, to take advantage of the host hardware acceleration offered by the host CPU, integrated GPU or media accelerator. Additionally, thanks to the architecture of our chip, the developer can choose how to allocate Digital In-Memory Compute (D-IMC) cores to the application: if there are multiple models, the cores can be loaded in parallel, or they can be cascaded, the decision is yours. This means if you have a very compute heavy model that you want to utilize 3 of the 4 cores to compute, you may. Likewise, if you have four models you want to run in parallel, that is also possible.
Application-level performance
Running a Computer Vision application is much more than just running inference. At Axelera AI we believe it’s important to understand what the realized performance is – meaning, how long does it take to get the answer a user is looking for, that’s the full end-to-end measurement. The Axelera AI Voyager SDK helps optimize the entire data pipeline, including the parts that run on the host CPU or embedded GPU. Why does this matter? This means that both the developer and the users will have a better experience because the SDK will handle the work for the developer, and the user gets faster results.
As can be appreciated in the table, Voyager SDK manages to deliver the raw inference performance to the end-to-end application: by optimizing the execution of non-neural operations in the computer vision pipeline we ensure that the application can take full advantage of the unmatched capabilities of Metis.
The Voyager SDK is compatible with a variety of host architectures and platforms to accommodate different application environments. Additionally, the SDK allows embedding a pipeline into an inference service, providing various preconfigured solutions for use cases ranging from fully embedded applications to distributed processing of multiple 4K streams.
State-of-the-Art Digital In-Memory Computing
Why is Metis so powerful? One of the key innovations that sets Metis apart from its competition is its use of Digital In-Memory Computing (D-IMC) technology. D-IMC allows for the simultaneous processing and storage of data within memory cells, allowing extremely high throughput and power efficient matrix-vector-multiplication. This approach is particularly beneficial for AI workloads, which require high-speed data access and intensive computation, and all with an average power consumption below 10 watts!
What is next?
We have been shipping the Axelera AI technology stack, the Metis AIPU and SDK, to our customers for months! We have a roadmap of continuous software updates, and in March, Axelera AI will publish our Voyager SDK on Github and welcomes the community to try the full tech stack with their own Metis M.2 or PCIe card!
Want your own card to see the power of Metis?
References