AI TECH INSIGHT

2024-07-10

How our quantization methods make the Metis AIPU highly efficient and accurate

To create a high-performing and highly energy efficient AI processing unit (AIPU) that obsoletes extensive model retraining, our engineers took a radically different approach to data processing. Through unique quantization methods and a proprietary system architecture, Axelera is able to offer the most powerful AI accelerator for the edge you can buy today. In this blog, you can read all about our unique quantization techniques.

Bram Verhoef | Head of Machine Learning & Co-Founder at AXELERA AI

Martino Dazzi | Algorithm and Quantization Researcher & Co-Founder at AXELERA AI

How our quantization methods make the Metis AIPU highly efficient and accurate

To create a high-performing and highly energy efficient AI processing unit (AIPU) that obsoletes extensive model retraining, our engineers took a radically different approach to data processing. Through unique quantization methods and a proprietary system architecture, Axelera is able to offer the most powerful AI accelerator for the edge you can buy today. In this blog, you can read all about our unique quantization techniques.

AI TECH INSIGHT

2024-07-10

How our quantization methods make the Metis AIPU highly efficient and accurate

To create a high-performing and highly energy efficient AI processing unit (AIPU) that obsoletes extensive model retraining, our engineers took a radically different approach to data processing. Through unique quantization methods and a proprietary system architecture, Axelera is able to offer the most powerful AI accelerator for the edge you can buy today. In this blog, you can read all about our unique quantization techniques.

Bram Verhoef | Head of Machine Learning & Co-Founder at AXELERA AI

Martino Dazzi | Algorithm and Quantization Researcher & Co-Founder at AXELERA AI

Industry-leading performance and usability

Our Metis acceleration hardware leads the industry, because of our unique combination of advanced technologies. This is how our sophisticated quantization flow methodology enables Metis’ high performance and efficiency.

  1. Metis is very user-friendly, not in the least because of the quantization techniques that are applied. Axelera AI uses Post-Training-Quantization (PTQ) techniques. These quantization techniques do not require the user to perform any retraining of the model, which would be time-, compute- and cost-intensive. Instead, PTQ can be performed quickly, automatically, and with very little data.

  2. Metis is also fast, energy-efficient and cost-effective. This is the result of innovative hardware design, like digital in-memory-computation and RISC-V, but also from the efficiency of the algorithms running on it. Our efficient digital in-memory-computation works hand in hand with quantization of the AI algorithms. The quantization process casts the numerical format of the AI algorithm elements into a more efficient format, compatible with Metis. For this, Axelera AI has developed an accurate, fast and easy-to-use quantization technique.

ModelDeviation from FP32 accuracy
ResNet-34 -0.1%
ResNet-50v1.5-0.1%
SSD-MobileNetV1  -0.3%
YoloV5s-ReLu-0.9%

Accuracy drop @ INT8

Highly accurate quantization technique

In combination with the mixed–precision arithmetic of the Axelera Metis AIPU, our AI accelerators can deliver an accuracy practically indistinguishable from a reference 32-bit floating point model. As an example, Metis AIPU can run the ResNet50v1.5 neural network processing, at a full processing speed of 3,200 frames per second, with a relative accuracy of 99.9%.

Technical details of our post-training quantization method

To reach high performance, AI accelerators often deploy 8-bit integer processing of the most compute-intensive parts of neural network calculations instead of using 32-bit floating-point arithmetic. To do so, a quantization of the data from 32-bit to 8-bit needs to be done.

The Post-Training Quantization (PTQ) technique begins with the user providing around hundred images. These images are processed through the full-precision model while detailed statistics are collected. Once this process is complete, the gathered statistics are used to compute quantization parameters, which are then applied to quantize the weights and activations to INT8 and other precisions in both hardware and software.

Additionally, the quantization technique modifies the compute graph to enhance quantization accuracy. This may involve operator folding and fusion, as well as reordering graph nodes.

Axelera AI's D-IMC Chip

Our radically different approach to data processing

From the outset, we designed our quantization method with two primary goals in mind. The first goal is achieving high efficiency, the second is high accuracy. Our quantized models typically maintain accuracy comparable to full-precision models.
To ensure this high accuracy, we begin with a comprehensive understanding of our hardware, as the quantization techniques employed depend on the specific hardware in use. Additionally, we utilize various statistical and graph optimization techniques, many of which were developed in-house.

Compatible with Various Neural Networks

By employing a generic quantization flow methodology, our systems can be applied to a wide variety of neural networks while minimizing accuracy loss.

Our quantization scheme and hardware allow developers to efficiently deploy an extremely wide variety of operators. This means that Axelera AI's hardware and quantization methods can support many different types of neural network architectures and applications.

Evaluate industry defining AI inference technology today. 1/3

This field is required!
This field is required!
This field is required!
This is not correct
This is not correct.
This is not correct

Your contact details2/3.

This field is required!
This field is required!
This field is required!
  • United States
  • Canada
  • Afghanistan
  • Albania
  • Algeria
  • American Samoa
  • Andorra
  • Angola
  • Anguilla
  • Antarctica
  • Antigua and Barbuda
  • Argentina
  • Armenia
  • Aruba
  • Australia
  • Austria
  • Azerbaijan
  • Bahamas
  • Bahrain
  • Bangladesh
  • Barbados
  • Belarus
  • Belgium
  • Belize
  • Benin
  • Bermuda
  • Bhutan
  • Bolivia
  • Bosnia and Herzegovina
  • Botswana
  • Brazil
  • British Indian Ocean Territory
  • British Virgin Islands
  • Brunei
  • Bulgaria
  • Burkina Faso
  • Burundi
  • Cambodia
  • Cameroon
  • Cape Verde
  • Cayman Islands
  • Central African Republic
  • Chad
  • Chile
  • China
  • Christmas Island
  • Cocos (Keeling) Islands
  • Colombia
  • Comoros
  • Congo
  • Cook Islands
  • Costa Rica
  • Croatia
  • Cuba
  • Curaçao
  • Cyprus
  • Czech Republic
  • Côte d’Ivoire
  • Democratic Republic of the Congo
  • Denmark
  • Djibouti
  • Dominica
  • Dominican Republic
  • Ecuador
  • Egypt
  • El Salvador
  • Equatorial Guinea
  • Eritrea
  • Estonia
  • Ethiopia
  • Falkland Islands
  • Faroe Islands
  • Fiji
  • Finland
  • France
  • French Guiana
  • French Polynesia
  • French Southern Territories
  • Gabon
  • Gambia
  • Georgia
  • Germany
  • Ghana
  • Gibraltar
  • Greece
  • Greenland
  • Grenada
  • Guadeloupe
  • Guam
  • Guatemala
  • Guernsey
  • Guinea
  • Guinea-Bissau
  • Guyana
  • Haiti
  • Honduras
  • Hong Kong S.A.R., China
  • Hungary
  • Iceland
  • India
  • Indonesia
  • Iran
  • Iraq
  • Ireland
  • Isle of Man
  • Israel
  • Italy
  • Jamaica
  • Japan
  • Jersey
  • Jordan
  • Kazakhstan
  • Kenya
  • Kiribati
  • Kuwait
  • Kyrgyzstan
  • Laos
  • Latvia
  • Lebanon
  • Lesotho
  • Liberia
  • Libya
  • Liechtenstein
  • Lithuania
  • Luxembourg
  • Macao S.A.R., China
  • Macedonia
  • Madagascar
  • Malawi
  • Malaysia
  • Maldives
  • Mali
  • Malta
  • Marshall Islands
  • Martinique
  • Mauritania
  • Mauritius
  • Mayotte
  • Mexico
  • Micronesia
  • Moldova
  • Monaco
  • Mongolia
  • Montenegro
  • Montserrat
  • Morocco
  • Mozambique
  • Myanmar
  • Namibia
  • Nauru
  • Nepal
  • Netherlands
  • New Caledonia
  • New Zealand
  • Nicaragua
  • Niger
  • Nigeria
  • Niue
  • Norfolk Island
  • North Korea
  • Northern Mariana Islands
  • Norway
  • Oman
  • Pakistan
  • Palau
  • Palestinian Territory
  • Panama
  • Papua New Guinea
  • Paraguay
  • Peru
  • Philippines
  • Pitcairn
  • Poland
  • Portugal
  • Puerto Rico
  • Qatar
  • Romania
  • Russia
  • Rwanda
  • Réunion
  • Saint Barthélemy
  • Saint Helena
  • Saint Kitts and Nevis
  • Saint Lucia
  • Saint Pierre and Miquelon
  • Saint Vincent and the Grenadines
  • Samoa
  • San Marino
  • Sao Tome and Principe
  • Saudi Arabia
  • Senegal
  • Serbia
  • Seychelles
  • Sierra Leone
  • Singapore
  • Slovakia
  • Slovenia
  • Solomon Islands
  • Somalia
  • South Africa
  • South Korea
  • South Sudan
  • Spain
  • Sri Lanka
  • Sudan
  • Suriname
  • Svalbard and Jan Mayen
  • Swaziland
  • Sweden
  • Switzerland
  • Syria
  • Taiwan
  • Tajikistan
  • Tanzania
  • Thailand
  • Timor-Leste
  • Togo
  • Tokelau
  • Tonga
  • Trinidad and Tobago
  • Tunisia
  • Turkey
  • Turkmenistan
  • Turks and Caicos Islands
  • Tuvalu
  • U.S. Virgin Islands
  • Uganda
  • Ukraine
  • United Arab Emirates
  • United Kingdom
  • United States Minor Outlying Islands
  • Uruguay
  • Uzbekistan
  • Vanuatu
  • Vatican
  • Venezuela
  • Viet Nam
  • Wallis and Futuna
  • Western Sahara
  • Yemen
  • Zambia
  • Zimbabwe
This is not correct.
This field is required!
This field is required!

Your project info3/3.

This field is required!
This field is required!
This is not correct

Thank you for your ordering your Axelera Metis Evaluation Kit!


We've received your order, and a confirmation email has been sent to the provided email address. Our team is excited to review your order.

After evaluating your input, we will be in touch within the next 2 business days to discuss the next steps and how your order can benefit your innovative projects.
Stay tuned for more details coming your way soon!

Continuously innovating our quantization methods

Axelera AI is currently developing very accurate quantization techniques for the most recent AI algorithms. We are constantly improving the algorithms to further improve accuracy. This is especially important as more recent algorithms, like large language models, require special handling when it comes to quantization. This means our future products will use enhanced quantization methods.