I met Marian Verhelst in the summer of 2019 and she immediately stroke me with her passion and competence for computing architecture design. We started immediately a collaboration and today she’s here with us sharing her insights on the future of computing.
Fabrizio: Can you please introduce yourself, your experience and your field of study?
Marian: My name is Marian Verhelst, and I am a professor at the MICAS lab of KU Leuven[i].
I studied electrical engineering and received my PhD in microelectronics in 2008. After completing my studies, I joined Intel Labs in Portland, Oregon, USA, and worked as a research scientist. I then became a professor at KU Leuven in 2012, focusing on efficient processing architectures for embedded sensor processing and machine learning. My lab regularly tapes out processor chips using innovative and advanced technologies. I am also active in international initiatives, organising IC conferences such as ISSCC, DATE, ESSCIRC, AICAS and more. I also serve as the Director of the tinyML Foundation.
Most recently, I was honoured to receive the André Mischke YAE Prize[ii] for Science and Policy, and I have been shortlisted for the 2021 Belgium Inspiring Fifty list[iii].
F: What is the focus of your most recent research?
M: My research currently focuses on three areas. First, I am looking at implementing an efficient processor chip for embedded DNN workloads. Our latest tape-out, the Diana chip, combines a digital AI accelerator with an analogue- compute-in-memory AI accelerator in a common RISC-V-based processing system. This allows the host processor to offload neural network layers to the most suitable accelerator core, depending on parallelisation opportunities and precision needs. We plan to present this chip at ISSCC 2022[iv].
The second research area is improving the efficiency of designing and programming such processors. We developed a new framework called the ZigZag framework[v], which enables rapid design space exploration of processor architectures and algorithm-to-processor mapping schedules for a suite of ML workloads.
My last research area is exploring processor architectures for beyond-NN workloads. Neural networks on their own cannot sufficiently perform complex reasoning, planning or perception tasks. They must be complemented with probabilistic and logic-based reasoning models. However, these networks do not map well on CPU, GPU, or NPUs. We are starting to develop processors and compilers for such emerging ML workloads in my lab.
F: There are different approaches and trends in new computing designs for artificial intelligence workloads: increasing the number of computing cores from a few to tens, thousands or even hundreds of thousands of small, efficient cores, as well as near-memory processing, computing-in-memory, or in-memory computing. What is your opinion about these architectures? What do you think is the most promising approach? Are there any other promising architecture developments?
M: Having seen the substantial divergence in ML algorithmic workloads and the general trends in the processor architecture field, I am a firm believer in very heterogeneous multi-core solutions. This means that future processing systems will have a large number of cores with very different natures. Eventually, such cores will include (digital) in- or near-memory processing cores, coarse grain reconfigurable systolic arrays and more traditional flexible SIMD cores. Of course, the challenge is to build compilers and mappers that can grasp all opportunities from such heterogeneous and widely parallel fabrics. To ensure excellent efficiency and memory capabilities, it will be especially important to exploit the cores in a streaming fashion, where one core immediately consumes the data produced by another.
F: Computing design researchers are working on low power and ultra-low power consumption design using metrics such as TOPs/w as a key performance indicator and low precision networks trained mainly on small datasets. However, we also see neural network research increasingly focusing on large networks, particularly transformer networks that are gaining traction in field deployment and seem to deliver very promising results. How can we conciliate these trends? How far are we from running these networks at the edge? What kind of architecture do you think can make this happen?
M: There will always be people working to improve energy efficiency for the edge and people pushing for throughput across the stack. The latter typically starts in the data centre but gradually trickles down to the edge, where improved technology and architectures enable better performance. It is never a story of choosing one option over another.
Over the past years, developers have introduced increasingly distributed solutions, dividing the workload between the edge and the data centre. The vital aspect of these presented solutions is that they need to work with scalable processor architectures. Developers can deploy these architectures with a smaller core count at the extreme edge and scale up to larger core numbers for the edge and a massive core count for the data centre. This will require processing architectures and memory systems that rely on a mesh-type distributed processor fabric, rather than being centrally controlled by a single host.
F: How do you see the future of computing architecture for the data centre? Will it be dominated by standard computing, GPU, heterogeneous computing, or something else?
M: As I noted earlier, I believe we will see an increasing amount of heterogeneity in the field. The data centre will host a wide variety of processors and differently-natured accelerator arrays to cover the widely different workloads in the most efficient manner possible. As a hardware architect, the exciting and still open challenge is what library of (configurable) processing tiles can cover all workloads of interest. Most intriguing is that, due to the slow nature of hardware development, this processor library should cover not only the algorithms we know of today but also those that researchers will develop in the years to come.
As Scientific Advisor, Marian Verhelst advises the Axelera AI Team on the scientific aspects of its research and development. To learn more about Marian’s work, please visit her biography page.