Shared Publication

DESIGN & PRODUCTS EDGE COMPUTING & AI velop without the hardware amenities available in a data-center environment. These models need to continue tuning themselves while minimizing processing power and energy consumption and simultaneously optimizing and maximizing the available local storage. Some applications will minimize this requirement by improving the AI model in a cloud environment, and then frequently updating the edge device with the latest model version. More interesting, though, are hybrid approaches, such as Google’s Federated Learning Model, which enable the model to be optimized using local data. This requires robust edge compute power to support incredibly frequent neural-network model updates from the cloud. However, since the learning is never “complete,” the model must consistently devote substantial processing power and memory to continue improving. This is precisely the problem, as the AI models struggle to accomplish these goals at the edge. Memory: it’s where the power goes As the apocryphal story goes, when the infamous bank robber Willie Sutton was asked why he chose banks as his targets, he replied “because that’s where the money is.” For many edge AI devices, most of the power is consumed in the memory system. AI processing—especially training—is very memory-hungry and utilizing off-chip memory has become a necessity to keep up with performance improvements. Google has found that in a mobile system over 60% of the total system power budget is used to transfer data back and forth between on- and off-chip memories. This is more than the processing, sensing, and all other functions combined. The obvious answer is therefore to eliminate these data transfers by putting all of the memory on-chip. However the current on-chip memory of choice, SRAM, is simply too large and power-hungry. If transferring data off-chip is the biggest power hog, close behind it is the power consumed by the SRAM on-chip memory. And due to SRAM’s large size, one quickly runs out of area on the chip to add enough memory for AI applications. To make AI at the edge truly successful, memory must be able to address performance demands on-chip and perform perception tasks locally, with high accuracy and energy efficiency. New memory for the edge All of these factors have made the AI landscape a fertile ground for experimentation and innovation with new memories that have unique or improving characteristics. Hardware is becoming the key performance bottleneck, and solutions to the bottlenecks become differentiators. That’s the reason why leading internet players, such as Google, Facebook, Amazon, and Apple, are rushing to become silicon designers in search of a hardware competitive edge. Hardware has emerged as the new AI battlefield. Necessity begets invention, and the necessity for faster AI chips that use less power has opened opportunities for potentially denser, more efficient memory technologies. One such promising technology is magnetic RAM (MRAM), a memory that’s bound to cross paths with AI as it rapidly moves toward higher density, energy efficiency, endurance, and yields. The semiconductor industry is beginning to invest heavily in MRAM, as the technology’s potential slowly becomes reality. Initial research has shown it offers a number of benefits that are ideal for intelligent edge applications. The ubiquitous on-chip working memory today is SRAM, but it has flaws. It’s the largest memory type, meaning it’s the most expensive per bit, and every bit “leaks” (wastes power) whenever the memory is powered on. MRAM is the only promising new memory that has the speed and endurance to replace SRAM. Since MRAM uses a very small memory bitcell, it can be three to four times denser than SRAM, allowing for more memory to reside on-chip and thus eliminating or reducing the need to shuttle data off-chip. MRAM, is also non-volatile, meaning that data is retained even when the power is shut off. This virtually eliminates memory leakage, which is critical for applications where the AI chip remains idle for extended periods of time. MRAM isn’t the only memory getting attention. The demand for AI applications and intelligence at the edge is leading a memory revolution within the semiconductor industry for a wide variety of applications. Other new high-density non-volatile memories, such as 3D XPoint (Intel’s Optane), phase-change memory (PCM), and resistive memory (ReRAM) also bring new possibilities and unique advantages for storage applications. While neither as fast or high-endurance as MRAM, and therefore not replacements for SRAM, these non-volatile technologies are extremely dense and provide unique speed and power advantages over flash memory. In addition, significant neuromorphic research is investigating using these new memory bitcells directly as the synapses and/or neurons of a neural net. Most research is focusing on ReRAM, although other technologies such as MRAM are also being explored. For the first time in decades, silicon startups are shaping our future with new, innovative memory technologies. MRAM and the other memories are the catalysts that will change the possibilities of modern technology and applications. This article first appeared on Electronic Design – www.electronicdesign.com 36 News April 2020 @eeNewsEurope www.eenewseurope.com /eenewseurope /www.electronicdesign.com /www.eenewseurope.com