DESIGN & PRODUCTS EDGE COMPUTING & AI
velop without the hardware amenities available in a data-center
environment. These models need to continue tuning themselves
while minimizing processing power and energy consumption
and simultaneously optimizing and maximizing the available
local storage. Some applications will minimize this requirement
by improving the AI model in a cloud environment, and then frequently
updating the edge device with the latest model version.
More interesting, though, are hybrid approaches, such as
Google’s Federated Learning Model, which enable the model to
be optimized using local data. This requires robust edge compute
power to support incredibly frequent neural-network model
updates from the cloud. However, since the learning is never
“complete,” the model must consistently devote substantial
processing power and memory to continue improving.
This is precisely the problem, as the AI models struggle to
accomplish these goals at the edge.
Memory: it’s where the power goes
As the apocryphal story goes, when the infamous bank robber
Willie Sutton was asked why he chose banks as his targets, he
replied “because that’s where
the money is.” For many edge
AI devices, most of the power is
consumed in the memory system.
AI processing—especially
training—is very memory-hungry
and utilizing off-chip memory
has become a necessity to keep
up with performance improvements.
Google has found that
in a mobile system over 60% of
the total system power budget is
used to transfer data back and
forth between on- and off-chip
memories. This is more than the
processing, sensing, and all other
functions combined.
The obvious answer is
therefore to eliminate these
data transfers by putting all of
the memory on-chip. However
the current on-chip memory of
choice, SRAM, is simply too large and power-hungry. If transferring
data off-chip is the biggest power hog, close behind it is
the power consumed by the SRAM on-chip memory. And due
to SRAM’s large size, one quickly runs out of area on the chip to
add enough memory for AI applications.
To make AI at the edge truly successful, memory must be
able to address performance demands on-chip and perform
perception tasks locally, with high accuracy and energy efficiency.
New memory for the edge
All of these factors have made the AI landscape a fertile ground
for experimentation and innovation with new memories that
have unique or improving characteristics. Hardware is becoming
the key performance bottleneck, and solutions to the
bottlenecks become differentiators. That’s the reason why leading
internet players, such as Google, Facebook, Amazon, and
Apple, are rushing to become silicon designers in search of a
hardware competitive edge. Hardware has emerged as the new
AI battlefield. Necessity begets invention, and the necessity for
faster AI chips that use less power has opened opportunities for
potentially denser, more efficient memory technologies.
One such promising technology is magnetic RAM (MRAM), a
memory that’s bound to cross paths with AI as it rapidly moves
toward higher density, energy efficiency, endurance, and yields.
The semiconductor industry is beginning to invest heavily in
MRAM, as the technology’s potential slowly becomes reality.
Initial research has shown it offers a number of benefits that are
ideal for intelligent edge applications.
The ubiquitous on-chip working memory today is SRAM, but
it has flaws. It’s the largest memory type, meaning it’s the most
expensive per bit, and every bit “leaks” (wastes power) whenever
the memory is powered on. MRAM is the only promising new
memory that has the speed and endurance to replace SRAM.
Since MRAM uses a very small memory bitcell, it can be
three to four times denser than SRAM, allowing for more memory
to reside on-chip and thus eliminating or reducing the need to
shuttle data off-chip. MRAM, is also non-volatile, meaning that
data is retained even when the power is shut off. This virtually
eliminates memory leakage, which is critical for applications
where the AI chip remains idle for extended periods of time.
MRAM isn’t the only memory getting attention.
The demand for AI applications and intelligence at the edge
is leading a memory revolution within the semiconductor industry
for a wide variety of applications.
Other new high-density non-volatile memories, such as 3D
XPoint (Intel’s Optane), phase-change memory (PCM), and resistive
memory (ReRAM) also bring new possibilities and unique
advantages for storage applications. While neither as fast or
high-endurance as MRAM, and therefore not replacements for
SRAM, these non-volatile technologies are extremely dense
and provide unique speed and power advantages over flash
memory.
In addition, significant neuromorphic research is investigating
using these new memory bitcells directly as the synapses
and/or neurons of a neural net. Most research is focusing on
ReRAM, although other technologies such as MRAM are also
being explored.
For the first time in decades, silicon startups are shaping our
future with new, innovative memory technologies. MRAM and
the other memories are the catalysts that will change the possibilities
of modern technology and applications.
This article first appeared on Electronic Design –
www.electronicdesign.com
36 News April 2020 @eeNewsEurope www.eenewseurope.com
/eenewseurope
/www.electronicdesign.com
/www.eenewseurope.com