FPGA
Easing FPGA Integration in Data Centres
Intel
The significant execution and power performance of FPGAs
is driving their growing use in multiple applications,
such as signal processing, cryptography and deep learning
inference. FPGAs use massive parallelism and deep pipelining
techniques to perform highly compute-intensive loops in
hardware, offloading these “kernels” from software and enabling
substantial improvements in execution speed.
The growth of cloud computing is seeing more and more of
such compute-intensive
applications, including
artificial intelligence
(AI), big-data analysis
and traditional supercomputing
applications
being undertaken within
cloud and enterprise
data centres. FPGA acceleration
offers some
obvious advantages
in this environment,
which some public
cloud providers have
already recognised, but
traditional data centre
development tools,
processes and culture,
inhibit widespread
adoption of FPGAs.
This article looks at the two areas where FPGA and traditional
data-centre development are most mismatched – programming
Figure 1. An adaptation stack represents the tasks and tools necessary to make
FPGA acceleration accessible to users.
and management and orchestration, (MANO) – and then
proposes an approach to bridging this gap.
Inhibitors to FPGA integration
The term “programming” may be used for both FPGAs and
traditional CPUs, but the tasks involved, and skills required
are very different for each type of device. With the traditional
CPU, developers write code in high-level languages, at a level
of abstraction from the “black-box” CPU and the programming
development environment translates the high-level code into
machine-level instructions. This type of programming requires
no knowledge of the underlying hardware and with certain
exceptions, the programmer doesn’t have to worry about timing
and sequencing of instructions.
FPGA programming also starts with human-readable languages,
such as C++, but programming involves descriptions of
hardware structures and logic blocks, rather than sequences of
instructions. FPGA programming requires the developer to get
involved in specifications of timing constraints and sequences
and the process is much closer to hardware configuration than
CPU programming.
Whilst the development culture is therefore very different for
traditional data centre technologies and FPGAs, established
data centre administration techniques also struggle to accommodate
FPGAs.
Modern data centres are highly automated environments
built on a multitude of identical racks containing similar servers
sharing network connections and storage nodes. Increasingly,
virtualisation is being used to enable rapid switching of
data centre resources across servers and MANO tools enable
resource tracking, scheduling and billing. FPGAs do not fit
easily into this environment; unlike the traditional server they
are not identical, their power coming from their variety. Consequently,
although they can be reconfigured when required, this
cannot be done within
the nano-second
windows required by
data-centre MANO
tools.
The challenge
therefore is to find
ways of making the
power of FPGAs available
to the application
developer and end
users by bridging this
mis-match between
tools, processes and
culture.
The Stack
If the use of FPGAs is
to be made transparent
at the high level to
data centre developers, then it will not suffice to simply insert a
card with an FPGA into a server slot; more work is required to
achieve the level of abstraction required in this environment.
The OSI model, (Open Systems Interconnection model),
adopted in the telecommunications world serves as a useful
tool to address this bridging challenge and for this purpose a
five-layer model or “stack” is defined, as shown in figure 1.
For this specific purpose, 5 layers are required – physical,
configuration, abstraction, environmental and MANO - as described
in the following sections.
Physical Layer
To gain access to the full capabilities of the FPGA it should be
placed on its own card and plugged into the data centre server
rack, following the approach adopted by Intel with its Intel
Stratix 10 Programmable Accelerator Card. This enables the
FPGA’s serial ports to have access to the backplane network,
facilitating communication with a particular CPU via PCI Express.
In this configuration, the FPGA is able to work as a slave
accelerator to the CPU or, alternatively, can stream data directly
from the network.
Further work is required at this level however to make the
FPGA a useful accelerator; FPGA, CPU and server board
designers need to cooperate at this level to create a hardware
“gasket”, or signal bridge, containing user-designed functions
to enable the FPGA to talk to its host CPU, maintain security,
transfer data and manage execution.
6 Embedded September 2019 www.eenewsembedded.com News @eeNewsEurope
/eenewseurope
/www.eenewsembedded.com