Shared Publication

FPGA Easing FPGA Integration in Data Centres Intel The significant execution and power performance of FPGAs is driving their growing use in multiple applications, such as signal processing, cryptography and deep learning inference. FPGAs use massive parallelism and deep pipelining techniques to perform highly compute-intensive loops in hardware, offloading these “kernels” from software and enabling substantial improvements in execution speed. The growth of cloud computing is seeing more and more of such compute-intensive applications, including artificial intelligence (AI), big-data analysis and traditional supercomputing applications being undertaken within cloud and enterprise data centres. FPGA acceleration offers some obvious advantages in this environment, which some public cloud providers have already recognised, but traditional data centre development tools, processes and culture, inhibit widespread adoption of FPGAs. This article looks at the two areas where FPGA and traditional data-centre development are most mismatched – programming Figure 1. An adaptation stack represents the tasks and tools necessary to make FPGA acceleration accessible to users. and management and orchestration, (MANO) – and then proposes an approach to bridging this gap. Inhibitors to FPGA integration The term “programming” may be used for both FPGAs and traditional CPUs, but the tasks involved, and skills required are very different for each type of device. With the traditional CPU, developers write code in high-level languages, at a level of abstraction from the “black-box” CPU and the programming development environment translates the high-level code into machine-level instructions. This type of programming requires no knowledge of the underlying hardware and with certain exceptions, the programmer doesn’t have to worry about timing and sequencing of instructions. FPGA programming also starts with human-readable languages, such as C++, but programming involves descriptions of hardware structures and logic blocks, rather than sequences of instructions. FPGA programming requires the developer to get involved in specifications of timing constraints and sequences and the process is much closer to hardware configuration than CPU programming. Whilst the development culture is therefore very different for traditional data centre technologies and FPGAs, established data centre administration techniques also struggle to accommodate FPGAs. Modern data centres are highly automated environments built on a multitude of identical racks containing similar servers sharing network connections and storage nodes. Increasingly, virtualisation is being used to enable rapid switching of data centre resources across servers and MANO tools enable resource tracking, scheduling and billing. FPGAs do not fit easily into this environment; unlike the traditional server they are not identical, their power coming from their variety. Consequently, although they can be reconfigured when required, this cannot be done within the nano-second windows required by data-centre MANO tools. The challenge therefore is to find ways of making the power of FPGAs available to the application developer and end users by bridging this mis-match between tools, processes and culture. The Stack If the use of FPGAs is to be made transparent at the high level to data centre developers, then it will not suffice to simply insert a card with an FPGA into a server slot; more work is required to achieve the level of abstraction required in this environment. The OSI model, (Open Systems Interconnection model), adopted in the telecommunications world serves as a useful tool to address this bridging challenge and for this purpose a five-layer model or “stack” is defined, as shown in figure 1. For this specific purpose, 5 layers are required – physical, configuration, abstraction, environmental and MANO - as described in the following sections. Physical Layer To gain access to the full capabilities of the FPGA it should be placed on its own card and plugged into the data centre server rack, following the approach adopted by Intel with its Intel Stratix 10 Programmable Accelerator Card. This enables the FPGA’s serial ports to have access to the backplane network, facilitating communication with a particular CPU via PCI Express. In this configuration, the FPGA is able to work as a slave accelerator to the CPU or, alternatively, can stream data directly from the network. Further work is required at this level however to make the FPGA a useful accelerator; FPGA, CPU and server board designers need to cooperate at this level to create a hardware “gasket”, or signal bridge, containing user-designed functions to enable the FPGA to talk to its host CPU, maintain security, transfer data and manage execution. 6 Embedded September 2019 www.eenewsembedded.com News @eeNewsEurope /eenewseurope /www.eenewsembedded.com