EEDE NOV 2017 - Page 36

N E U R A L C O M P U T I N G THE DEVIL IS IN THE DETAIL OF DEEP LEARNING HARDWARE BY JAMES MORRA To identify skin cancer, perceive human speech, and run other deep learning tasks, chipmakers are editing processors to work with lower precision numbers. These numbers contain fewer bits than those with higher precision, which require “heavier lifting” from computers. Intel’s Nervana unit plans to release a special processor before the end of the year (2017) that trains neural networks faster than other architectures. But in addition to improving memory and interconnects, Intel created a new way of formatting numbers for lower precision maths. The numbers weigh fewer bits so the hardware can use less silicon, less computing power, and less electricity. Intel’s numerology is an example of the dull and yet strangely elegant ways that chip companies are coming to grips with deep learning. It is still unclear whether ASICs, FPGAs, CPUs, GPUs, or other chips will be best at handling calculations like the human brain does. But every chip appears to be using lower precision math to get the job done. Still, companies pay a surcharge for using numbers with less detail. “You are giving up something, but the question is whether it’s significant or not,” said Paulius Micikevicius, principal engineer in Nvidia’s computer architecture and deep learning research group. “At some point you start losing accuracy, and people start playing games to recover it.” Shedding with precision is nothing new, he said. For over five years, oil and gas companies have stored drilling and geological data in halfprecision numbers – 16-bit floating point – and run calculations with single-precision – 32-bit floating point – on Nvidia’s graphics chips, which are the current gold standard for training and running deep learning. In recent years, Nvidia has edited its graphic chips to reduce computing power wasted in training deep learning programs. Its older Pascal architecture performs 16-bit maths twice as efficiently as 32-bit operations. Its latest Volta architecture runs 16-bit operations inside custom tensor cores, which speedily move data through the layers of a neural network. Intel’s new format, FlexPoint, maximizes the precision that can be stored in 16-bits. It can represent a slightly wider range of numbers than traditional fixed-point formats, which can be handled with less computing power and memory. But it seems to provide less flexibility than floating-point numbers commonly used with neural networks. Different parts of deep learning need different levels of precision. Training entails going through, for example, thousands of photographs without explicit programming. An algorithm automatically adjusts millions of connections between the layers of the neural 36 DesignNews NOVEMBER 2017 www.eedesignnewseurope.com