Deep learning techniques have by now achieved unprecedented levels of accuracy in important tasks such as speech translation and image recognition, despite their known failures on properly selected adversarial examples. The operation of deep neural networks can be interpreted as the extraction, across successive layers, of approximate minimal sufficient statistics from the data, with the aim of preserving as much information as possible with respect to the desired output.
A deep neural network encodes a learned task in the synaptic weights between connected neurons. The weights define the transformation between the statistics produced by successive layers. Learning requires updating all the synaptic weights, which typically run in the millions; and inference on a new input, e.g., audio file or image, generally involves computations at all neurons. As a result, the energy required to run a deep neural network is currently incompatible with an implementation on mobile devices.
The economic incentive to offer mobile users applications such as Siri has hence motivated the development in recent years of computation offloading schemes, whereby computation is migrated from mobile devices to remote servers accessed via a wireless interface. Accordingly, user’s data is processed on servers located within the wireless operator’s network rather than on the devices. This reduces energy consumption at the mobiles, while, at the same time, entailing latency — a significant issue for applications such as Augmented Reality — and a potential loss of privacy.
The terminology used to describe deep learning methods — neurons, synapses — reveals the ambition to capture at least some of the brain functionalities via artificial means. But the contrast between the apparent efficiency of the human brain, which operates with five orders of magnitude (100,000 times) less power than current most powerful supercomputers, and the state of the art on neural networks remains jarring.
Current deep learning methods rely on second-generation neurons, which consist of simple static non-linear functions. In contrast, neurons in the human brain are known to communicate by means of sparse spiking processes. As a result, neurons are mostly inactive and energy is consumed sporadically and only in limited areas of the brain at any given time. Third-generation neural networks, or Spiking Neural Networks (SNNs), aim at harnessing the efficiencies of spike-domain processing by building on computing elements that operate on, and exchange, spikes. In an SNN, spiking neurons determine whether to output a spike to the connected neurons based on the incoming spikes.
Neuromorphic hardware is currently being developed that is able to natively implement SNNs. Unlike traditional CPUs or GPUs running deep learning algorithms, processing and communication is not “clocked” to take place across all computing elements at regular intervals. Rather, neuromorphic hardware consists of spiking neurons that are only active in an asynchronous manner whenever excited by input spikes, potentially increasing the energy efficiency by orders of magnitude.
If the promises of neuromorphic hardware and SNNs will be realized and neuromorphic chips will find their place within mobile devices, we could soon see the emergence of revolutionary new applications under enhanced privacy guarantees.