Study and implementation of non-recurrent deep learning algorithms for temporal sequences processing

Published : 15 July 2019

Recurrent neural networks – and notably the Long-Short Term Memory (LSTM) variant – are today at the state of the art for solving many temporal sequence classification problems and in particular used in speech recognition applications (from 2015 for Android) and automatic translation (from 2016 at Google, Apple and Facebook). This type of algorithm is also successfully applied in various applications such as audio event recognition, denoising, language modeling, sequences generations, etc.

The success of these approaches comes however with the cost of huge computing power requirements. This is why most of this algorithms are run on the Cloud, and not on the Edge. Moreover, recurrent neural networks are very sensitive to training parameters and can be difficult to converge because gradients internal to their recurrent structure can easily explode or vanish to zero. The adaptation of these algorithms for an embedded implementation is therefore not straightforward, because the recurrence requires a high precision and partially sequential (large latency) computing.

Some technics for overcoming these difficulties are starting to appear, but are still in their infancy. Among them, a non-recurrent technic allowing sequence processing with less constrain than LSTM seems promising: hierarchical networks. Temporal convolution networks (TCN) are one of their application. The advantages and drawbacks of this model are studied notably in “An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling” (Shaojie Bai, J. Zico Kolter, Vladlen Koltun). A basic implementation of each structure showed that TCN are more efficient in almost every test cases. Internal gradients are much more stable and computation can be easily parallelized thanks to the elimination of the recurrence.

More information