This week we posted a new Tech Note in which Jesse Engel discusses a new technique for speeding up the training of deep recurrent neural networks. This is Part II of a multi-part series detailing some of the techniques we’ve used here at Baidu’s Silicon Valley AI Lab (SVAIL) to accelerate the training of recurrent neural networks. While Part I focused on the role that minibatch and memory layout play on recurrent GEMM performance, we shift our focus here to tricks we can use to optimize the algorithms themselves.
There are two main takeaways in this blog post. First, differentiable graphs are a simple and useful tool for visually calculating complicated derivatives. Second, these graphs can also inspire algorithmic optimizations. As an example, we show how to accelerate Gated Recurrent Units (GRUs) by up to 40 percent.
The post is targeted to:
– Researchers developing new iterative algorithms. (We will develop variations of iterative algorithms such as RNNs that are more efficiently parallelized.)
– Authors of Deep Learning frameworks that apply auto-differentiation such as Theano, Tensorflow, Torch Autograd, or Neon. (These methods will hopefully provide inspiration for implicit graph optimizations to move towards systems that can better balance tradeoffs of memory usage and computation.)
Read the full post here: http://svail.github.io/diff_graphs/
SVAIL Tech Notes are written by engineers for engineers on topics related to AI technologies, techniques, tips and trends.
Around the World in 60 Days, by Ryan Prenger and Tony Han
Deploying Deep Neural Networks Efficiently, by Chris Fougner
Optimizing RNN Performance, by Erich Elsen
SVAIL GitHub Blog