TensorFlow 實現的擴大循環神經網路 DilatedRNN

TensorFlow 實現的擴大循環神經網路

總所周知的,用循環神經網路學習長序列是一個困難的任務。有三個主要挑戰:1)提取複雜的依賴關係,2)梯度的消除和爆發,3)高效的並行化。在本文中,我們介紹一種簡單而有效的RNN連接結構,即DILATEDRNN,同時解決所有這些挑戰。

論文地址:[1710.02224] Dilated Recurrent Neural Networks

Notoriously, learning with recurrent neural networks (RNNs) on long sequences is a difficult task. There are three major challenges: 1) extracting complex dependencies, 2) vanishing and exploding gradients, and 3) efficient parallelization. In this paper, we introduce a simple yet effective RNN connection structure, the DILATEDRNN, which simultaneously tackles all these challenges. The proposed architecture is characterized by multi-resolution dilated recurrent skip connections and can be combined flexibly with different RNN cells. Moreover, the DILATEDRNN reduces the number of parameters and enhances training efficiency significantly, while matching state-of-the-art performance (even with Vanilla RNN cells) in tasks involving very long-term dependencies. To provide a theory-based quantification of the architecture"s advantages, we introduce a memory capacity measure - the mean recurrent length, which is more suitable for RNNs with long skip connections than existing measures. We rigorously prove the advantages of the DILATEDRNN over other recurrent neural architectures.

項目地址:code-terminator/DilatedRNN

Getting Started

The light weighted demo demonstrates how to construct a multi-layer DilatenRNN with different cells, hidden structures and dilations for the task of permuted sequence classification on MNist. Although most of the code is straightforward, we would like to provide examples for different network constructions.

Below is an example that constructs a 9-layer DilatedRNN with vanilla RNN cells. The hidden dimension is 20. And the number of dilatation starts with 1 at the bottom layer and ends with 256 at the top layer.

cell_type = "RNN"

hidden_structs = [20] * 9

dilations = [1, 2, 4, 8, 16, 32, 64, 128, 256]

The current version of the code supports three types of cell: "RNN", "LSTM", and "GRU". Of course, the code also supports the case where the dilation rate at the bottom layer is greater than 1 (as shown on the right hand side of figure 2 in our paper).

An example of constructing a 4-layer DilatedRNN with GRU cells is shown below. The number of dilations starts at 4. It is worth mentioning that, the number of dilations does not necessarily need to increase with the power of 2. And the hidden dimensions for different layers do not need to be the same.

cell_type = "GRU"

hidden_structs = [20, 30, 40, 50]

dilations = [4, 8, 16, 32]

Tested environment: Tensorflow 1.3 and Numpy 1.13.1.

Final Words

That"s all for now and hope this repo is useful to your research. For any questions, please create an issue and we will get back to you as soon as possible.

更多機器學習資源:TensorFlow 安裝,TensorFlow 教程,TensorFlowNews 原創人工智慧,機器學習,深度學習,神經網路,計算機視覺,自然語言處理項目分享。


推薦閱讀:

形象的解釋神經網路激活函數的作用是什麼?
Python · 神經網路(四*)· 網路
如何評估神經網路演算法的計算量,從而來確定需要多少GPU的投入?
在神經網路中,激活函數sigmoid和tanh除了閾值取值外有什麼不同嗎?
你心中的deep learning(深度學習)領域世界十大名校是哪些?

TAG:TensorFlow | RNN | 神经网络 |