《Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation》阅读笔记

## contribution

• a novel RNN Encoder–Decoder

能够处理变长序列

• a novel hidden unit

• reset gate
• update gate

## RNN Encoder–Decoder

From a probabilistic perspective, this new model is a general method to learn the conditional distribution over a variable-length sequence conditioned on yet another variable-length sequence

### Decoder

Decoder是另一个RNN，其被训练出来以通过预测隐藏状态$h_t$的下一个符号$y_t$来生成输出序列。计算公式如下

## Hidden Unit

### reset gate

In this formulation, when the reset gate is close to 0, the hidden state is forced to ignore the pre- vious hidden state and reset with the current input only. This effectively allows the hidden state to drop any information that is found to be irrelevant later in the future, thus, allowing a more compact representation.

### update gate

the update gate controls how much information from the previous hidden state will carry over to the current hidden state.

## future

