Deep Recurrent Neural Networks
Deep Recurrent Neural Networks #
Vanilla RNNs introduce the hidden-state idea, but they struggle on longer and more complex sequences because gradients can vanish across time. Deep recurrent models extend the RNN idea in two important ways:
- make the recurrent architecture richer, for example by stacking multiple recurrent layers or using information from both directions,
- use gates and memory cells to control what should be remembered, forgotten, updated, and exposed.
This is why practical recurrent modelling usually moves from a simple RNN to stacked RNNs, bidirectional RNNs, GRUs, or LSTMs.