Sequence Modelling

Recurrent Neural Networks

Recurrent Neural Networks #

Recurrent Neural Networks (RNNs) are neural networks designed for sequential data, where the order of inputs matters and the model must use information from earlier time steps to interpret later ones. Unlike a feedforward network, an RNN does not process each input in isolation. It carries a hidden state from one time step to the next, so the network can build a running summary of what it has seen so far.

Deep Recurrent Neural Networks

Deep Recurrent Neural Networks #

Vanilla RNNs introduce the hidden-state idea, but they struggle on longer and more complex sequences because gradients can vanish across time. Deep recurrent models extend the RNN idea in two important ways:

  1. make the recurrent architecture richer, for example by stacking multiple recurrent layers or using information from both directions,
  2. use gates and memory cells to control what should be remembered, forgotten, updated, and exposed.

This is why practical recurrent modelling usually moves from a simple RNN to stacked RNNs, bidirectional RNNs, GRUs, or LSTMs.