In neural networks, the gating mechanism is an architectural motif for controlling the flow of activation and gradient signals. They are most prominently used in recurrent neural networks (RNNs), but have also found applications in other architectures.

Gating mechanisms are the centerpiece of long short-term memory (LSTM).[1] It was proposed to solve the vanishing gradient problem that made training RNNs for long-sequence modelling unstable. An LSTM contains three gates: the input gate, the forget gate, and the output gate. The input gate controls the flow of new information into the memory cell, the forget gate controls how much information is retained from the previous time step, and the output gate controls how much information is passed to the next layer.

The equations for LSTM are:[2] Here,   represents elementwise multiplication.

The gated recurrent unit (GRU) simplifies the LSTM.[3] Compared to the LSTM, the GRU has just two gates: reset gate and update gate, and also merges the cell state and hidden state. The reset gate roughly corresponds to the forget gate, and the update gate roughly corresponds to the input gate. The output gate is removed. There are several variants of GRU. One particular variant has these equations:[4] 

Gated Linear Units (GLUs)[5] adapt the gating mechanism for use in feedforward networks, often within Transformer-based architectures. They are defined as: where   is the first input and   is the second input. The   represents the sigmoid activation function.

Replacing   by other activation functions leads to variants of GLU: where ReLU, GELU, and Swish are different activation functions (see the main page for definitions).

In a Transformer, such gating units are often used in the feedforward modules. For a single vector input, this results in:[6]

Gating mechanism is used in highway networks, which were designed by unrolling an LSTM.

Channel gating[7] uses a gate to control the flow of information through different channels inside a convolutional neural network (CNN).

