← Content
AI · 4 min read · May 3, 2026

Selective-Update RNNs Match Transformers While Using Less Memory

A new RNN architecture learns when to update internal state, preserving memory across long sequences and reducing computational waste on redundant input.

Source: arxiv/cs.LG · Bojian Yin, Shurong Wang, Haoyu Tan, Sander Bohte, Federico Corradi, Guoqi Li · open original ↗ ↗
Share: X LinkedIn

Selective-Update RNNs preserve memory during low-information periods, matching Transformer accuracy with lower computational cost.

  • Standard RNNs update state at every step, causing memory decay and wasting computation on static input.
  • suRNNs use neuron-level binary switches that activate only for informative events, decoupling updates from sequence length.
  • Preserved memory during silence or noise creates direct gradient paths to distant past events.
  • Experiments show suRNNs match or exceed Transformer performance on Long Range Arena and WikiText benchmarks.
  • Each neuron learns its own update timescale, aligning model behavior with actual information density in data.
  • Approach maintains exact memory unchanged during low-information intervals, reducing overwriting and signal loss.
  • More efficient long-term storage than Transformers while retaining competitive accuracy on long-range dependencies.

Frequently asked

  • suRNNs use neuron-level binary switches that only activate when the input contains new information. During low-information periods (silence, noise, or static input), the switches remain closed and the internal memory stays unchanged. This prevents the model from overwriting past information and creates a direct path for gradients to flow backward across time, solving the memory decay problem that standard RNNs face.

Related