← Content
Engineering · 8 min read · April 27, 2026

Sequential decision-making reduces error drift in modular digital twins

Researchers frame error propagation in digital twins as a Markov decision process, comparing model-based and model-free approaches to optimize maintenance interventions.

Source: arxiv/cs.LG · Annice Najafi, Shokoufeh Mirzaei · open original ↗ ↗
Share: X LinkedIn

Najafi and Mirzaei use Markov decision processes to decide when and how to intervene in digital twins to prevent error accumulation.

  • Hidden Markov Models infer latent error regimes from surrogate-physics residuals in modular digital twins.
  • MDP formulation treats inferred regimes as states and corrective actions as decisions with cost-benefit rewards.
  • POMDP extension accounts for imperfect regime classification using Bayesian belief updates and confusion matrices.
  • Dynamic programming solves both MDP and POMDP; validated via Gillespie stochastic simulation.
  • Q-learning and REINFORCE tested as model-free alternatives to assess learning without explicit model knowledge.
  • MDP policy achieves highest cumulative reward; POMDP recovers 95% of MDP performance under observation noise.
  • Information value quantified: gap between MDP and POMDP guides investment in classification accuracy improvements.

Frequently asked

  • An MDP assumes you observe the true error regime perfectly; a POMDP accounts for classification uncertainty by maintaining a probability distribution over regimes updated via Bayesian filtering. In the paper, POMDP recovers 95% of MDP performance under realistic noise, showing that imperfect sensing is tolerable for maintenance decisions.

Related