Engineering · 8 min read · April 27, 2026
Sequential decision-making reduces error drift in modular digital twins
Researchers frame error propagation in digital twins as a Markov decision process, comparing model-based and model-free approaches to optimize maintenance interventions.
Najafi and Mirzaei use Markov decision processes to decide when and how to intervene in digital twins to prevent error accumulation.
- — Hidden Markov Models infer latent error regimes from surrogate-physics residuals in modular digital twins.
- — MDP formulation treats inferred regimes as states and corrective actions as decisions with cost-benefit rewards.
- — POMDP extension accounts for imperfect regime classification using Bayesian belief updates and confusion matrices.
- — Dynamic programming solves both MDP and POMDP; validated via Gillespie stochastic simulation.
- — Q-learning and REINFORCE tested as model-free alternatives to assess learning without explicit model knowledge.
- — MDP policy achieves highest cumulative reward; POMDP recovers 95% of MDP performance under observation noise.
- — Information value quantified: gap between MDP and POMDP guides investment in classification accuracy improvements.
Frequently asked
- An MDP assumes you observe the true error regime perfectly; a POMDP accounts for classification uncertainty by maintaining a probability distribution over regimes updated via Bayesian filtering. In the paper, POMDP recovers 95% of MDP performance under realistic noise, showing that imperfect sensing is tolerable for maintenance decisions.