Yapay Zeka · 3 dk okuma · 24 Nisan 2026
VLAA-GUI: Framework Stops Agents from Looping and Guessing
A modular GUI automation system uses verification, loop detection, and search to prevent autonomous agents from declaring false success or repeating failed actions.
Kaynak: arxiv/cs.AI · Qijun Han, Haoqin Tu, Zijun Wang, Haoyue Dai, Yiyang Zhou, Nancy Lau, Alvaro A. Cardenas, Yuhui Xu, Ran Xu, Caiming Xiong, Zeyu Zheng, Huaxiu Yao, Yuyin Zhou, Cihang Xie · orijinali aç ↗ ↗
VLAA-GUI adds mandatory verification, loop-breaking, and search modules to prevent GUI agents from premature success claims and repetitive failures.
- — Completeness Verifier enforces visual evidence before agents declare task completion.
- — Loop Breaker detects repeated failures and forces strategy or interaction mode changes.
- — Search Agent queries LLMs for unfamiliar workflows when agents encounter unknown tasks.
- — Coding Agent and Grounding Agent handle specialized actions on demand.
- — Achieves 77.5% on OSWorld and 61.0% on WindowsAgentArena benchmarks.
- — Three of five tested backbones exceed human performance (72.4%) on OSWorld.
- — Ablation shows Loop Breaker cuts wasted steps by roughly half for loop-prone models.
Sık sorulanlar
- The Completeness Verifier is a mandatory component that checks whether a GUI task is truly complete before the agent stops. It uses an agent-level verifier to cross-examine the agent's completion claim against decision rules, rejecting claims that lack direct visual evidence from the UI. This prevents agents from declaring success prematurely.