What is Centaur and how does it improve hyperparameter tuning?

Centaur is a hybrid method that shares CMA-ES's internal optimization state (mean vector, step-size, covariance matrix) with an LLM. The LLM uses this state to propose informed trial configurations. In experiments, Centaur outperformed all pure classical and pure LLM methods, and even a 0.8B parameter LLM in Centaur beat frontier models used alone.

Why do LLMs struggle at hyperparameter optimization despite their reasoning abilities?

LLMs lack persistent memory of optimization state across trials, making it hard to track progress or learn from past configurations. They also cannot reliably maintain numerical precision for optimization parameters. Classical algorithms excel at these stateful, numerical tasks, while LLMs excel at domain reasoning—making hybrid approaches that divide these responsibilities most effective.

← Content

AI · 4 min read · April 21, 2026

LLMs complement but don't replace classical hyperparameter optimization

A study comparing LLM agents to classical algorithms like CMA-ES and TPE finds hybrid approaches work best for tuning model hyperparameters under compute constraints.

Source: arxiv/cs.LG · Fabio Ferreira, Lucca Wobbe, Arjun Krishnakumar, Frank Hutter, Arber Zela · open original ↗ ↗

Share: X LinkedIn

Classical hyperparameter optimizers outperform pure LLM agents, but hybrid methods combining both achieve superior results.

— CMA-ES and TPE consistently beat LLM-based agents when searching fixed hyperparameter spaces.
— LLMs struggle to maintain optimization state across multiple trials and experiments.
— Allowing LLMs to edit training code directly narrows but doesn't close the performance gap.
— Centaur, a hybrid pairing CMA-ES state with an LLM, outperforms all pure methods tested.
— Even a 0.8B parameter LLM in Centaur beats frontier models used alone.
— Classical methods lack domain knowledge that LLMs possess about code and tuning strategies.
— Search diversity matters less than avoiding out-of-memory failures under fixed budgets.
— LLMs work best as complements to classical optimizers, not replacements.

Frequently asked

No. The study shows classical methods like CMA-ES and TPE consistently outperform pure LLM agents on fixed hyperparameter spaces. LLMs struggle to track optimization state across trials. However, hybrid approaches that combine classical optimizers with LLMs achieve the best results, suggesting LLMs work best as complements rather than replacements.

#hyperparameter #optimization #llm #automl #hybrid

LLMs complement but don't replace classical hyperparameter optimization

Frequently asked

Synthetic Computers Enable Agent Training at Scale

ActiNet: Self-Supervised Model Improves Wrist Activity Classification

Mixed Precision Training Stabilizes Neural ODEs