← Content
AI · 3 min read · April 17, 2026

Framework uses AI outputs as features, not proxies, for labeled data

Generative Augmented Inference treats LLM predictions as informative signals rather than direct substitutes, reducing human labeling needs by 75–90% across operations tasks.

Source: arxiv/cs.LG · Cheng Lu, Mengxin Wang, Dennis J. Zhang, Heng Zhang · open original ↗ ↗
Share: X LinkedIn

GAI framework incorporates AI-generated outputs as features to estimate human-labeled outcomes, reducing labeling costs while maintaining accuracy.

  • Treats LLM outputs as informative features, not direct proxies for true labels.
  • Uses orthogonal moment construction for consistent, valid inference with nonparametric relationships.
  • Guarantees weak improvement over human-only estimators; strict gains when auxiliary data is predictive.
  • Conjoint analysis: 50% error reduction, 75% fewer human labels required.
  • Retail pricing: outperforms alternatives even with identical input access.
  • Health insurance: cuts labeling by 90% while preserving decision accuracy.
  • Maintains valid confidence intervals without widening bounds.
  • Scales to diverse operations management and data-driven decision tasks.

Frequently asked

  • GAI treats AI outputs as informative features rather than ground truth. It learns the relationship between AI signals and human labels using a subset of labeled data, then uses that learned relationship to estimate outcomes for unlabeled cases. This avoids bias from misspecified AI-to-label mappings and guarantees efficiency gains whenever the AI signal is predictive.

Related