- AI · arxiv/cs.AI · 8 min
LATTICE: Measuring Crypto Agent Quality Beyond Accuracy
New benchmark evaluates how well AI agents support user decisions in crypto, not just whether they get answers right.
April 30, 2026 Read → → - AI · arxiv/cs.AI · 8 min
AI Bias in Code Decisions: Prompt Wording Shifts Model Choices
Researchers find that small phrasing changes in prompts push AI systems toward poor software engineering decisions, and standard prompt techniques don't fix it.
April 23, 2026 Read → →