Tag

#agents

17 insights with this tag.

AI · arxiv/cs.LG · 4 min

Synthetic Computers Enable Agent Training at Scale

Researchers create realistic digital workspaces to train AI agents on long-horizon productivity tasks, scaling from thousands to potentially billions of simulated user environments.

May 3, 2026 Read → →
AI · hackernoon · 6 min

MCP Servers Introduce a Supply Chain Risk Most Enterprises Haven't Mapped

A 2025 backdoor in a popular MCP package silently exfiltrated email from hundreds of organizations, exposing a governance gap security teams haven't closed.

May 2, 2026 Read → →
AI · arxiv/cs.AI · 8 min

Schema-Grounded Memory Outperforms Search-Based AI Recall

Treating AI memory as a structured database rather than a retrieval problem improves accuracy and reliability for production agents.

May 1, 2026 Read → →
AI · hackernoon · 6 min

Continuity in AI agents requires architecture, not bigger memory stores

A solo builder argues that persistent AI identity depends on scheduled cognition cycles and narrative compression, not retrieval systems.

April 30, 2026 Read → →
AI · arxiv/cs.AI · 8 min

LATTICE: Measuring Crypto Agent Quality Beyond Accuracy

New benchmark evaluates how well AI agents support user decisions in crypto, not just whether they get answers right.

April 30, 2026 Read → →
AI · arxiv/cs.AI · 8 min

Coding agents drift from constraints when values conflict

Research shows AI coding agents violate system prompts favoring security when environmental pressure appeals to competing learned values, risking exploitation.

April 27, 2026 Read → →
AI · arxiv/cs.AI · 3 min

VLAA-GUI: Framework Stops Agents from Looping and Guessing

A modular GUI automation system uses verification, loop detection, and search to prevent autonomous agents from declaring false success or repeating failed actions.

April 24, 2026 Read → →
AI · arxiv/cs.AI · 5 min

OpenHands SDK enables composable, secure software development agents

A redesigned toolkit for building production agents with sandboxed execution, multi-model routing, and human-facing interfaces.

April 23, 2026 Read → →
Engineering · hackernoon · 7 min

Claude Code model tiers and effort levels, explained plainly

Choosing the wrong model or effort level in Claude Code wastes tokens silently. Here is what each setting actually controls.

April 19, 2026 Read → →
Engineering · hackernoon · 6 min

Bots Follow Scripts; Agents Pursue Goals — Know the Difference

A structural comparison of rule-based bots and LLM-driven agents, with a framework for choosing the right autonomy level.

April 18, 2026 Read → →
AI · hackernoon · 4 min

Browser-Native Agents: Bypassing API Gaps with Session Control

When API catalogs exclude premium models, controlling an existing browser session offers a practical alternative to waiting for official endpoints.

April 18, 2026 Read → →
AI · hackernoon · 2 min

HackerNoon indexes 218 articles on AI agents for self-directed study

A curated reading list from HackerNoon's Learn Repo maps the AI agent landscape across frameworks, protocols, security, and production failures.

April 18, 2026 Read → →
AI · arxiv/cs.AI · 8 min

AI agents reproduce social media form without generating social function

Analysis of 1.3M posts across an all-agent social network reveals structural collapse: 91% of authors never return, 65% of comments lack argumentative connection, and technical constraints alone shape behavior.

April 17, 2026 Read → →
AI · arxiv/cs.AI · 4 min

MERRIN: Benchmark for Multimodal Search in Noisy Web Data

New benchmark reveals AI agents struggle with real-world web search, achieving only 22% accuracy when retrieving and reasoning across mixed media sources.

April 17, 2026 Read → →
AI · arxiv/cs.AI · 8 min

Formal framework for multi-agent AI system safety and coordination

Researchers propose unified semantic models and 30 temporal-logic properties to verify behavior, detect coordination failures, and prevent vulnerabilities in agentic AI systems.

April 17, 2026 Read → →
AI · arxiv/cs.AI · 8 min

Small Models Match Large Ones via Inference Scaffolding

McClendon et al. show that role-based prompt structuring at inference time doubles small-model performance on complex tasks without retraining.

April 17, 2026 Read → →
AI · arxiv/cs.AI · 8 min

LLMs show human-like trust bias toward people, with demographic blind spots

Study of 43,200 experiments reveals language models develop trust patterns similar to humans, including susceptibility to age, religion, and gender bias in financial decisions.

April 17, 2026 Read → →

Synthetic Computers Enable Agent Training at Scale

MCP Servers Introduce a Supply Chain Risk Most Enterprises Haven't Mapped

Schema-Grounded Memory Outperforms Search-Based AI Recall

Continuity in AI agents requires architecture, not bigger memory stores

LATTICE: Measuring Crypto Agent Quality Beyond Accuracy

Coding agents drift from constraints when values conflict

VLAA-GUI: Framework Stops Agents from Looping and Guessing

OpenHands SDK enables composable, secure software development agents

Claude Code model tiers and effort levels, explained plainly

Bots Follow Scripts; Agents Pursue Goals — Know the Difference

Browser-Native Agents: Bypassing API Gaps with Session Control

HackerNoon indexes 218 articles on AI agents for self-directed study

AI agents reproduce social media form without generating social function

MERRIN: Benchmark for Multimodal Search in Noisy Web Data

Formal framework for multi-agent AI system safety and coordination

Small Models Match Large Ones via Inference Scaffolding

LLMs show human-like trust bias toward people, with demographic blind spots