- Yapay Zeka · arxiv/cs.AI · 8 dk
Benchmark Rubrics Shift LLM Scores in Financial NLP Tasks
How wording changes in evaluation criteria and metric selection alter model rankings on financial text benchmarks, requiring governance over gold-label assumptions.
2 Mayıs 2026 Oku → → - Mühendislik · arxiv/cs.AI · 8 dk
Automated SysML generation bridges text to engineering models
Hendricks and Cicirello propose a five-step pipeline using NLP and LLMs to convert unstructured documents into SysML diagrams and executable dynamical system models.
23 Nisan 2026 Oku → → - Mühendislik · arxiv/cs.AI · 4 dk
Dual Transformers Improve Bug Assignment Accuracy by 10%+
TriagerX uses two transformer models and developer interaction history to recommend the right engineer for bug fixes, outperforming single-model approaches.
20 Nisan 2026 Oku → →