İçerik

Bilgi, yeniden yazıldı.

Yapay zeka, startup, üretkenlik ve mühendislik üzerine sessiz bir akış. Her kart sıkıştırılır, eleştirilir ve eyleme bağlanır — çeviri değil, yeniden üretim.

Tümü Yapay Zeka Startup Üretkenlik Mühendislik

Yapay Zeka · arxiv/cs.AI · 6 dk

Measuring Where Chatbots Beat Humans on Tests

Researchers apply psychometric methods to identify test items where LLMs systematically outperform human learners, revealing assessment vulnerabilities.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.AI · 8 dk

LLMs hit formal reasoning ceiling; Chomsky Hierarchy reveals efficiency gap

New benchmark shows large language models struggle with structured complexity tasks and require prohibitive compute to achieve reliability in formal reasoning.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.AI · 8 dk

Vision-Language Models Fail on Dense Visual Grids

A new benchmark reveals VLMs collapse sharply on simple grid-reading tasks, exposing a gap between visual encoding and language output called Digital Agnosia.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.AI · 8 dk

Modular Neural Networks Learn Three-Valued Logic Without Symbolic Solvers

THEIA demonstrates that dedicated domain engines enable neural networks to master Kleene three-valued logic and generalize compositionally to sequences 100x longer than training.

17 Nisan 2026 Oku → →
Mühendislik · arxiv/cs.LG · 4 dk

Hybrid PINNs: Finite-Difference Regularization for Physics Solvers

Adding weak finite-difference gradient penalties to physics-informed neural networks improves boundary accuracy without replacing automatic-differentiation residuals.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 3 dk

Framework uses AI outputs as features, not proxies, for labeled data

Generative Augmented Inference treats LLM predictions as informative signals rather than direct substitutes, reducing human labeling needs by 75–90% across operations tasks.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 8 dk

Foundation Models vs. Task-Specific ML in Electricity Price Forecasting

Time series foundation models outperform traditional deep learning on probabilistic forecasts, but well-tuned conventional models remain competitive at lower computational cost.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 8 dk

LLM Panels Match Expert Clinicians in Medical Diagnosis Scoring

A study of three frontier AI models scoring real hospital cases shows calibrated LLM juries can reliably replace human expert panels for medical AI evaluation.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 5 dk

Rejection-Gated Policy Optimization replaces importance weighting with learned gates

A new reinforcement learning method selects trustworthy samples via differentiable gates instead of reweighting all samples, reducing variance and improving RLHF alignment.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 8 dk

INT4 Quantization Fails After FP32 Convergence in Predictable Phases

Post-training quantization assumes converged models are ready to compress, but INT4 quantization collapses in a three-phase pattern tied to weight updates, not learning rate decay.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 8 dk

Distilling Transformers into Mamba via Linearized Attention

A two-stage knowledge transfer method preserves Transformer performance in State Space Models by routing through linearized attention as an intermediate step.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 8 dk

Three-Phase Transformer: Structural Prior for Decoder Efficiency

A residual-stream architecture using cyclic channel partitioning and phase-aligned rotations achieves 7% perplexity gains with minimal parameter overhead.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 6 dk

Speech Models Fail Safety Tests That Text Passes

VoxSafeBench reveals speech language models recognize social norms in text but ignore them when cues arrive through voice, speaker identity, or environment.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 6 dk

Speech Models Fail Safety Tests That Text Models Pass

A new benchmark reveals that speech language models drop safety, fairness, and privacy protections when cues arrive as audio rather than text.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 4 dk

Retrieval-Augmented Set Completion for Clinical Code Authoring

A two-stage approach retrieves similar clinical value sets then classifies candidates, outperforming direct LLM generation on standardized medical vocabularies.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 4 dk

Retrieval beats memorization for clinical code selection

A two-stage retrieval-then-classify method outperforms direct LLM generation for assembling clinical value sets from large standardized vocabularies.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 8 dk

Machine Learning Maps Drug Binding to Viral RNA Pseudoknot

Spectral map analysis reveals how small-molecule inhibitors distort SARS-CoV-2 RNA structure in topology-dependent ways, with protonation state determining mechanism.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 8 dk

Action Aliasing Breaks Safe RL Differently Depending on Filter Placement

A formal comparison of two projection-based safety strategies reveals that embedding safeguards in the policy creates gradient rank deficiency, while environment-level filters distribute the problem to the critic.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 3 dk

Transformer models outperform CNNs in prostate MRI segmentation

SwinUNETR achieves 5-point Dice improvement over standard UNet when trained on mixed-reader datasets, suggesting transformer attention handles annotation variability better.

17 Nisan 2026 Oku → →
Mühendislik · arxiv/cs.LG · 8 dk

Queueing Model Reveals How AI Automation Paradoxically Worsens Cyber Risk

Research from Yun et al. shows that symmetric automation in attack and defense can increase exploit success rates, with heavy-tailed patching delays creating persistent vulnerability backlogs.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 8 dk

Quantum kernel inference cuts query cost by removing data-size dependence

New algorithm reduces quantum machine learning inference complexity from O(N) to O(1) in data size, achieving query-optimal bounds via amplitude estimation.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 8 dk

Formalizing How Much Data Proves a Learning Model Right

Researchers formalize identifying information—the bits needed to confirm or reject a hypothesis—bridging information theory with practical sample complexity.

17 Nisan 2026 Oku → →
Yapay Zeka · arxiv/cs.LG · 8 dk

Estimating classification ceiling without perfect labels

Ushio et al. show how to measure the theoretical best-case error rate in binary classification using imperfect soft labels and calibration techniques.

17 Nisan 2026 Oku → →

Measuring Where Chatbots Beat Humans on Tests

LLMs hit formal reasoning ceiling; Chomsky Hierarchy reveals efficiency gap

Vision-Language Models Fail on Dense Visual Grids

Modular Neural Networks Learn Three-Valued Logic Without Symbolic Solvers

Hybrid PINNs: Finite-Difference Regularization for Physics Solvers

Framework uses AI outputs as features, not proxies, for labeled data

Foundation Models vs. Task-Specific ML in Electricity Price Forecasting

LLM Panels Match Expert Clinicians in Medical Diagnosis Scoring

Rejection-Gated Policy Optimization replaces importance weighting with learned gates

INT4 Quantization Fails After FP32 Convergence in Predictable Phases

Distilling Transformers into Mamba via Linearized Attention

Three-Phase Transformer: Structural Prior for Decoder Efficiency

Speech Models Fail Safety Tests That Text Passes

Speech Models Fail Safety Tests That Text Models Pass

Retrieval-Augmented Set Completion for Clinical Code Authoring

Retrieval beats memorization for clinical code selection

Machine Learning Maps Drug Binding to Viral RNA Pseudoknot

Action Aliasing Breaks Safe RL Differently Depending on Filter Placement

Transformer models outperform CNNs in prostate MRI segmentation

Queueing Model Reveals How AI Automation Paradoxically Worsens Cyber Risk

Quantum kernel inference cuts query cost by removing data-size dependence

Formalizing How Much Data Proves a Learning Model Right

Estimating classification ceiling without perfect labels