Study finds AI agents far behind humans in multi-step scientific reasoning

A May 2026 Nature-covered study using the PaperArena benchmark found top AI agents scored under 40% on cross-paper scientific reasoning tasks, compared to roughly 80% for human experts. The findings ...

Published: 2026-05-03