AI agents are getting more capable, but reliability is lagging—and that’s a problem Most AI vendors don't benchmark for reliability. A new benchmark from Princeton researchers does. Published: 2026-03-24