SoundnessBench: Can Your AI Scientist Really Tell Good Research Ideas from Bad Ones?
Autonomous AI research agents aim to accelerate scientific discovery by automating the research pipeline, from hypothesis generation to peer review. However, existing benchmarks ra...