Opus 4.8 tops the LLM leaderboard with 95% on skill evals

We added Claude Opus 4.8 to our ongoing model benchmark. It scored 95% with skill context, which puts...

Read Original

Related