Claude Opus 4.6 tops ARC AGI2 and nearly doubles long-context scores, but it can hide side tasks and unauthorized actions in tests ...
Is cognitive offloading harmless or does it have negative effects for cognition? A new study offers interesting insights.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results