Four independent teams, four setups, same conclusion.
The harness moves the needle far more than the next model upgrade. The bars below show what changed when teams re-engineered the system around the model.
The pattern. None of these teams beat the next model. They beat the previous version of themselves by re-engineering the system around the model — fewer tools, tighter verification, durable session state, hand-written context. The harness is the leverage.