e3: Learning to Explore Enables Extrapolation of Test-Time Compute for LLMsShare on Twitter Facebook LinkedIn Previous Next