Reasoning over mathematical objects: on-policy reward modeling and test time aggregation Paper • 2603.18886 • Published Mar 19 • 6
ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions Paper • 2605.20087 • Published May 19 • 18