Trending topics
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
Here’s this week’s Ritual Research Digest, a newsletter covering the latest in the world of LLMs and the intersection of Crypto x AI.
With hundreds of papers published weekly, staying current with the latest is impossible. We do the reading so you don’t have to.

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models
Dynamic latent iteration is hard: it needs full context, adaptive objectives, parameter reuse, but policy-quality coupling causes training instability.


This work introduces TaH, dynamic latent thinking that iterates only over hard tokens by developing a specialized model arch and a stable training method, selectively applying latent iteration.
Finetuned from Qwen3-0.6/1.7B-Base, TaH achieves +4% over 5 reasoning benchmarks.

P1: Mastering Physics Olympiads with Reinforcement Learning
This work introduces P1, a family of OSS physics reasoning models. They integrate both train-time and test-time scaling, ensuring stronger reasoning ability deployed adaptively at inference.

P1 models are trained purely through RL post-training on base LMs in a multi-stage RL framework. For test time, they combine P1 models with the PhysicsMinions agent framework.
Their model P1-235B-A22B achieves Gold-medal performance on the IPhO 2025.

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling
The paper introduces a research agent that pushes performance on 3 dimensions: model size, context length, & interaction depth.

To sustain deep reasoning processes, the model is equipped with a 256K context window and up to 600 tool calls per task.
MiroThinker v1.0, equipped with a simple ReAct agent, achieves SOTA performance among open-source research agents.


What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity
This paper proposes methods to quantify and control the agent’s ideation diversity. The choice of agentic scaffold significantly influences ideation diversity.


Through a controlled experimental design, they establish a causal relationship, showing that increasing ideation diversity leads to improved performance on MLE-bench tasks. They also establish robustness when evaluated with alternative performance metrics.

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
This paper trains deep research Tulu (DR Tulu-8B) trained for open-ended, long-form deep research tasks.

To address verification in long-form tasks, DR-Tulu is finetuned on high-quality user data, and then trained via RL with evolving rubrics (RLER), in which rubrics co-evolve with the policy model during training. They obtain results better than strongest open 8-32 models.

Follow us @ritualdigest for more on all things crypto x AI research, and
@ritualnet to learn more about what Ritual is building.
3.14K
Top
Ranking
Favorites

