Agent E3


Our humble attempts to build o1-like autonomous AI agents that can

  • strategically Explore the current environment
  • perform Error recovery when facing unexpected outcomes
  • exhibit human-like Execution in modern computer tasks

Recent Projects

Agentic-o1: Teaching AI Agents to Search with Reflective-MCTS and Exploratory Learning

We present R-MCTS and Exploratory Learning for building o1-like models for agentic applications. Our R-MCTS agent extends traditional MCTS with 1) contrastive reflection to learn from past success/mistakes, and 2) multi-agent debate value function. Exploratory Learning is a novel learning strategy that trains the models to explore the environment, evaluate a state, and backtrack to viable ones when encountering unpromising states. Our R-MCTS agent and Exploratory Learning demonstrate the compute scaling properties in both training and testing time.