ML-Agent: New Framework Trains 7B Model to Match GPT-5 Performance on Autonomous Machine Learning Tasks

Researchers develop reinforcement learning framework that trains smaller language models for autonomous ML engineering at lower computational cost.

Researchers have introduced ML-Agent, a new framework that trains smaller language models to perform autonomous machine learning engineering tasks through online reinforcement learning, according to a paper published on arxiv.org.

The framework addresses limitations in current prompt-based approaches, where smaller models struggle to learn from execution trajectories while large proprietary models incur high computational costs. According to the paper, ML-Agent introduces three key components: exploration-enriched fine-tuning for diverse action generation, step-wise reinforcement learning for accelerated experience collection, and an ML-specific reward module that unifies feedback signals for optimization.

The researchers trained ML-Agent using a 7B-sized Qwen-2.5 language model on only 9 ML tasks. According to arxiv.org, the resulting agent “achieves comparable performance to agents using much larger proprietary LLMs (e.g., GPT-5) but at significantly lower computational cost, demonstrating strong performance and cross-task generalization.”

This represents what the researchers describe as “the first time” exploring “the paradigm of learning-based agentic ML, where an LLM agent learns through interactive experimentation on ML tasks using online reinforcement learning.”

The work was published on May 5, 2026, and focuses on enabling autonomous machine learning engineering through reinforcement learning rather than traditional prompting methods, according to the arxiv.org abstract.