Three New Agent Frameworks Advance Domain-Specific AI Automation

Three separate research teams have published agent frameworks designed to improve reliability and automation in specialized domains, according to papers posted on arxiv.org on May 18, 2026.

CAX-Agent addresses reliability challenges in MAPDL finite-element simulation by introducing an “agent harness” with structured execution control and fault recovery, according to arxiv.org. The system organizes execution into three layers—LLM service, agent harness, and solver backend—with a recovery ladder that escalates from deterministic rule patching through model-driven regeneration to human intervention. In testing on 50 structural benchmarks with 450 total case-runs, the model-driven recovery strategy achieved a 0.9267 completion rate and 3.59/4 task score, outperforming rule-only and no-recovery approaches with “large effect sizes,” the paper states.

ColPackAgent provides an agent framework for Monte Carlo simulations of colloidal packing, according to arxiv.org. The system uses a Model Context Protocol (MCP) tool server wrapping HOOMD-blue hard-particle Monte Carlo and encodes a “four-stage workflow contract.” The researchers note that “without dedicated simulation tools and workflow instructions, general-purpose Large Language Model (LLM) agents tend to describe such workflows rather than execute them reliably.”

FORGE (Failure-Optimized Reflective Graduation and Evolution) enables LLM agents to improve decision-making through self-generated memory without gradient updates, according to arxiv.org. Tested on the CybORG CAGE-2 network-defense benchmark, FORGE improved average evaluation returns by 1.7-7.7× over zero-shot baselines and 29-72% over Reflexion baselines across 12 model-representation conditions, the paper reports.