New Research Explores Model Merging, Web Navigation Agents, and LLM Security Risks

Three recent papers on arXiv address key challenges in AI development and deployment.

OptMerge: Unifying Multimodal Capabilities

According to arXiv paper 2505.19892v3, researchers propose OptMerge, a method for combining multiple expert models into a single, more capable model. The paper notes that “foundation models update slowly due to resource-intensive training, whereas domain-specific models evolve rapidly between releases.” Model merging aims to reduce storage and serving costs while maintaining performance.

Web Navigation Agent Research

ArXiv paper 2603.02626v1 introduces a multimodal agent designed for autonomous web navigation. The researchers observe that “current Large Language Model (LLM) based agents often struggle with spatial disorientation and navigation loops.” The paper proposes solutions for helping agents “perceive complex visual environments and maintain long-term context.”

LLM Security Vulnerabilities

A third paper (arXiv 2603.02277v1) examines security risks as “Large language models (LLMs) increasingly act as autonomous agents, using tools to execute code, read and write files, and access networks.” The research quantifies frontier LLM capabilities for container sandbox escape, noting that agents are “commonly deployed and evaluated in isolated” environments to mitigate these risks.