Three New arXiv Papers Address Action Feasibility, Medical Reasoning, and LLM Training Metrics

Three New arXiv Papers Address Key AI Challenges

Three new papers have been published on arXiv addressing distinct challenges in artificial intelligence systems.

Action Feasibility in Embodied Agents

According to arXiv paper 2602.22452v1, researchers have introduced CWM (Contrastive World Models), which addresses a “critical bottleneck in embodied agent pipelines.” The paper states that agents must identify “which candidate actions are physically executable in the current state” before planning or reasoning can occur.

Evidence-Grounded Medical Diagnosis

ArXiv paper 2602.23276v1 presents CXReasonAgent, a system designed for chest X-ray interpretation. According to the abstract, “Chest X-ray plays a central role in thoracic diagnosis, and its interpretation inherently requires multi-step, evidence-grounded reasoning.” The researchers note that current large vision-language models (LVLMs) “often generate plausible responses that are not faithfully grounded” in evidence.

LLM Training Optimization Issues

A third paper (arXiv:2602.21189v2) examines why optimizing for Pass@k metrics can harm Pass@1 performance in language models. According to the abstract, Pass@k is “a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning,” defining success when “any of k independently sampled solutions passes a verification.”

All three papers represent ongoing research and have not yet undergone peer review.