Three new research papers published on arXiv demonstrate advances in neuro-symbolic approaches that combine large language models with symbolic reasoning for automated mathematical tasks.
According to arxiv.org, Stepwise introduces a neuro-symbolic framework for automated systems verification that performs best-first tree search over proof states while querying LLMs for candidate proof steps. The system fine-tunes LLMs on proof state-step pairs and incorporates interactive theorem proving (ITP) tools to repair rejected steps and discharge subgoals. On the FVEL seL4 benchmark, Stepwise proved up to 77.6% of theorems, “substantially surpassing previous LLM-based approaches and standalone Sledgehammer.”
In a separate paper, arxiv.org describes VIRO (Verification-Integrated Reasoning Operators), a framework for Referring Expression Comprehension that addresses cascading errors in compositional reasoning. The system embeds “lightweight operator-level verifiers within reasoning steps” to validate outputs like object existence and spatial relationships. VIRO achieved 61.1% balanced accuracy across target-present and no-target settings and was accepted to CVPR 2026.
FormalEvolve, detailed in a third arxiv.org paper, tackles autoformalization—translating natural-language mathematics into machine-checkable statements. The framework uses “LLM-driven mutation and crossover with bounded patch repair” alongside Abstract Syntax Tree rewrite operations. On CombiBench and ProofNet benchmarks with a generator-call budget of T=100, FormalEvolve reached semantic hit rates of 58.0% and 84.9% respectively.