New Research Examines Methods for Enhancing Large Language Models Through Skill Transfer and Improved Training

Four new papers explore techniques for improving LLM capabilities, from cross-modal skill injection to optimal learning rates and aggregation methods.

New Research Examines Methods for Enhancing Large Language Models Through Skill Transfer and Improved Training

Researchers have published multiple studies examining different approaches to improving large language model capabilities and efficiency.

According to a paper on arxiv.org, investigators studied cross-modal skill injection, which aims to transfer domain-specific expertise from Large Language Models to Vision-Language Models without requiring additional training data or significant computational overhead. The research found that cross-modal skill injection “generally performs well in instruction-following and cross-lingual settings, yet struggles with mathematical reasoning,” and that classic approaches such as TA and DARE “consistently achieve superior performance over alternative merging methods.”

Another arxiv.org paper, accepted to ICML 2026, addressed how to aggregate answers from multiple LLMs beyond simple majority voting. The researchers introduced two algorithms called Optimal Weight and Inverse Surprising Popularity that leverage “both first-order and second-order information” and “consistently outperform standard baselines” across benchmarks including UltraFeedback, MMLU, and real-world healthcare applications.

A separate study on arxiv.org challenged assumptions about Low-Rank Adaptation (LoRA) variants. Through extensive hyperparameter searches, researchers found that “once learning rates are properly tuned, all methods achieve similar peak performance (within 1-2%),” suggesting that “vanilla LoRA remains a competitive baseline.”

Finally, according to arxiv.org, researchers introduced “Formal Skill,” a runtime-native abstraction for representing reusable LLM agent capabilities with JSON metadata and Python executors, implemented in an open-source system called FairyClaw that achieved competitive results on Harness-Bench “while using substantially fewer tokens.”