Four research papers published to arXiv on April 2, 2026, present distinct approaches to improving AI model performance and efficiency.
Brainstacks: Modular Continual Learning
According to arxiv.org, researchers introduced Brainstacks, a modular architecture for continual multi-domain fine-tuning of large language models. The system uses frozen adapter stacks that compose additively at inference, incorporating MoE-LoRA with top-2 routing and QLoRA 4-bit quantization. Testing on TinyLlama-1.1B and Gemma 3 12B IT showed MoE-LoRA achieved “2.5x faster convergence than parameter-matched single LoRA.” The researchers reported that “domain stacks encode transferable cognitive primitives (instruction-following clarity, numerical reasoning, procedural logic, chain-of-thought structure) rather than domain-specific knowledge,” with medical prompts routing to chat+math stacks in 97% of cases.
Science-T2I: Scientific Accuracy in Images
According to arxiv.org, researchers addressed “scientific illusions” in image generation with Science-T2I, featuring over 20,000 adversarial image pairs across 16 scientific domains. The study found that none of 18 evaluated models scored above 50 out of 100 on implicit scientific prompts, while explicit prompts yielded scores “roughly 35 points higher.” The researchers developed SciScore, a CLIP-H-based reward model that “surpassing GPT-4o and experienced human evaluators by roughly 5 points,” and achieved “relative improvement exceeding 50%” when applied to FLUX.1[dev].
PixelPrune: Visual Token Reduction
According to arxiv.org, PixelPrune exploits pixel-level redundancy in document and GUI applications, where “only 22—71% of image patches are pixel-unique.” The training-free method prunes redundant patches before Vision Transformer encoding, delivering “up to 4.2× inference speedup and 1.9× training acceleration” while maintaining competitive accuracy.
ORCA: Test-Time Calibration
According to arxiv.org, Online Reasoning Calibration (ORCA) uses meta-learning to update calibration modules for each input. At risk level δ=0.1, ORCA improved Qwen2.5-32B efficiency with “savings up to 47.5% with supervised labels and 40.7% with self-consistency labels,” and improved MATH-500 savings from 24.8% to 67.0% in zero-shot settings.