Three research papers published on arXiv on May 18, 2026, present advances in large language model applications and training methodologies.
Cross-Domain Recommendation Enhancement
According to arxiv.org, researchers introduced LLM-EDT (Large Language Model Enhanced Cross-domain Sequential Recommendation with Dual-phase Training), which addresses challenges in Cross-domain Sequential Recommendation (CDSR) systems. The paper identifies two key issues: an “imbalance issue” where interactions in one domain dominate user behavior, and a “transition issue” affecting cross-domain preference capture. The proposed system includes a “transferable item augmenter” to generate cross-domain behaviors and a “domain-aware profiling module” to create comprehensive user profiles. The code has been released online, according to the paper.
Learning from User Interaction Logs
A separate arxiv.org paper presents UNO (User log-driveN Optimization), a framework for improving LLM systems using user interaction logs. According to the researchers, the method addresses challenges in learning from “unstructured and noisy” user logs by distilling them into semi-structured rules and preference pairs. The paper states that UNO “significantly outperforms Retrieval Augmented Generation (RAG) and memory-based baselines,” with code open-sourced on GitHub.
Advanced Reasoning Training
According to a third arxiv.org paper, researchers developed a four-stage post-training workflow for LLM reasoning that achieved 79.3% on MATH and 25.2% on AIME 2024 benchmarks using a Qwen3-1.7B model, compared to 75.9% and 19.8% respectively for direct GRPO training. The workflow includes sparse-reward reinforcement learning, forward-KL warmup, on-policy distillation, and optional additional RL training.