Three New arXiv Papers Explore LLM Applications in Time Series, Gaming, and Medical Practice

Three new preprints on arXiv address different challenges in applying Large Language Models to specialized domains.

Time Series Forecasting: According to arXiv:2512.04871v1, researchers introduced STELLA, a system designed to guide LLMs for time series forecasting using semantic abstractions. The paper notes that “recent adaptations of Large Language Models (LLMs) for time series forecasting often fail to effectively enhance information for raw series, leaving LLM reasoning capabilities underutilized.” The authors indicate that existing prompting strategies rely on static correlations.

Gaming Decision-Making: A paper on what-if analysis (arXiv:2509.04791v2) examines LLM limitations in gaming environments. According to the abstract, while “Large Language Models (LLMs) are effective at reasoning and information retrieval,” they “remain unreliable for decision-making in dynamic, partially observable, high-stakes environments such as MOBA games.” The researchers identify “weak counterfactual reasoning” as a key limitation.

Medical Applications: Research on multidimensional rubric-oriented reward model learning (arXiv:2511.16139v3) addresses LLM integration in medical practice. The paper identifies critical alignment issues, including “a misalignment between static evaluation benchmarks” and real-world clinical needs, though full details were not provided in the source material.