Three recent papers on arXiv examine critical challenges facing large language models (LLMs) across different domains.
A study titled “Memorization in Large Language Models in Medicine” (arXiv:2509.08604v3) investigates memorization patterns in LLMs adapted for medical applications. According to the abstract, the research examines the prevalence, characteristics, and implications of memorization in models that have undergone continued pre-training or fine-tuning on medical data to enhance “domain-specific accuracy and safety.”
A second paper, “Jailbreaking Large Language Models through Iterative Tool-Disguised Attacks via Reinforcement Learning” (arXiv:2601.05466v1), addresses security vulnerabilities in LLMs. The research focuses on jailbreak attacks that can “elicit harmful responses violating human values and safety guidelines,” despite LLMs demonstrating “remarkable capabilities across diverse applications.”
The third study, “An Evaluation on Large Language Model Outputs: Discourse and Memorization” (arXiv:2304.08637v2), presents an empirical evaluation of outputs from nine widely-available LLMs. According to the abstract, the analysis was conducted using “off-the-shelf, readily-available tools” and identified “a correlation between percentage of m[emorization]” in the outputs, though the abstract excerpt is incomplete.
These papers collectively highlight ongoing research into LLM reliability, safety, and memorization behaviors across different application contexts.