Three New arXiv Papers Address LLM Quantization and Specialized Applications

Researchers publish papers on pedestrian behavior inference using LLMs, self-explanation capabilities after quantization, and ultra-low-bit quantization methods.

Three New arXiv Papers Address LLM Quantization and Specialized Applications

Three new research papers on arXiv explore different aspects of large language model (LLM) capabilities and optimization.

According to arXiv paper 2601.00694v1, researchers have developed a vision-and-knowledge enhanced LLM for inferring pedestrian crossing behavior. The paper states that existing approaches, including statistical models and supervised learning methods, “demonstrate limited generalizability and perform inadequately on new sites.”

A second paper (arXiv:2601.00282v1) investigates how quantization affects LLMs’ ability to generate self-explanations—justifications that models provide for their own outputs. According to the abstract, while “quantization is widely used to accelerate inference and streamline the deployment of large language models,” its effects on self-explanations, which “require reasoning about model outputs,” have not been previously explored.

The third paper (arXiv:2509.16989v3) presents PTQTP (Post-Training Quantization to Trit-Planes), a method for quantizing LLMs to extremely low bit-widths. According to the researchers, post-training quantization of LLMs “to extremely low bit-widths remains challenging due to the fundamental trade-off between computational efficiency and representational capacity.”

All three papers represent ongoing research efforts to either expand LLM applications to specialized domains or improve their computational efficiency through quantization techniques.