According to research published on arxiv.org, SkinGPT-X represents a new approach to dermatological diagnosis using a multimodal collaborative multi-agent system integrated with a “self-evolving dermatological memory mechanism.” The system simulates the diagnostic workflow of dermatologists to deliver transparent and trustworthy diagnostics for complex and rare dermatological cases.
The researchers validated SkinGPT-X through a three-tier comparative experiment. According to arxiv.org, when benchmarked against four state-of-the-art Large Language Models across four public datasets, SkinGPT-X demonstrated a +9.6% accuracy improvement on the DDI31 dataset and a +13% weighted F1 gain on Dermnet compared to existing state-of-the-art models.
For fine-grained classification, the team constructed a large-scale multi-class dataset covering 498 distinct dermatological categories. Most notably, according to arxiv.org, researchers curated “the first benchmark to address the scarcity of clinical rare skin diseases,” containing 564 clinical samples with eight rare dermatological diseases. On this rare disease dataset, SkinGPT-X achieved a +9.8% accuracy improvement, a +7.1% weighted F1 improvement, and a +10% Cohen’s Kappa improvement.
The research addresses limitations in existing systems, which according to arxiv.org, “frequently struggle with fine-grained, large-scale multi-class diagnostic tasks and rare skin disease diagnosis owing to training data sparsity, while also lacking the interpretability and traceability essential for clinical reasoning.”