AI Pathology Foundation Models Struggle with Inter-Center Variability

Top post
Artificial Intelligence in Pathology: Challenges in the Robustness of Foundation Models
Foundation Models (FMs) in Artificial Intelligence (AI) offer promising possibilities for medical diagnostics, particularly in pathology. However, before these models can be used in clinical practice, their robustness against variations between different medical centers must be ensured. Differences in staining procedures, scanners, and other factors can lead to so-called "Medical Center Signatures," which influence the results of the FMs.
A recent study investigates the robustness of ten publicly available pathology FMs and finds that all models strongly represent the medical center from which the data originates. For nine of the ten models, this representation is even stronger than that of biological information such as tissue type or cancer type. This raises the question of whether the FMs are actually recognizing biological features or rather reacting to the technical differences between the centers.
To quantify the robustness of the FMs, the study introduced the "Robustness Index." This index indicates the extent to which biological features dominate over the confounding features of the medical centers. An index greater than one means that the biological features are more strongly weighted. Of the models examined, only one achieved a robustness index above one, and even then only barely.
The study also analyzes the influence of the lack of robustness on the classification performance of downstream models. It shows that errors in cancer type classification do not occur randomly, but are specifically attributable to confounding factors originating from the same medical center. Images of other classes from the same center are incorrectly assigned to the class being classified.
The visualization of the FM embedding spaces illustrates this problem. The embeddings are organized more strongly by medical center than by biological factors. Consequently, the center of origin of the images can be predicted more accurately than the tissue type or cancer type.
These results underscore the importance of developing more robust pathology FMs. The robustness index provides a valuable metric to measure progress in this area and promote the development of reliable AI models for medical diagnostics. Future research should focus on improving the robustness of FMs against variations between medical centers to enable their clinical application.
For companies like Mindverse, which develop AI-powered solutions for various fields, these findings are particularly relevant. The development of robust and reliable AI models is crucial for successful practical application, especially in the sensitive field of medical diagnostics. Considering factors such as the differences between medical centers is therefore essential to create trustworthy and effective AI solutions.
Bibliography: - de Jong, E. D., Marcus, E., & Teuwen, J. (2025). Current Pathology Foundation Models are unrobust to Medical Center Differences. arXiv preprint arXiv:2501.18055. - https://www.linkedin.com/posts/jonasteuwen_current-pathology-foundation-models-are-unrobust-activity-7291079777015267331-qEej - https://www.jmaj.jp/detail.php?id=10.31662%2Fjmaj.2024-0206 - https://www.researchgate.net/publication/382738982_Pathology_Foundation_Models - https://www.researchgate.net/publication/382738982_Pathology_Foundation_Models/fulltext/66aafe2f75fcd863e5ed5bac/Pathology-Foundation-Models.pdf?origin=scientificContributions - https://www.nature.com/articles/s41591-024-03141-0 - https://www.sciencedirect.com/science/article/pii/S0010482524014124 - https://www.modernpathology.org/article/S0893-3952(25)00011-0/pdf - https://www.nature.com/articles/s41586-023-05881-4 - https://pmc.ncbi.nlm.nih.gov/articles/PMC11384335/