FIGURE 1.
Workflow of the deep learning model. (A) Data Source and Division: This study utilized 5062 H&E stained WSIs from four different centers. Data from Liuzhou Hospital served as the internal dataset for model training, while data from Xijing Hospital was used as an external test set. Additionally, two publicly available datasets were used to construct extra external datasets to evaluate the model’s generalization capability. (B) Construction and Optimization of Encoder: The image encoder and text encoder used in the model were trained through contrastive learning on large-scale pathology image-text pairs. The text content was optimized and adjusted by pathology experts to capture more robust pathological representations, thereby enhancing the model’s performance in practical applications. (C) Data Preprocessing: After digitizing the slides, the tissue regions were segmented, and the entire WSI was divided into multiple patches to facilitate subsequent feature extraction and analysis. (D) Model Computation Process: The core computation process of the deep learning model is divided into three stages: (1) Slide-level feature generation and prediction based on images; (2) Slide-level feature generation and prediction based on text; (3) Loss calculation dynamically adjusted according to the loss gradient to balance the contributions of image and text features, thereby optimizing the final classification performance.
