Abstract
Breast cancer diagnosis is crucial due to the high prevalence and mortality rate associated with the disease. However, mammography involves ionizing radiation and has compromised sensitivity in radiographically dense breasts, ultrasonography lacks specificity and has operator-dependent image quality, and magnetic resonance imaging faces high cost and patient exclusion. Photoacoustic computed tomography (PACT) offers a promising solution by combining light and ultrasound for high-resolution imaging that detects tumour-related vasculature changes. Here we introduce a workflow using panoramic PACT for breast lesion characterization, offering detailed visualization of vasculature irrespective of breast density. Analysing PACT features of 78 breasts in 39 patients, we develop learning-based classifiers to distinguish between normal and suspicious tissue, achieving a maximum area under the receiver operating characteristic curve of 0.89, which is comparable with that of conventional imaging standards. We further differentiate malignant and benign lesions using 13 features. Finally, we developed a learning-based model to segment breast lesions. Our study identifies PACT as a non-invasive and sensitive imaging tool for breast lesion evaluation.
In the United States, about 12–13% of women will be diagnosed with breast cancer during their lifetime1, and more than half of women will develop benign breast diseases2. Breast cancer is the second most common cause of cancer-related deaths in women, and breast cancer screening and early detection are associated with reduced mortality. A critical need remains for improving accurate classification of breast findings by imaging3. The Breast Imaging Reporting and Data System (BI-RADS) provides standardized terminology and assessment criteria for breast imaging findings, with scores ranging from zero to six that guide clinical management and aid in the diagnosis of breast cancer4. When screening mammography identifies an abnormality, patients typically undergo further diagnostic imaging. For lesions that cannot be cleared as completely benign, women are counselled to undergo either close interval follow-up (BI-RADS 3, probably benign) or breast biopsy (BI-RADS 4, suspicious; BI-RADS 5, highly suspicious for malignancy). Biopsies, although diagnostic, are invasive procedures with associated risks, such as pain, bleeding and scarring5. The side effects, costs and delays of these workups cause additional stress to patients. In addition, mammography involves exposure to ionizing radiation6, and its sensitivity is compromised in women with radiographically dense breast tissue (<50% in heterogeneously dense breasts compared with >60% in fatty breasts)7–9. Therefore, there is an urgent demand for a regular breast imaging method that is both rapid and cost-efficient, providing high diagnostic accuracy without the risks associated with ionizing radiation or the need for contrast agents.
Imaging modalities used in conjunction with mammography include handheld ultrasound and breast magnetic resonance imaging (MRI). Ultrasonography (US) is a widely used clinical tool that serves as an adjunct to mammography by providing valuable morphological and functional insights into breast tissue10. However, the image quality, interpretation and effectiveness of US in lesion characterization rely heavily on the skill and experience of the operator. Moreover, conventional US features are not always conclusive in distinguishing benign and malignant lesions, often necessitating further follow-up or biopsy11–13. Emerging investigational US technologies, such as US elastography and microvascular US, enhance diagnostic capabilities by assessing tissue stiffness and detailed blood flow, which warrant further validation for routine clinical use14,15. MRI is often used as a screening study for women at high risk of developing breast cancer (>20% lifetime risk). MRI has the advantage of using non-ionizing radiation through a magnetic field. However, this imaging modality requires administration of intravenous contrast agent, which carries risks such as nephrotoxicity16, and may be problematic for women with claustrophobic tendencies or MRI-incompatible ferromagnetic metal implants17 ,18. Diffuse optical tomography has been investigated as a new means of providing highly sensitive functional optical contrast. However, its clinical use is limited by its low spatial resolution19. In sum, despite the benefits of the currently available imaging techniques, they also possess drawbacks that can affect their diagnostic performance and patient experience.
Photoacoustic computed tomography (PACT) is emerging as a complementary imaging technique with the potential to overcome many of these limitations20. PACT combines the functional optical contrast of diffuse optical tomography and the high spatial resolution of US21. For breast imaging, PACT offers whole-breast (panoramic) images with rich functional contrasts, high spatial and temporal resolution and non-ionizing optical penetration to depths of up to 4 cm (refs. 22–26). A distinct advantage of PACT is the capability to selectively image various endogenous and exogenous contrast agents by tuning the illumination wavelength. For example, using haemoglobin as an intrinsic contrast agent in the near-infra-red region enables the visualization of angiogenesis27–29 and hypoxia30,31, both of which are critical in understanding tumour development and metastatic processes32. Moreover, PACT relies solely on light absorption, minimizing unwanted background noise from surrounding tissues and avoiding speckle artefacts, which enhances sensitivity in detecting small vessels33. There have been qualitative investigations into the appearance of breast lesions in PACT images22,34–37, and several PACT image features have been developed and proven to be useful in detecting breast lesions quantitatively38–42. Recently, machine-learning-based techniques have been applied in breast PACT to enhance image quality and diagnosis43–45. However, these studies have encountered several limitations. First, the field of view and image quality can be compromised by the suboptimal angular coverage of commercial ultrasound transducers. Second, a comprehensive feature base to systematically study breast lesions in panoramic PACT images has yet to be developed. Third, there is an absence of quantitative methodologies in PACT to distinguish between suspicious lesions and normal breast tissue. Fourth, there is not currently a systematic model for localizing and segmenting the breast lesions in the PACT images. Thus, definitive evidence is still lacking on the clinical potential of breast PACT to enhance diagnosis and monitoring of lesions detected by standard breast imaging46.
Here we conducted a 2-year clinical study of women with abnormal breast mammograms to establish a comprehensive workflow to evaluate the capability of breast PACT in clinical tasks, including classification (that is, suspicious versus normal and benign versus malignant), monitoring (that is, follow-up assessment of the same lesion) and segmentation (that is, localization of lesion and its boundary). Using a panoramic PACT system that captures images of the entire breast with high spatial resolution, we imaged 39 patients (78 breasts). Our PACT system also allows for rapid imaging within a single breath-hold of 13 s without the need for ionizing radiation or exogenous contrast agents. The acquired PACT images were qualitatively assessed and compared with existing corresponding diagnostic images, such as mammography and MRI. We then categorized the breast quadrant images into those representing healthy tissue and those indicating lesions. A total of 42 features were then extracted for a detailed quantitative comparison between the two groups. Machine-learning-based classifiers were then developed to distinguish suspicious-lesion-containing quadrants (SQs) from healthy quadrants (HQs). For lesions categorized as BI-RADS 3 or lower, PACT proves to be an effective tool for regular monitoring. For cases classified as BI-RADS 4 or higher, PACT achieves a maximum area under the receiver operating characteristic curve (AUROC) of 0.89 in discerning between SQs and HQs, which is comparable or superior to that of standard-of-care imaging47 ,48. Moreover, we selected a subset of the top-13 features from the classifier to avoid overfitting. This subset was then applied to all patients with SQs to distinguish biopsy-confirmed benign lesions from malignant ones. Finally, we investigated the potential of localizing and segmenting the lesions in PACT images using a semi-automatic learning-based model. The panoramic PACT images and their extracted features reveal the breast lesions both qualitatively and quantitatively, showing PACT’s promise as a fast, safe and comfortable imaging modality, eventually leading to a more streamlined and accurate workup.
Results
Patient recruitment and imaging procedure
The patient cohort of our study consists of patients recruited from two clinical studies (Methods). The workflow of patient imaging and feature extraction and classification is shown in Fig. 1. In the patient recruitment phase (Fig. 1a), 39 participants were enrolled based on the study eligibility criteria of being diagnosed with an abnormal mammogram, MRI, with a BI-RADS 3 or higher lesion. A detailed summary of the demographic and clinical characteristics of the study participant cohort can be found in Supplementary Table 1.
Fig. 1 |. Patient breast PACT workflow.

a, Participant recruitment. Women with abnormal or suspicious lesions (BI-RADS 3–5) in at least one of their breasts provided consent and were recruited for the study; for those with BI-RADS 4–5 results, baseline PACT imaging was obtained before biopsy. b, Participant imaging. Multiple single-breath-hold scans of each breast were acquired. c, PACT images of the patient. The panoramic PACT images were reconstructed and rendered as 2D MAPs for analysis. Each whole-breast image was divided into four quadrants. d, Image processing and feature extraction. The images were processed to extract various features, such as basic (1D) features and morphological (2D) features. e, Feature comparison. The quadrants were categorized into HQs and SQs in the clinical reports. Statistical differences were investigated to assess PACT’s capability to distinguish the two groups. f, Lesion classification. The features were used to train the learning-based classifiers to differentiate SQs from HQs, and further distinguish the biopsy-proven malignant quadrants (MQs) from the benign ones (BQs). g, Classifier evaluation and feature selection. The classifiers were applied to the testing set to quantitatively evaluate their performances, and the most important features were selected to form an explainable set for quantitative analysis. h, Lesion localization and segmentation. The centroid (centre of mass) of the lesion was automatically detected and the lesion segmentation mask was inferred. LIQ, lower-inner quadrant; LOQ, lower-outer quadrant; UIQ, upper-inner quadrant; UOQ, upper-outer quadrant.
After confirming eligibility and obtaining informed consent, participants were scheduled to undergo PACT of the breast (Fig. 1b). The PACT imaging process used light at 2 different wavelengths, 1,064 nm and 755 nm, which were combined using a dichroic mirror. The light was then diffused and directed to uniformly illuminate the breast (Supplementary Fig. 1). Participants lay on a custom-built bed in the prone position with the breast to be imaged placed in a large aperture. The breast was wrapped in a disposable membrane to support the breast. The photoacoustic (PA) signals were then collected with z-scanning of the full-ring ultrasound transducer array beneath the bed. Each imaging scan took 13 s with breath-hold. Each participant was imaged on both the contralateral unaffected breast and the affected breast in multiple positions per breast. For some patients with BI-RADS 3 and 4 lesions, up to 3 follow-up visits were scheduled ~6, 12 and 24 months after the first visit.
Each patient visit involved up to 2 volumetric scans under 1,064 nm and 755 nm illumination (Fig. 1c). In volumetric scanning, the transducer array was scanned along the elevational direction to form a volumetric (three-dimensional (3D)) image of the breast under dual-wavelength illumination. After acquisition, each 3D image was reconstructed, projected as two-dimensional (2D) maximum amplitude projections (MAPs) and divided into four quadrants for analysis (details in Methods and Supplementary Fig. 2). The panoramic image revealed the breast vasculature non-invasively and was used for both qualitative and quantitative radiological analyses.
The PACT system features deep penetration and high spatial resolution, visualizing vasculatures down to a diameter of ~258 μm and an imaging depth of ~4 cm (ref. 22). Breast- and tumour-mimicking phantoms were prepared23 and imaged as deep as 4 cm (Supplementary Fig. 3). A total of 45 image features were extracted from the images, including basic (one-dimensional (1D)), morphological (2D) and dynamic (2D plus time) features, as shown in Fig. 1d. The features were then categorized by the lesion information by clinical imaging and compared with the contralateral normal breast (Fig. 1e).
In addition to the direct comparison of features, learning-based binary classifiers were developed to differentiate the breast quadrants as SQs versus HQs (Fig. 1f). The quadrants from all patients were divided into the training, validation, and testing sets, and the performance of the classifiers was evaluated primarily based on the ROC curves (Fig. 1g). Next, 13 features were selected from the classifier to avoid overfitting. The feature subset was then applied to all the patients with SQs to distinguish the biopsy-proven benign lesions from the malignant ones, providing more detailed information for quantitative diagnosis. Finally, learning-based models were combined to automatically localize the centroid of the lesion and perform lesion segmentation (Fig. 1h).
Comparison of PACT with other clinical imaging modalities
The reconstructed images allow for both qualitative and quantitative analyses of breast lesions. Figure 2 exemplifies the results qualitatively.
Fig. 2 |. Representative breast images from PACT and conventional imaging modalities.

Representative images from participants undergoing PACT before neoadjuvant therapy for locally advanced breast cancer (Methods). a,b, Mammography (top), gadolinium-enhanced MRI (second row), depth-encoded PACT (third row) and feature-encoded PACT (bottom) images from Patient 1 (a) and Patient 2 (b), both with IDC, are shown. Pink arrows indicate lesions. Vessels detected by both PACT and MRI are marked by white dotted arrows with numbers. Vessels detected by PACT only are marked by yellow dashed arrows with letters. Nipples are marked by light blue dashed contours. Scale bars, 1 cm. MLO, mediolateral oblique view.
Figure 2a presents a side-by-side comparison of mammography, gadolinium-enhanced MRI (Gd-MRI), depth-encoded PACT and feature-encoded PACT images (using 1,064 nm illumination) for a patient with an invasive ductal carcinoma (IDC) measuring approximately 3.5 cm along its longest axis. The tumour (lower) and its satellite lesion (upper) are highlighted in all three images, as indicated by white solid arrows. The volumetric Gd-MRI images are rotated and projected to better correlate with the PACT images (Methods). The surrounding vasculatures in Gd-MRI and PACT images correlate well (indicated by the white dotted arrows) despite soft tissue deformation with PACT unlike MRI, whereas mammography does not provide vasculature visualization. Moreover, whereas conventional two-dimensional (B-mode) US focuses on the region surrounding the lesion, PACT images the breast panoramically, allowing for inclusion of both the lesion and its satellite region with a single acquisition. Notably, PACT reveals a more detailed vascular network around the tumour (highlighted by the yellow dashed arrows). These penetrating vessels indicate tumour angiogenesis, a critical process in tumour growth and progression27,42,49, and are identified only in PACT images due to PACT’s sensitivity to haemoglobin and high spatial resolution. Although the lesions are revealed in the depth-encoded PACT images, they appear more clearly in the automation-assisted, feature-encoded PACT images (details in later sections).
Figure 2b compares mammography, Gd-MRI, depth-encoded PACT and feature-encoded PACT of another patient’s affected breast with an IDC measuring approximately 4 cm along its long axis. The tumour is identifiable (marked by the white solid arrows) in the images from Gd-MRI and PACT, whereas more extensive vasculatures surrounding the tumour are captured by PACT. Notably, some of the vessels in PACT images appear more tortuous or irregular than the straighter vessels typically associated with benign lesions, serving as another indicator of malignancy (more example images in Supplementary Fig. 4)50. Moreover, the patient has extremely dense breast tissue (classified as level D according to Supplementary Table 4), which tends to obscure tumour visibility in the mammogram. In contrast, PACT effectively visualizes the mass and its surrounding vasculatures, even when mammography is limited by breast density.
Qualitative analysis of the breast lesions
Following the comparison between PACT and the other modalities, we analysed the visual differences across lesions captured in PACT images. Representative images are shown in Fig. 3.
Fig. 3 |. Representative PACT images of breasts with suspicious lesions and follow-up assessments.

a–d, Stromal fibrosis versus IDC. Example PACT images of the regions around two stromal fibrosis lesions (a and b, from two study participants) and two IDCs (c and d, from two study participants). The feeding vessels around the cancers are highlighted by orange lines. e–f, Sequential benign imaging. e, Serial images of a BI-RADS 3 lesion from 3 visits over 1 year in a participant with a benign mass. f, Feature comparison of the same lesion over three visits. n = 6,052 is the resolvable pixel count in the lesion mask. g–i, Sequential imaging of ductal carcinoma in situ. g, First of the serial images of the unaffected (left) and affected (right) breasts of a study participant with a benign mass. h, Serial images of the unaffected (left) and affected (right) breasts of the same patient after 6 months. Changes by standard-of-care imaging at 6 months led to a BI-RADS 4 classification and subsequent biopsy that diagnosed ductal carcinoma in situ; changes by PACT before biopsy are marked by the pink dotted arrows. i, Feature comparison of the same lesion over two visits. n = 13,971 is the resolvable pixel count in the lesion mask. All other lesions are marked by white arrows. Nipples are marked by grey dotted contours. Scale bars, 1 cm. All the P values are computed through one-sided Student’s t-tests. In the violin plots, the white dot represents the median, the thick bar the IQR, the thin line 1.5× the IQR and the side curves the kernel density plots of the data. Identified lesions are indicated by colour using the combined feature value.
Figure 3a,b presents the PACT images of two patients with stromal fibrosis. The presence of fibrosis can lead to architectural distortion of the normal tissue, which makes vessels appear stretched or displaced51, as marked by the white arrows in the images. In comparison, Fig. 3c,d presents the enlarged regions around the IDCs shown in Fig. 2. Compared with fibrosis, the malignant lesions of those patients are highlighted in the PACT images by (1) larger area, (2) enhanced signal amplitude, and more importantly, (3) increased surrounding feeding vessel density. The increased feeding vessels are marked by orange lines in Fig. 3c,d.
Other than cross-sectional imaging of patients from one visit, the accessibility and non-invasiveness of PACT make it well suited for monitoring changes in lesions over extended periods. A total of 26 participants with BI-RADS 3, or BI-RADS 4 with benign biopsies, underwent sequential imaging for up to 2 years from the baseline visit. Follow-up standard-of-care imaging remained benign in 25 patients, with a representative example in Fig. 3e. Figure 3e shows the serial images of the 3 PACT imaging sessions of a patient with a BI-RADS 3 breast mass, conducted at the initial baseline, 6th and 12th month visits, respectively. The follow-up PACT images show no notable increase in vessel density around the lesion over the year, and the comparison of the anisotropy-modulated entropy (AME) feature over three visits does not show significant differences through two-sided t-test, suggesting stability in the lesion’s characteristics over time (Fig. 3f).
Figure 3g,h offers a contrasting example and shows images from 2 visits spaced 6 months apart for a patient initially classified as having a BI-RADS 4 mass during the first visit. The ~5 mm lesion at 11:00 and ~3 cm from the nipple, marked by white arrows, was biopsied and diagnosed as a papillary lesion with atypia. She had a delay in clinical follow-up, with follow-up mammogram in 6 months showing a right breast lesion at 1:00, classified as BI-RADS 4; the biopsy diagnosed ductal carcinoma in situ. This progression is reflected in the PACT images, where subtle differences around the 1:00 region were already visible during the first visit but became more prominent at the second visit (highlighted by pink arrows in Fig. 3h,g, bottom). Notably, the PACT images from the second visit capture an increase in the density and AME of surrounding vessels (Fig. 3i). These cases show the value of PACT in providing insights into the progression or regression of breast lesions through non-invasive monitoring.
Quantitative comparison of PACT features
Following the qualitative analysis, we quantified the sensitivity of PACT to distinguish suspicious from healthy tissues. We divided each whole-breast image into quadrants and batch processed all the images. These quadrants were then categorized into two groups: HQs (tissue characterized as BI-RADS 3 or lower) and SQs (tissue of BI-RADS 4 or higher). From the quadrant images, we extracted three groups of features, summarized as basic (1D), morphological (2D) and dynamic (2D plus time) features (outlined in Fig. 4).
Fig. 4 |. Feature comparison of HQs and lesion quadrants.

a, Feature extraction based on quadrants. Each breast quadrant was captured as a depth-encoded image. b, Extraction of the basic (1D) features. c,d, Examples of the 1D feature comparison using violin plots, based on s.d. (c) and IQR (d), between HQs and SQs. e, Example morphological (2D) feature map of the vessel skeleton, from which the vessel density map was acquired. f,g, Violin plots of vessel density (f) and AME (g). h, Hu moment invariant (HMI)-based feature comparison. i, Extraction of GLCM-based features. j,k, Examples of the GLCM feature comparison using violin plots, based on the contrast (Con; j) and energy (E; k) at 0° orientation of neighbouring pixels, between HQs and SQs. All P values are computed through two-sided permutation tests. nHQ = 554 and nSQ = 121 are the number of quadrant images. In the violin plots, the white dot represents the median, the thick bar the IQR, the thin line 1.5× the IQR and the side curves the kernel density plots of the data.
To compute the basic features, each 2D image was first reshaped into a 1D vector; that is, its spatial information was neglected. The features were then calculated from the pixel value distribution (Fig. 4b, details in Methods). Figure 4c,d exemplifies two such features: image s.d. (Fig. 4c) and image interquartile range (IQR; Fig. 4d). The distribution of each feature across the 2 groups is visualized using violin plots52, and 2-sided permutation tests (104 permutations) were performed against the null hypothesis that the mean value of each feature from SQs is equal to that from HQs53. Significant differences are observed among both features, as illustrated by the small P values. Additional examples of the basic features can be found in Supplementary Fig. 5.
Morphological features were derived from the 2D feature maps. These maps were generated through scanning window analysis, automatic segmentation, morphological operations or texture analysis. For instance, Fig. 4e shows the vessel skeleton map of the quadrant image in Fig. 4a (left)23, from which the vessel density map was acquired through a scanning window (details in Methods). The features derived from the maps, such as the vessel density and AME, are shown in Fig. 4f,g and Supplementary Fig. 6. From each 2D feature map, we could acquire the basic features, such as mean value and s.d. Moreover, we could apply multilevel threshold-based segmentation before computing the basic features and selecting the most representative ones (Methods and Supplementary Fig. 7). These features provide a more nuanced view of the underlying tissue characteristics in PACT images.
Other than 1D and 2D features, we also investigated other features based on the grey-level co-occurrence matrix (GLCM; details in Methods, Fig. 4j–k and Supplementary Fig. 8), nth order Hu moment invariants (Methods, Fig. 4h and Supplementary Fig. 6) and more. A summary of all features used for comparison or classification can be found in Supplementary Fig. 9 and Supplementary Table 5.
Learning-based breast lesion classification
As significant differences were found in the studied features, it was advantageous to develop a classifier that integrates multiple features to better distinguish SQs from HQs. A summary of classifier development and performance is shown in Fig. 5.
Fig. 5 |. Classifier training, evaluation and feature selection workflow.

a, Schematic of the feature extraction, feature selection and classification. For each image, 42 features were preselected based on their significance and independence and combined into a single vector for classification. The feature vectors were then split into training, validation and testing sets. From cross-validation, the best model was selected and the features were ranked based on their Gini importance score. The 13 most important features (inset) were selected to further distinguish the biopsy-proven benign lesions from the malignant ones. b, ROC curves of the XGBoost classifier. The solid line corresponds to the ROC curve with the highest AUROC. The dotted lines correspond to the other ROC curves from the five rounds of cross-validation (R1–R5). The dashed line corresponds to the baseline ROC curve from random guess (RG). The cross denotes the optimal operating point (OOP) of the optimal ROC curve82. For the testing set, nHQ = 53 and nSQ = 39 are the number of quadrant images. c, ROC curves based on the sum of the first six principal components (PCs) of biopsy-proven BQs and MQs through PCA. The dotted lines correspond to the lower and upper bounds of the 95% CI. The insets show the violin plots for the feature. The white dot represents the median, the thick bar the IQR, the thin line 1.5× the IQR and the side curves the kernel density plots of the data. The P value is computed through a two-sided permutation test. d, t-Distributed stochastic neighbour embedding (t-SNE) visualization of the clustering of the 13-dimensional features from the BQs and MQs in the 2D subspace. For c and d, nBQ = 85 and nMQ = 56 are the number of quadrant images. Con, contrast; Cor, correlation; H, homogeneity.
On the basis of the quadrants from 39 patients, we extracted and preselected 42 PACT features from the image, forming a feature vector for each quadrant (Fig. 5a). The feature vectors were then divided into training, validation and testing sets for binary classification. We investigated the performance of five classifiers: naive Bayes, random forest, support vector machine, adaptive boosting (AdaBoost)54 and extreme gradient boosting (XGBoost)55. From each classifier, we performed cross-validation by shuffling the training and validation set five times to estimate their average performance. Among all the evaluation metrics (some of which are summarized in Supplementary Table 6), we focus on the AUROC. To avoid overfitting, we ranked all the features by their averaged Gini importance and retrained the models with increasing numbers of features step by step. The training and validation accuracy and AUROC are shown in Supplementary Fig. 10. Although including more features improves training accuracy, validation accuracy plateaus due to overfitting. The number of features (13) was decided based on the peak of the ratio of validation AUROC (or accuracy) over training AUROC (or accuracy); the Gini importance scores for these features are shown in the inset of Fig. 5a. Among all the models from five cross-validation trials, XGBoost achieved the highest maximal (0.89) and mean (0.87) AUROC (Fig. 5b and Supplementary Fig. 11). More details on the model training and testing can be found in Methods and Supplementary Fig. 12. The AUROCs from most models are above 0.8, indicating the high sensitivity and specificity of the PACT features.
To show the sensitivity of PACT to reduce benign biopsies, we further examined the data from all patients with SQs. The 13 selected features (Supplementary Fig. 13) quantitatively described the quadrant, thus differentiating between biopsy-proven malignant and benign lesions. By applying principal component analysis (PCA), we used the sum of the first six principal components as a binary classifier to differentiate the biopsy-proven malignant quadrants from benign quadrants (Fig. 5c). Further, to visualize the clustering capabilities of the feature set, we applied t-distributed stochastic neighbour embedding56 on the SQ-carrying patient data. t-Distributed stochastic neighbour embedding, as a nonlinear dimension reduction approach, mapped the high-dimensional data points (each denoted as a vector f ∈ ℝ13, where R denotes the real number field) onto the 2D subspace (denoted as 2D vectors g = (g1, g2) ∈ ℝ2) and formed clusters with malignant lesions (red dots in Fig. 5d) and benign ones (blue dots). By applying the 13 features as indicators, PACT shows potential to differentiate malignant tumours.
Learning-based lesion localization and segmentation
As the classifiers above only give binary information about the whole quadrant, we further investigated the possibility to localize and segment the lesions in the PACT images. By combining automatic lesion centroid localization and manual bounding box selection, we developed a learning-based semi-automatic roadmap to segment the lesions from each affected breast (Fig. 6).
Fig. 6 |. Lesion localization and segmentation workflow.

a, Schematic of the lesion segmentation. The raw PA image was enhanced by 2D features determined by XGBoost and then stacked with a series of Gabor-filtered images and the x and y grids for K-means clustering to get the rough lesion mask. The rough bounding box of the lesion was determined after morphological cleaning of the mask. The enhanced PA images and the rough bounding boxes were distributed to three trained readers to manually assign finer bounding boxes independently. The enhanced PA images and the finer bounding boxes were then fed into the pretrained MedSAM model to acquire the finer lesion masks. From the three masks, the long axes of the lesion were averaged and compared with the clinical reports. b, Examples of four breast quadrant images colour-encoded by the weighted product of vessel density, entropy and anisotropy. Scale bars, 1 cm. c, Distribution of the lesion size (in terms of the long axis) estimated through PACT versus that from the clinical report. Data are plotted as means ± s.e.m. (n = 3 is the number of independent trained readers).
Starting from the raw PA images, we enhanced the contrast of the lesions by 2D features. Assisted by the XGBoost classifier, we selected the three most important features based on the Gini importances, namely entropy, density and anisotropy (which is defined as A = exp(−5 × directionality)). The combined feature map represents the weighted product of the three maps (details in Methods). For visualization, we applied thresholding on the combined feature map and used it to selectively colour-encode and highlight the lesions in the breast (shown in Fig. 2 (last row) and Fig. 6b). For downstream tasks such as lesion localization and segmentation, we multiplied the feature map with the raw PA image to form the enhanced PA images.
The enhanced PA images were then stacked with a series of Gabor-filtered images and the x and y grids (to maintain spatial correlation) for collaborative K-means clustering to get the rough lesion mask. After morphological cleaning, a rough bounding box of each lesion was automatically determined. The images and the rough bounding boxes were then distributed to three readers to manually assign finer bounding boxes around the lesion independently. The readers were trained with some example images that were excluded from the study. The enhanced images and the finer bounding boxes were fed into the MedSAM model that has been fine-tuned on ~1.57 million medical images57. The output of the model was a finer mask of the lesion. From the masks from the three readers, we estimated the dimension of each lesion using the average long axis of the segmentation mask. We then compared the size estimated from PACT with the clinical reports from other modalities, such as US, mammography and MRI. As shown in Fig. 6b, the findings from our PACT and the other imaging modalities in the clinical reports are linearly correlated (r2 = 0.77 from linear regression) in a wide range from ~3 mm to ~50 mm.
Discussion
This work represents an advancement in non-invasively assessing breast lesions, offering high clarity for qualitative analysis and enhanced sensitivity for quantitative studies using PA tomography. Our proposed workflow features machine-learning-based lesion classification and segmentation, paving the way for standardization and full automation. Our PACT system features a balance between spatial resolution and penetration depth, visualizing detailed vasculatures down to ~300 μm and covering more than 93% of the lesion depths (Supplementary Table 3). Unlike previous reports in PA tomography with small patient cohorts and suboptimal image quality, our study cohort of 39 participants in 2 longitudinal clinical studies delivers high-quality panoramic images, comprehensive feature analysis, learning-based models for classification and segmentation, and system improvements that enhance diagnostic accuracy. From the visual appearances, we categorize the tissues into four subtypes (shown in Supplementary Fig. 14). Healthy tissue typically presents with regular blood vessels (Supplementary Fig. 17). Benign lesions have variable appearances in images due to the absence of vasculature49,51,58. For example, some lesions may show tortuous surrounding vessels, yet the overall vessel density remains relatively stable (Fig. 3a,b,e and Supplementary Fig. 14b). As lesions progress from benign to malignant, they often develop more feeding vessels. Compared with benign and in situ lesions, invasive lesions generally show increased micro-vessel density, larger area, enhanced vasculature irregularity and more feeding vessels59. Therefore, PACT features such as vessel density, area occupancy and entropy might be higher with invasive breast cancer cases. Despite our small sample sizes, some differences were observed between IDC versus ductal carcinoma in situ (Supplementary Fig. 15), and stromal fibrosis versus malignant lesions (Supplementary Fig. 16), as hypothesized. The vascular changes are detected by PACT as enhancement around the lesion (Fig. 3i and Supplementary Fig. 14c). Depending on the centre frequency and bandwidth of the ultrasound transducers, individual feeding vessels become distinctly visible in PACT images (Fig. 3c,d and Supplementary Fig. 14d). The ability of PACT to non-invasively capture these diverse tissue characteristics underscores its potential use as a valuable diagnostic tool for early-stage breast cancer.
Other than 1,064 nm illumination with static images, we further investigated 2 technical additions to the study, including dual-wavelength illumination and PA elastography. First, the incorporation of a second (or more) illumination wavelength in PACT enhances its diagnostic capability, adding extra imaging information with minimal additional cost, time and patient discomfort. Supplementary Fig. 18a,b compares the 1,064 nm and 755 nm images of the same breast with stromal fibrosis, approximately 9 mm in size. Despite subtle visual differences, the lesion is identified in both images. A detailed comparison of the attenuation and peak signal-to-noise ratio curves can be found in Supplementary Fig. 18c,d. Compared with the 1,064 nm wavelength, the 755 nm light has much less attenuation in water and tissue and is more sensitive to melanin and oxyhaemoglobin60,61. For example, from a patient with darker skin colour, the attenuation of 755 nm light became more notable, whereas the penetration depth of 1,064 nm was minimally affected (Supplementary Fig. 19). For those patients, time-gain compensation with different coefficients should be considered. The combination of both images allows for extracting additional functional features, such as the tissue oxygenation level62. Although our current feature set does not include oxygenation measurement, the potential value of such information is important and warrants further investigation, particularly with advancements in fluence calibration and normalization techniques. Second, whereas basic and morphological features were extracted from the volumetric image, the breathing frames allow for extracting the relative tissue deformation map during breathing, referred to as PA elastography22. As shown in Supplementary Fig. 20, the breathing frames were registered and rasterized into triangle grids. The averaged area change of the triangles describes the tissue strain from the compression and relaxation during breathing (details in Methods). Lesions typically show distinct mechanical characteristics compared with surrounding normal tissue63,64. Although PA elastography is currently performed one slice at a time, axial scanning can be implemented during the first visit to ensure comprehensive lesion capture. Furthermore, knowing the lesion’s axial position from the first visit allows elastography to be especially valuable for targeted monitoring during follow-up imaging sessions (shown in Supplementary Fig. 21).
This study extracts multiple features from PACT images using various approaches, with each feature offering a clear physical interpretation. For example, among the basic (1D) features, the mean value focuses on the PA signal amplitude, indicating the overall activity within the tissue. Other features such as s.d., IQR and mean absolute deviation (MAD) focus on the distribution spread of the image pixel values. Such values generally increase as the vessel density increases around the lesion. Morphological features delve deeper into the vasculature structure, examining aspects such as vessel density and skeleton endpoints (Methods). These features are particularly insightful as they consider the topology of the blood vessels, offering more localized and detailed information about the lesions. The GLCM features, in contrast, assess the texture of the tissue, providing insights into its heterogeneity and patterns. Moment-invariant-based features with varying orders capture different image aspects, including area, centroid and spread. The dynamic feature assesses the mechanical properties of the tissue, with lesions generally showing different strain patterns compared with the surrounding tissue. Although each feature can be used as a single-factor binary classifier (Supplementary Fig. 22), combining these diverse features enhances the performance of the classifier and the diagnostic capabilities of PACT. This holistic approach not only augments the precision of lesion detection but also contributes new insights towards breast cancer diagnostics. Moreover, we expect the feature set and classification models can be useful when adapted to other PACT systems or imaging modalities. For example, we have applied the same model and feature set on a four-arc-based PACT system within our laboratory, which features different geometry and is operated by different personnel (Supplementary Fig. 23)24. Despite the much smaller data size, the resulting average AUROC of 0.82 indicates some degree of generalizability of our model to other PACT systems. When being applied to other modalities such as MRI and mammography, fine-tuning or transfer learning might be needed due to the different views and image appearances, whereas good generalizability of our model can be inferred from the correlation shown in Fig. 2.
It is also noted that the breasts from different patients vary in appearance in the PACT images (as evident in Figs. 2 and 3 and Supplementary Fig. 17). These differences arise despite the standardization of the imaging sessions. Factors contributing to this variability include individual differences in age, breast density, cup size, among others (shown in Supplementary Table 1). For example, although PACT images of different breast cup sizes look similar (Supplementary Fig. 24a), there are systematic differences in the features, which might affect classification accuracy (Supplementary Fig. 24b). To mitigate the impact of the inter-patient variability, we self-normalized the features of each patient based on the averaged value from all her HQs. The results with and without self-normalization are compared in Supplementary Figs. 24c and 25. It is clearly shown that self-normalization enhances the generalizability of our method, ensuring that the features and classifiers are robust across a diverse patient population. This is crucial for the practical application of PACT in clinical settings, where patient diversity is the norm. To further standardize the imaging workflow with different breast cup sizes, a series of rigid breast-holding cups made with materials such as polyvinyl chloride or polymethylpentene through thermoforming can be considered.
We compared the performance of PACT with other modalities in two aspects. First, in the task to differentiate suspicious quadrants from healthy ones, our best-achieved AUROC of 0.89 from XGBoost is comparable or superior to those from mammography or US (~0.8)7,65–68. A more comprehensive comparison of the performance metrics across multiple modalities can be found in Supplementary Table 7. In addition, PACT performs consistently well regardless of breast density. Our sensitivity of 72% at the optimal operating point far exceeds that of mammography for radiographically dense breasts, which is typically less than 50% (refs. 7,69). At the optimal operating point, the classifiers generally achieve higher specificity than sensitivity, which is most likely due to the existence of biopsy-proven benign lesions in the testing sets. Second, our technology shows potential to further distinguish the malignant lesions from benign ones qualitatively (that is, graphically) and quantitatively, thus reducing unnecessary biopsies. In the PCA-based model (Fig. 5c), when our sensitivities are set to those of mammography and MRI, our respective specificities are comparable8. When our sensitivity is set to that of US (~84%), our specificity of ~44% is superior8.
This pilot study imaged 39 women. From power analysis, we achieved a high Z-score and low P value, indicating that our test has sufficient power to detect an AUROC significantly greater than 0.5 with the current sample size (details in Methods)70. However, the 95% confidence interval (CI) of our AUROC is relatively wide due to the limited patient size. Adding more samples, especially positive cases, would reduce the s.e., narrow the CI and increase the precision of our AUROC estimate. In the 39 patients, there was a total of 31 positive findings, which exceeds the requisite number for statistical robustness (Supplementary Table 1). In addition, we studied both breasts with the unaffected, normal quadrants and the contralateral unaffected breast for self-normalization. Given the number of patients enrolled in the study, analysing the 2D MAPs acquired from the 3D image, segmenting each breast into quadrants and comparing the abnormal quadrants to the normal quadrants not only yield clinically relevant insights but also simplify the complexity of classifier design. For lesions that cover multiple quadrants, our proposed approach would classify all affected quadrants as suspicious. However, the sequential segmentation process we developed (Fig. 6) can help localize the lesion more accurately, potentially resolving some of the ambiguity (Supplementary Fig. 26). Finally, five specific classifier types and K-means clustering were investigated based on their proven robustness and efficiency with relatively small datasets. The choices above ensure reliable outcomes from our study and set a foundation for future research. As we progress and expand our patient sample size, there is potential to explore more complex tasks and models71,72. For example, as automation is essential for robust clinical translation, we experimented with bypassing the fine box selection and sending the rough bounding boxes directly to the MedSAM model (shown in Supplementary Fig. 27). Although this approach ensured complete automation, the decreased segmentation accuracy highlights the importance of fine-tuning to balance automation and accuracy and the need of a larger dataset for more advanced models.
In summary, this study introduces a comprehensive workflow and methodology of using panoramic PACT as both a qualitative and quantitative tool for the characterization, monitoring and segmentation of various breast lesions. The high spatial resolution and sensitivity to endogenous contrast agents allow for direct and non-invasive visualization of lesions and surrounding tissues, and the accessibility and non-ionizing nature make it possible to monitor the breast over multiple visits. Moreover, the detailed image of the vasculature allows for feature extraction, classification and segmentation of suspicious lesions, suggesting potential for PACT to reduce the number of unnecessary benign breast biopsies. This study warrants further investigation using a larger patient dataset, a more complex model that considers the molecular subtype and volumetric information from the 3D images rather than 2D MAPs, and a more accurate modelling of the optical fluence in deep tissue for precise oxygenation measurement. We also expect to achieve deeper penetration through techniques such as multiple-side illumination, moderate breast compression and averaging over repeated measurements. In addition, we aim to standardize imaging procedures across different PACT systems, which would allow us to apply our model more broadly and improve generalizability across different set-ups. Finally, future steps towards clinics include reducing PACT costs through mass production and leveraging lower operational expenses due to non-ionizing radiation. The system’s compact set-up (takes less than 4″ × 6″) simplifies integration without requiring specialized infrastructure, and machine learning can streamline image interpretation, easing the learning curve for clinical adoption. As PACT undergoes standardization and further refinement, we anticipate the use of PACT not only as a complementary tool to existing mainstream diagnostic methods but also as a unique modality with the potential to transform the landscape of breast cancer diagnostics.
Methods
System construction
In the PACT system, a 1,064 nm laser beam from an Nd:YAG laser (LPY 7875-20; Litron Lasers) was combined with a 755 nm laser beam from an Alexandrite laser (Alex-Q; Beamtech Optronics) using a dichroic mirror (DMLP900L; Thorlabs). The combined beams were then expanded by an engineered diffuser (EDC-10; RPC Photonics) to form a circular light beam. The laser radiant exposure (20.37 mJ cm−2 for 1,064 nm and 5.09 mJ cm−2 for 755 nm) and irradiance (407.44 mW cm−2 for 1,064 nm and 50.93 mW cm−2 for 755 nm) were within the American National Standards Institutes safety limits for laser exposure73. For panoramic acoustic detection, a 512-element full-ring ultrasonic transducer array (2.25 MHz central frequency; Imasonic) was connected to 4 sets of 128-channel preamplifiers and data acquisition systems (SonixDAQ; Ultrasonix Medical) placed around the water tank22. A linear stage (KR4610D; THK America) was fixed beneath the water tank and controlled by a customized LabVIEW (2018) program.
Study oversight
The human studies were completed under institutional approval and oversight by both the California Institute of Technology (Committee for the Protection of Human Subjects, 18–0785 and 20–1040) and City of Hope National Medical Center (Institutional Review Board, 17315 and 19552). We are reporting breast PACT imaging results from two clinical studies: (1) a study involving women with locally advanced breast cancer initiating therapy with neoadjuvant chemotherapy (Caltech CPHS 18-0785/COH IRB 17315), and (2) a study involving women with abnormal breast screening imaging BI-RADS 3–5 (Caltech CPHS 20-1040/COH IRB 19552). The clinical studies were conducted in accordance with institutional guidelines. Study participants were informed of the investigational nature of the study and provided informed consent.
Patient selection and exclusion criteria
For participants associated with Caltech CPHS 20-1040/COH IRB 19552, all recruited participants met the following inclusion criteria: (1) women newly identified to have abnormal mammograms, breast ultrasound and/or breast MRI as part of breast cancer screening, with BI-RADS 3–5 breast lesions for which diagnostic biopsy (BI-RADS 4–5) or close interval radiologic follow-up (BI-RADS 3) is recommended; (2) who were >18 years of age; (3) able to understand and willing to sign a written informed consent document; and (4) willing and able to undergo PA imaging before the standard-of-care biopsy procedure for a BI-RADS 4 or 5 breast imaging. The exclusion criteria included (1) weight exceeding 300 lb (weight limit of the steps to the PACT examination table); (2) pregnancy or lactation; (3) uncontrolled intercurrent illness including, but not limited to, ongoing or active infection of the breast and/or axilla, symptomatic congestive heart failure, unstable angina pectoris, cardiac arrhythmia, or psychiatric illness or social situations that would limit compliance with study requirements; and (4) use of photosensitizing medication.
For participants associated with Caltech CPHS 18-0785/COH IRB 17315, all recruited participants met the following inclusion criteria: (1) women newly diagnosed with breast cancers; (2) who were ≥18 years of age; (3) able to understand and willing to sign a written informed consent document; and (4) must have intact skin in the area that is to be imaged (that is, no skin cuts, open wounds or ulcers). The exclusion criteria included (1) weight exceeding 300 lb; (2) pregnancy; and (3) uncontrolled intercurrent illness including, but not limited to, ongoing or active infection of the breast and/or axilla, symptomatic congestive heart failure, unstable angina pectoris, cardiac arrhythmia, or psychiatric illness or social situations that would limit compliance with study requirements.
Standard PACT imaging procedure
Breast imaging was performed at California Institute of Technology in a dedicated human imaging room installed with privacy curtains. Before PACT imaging, the imaging bed and the imaging system were thoroughly sanitized using disinfecting wipes. The examination table was covered by single-use paper that was discarded after each use. Participants were provided privacy to change into hospital gowns. During PACT imaging, a female study coordinator assisted the patient in a private space enclosed by curtains. All other researchers were outside the private space to operate the PACT device. The patient was positioned prone, with 1 breast placed in the preheated 35 °C water tank through a large aperture in the bed top.
For optional elastography measurement after completing standard PACT, the transducer array was fixed at an elevational position ~2.5 cm from the skin surface. The patient breathed normally, compressing the breast naturally against the food-safe plastic wrap periodically. The system then captured cross-sections (2D) of the breast at 20 Hz to form time-lapsed image frames.
Image reconstruction and post-processing
The dual speed-of-sound universal back-projection (dualSoS-UBP) algorithm74 was used to reconstruct all images in this work. The ultrasonic transducer array scanned the entire breast from the chest wall to the nipple, back-projecting the time domain PA signals at all elevational scanning steps into the 3D space. Each volumetric image was first reconstructed with a voxel size of 1 mm in the elevational direction and 0.15 mm × 0.15 mm on the horizontal plane. All the reconstructed images were further batch-processed to improve contrast. In each horizontal slice, we applied the same Hessian-based Frangi vesselness filtration75 to enhance the contrast of blood vessels. In each filtered slice, adaptive thresholding was used to segment blood vessels, followed by morphology filtration for removing the isolated pixels. In the elevational direction of each filtered volumetric image, we selected voxels with the largest PA amplitudes and then projected their depths to form a 2D depth map. We applied median filtration with a window size of 8 pixel × 8 pixel to the depth map. Different RGB (red, green, blue) colour values were assigned to discrete depths. Finally, the 2D colour-encoded, depth-resolved image was multiplied by the MAP image pixel by pixel to represent the maximum amplitudes. The workflow of image post-processing can be found in Supplementary Fig. 2.
Co-registration between MRI and PACT images
The Gd-MRI images were acquired and shared as DICOM files. We used the built-in function dicomread in MATLAB to load the files as volumetric images and performed 3D rotation using the PHOVIS software76 to find the optimal view angle with the best vasculature correlation with PACT MAPs. The MRI images were then rendered as maximum intensity projections to be shown in Fig. 2. Although the two images generally correlate well, small differences remain, primarily from tissue deformation, with the breast fully dependent in MRI and slightly compressed in PACT.
Measurement of basic features
Upon batch processing, each image I(i, j) (i = 1, 2, …, m, j = 1, 2, …, n, where m and n are the height and width of the image in pixels, respectively) was first stretched into a 1D vector I1D (with index k = 1, 2, …, mn) using the built-in function reshape in MATLAB. On the basis of the vectorized images, we computed the basic features Fb, such as the mean value
| (1) |
the (uncorrected) sample s.d.
| (2) |
the (uncorrected) skewness
| (3) |
and (uncorrected) kurtosis
| (4) |
Moreover, by sorting the vectorized image as is the sorted index), we computed other basic features, such as the IQR:
| (5) |
A complete list of the basic features investigated can be found in Supplementary Table 5.
Measurement of morphological features
Measurement of blood vessel density, occupancy, endpoint and branching.
Blood vessel skeletons were first extracted by generating vessel centerlines77 from the threshold-based binary vessel masks from the MAP quadrant images. The vessel centerlines were broken into independent vessels at junction points. Independent vessels with lengths less than 3 pixels were then removed to reduce noise. To generate the blood vessel density map V, a 4 mm × 4 mm window was scanned across the entire vessel skeleton image. The vessel density was quantified as the number of vessels in the window divided by the window area. The vessel density of the window area was then assigned to the window’s centre pixel. The vessel density of the quadrant was then computed as the mean value of the vessel density map, and the vessel area occupancy of the quadrant was computed as the area ratio of the vessel mask over the entire quadrant. Furthermore, the MATLAB function bwmorph was applied to acquire the total number of branch points Fbranch and endpoints Fendpoint of the skeleton.
Measurement of entropy, directionality and AME.
To mitigate the background noise and single-pixel artefacts, thresholding was first applied to the MAPs of the batch-processed images. The threshold was selected as the maximum PA amplitude within the selected background (that is, a region in the coupling medium outside the breast). A 2 mm × 2 mm window was then used to scan across every pixel in the image. For each image subset Isub, we calculated the entropy within the window as
| (6) |
where B denotes the number of discrete bins in the window and Pb denotes the probability for a pixel in the window to have value fallen in the bth bin. The acquired entropy was then assigned to the centre pixel of the window, forming an entropy map H. The mean entropy of the quadrant Fentropy was then computed as the mean value of the entropy map.
Similarly, to measure the directionality, the same window was scanned and a singular-value-decomposition-based method was applied to the rotated subset to acquire the normalized singular value decomposition dominancy term, which led to the directionality map D23,78. The mean directionality of the quadrant Fdirectionality was then computed as the mean value of the directionality map.
The AME was calculated from the entropy and directionality maps using the following formula:
| (7) |
where the coefficient was set to 5. The mean AME of the quadrant FAME was then computed as the mean value of the AME map.
Measurement of Hu moment invariants.
The image central moment of orders p and q can be computed as
| (8) |
where and denote the centroid of the image. From the central moments, we computed the nth order Hu moment invariants (ref. 79):
| (9) |
where are the translation and scale invariants.
Measurement of GLCM properties
GLCM describes how often a pixel with a given intensity value occurs in a specific spatial relationship to a pixel with another value. The image of interest was first scaled to eight grey levels. The spatial relationship was defined as the pixel of interest and its horizontally (0°), diagonally (45°), vertically (90°) and anti-diagonally (135°) adjacent pixels. Each element (l, m) in the resultant GLCM was the sum of the number of times that the pixel with value l occurred in the specified spatial relationship to a pixel with value m in the input image.
After computing the GLCMs, four properties were calculated to describe the texture of the image. The contrast () measures the intensity contrast between a pixel and its neighbours over the whole GLCM:
| (10) |
where . The correlation () measures how correlated a pixel is to its neighbour over the whole GLCM:
| (11) |
where μGLCM and σGLCM are the mean and s.d. of the GLCM, respectively. The energy () refers to the sum of squared elements in the GLCM:
| (12) |
The homogeneity () measures the closeness of the distribution of elements in the GLCM to the GLCM diagonal:
| (13) |
In total, there were 16 (4 orientations × 4 properties) features from the GLCM analysis.
Measurement of dynamic features
To conduct PACT elastography of the breast, patients were asked to breathe normally. The chest wall pushed the breast against the plastic film, generating a deformation of the breast in the coronal plane. To assess deformations over time, the first frame was taken as a reference. Other frames were registered to the first frame through a non-rigid demon algorithm80 in MATLAB. The entire image was then segmented into 2 mm ×2 mm squares. One randomly selected pixel was chosen from each square, and triangular grids were further generated from these registered pixels. The triangular grids were mapped back to the original unregistered frames and their areas were calculated. For each grid, Fourier transformation was applied to quantify the area variation at the frequency of periodic compression, and amplitudes were assigned to the pixels inside this triangle to generate the deformation map22. The procedure above was repeated 10×, and the final image was averaged to reduce noise. To account for the mechanical differences between the lesion and healthy tissue, the absolute value of the differences between the deformation map and its median was averaged over the whole map to acquire the strain feature Fstrain.
Classifier training and testing
The complete dataset for classifier training and testing consists of 675 section feature vectors with binary labels. Input 1D feature vectors consist of 1D basic features and statistical metrics describing the 2D morphological features extracted from each self-normalized section image. Patients were randomly split into approximately 80%, 10% and 10% groups for training, validation and testing, respectively. Sections from a given patient were placed exclusively into the same set to prevent data leakage. Positive cases were randomly repeated to balance the classes in the training set only.
Classifier models were implemented in Python using the scikit-learn and XGBoost libraries. The models used as binary classifiers were naive Bayes, random forest, support vector machine, AdaBoost and XGBoost. Model hyperparameters were optimized using a grid search. For example, for random forest, the number of tree estimators, split criterion function, maximum tree depth, minimum number of samples at each leaf node, minimum number of samples required to split a node and whether bootstrap samples were used were tuned. For support vector machine, the kernel type, regularization parameter and kernel coefficient are tuned, with a maximum of 1,000 iterations for early stopping. For AdaBoost, the number of estimators, base decision tree classifier depth and learning rate were adjusted. For XGBoost, the number of trees, maximum tree depth, learning rate and L2 regularization term on weights were tuned.
The best models were chosen based on validation AUROC, recall and precision, as evaluated across the grid search on the validation set after model training on the training set. Final models were retrained using the combined training and validation sets and evaluated using the hold-out testing set. Due to potential inter-patient variability, the average performance of the models was assessed by randomly shuffling the training and validation sets five times with a fixed hold-out test set. To avoid overfitting and further improve model performance, the input feature set was reduced to a subset of top features to reduce model complexity. Top features were determined by their Gini important scores in the trained XGBoost models. The subset of 13 features was determined by evaluating the training and validation AUROCs across varying numbers of top features used, as shown in Supplementary Fig. 10.
For classification between benign and malignant lesions, we applied PCA based on the 13-feature subset vector for each image. We then used the sum of the first six principal components as a binary classifier, which did not require training.
Image enhancement and lesion segmentation
From the post-processed MAP of the affected breast, the vessel density map V (Gaussian filtered with s.d. of 5), the entropy map H and the anisotropy map A (that is, A = exp(−5 × directionality), we acquired the combined feature C as
| (14) |
where Gini refers to the average Gini importances for each feature computed from the XGBoost model. We then multiplied the MAP with the combined feature map and applied manual thresholding to enhance the contrast of lesions in the image.
Next, we supplemented the enhanced image with information about the texture in the neighbourhood of each pixel. We filtered the image using a set of 12 Gabor filters covering 3 wavelengths and 4 orientations (MATLAB function imgaborfilt). We also got the x- and y-coordinate grids of the enhanced image to allow the K-means clustering algorithm (MATLAB function imsegkmeans) to prefer groupings that are close together spatially. We finally concatenated enhanced image with the Gabor-filtered images and the x and y grids for K-means segmentation (K = 2).
From the segmented masks, we applied morphological cleaning (MATLAB function bwmorph), removed the masks from the nipple and selected the largest mask as the rough lesion segmentation mask. The centroid of the mask was used to localize the lesion and a bounding box was drawn manually around the centroid. The image and the mask were then fed into the pretrained MedSAM model to output the finer segmentation mask, from which the long axis of the lesion was computed (MATLAB function regionprops) and compared with the clinical reports.
Statistical analysis
Statistical analysis is performed using MATLAB (R2021a). Data are presented as mean ± s.e.m. in all figure parts in which shadows or error bars are shown. For power analysis, we used the method proposed by Hanley and McNeil to estimate the variance of the AUROC70:
| (15) |
where and n0 and n1 are the numbers of negative and positive cases, respectively.
We tested the null hypothesis that the true AUROC is 0.5 (no better than random chance) against the alternative hypothesis that the AUROC is greater than 0.5. From the estimated variance and s.e., we derived the Z-score as
| (16) |
which leads to a one-sided P value of less than 0.0001. The 95% CI can be calculated as
| (17) |
where . With only six positive cases, the s.e. is relatively large, leading to a wide CI.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary Material
The online version contains supplementary material available at https://doi.org/10.1038/s41551-025-01435-3.
Acknowledgements
We thank Y. Aborahama and G. Zhao for machine-learning-related discussion, and R. Nelson, D. Schmolze and L. Vora for their interpretation of the radiology findings. This project has been made possible in part by National Institutes of Health grants R35 CA220436 (Outstanding Investigator Award), R01 CA282505, U01 EB029823 and R01 EB028277. Y.Z. was sponsored by the National Institutes of Health grant K99 EB035645. C.Z.L. was sponsored by the National Institutes of Health grants NIGMS T32 GM008042 and NIGMS T32 GM152342.
Footnotes
Competing interests
L.V.W. has a financial interest in Microphotoacoustics Inc., CalPACT LLC and Union Photoacoustic Technologies Ltd., which, however, did not support this work. The other authors declare no competing interests.
Data availability
The calculated features for all patients and the classification results are available on Figshare at https://doi.org/10.6084/m9.figshare.28675031 (ref. 81). The rest of the main data supporting the results in this study is available within the article and its Supplementary Information. The PA data are available for research purposes from the corresponding author on reasonable request.
Code availability
The code for data analysis is available on Figshare at https://doi.org/10.6084/m9.figshare.28675031 (ref. 81). The original code for MedSAM is available on GitHub82. We applied this code to our dataset with the customized settings described in Methods. We have opted not to make reconstruction and post-processing codes (described in detail in Methods and ref. 74) publicly available because the code is proprietary and used for other projects.
References
- 1.Feuer EJ et al. The lifetime risk of developing breast cancer. J. Natl Cancer Inst 85, 892–897 (1993). [DOI] [PubMed] [Google Scholar]
- 2.Hockenberger SJ Fibrocystic breast disease: every woman is at risk. Plast. Aesthet. Nurs 13, 37–40 (1993). [DOI] [PubMed] [Google Scholar]
- 3.Siegel RL, Miller KD, Fuchs HE & Jemal A Cancer statistics, 2021. CA: Cancer J. Clin 71, 7–33 (2021). [DOI] [PubMed] [Google Scholar]
- 4.Spak DA, Plaxco JS, Santiago L, Dryden MJ & Dogan BE BI-RADS fifth edition: a summary of changes. Diagn. Interv. Imaging 98, 179–190 (2017). [DOI] [PubMed] [Google Scholar]
- 5.Pesapane F et al. Will traditional biopsy be substituted by radiomics and liquid biopsy for breast cancer diagnosis and characterisation? Med. Oncol 37, 29 (2020). [DOI] [PubMed] [Google Scholar]
- 6.Popli MB, Teotia R, Narang M & Krishna H Breast positioning during mammography: mistakes to be avoided. Breast Cancer 8, 119–124 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kolb TM, Lichy J & Newhouse JH Comparison of the performance of screening mammography, physical examination, and breast US and evaluation of factors that influence them: an analysis of 27,825 patient evaluations. Radiology 225, 165–175 (2002). [DOI] [PubMed] [Google Scholar]
- 8.Berg WA et al. Diagnostic accuracy of mammography, clinical examination, US, and MR imaging in preoperative assessment of breast cancer. Radiology 233, 830–849 (2004). [DOI] [PubMed] [Google Scholar]
- 9.von Euler-Chelpin M, Lillholm M, Vejborg I, Nielsen M & Lynge E Sensitivity of screening mammography by density and texture: a cohort study from a population-based screening program in Denmark. Breast Cancer Res. 21, 111 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Brem RF, Lenihan MJ, Lieberman J & Torrente J Screening breast ultrasound: past, present, and future. Am. J. Roentgenol 204, 234–240 (2015). [DOI] [PubMed] [Google Scholar]
- 11.Lehman CD et al. Cancer yield of mammography, MR, and US in high-risk women: prospective multi-institution breast cancer screening study. Radiology 244, 381–388 (2007). [DOI] [PubMed] [Google Scholar]
- 12.Corsetti V et al. Breast screening with ultrasound in women with mammography-negative dense breasts: evidence on incremental cancer detection and false positives, and associated cost. Eur. J. Cancer 44, 539–544 (2008). [DOI] [PubMed] [Google Scholar]
- 13.Raza S, Chikarmane SA, Neilsen SS, Zorn LM & Birdwell RL BI-RADS 3, 4, and 5 lesions: value of US in management—follow-up and outcome. Radiology 248, 773–781 (2008). [DOI] [PubMed] [Google Scholar]
- 14.Errico C et al. Ultrafast ultrasound localization microscopy for deep super-resolution vascular imaging. Nature 527, 499–502 (2015). [DOI] [PubMed] [Google Scholar]
- 15.Sigrist RMS, Liau J, Kaffas AE, Chammas MC & Willmann JK Ultrasound elastography: review of techniques and clinical applications. Theranostics 7, 1303–1329 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Perazella MA Gadolinium-contrast toxicity in patients with kidney disease: nephrotoxicity and nephrogenic systemic fibrosis. Curr. Drug Saf 3, 67–75 (2008). [DOI] [PubMed] [Google Scholar]
- 17.Eshed I, Althoff CE, Hamm B & Hermann K-GA Claustrophobia and premature termination of magnetic resonance imaging examinations. J. Magn. Reson. Imaging 26, 401–404 (2007). [DOI] [PubMed] [Google Scholar]
- 18.Faris OP & Shein MJ Government viewpoint: US Food & Drug Administration: pacemakers, ICDs and MRI. Pacing Clin. Electrophysiol 28, 268–269 (2005). [DOI] [PubMed] [Google Scholar]
- 19.Leff DR et al. Diffuse optical imaging of the healthy and diseased breast: a systematic review. Breast Cancer Res. Treat 108, 9–22 (2008). [DOI] [PubMed] [Google Scholar]
- 20.Lin L & Wang LV The emerging role of photoacoustic imaging in clinical oncology. Nat. Rev. Clin. Oncol 19, 365–384 (2022). [DOI] [PubMed] [Google Scholar]
- 21.Wang LV Multiscale photoacoustic microscopy and computed tomography. Nat. Photonics 3, 503–509 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lin L et al. Single-breath-hold photoacoustic computed tomography of the breast. Nat. Commun 9, 2352 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lin L et al. Photoacoustic computed tomography of breast cancer in response to neoadjuvant chemotherapy. Adv. Sci 10.1002/advs.202003396 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Lin L et al. High-speed three-dimensional photoacoustic computed tomography for preclinical research and clinical translation. Nat. Commun 12, 882 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Han S, Lee H, Kim C & Kim J Review on multispectral photoacoustic analysis of cancer: thyroid and breast. Metabolites 12, 382 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Dantuma M et al. Fully three-dimensional sound speed-corrected multi-wavelength photoacoustic breast tomography. Preprint at https://arxiv.org/abs/2308.06754 (2023). [Google Scholar]
- 27.Weidner N, Semple JP, Welch WR & Folkman J Tumor angiogenesis and metastasis—correlation in invasive breast carcinoma. N. Engl. J. Med 324, 1–8 (1991). [DOI] [PubMed] [Google Scholar]
- 28.Schneider BP & Miller KD Angiogenesis of breast cancer. J. Clin. Oncol 23, 1782–1790 (2005). [DOI] [PubMed] [Google Scholar]
- 29.Reynolds AR et al. Stimulation of tumor growth and angiogenesis by low concentrations of RGD-mimetic integrin inhibitors. Nat. Med 15, 392–400 (2009). [DOI] [PubMed] [Google Scholar]
- 30.Vaupel P, Mayer A, Briest S & Höckel M in Oxygen Transport to Tissue XXVI (eds. Okunieff P et al. 333–342 (Springer, 2005); 10.1007/0-387-26206-7_44 [DOI] [Google Scholar]
- 31.Gilkes DM & Semenza GL Role of hypoxia-inducible factors in breast cancer metastasis. Future Oncol. 9, 1623–1636 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Folkman J Role of angiogenesis in tumor growth and metastasis. Semin. Oncol 29, 15–18 (2002). [DOI] [PubMed] [Google Scholar]
- 33.Wang LV & Hu S Photoacoustic tomography: in vivo imaging from organelles to organs. Science 335, 1458–1462 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Ermilov SA et al. Laser optoacoustic imaging system for detection of breast cancer. J. Biomed. Opt 14, 024007 (2009). [DOI] [PubMed] [Google Scholar]
- 35.Li X, Heldermon CD, Yao L, Xi L & Jiang H High resolution functional photoacoustic tomography of breast cancer. Med. Phys 42, 5321–5328 (2015). [DOI] [PubMed] [Google Scholar]
- 36.Heijblom M et al. Photoacoustic image patterns of breast carcinoma and comparisons with magnetic resonance imaging and vascular stained histopathology. Sci. Rep 5, 11778 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Schoustra SM et al. Imaging breast malignancies with the Twente Photoacoustic Mammoscope 2. PLoS ONE 18, e0281434 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Menezes GLG et al. Downgrading of breast masses suspicious for cancer by using optoacoustic breast imaging. Radiology 288, 355–365 (2018). [DOI] [PubMed] [Google Scholar]
- 39.Neuschler EI et al. A pivotal study of optoacoustic imaging to diagnose benign and malignant breast masses: a new evaluation tool for radiologists. Radiology 287, 398–412 (2018). [DOI] [PubMed] [Google Scholar]
- 40.Dogan BE et al. Optoacoustic imaging and gray-scale US features of breast cancers: correlation with molecular subtypes. Radiology 292, 564–572 (2019). [DOI] [PubMed] [Google Scholar]
- 41.Nyayapathi N et al. Photoacoustic dual-scan mammoscope: results from 38 patients. Biomed. Opt. Express 12, 2054–2063 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Abeyakoon O et al. An optoacoustic imaging feature set to characterise blood vessels surrounding benign and malignant breast lesions. Photoacoustics 10.1016/j.pacs.2022.100383 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Zheng W et al. Deep learning enhanced volumetric photoacoustic imaging of vasculature in human. Adv. Sci 10, 2301277 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Rodrigues J et al. Machine learning enabled photoacoustic spectroscopy for noninvasive assessment of breast tumor progression in vivo: a preclinical study. ACS Sens. 9, 589–601 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Li G et al. Deep learning combined with attention mechanisms to assist radiologists in enhancing breast cancer diagnosis: a study on photoacoustic imaging. Biomed. Opt. Express 15, 4689–4704 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Center for Devices and Radiological Health. Imagio Breast Imaging System—P200003 (FDA, 2021). [Google Scholar]
- 47.Pereira R et al. Evaluation of the accuracy of mammography, ultrasound and magnetic resonance imaging in suspect breast lesions. Clinics 75, e1805 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Fitzjohn J, Zhou C & Chase JG Critical assessment of mammography accuracy. IFAC-PapersOnLine 56, 5620–5625 (2023). [Google Scholar]
- 49.Park AY et al. An innovative ultrasound technique for evaluation of tumor vascularity in breast cancers: superb micro-vascular imaging. J. Breast Cancer 19, 210–213 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Chang Y-C, Huang Y-H, Huang C-S & Chang R-F Vascular morphology and tortuosity analysis of breast tumor inside and outside contour by 3-D power Doppler ultrasound. Ultrasound Med. Biol 38, 1859–1869 (2012). [DOI] [PubMed] [Google Scholar]
- 51.Park AY et al. A prospective study on the value of ultrasound microflow assessment to distinguish malignant from benign solid breast masses: association between ultrasound parameters and histologic microvessel densities. Korean J. Radiol 20, 759–772 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Hintze JL & Nelson RD Violin plots: a box plot-density trace synergism. Am. Stat 52, 181–184 (1998). [Google Scholar]
- 53.Good P Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses (Springer Science & Business Media, 2013). [Google Scholar]
- 54.Freund Y & Schapire RE A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci 55, 119–139 (1997). [Google Scholar]
- 55.Chen T & Guestrin C XGBoost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (Association for Computing Machinery, 2016); 10.1145/2939672.2939785 [DOI] [Google Scholar]
- 56.van der Maaten L & Hinton G Visualizing data using t-SNE. J. Mach. Learn. Res 9, 2579–2605 (2008). [Google Scholar]
- 57.Ma J et al. Segment anything in medical images. Nat. Commun 15, 654 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Wu C, Pineda F, Hormuth DA II, Karczmar GS & Yankeelov TE Quantitative analysis of vascular properties derived from ultrafast DCE-MRI to discriminate malignant and benign breast tumors. Magn. Reson. Med 81, 2147–2160 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Weidner N Intratumor microvessel density as a prognostic factor in cancer. Am. J. Pathol 147, 9–19 (1995). [PMC free article] [PubMed] [Google Scholar]
- 60.Smith AM, Mancini MC & Nie S Second window for in vivo imaging. Nat. Nanotechnol 4, 710–711 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Jacques SL Optical properties of biological tissues: a review. Phys. Med. Biol 58, R37–R61 (2013). [DOI] [PubMed] [Google Scholar]
- 62.Li L et al. Single-impulse panoramic photoacoustic computed tomography of small-animal whole-body dynamics at high spatiotemporal resolution. Nat. Biomed. Eng 1, 0071 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Fenner J et al. Macroscopic stiffness of breast tumors predicts metastasis. Sci. Rep 4, 5512 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Ramião NG et al. Biomechanical properties of breast tissue, a state-of-the-art review. Biomech. Model. Mechanobiol 15, 1307–1323 (2016). [DOI] [PubMed] [Google Scholar]
- 65.Berg WA et al. Combined screening with ultrasound and mammography vs mammography alone in women at elevated risk of breast cancer. JAMA 299, 2151–2163 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Sardanelli F et al. Multicenter surveillance of women at high genetic breast cancer risk using mammography, ultrasonography, and contrast-enhanced magnetic resonance imaging (the High Breast Cancer Risk Italian 1 Study): final results. Invest. Radiol 46, 94–105 (2011). [DOI] [PubMed] [Google Scholar]
- 67.Shen S et al. A multi-centre randomised trial comparing ultrasound vs mammography for screening breast cancer in high-risk Chinese women. Br. J. Cancer 112, 998–1004 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Guo R, Lu G, Qin B & Fei B Ultrasound imaging technologies for breast cancer detection and management: a review. Ultrasound Med. Biol 44, 37–70 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Devolli-Disha E, Manxhuka-Kërliu S, Ymeri H & Kutllovci A Comparative accuracy of mammography and ultrasound in women with breast symptoms according to age and breast density. Biomol. Biomed 9, 131–136 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Hanley JA & McNeil BJ The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982). [DOI] [PubMed] [Google Scholar]
- 71.Qian X et al. Prospective assessment of breast cancer risk from multimodal multiview ultrasound images via clinically applicable deep learning. Nat. Biomed. Eng 5, 522–532 (2021). [DOI] [PubMed] [Google Scholar]
- 72.Witowski J et al. Improving breast cancer diagnostics with deep learning for MRI. Sci. Transl. Med 14, eabo4802 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.American National Standards Institute. ANSI Z136.1-2014—American National Standard for Safe Use of Lasers. (Laser Institute of America, 2014). [Google Scholar]
- 74.Xu M & Wang LV Universal back-projection algorithm for photoacoustic computed tomography. Phys. Rev. E 71, 016706 (2005). [DOI] [PubMed] [Google Scholar]
- 75.Frangi AF, Niessen WJ, Vincken KL & Viergever MA Multiscale vessel enhancement filtering. In Proc. Medical Image Computing and Computer-Assisted Intervention (eds Wells WM et al. ) 130–137 (Springer, 1998). [Google Scholar]
- 76.Cho S, Baik J, Managuli R & Kim C 3D PHOVIS: 3D photoacoustic visualization studio. Photoacoustics 18, 100168 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Lam L, Lee SW & Suen CY Thinning methodologies—a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell 14, 869–885 (1992). [Google Scholar]
- 78.Tong X et al. Non-invasive 3D photoacoustic tomography of angiographic anatomy and hemodynamics of fatty livers in rats. Adv. Sci 10, 2205759 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Hu M-K Visual pattern recognition by moment invariants. IRE Trans. Inf. Theory 8, 179–187 (1962). [Google Scholar]
- 80.Thirion J-P Image matching as a diffusion process: an analogy with Maxwell’s demons. Med. Image Anal 2, 243–260 (1998). [DOI] [PubMed] [Google Scholar]
- 81.Tong X & Liu C Panoramic photoacoustic computed tomography with learning-based classification and segmentation enhances breast lesion characterization. Datasets. figshare 10.6084/m9.figshare.28675031 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Ma J et al. MedSAM: segment anything model for medical image analysis. Source code. Github https://github.com/bowang-lab/MedSAM (2023). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The calculated features for all patients and the classification results are available on Figshare at https://doi.org/10.6084/m9.figshare.28675031 (ref. 81). The rest of the main data supporting the results in this study is available within the article and its Supplementary Information. The PA data are available for research purposes from the corresponding author on reasonable request.
The code for data analysis is available on Figshare at https://doi.org/10.6084/m9.figshare.28675031 (ref. 81). The original code for MedSAM is available on GitHub82. We applied this code to our dataset with the customized settings described in Methods. We have opted not to make reconstruction and post-processing codes (described in detail in Methods and ref. 74) publicly available because the code is proprietary and used for other projects.
