Supplementary Vignette 1

Consistent API for loading images from diverse modalities and file formats

PathML provides support for loading a wide array of imaging modalities and file formats under a standardized syntax. In this vignette, we highlight code snippets for loading a range of image types ranging from brightfield H&E and IHC to highly multiplexed immunofluorescence and spatial expression and proteomics, from small images to gigapixel scale:

Imaging modality File format Source Image dimensions (X, Y, Z, C, T)
Brightfield H&E Aperio SVS OpenSlide example data (32914, 46000, 1, 3, 1)
Brightfield H&E Generic tiled TIFF OpenSlide example data (32914, 46000, 1, 3, 1)
Brightfield IHC Hamamatsu NDPI OpenSlide example data (73728, 126976, 1, 3, 1)
Brightfield H&E Hamamatsu VMS OpenSlide example data (76288, 102400, 1, 3, 1)
Brightfield H&E Leica SCN OpenSlide example data (153470, 53130, 1, 3, 1)
Fluorescence MIRAX OpenSlide example data (170960, 76324, 1, 3, 1)
Brightfield IHC Olympus VSI OpenSlide example data (6753, 13196, 1, 3, 1)
Brightfield H&E Trestle TIFF OpenSlide example data (25408, 61504, 1, 3, 1)
Brightfield H&E Ventana BIF OpenSlide example data (93951, 105813, 1, 3, 1)
Fluorescence Zeiss ZVI OpenSlide example data (1388, 1040, 13, 3, 1)
Brightfield H&E DICOM Orthanc example data (30462, 78000, 1, 3, 1)
Fluorescence (CODEX spatial proteomics) TIFF Schurch et al., Cell 2020 (1920, 1440, 17, 4, 23)
Fluorescence (time-series + volumetric) OME-TIFF OME-TIFF example data (512, 512, 10, 2, 43)
Fluorescence (MERFISH spatial gene expression) TIF Zhuang et al., 2020 (2048, 2048, 7, 1, 40)
Fluorescence (Visium 10x spatial gene expression) TIFF 10x Genomics (25088, 26624, 1, 1, 4)

All images used in these examples are publicly available for download at the links listed above.

Note that across the wide diversity of modalities and file formats, the syntax for loading images is consistent (see examples below).

Aperio SVS

Generic tiled TIFF

Hamamatsu NDPI

The labels field can be used to store slide-level metadata. For example, in this case we store the target gene, which is Ki-67:

Hamamatsu VMS

Leica SCN

MIRAX

Olympus VSI

Again, we use the labels field to store slide-level metadata such as the name of the target gene.

Trestle TIFF

Ventana BIF

Zeiss ZVI

Again, we use the labels field to store slide-level metadata such as the name of the target gene.

DICOM

Volumetric + time-series OME-TIFF

CODEX spatial proteomics

The labels field can be used to store whatever slide-level metadata the user wants; here we specify the tissue type

MERFISH spatial gene expression

Visium 10x spatial gene expression

Here we load an image with accompanying expression data in AnnData format.

Summary

The PathML API provides a consistent, easy to use interface for loading a wide range of imaging data:

The output from all of the code snippets above is a SlideData object compatible with the PathML preprocessing module.

Full documentation of the PathML API is available at https://pathml.org.

Full code for this vignette is available at https://github.com/Dana-Farber-AIOS/pathml/tree/master/examples/vignettes/