Skip to main content
Journal of Pathology Informatics logoLink to Journal of Pathology Informatics
. 2018 May 25;9:20. doi: 10.4103/jpi.jpi_69_17

Optimized JPEG 2000 Compression for Efficient Storage of Histopathological Whole-Slide Images

Henrik Helin 1, Teemu Tolonen 2, Onni Ylinen 1, Petteri Tolonen 1, Juha Näpänkangas 3, Jorma Isola 1,
PMCID: PMC5989536  PMID: 29910969

Abstract

Background:

Whole slide images (WSIs, digitized histopathology glass slides) are large data files whose long-term storage remains a significant cost for pathology departments. Currently used WSI formats are based on lossy image compression alogrithms, either using JPEG or its more efficient successor JPEG 2000. While the advantages of the JPEG 2000 algorithm (JP2) are commonly recognized, its compression parameters have not been fully optimized for pathology WSIs.

Methods:

We defined an optimized parametrization for JPEG 2000 image compression, designated JP2-WSI, to be used specifically with histopathological WSIs. Our parametrization is based on allowing a very high degree of compression on the background part of the WSI while using a conventional amount of compression on the tissue-containing part of the image, resulting in high overall compression ratios.

Results:

When comparing the compression power of JP2-WSI to the commonly used fixed 35:1 compression ratio JPEG 2000 and the default image formats of proprietary Aperio, Hamamatsu, and 3DHISTECH scanners, JP2-WSI produced the smallest file sizes and highest overall compression ratios for all 17 slides tested. The image quality, as judged by visual inspection and peak signal-to-noise ratio (PSNR) measurements, was equal to or better than the compared image formats. The average file size by JP2-WSI amounted to 15, 9, and 16 percent, respectively, of the file sizes of the three commercial scanner vendors' proprietary file formats (3DHISTECH MRXS, Aperio SVS, and Hamamatsu NDPI). In comparison to the commonly used 35:1 compressed JPEG 2000, JP2-WSI was three times more efficient.

Conclusions:

JP2-WSI allows very efficient and cost-effective data compression for whole slide images without loss of image information required for histopathological diagnosis.

Keywords: Digital pathology, image compression, JPEG 2000, virtual microscopy, whole-slide imaging

INTRODUCTION

Whole-slide images (WSIs), representing entire digitized histopathological tissue sections, are very large image files, usually over 20 gigabytes (GB) in size without image compression.[1] Large-scale use of whole-slide imaging in a pathology department generates tens [2,3] or hundreds [4,5] of terabytes (1 terabyte = 1000 GB) of image data each year, not including storage redundancy or backup, which further increase the storage footprint.[6,7] Owing to the high costs of storing WSIs,[8] digital archiving in a clinical setting may necessitate some form of image lifecycle management, such as deleting older WSIs from hard disks, or moving them to cheaper storage media, for example, magnetic tape.[2,3,6,9] This, in turn, counteracts one of the main advantages of WSIs over glass slides, namely, ease of access. To save storage space WSIs are compressed, usually irreversibly using so-called lossy compression algorithms such as JPEG or its successor JPEG 2000.[7,10] Lossy image compression is mathematically irreversible, meaning some image information is lost during the compression. Lossless image compression, on the other hand, is reversible, and no image information is lost in the process.[1] The degree of data compression is generally expressed as a compression ratio defined as the uncompressed file size divided by the compressed file size. Although there is no consensus regarding acceptable degrees of image compression for pathology WSIs,[11,12] JPEG is thought to allow 10:1–20:1 and JPEG 2000 30:1–50:1 data compression without loss of diagnostic information.[10] JPEG compression is often defined by a nonstandard compression quality level, usually expressed as a value between 0 and 100, where the bigger the value, the better the resulting image quality is, and the less compression is applied. The compression ratio achieved with a given compression quality level depends on the image content and therefore the compression quality level is not directly proportional to compression ratio. Even when using image data compression, the costs of storing WSIs remain high in pathology laboratories using WSI in routine practice.

The JPEG 2000 image compression method has been designed to allow a highly customizable way of compressing image data by user-controlled parametrization.[13] The compression parameters of the JPEG 2000 algorithm (JP2) have so far not been fully optimized for digital pathology.[11] We have previously employed a fixed compression ratio to allow fast remote viewing of WSIs over the internet using JPIP (JPEG 200 interactive protocol).[13] Newer image server software is able to read and decompress JPEG 2000 files on the fly and send image tiles to the client through hypertext transfer protocol. This allows testing a wider set of compression parameters for producing maximal file size reduction while avoiding image compression artifacts. The present study demonstrates a novel image content sensitive strategy for pathology WSI-optimized JPEG 2000 compression, designated JP2-WSI. The new compression parametrization is compared to commercial WSI formats and the commonly used JPEG and constant compression ratio JPEG 2000 in terms of file sizes and visual image quality.

PROCEDURE

The concept of whole slide image-optimized parametrization for JPEG 2000 image compression

The prevailing method of defining JPEG 2000 image compression is to choose a fixed compression ratio for the scanned microscope slides.[7] However, due to the highly variable amount of diagnostically irrelevant background on the slides [Figure 1], fixed ratio compression has not turned out to produce optimal file size reduction for WSIs. When using fixed ratio compression, the more there is tissue in relation to empty slide area, the more details must be discarded to achieve the desired compression ratio. Thus, a fixed compression ratio will result in variable image quality on tissue-containing image areas. To avoid too low image quality on any WSIs, we have previously defined compression ratios of 25–30:1,[13] and subsequently 35:1, as suitable for JPEG 2000 compression of histopathological WSIs. When using standard JPEG compression, the image quality associated with level 80 compression out of 100 has been considered suitable for large-scale applications of WSI in pathology.[2,3] Compression quality level is not a concept that is defined in the JPEG standard and as such is not unambiguous. In our experience, level 80 JPEG compression produces typically compression ratios of about 1:20 in pathology WSIs.

Figure 1.

Figure 1

A macroscopic view of the seventeen routine histopathological slides used in the study. The shaded rectangle on slide fourteen demonstrates the area that makes up the whole-slide image containing both tissue and empty slide area

We defined JP2-WSI compression to match the image quality of hematoxylin and eosin stained tissue sections scanned and stored with JPEG level 80 compression, followed by maximal compression of the empty slide area (the background). For assessing the image quality of the tissue, we used peak signal-to-noise ratio (PSNR) measurements [14] and visual inspection by two senior pathologists (TT and JI). In our validation data of seventeen routine histopathological slides, the mean PSNR values of JPEG level 80 and JP2-WSI compression were 33.4 dB and 35.8 dB (with lossless image used as reference). The corresponding mean value for fixed 35:1 ratio JPEG 2000 was 40.8 dB. In visual inspection by two senior pathologists, there was no significant loss of overall image quality associated with JP2-WSI compression compared to JPEG level 80 compression.

Aside from the variable compression ratio, the main codestream parameters for JP2-WSI were essentially the same as we have used before: Eight wavelet decomposition levels, no tiling, precinct size 256 × 256, code-block size 64 × 64, progression order resolution-position-component-layer, and one quality layer.[13]

Comparison of JP2-WSI to standard JPEG and JPEG 2000 compression methods

For evaluating JP2-WSI, we digitized a set of seventeen histopathological slides selected from a university hospital pathology archive. These slides reflect the routine workload of a general pathologist. The slides included needle biopsies (n = 10) as well as surgical sections (n = 7) and were stained with hematoxylin and eosin (n = 15) and Giemsa (n = 2). Figure 1 presents an overview of the slide set. As an additional test of diagnostically challenging WSI image quality, we digitized a gastric biopsy slide to verify the image quality of JP2-WSI in visualizing Helicobacter pylori. The slides were digitized with whole slide scanners from four different vendors, including two-line scanners (Aperio and Hamamatsu) and two tile-based scanners (3DHISTECH and Jilab). The scanner setups were as follows

  1. Aperio ScanScope AT2 (Leica Biosystems, Nussloch, Germany) brightfield line scanner, Piranha Color 2k PC-30-02K80 camera (Teledyne DALSA, Ontario, Canada) with 2048 × 3 pixel resolution, pixel size 14 × 14 μm, ×20 Olympus Plan-Apo objective lens with a numerical aperture (NA) of 0.75, and scanning resolution 0.5 μm/pixel

  2. Hamamatsu NanoZoomer XR (Hamamatsu Photonics, Hamamatsu, Japan) brightfield line scanner, charge-coupled device (CCD) camera with 4096 × 64 pixel resolution, pixel size 8 × 8 μm, ×20 Olympus Plan-Apo objective lens (NA 0.75) and ×1.75 relay lens, scanning resolution 0.46 μm/pixel

  3. Pannoramic SCAN (3DHISTECH Ltd, Budapest, Hungary) brightfield tile-based scanner, CIS 3CCD camera with 2048 × 2048 pixel resolution, pixel size 5.5 × 5.5 μm, ×20 Carl Zeiss Plan-Apochromat objective lens (NA 0.8) and ×1 phototube, scanning resolution 0.24 μm/pixel

  4. SlideStrider (Jilab Inc, Tampere, Finland) brightfield tile-based scanner, Lumenera Lt1265R CCD camera (Lumenera Corporation, Ottawa, Ontario, Canada) with 4240 × 2832 pixel resolution, pixel size 3.1 μm × 3.1 μm, ×10 (NA 0.4) and ×20 (NA 0.75) Olympus UPLSAPO objective lenses, scanning resolution 0.16–0.31 μm/pixel.

The area included in the WSI was the smallest rectangle covering all individual tissue fragments on the slide. For the SlideStrider scans, the nonscanned empty slide areas required to fill in the WSI rectangle were copied automatically from a standard empty slide image tile.

All WSIs were scanned as 24-bit RGB color images. WSIs from the Aperio, Hamamatsu, and Pannoramic scanners were saved without compression and then compressed with JP2-WSI. The same scanned WSIs were also saved using the manufacturers' proprietary file formats and their default compression schemes. Aperio SVS format used JPEG tile compression with compression level set at 70/100. Both Hamamatsu NDPI format and Pannoramic MRXS format employed JPEG tile compression with quality level 80/100. The WSIs scanned with SlideStrider werefirst saved losslessly and then converted to either fixed 35:1 ratio JPEG 2000 or the developed JP2-WSI compression. The compression method of the SlideStrider software is based on the Kakadu software development kit library implementation of JPEG 2000 (version 7.5, Kakadu Software Inc., NewSouth Innovations Pty Limited, Sydney, Australia).[15] The four scanners all had different sampling resolutions, Aperio 0.5 μm/pixel, Hamamatsu 0.46 μm/pixel, Pannoramic 0.24 μm/pixel, and SlideStrider 0.16–0.31 μm/pixel.

RESULTS

Table 1 presents the pixel dimensions, the ratios of empty slide to tissue area, and the file sizes of the 17 slides digitized with the SlideStrider scanner at 0.31 μm/pixel. The uncompressed file sizes ranged from 2.6 to 30 GB. Lossless JPEG 2000 compression yielded compression ratios ranging from 3:1 to 56:1 and file sizes from 341 megabytes (MB) to 5.9 GB. The fixed ratio JPEG 2000 algorithm compressed all images to the 35:1 extent, except for two cases (slides 14 and 16), for which higher compression ratios were already achieved with the lossless algorithm. The file sizes ranged from 74 MB to 686 MB. The developed JP2-WSI compression produced overall compression ratios varying from 41:1 to 1487:1, and file sizes of 8 MB to 442 MB. As an average, using JP2-WSI, we obtained file sizes that were 33% of fixed-ratio lossy compressed JPEG 2000. Of the individual scanned histopathology test slides, JP2-WSI reduced file sizes most effectively in biopsy slides containing multiple small tissue fragments and abundant empty slide area (slides 14, 16, and 17 in our test set). The ratio of empty slide area to tissue-containing slide area showed an approximately linear relationship with the overall compression ratio achieved with JP2-WSI [Figure 2].

Table 1.

Comparison of whole-slide image file sizes produced by three different parametrizations of JPEG 2000

graphic file with name JPI-9-20-g002.jpg

Figure 2.

Figure 2

The compression ratio obtained by JP2-WSI plotted against the ratio of empty slide to tissue area in the set of seventeen digitized slides. The approximately linear relation signifies JP2-WSI's sensitivity to image content – the greater the ratio of empty slide to tissue area the higher the overall compression ratio that is achieved

Figures 3 and 4 allow comparison of the visual image quality obtained with JP2-WSI compression compared to JPEG quality level 80 compression and fixed 35:1 ratio JPEG 2000 compression. In Figure 3, visually detectable differences can be seen only with zoom levels well over 100%, which represent purely digital magnification. The magnified screenshots come from WSIs with file sizes of 528 MB and 21 MB (JPEG 2000 35:1 and JP2-WSI, respectively) and 611 MB and 18 MB (JPEG 2000 35:1 and JP2-WSI, respectively). They were scanned using the SlideStrider whole-slide scanner with ×10 objective lens and a charge-coupled device camera with 3.1 μm pixel size resulting in 0.31 μm/pixel scanning resolution. Figure 4 presents zoomed screenshots of Helicobacteria in a gastric biopsy scanned with resolutions of 0.31 μm/pixel and 0.16 μm/pixel (Plan-Apo ×10 and ×20 objective lenses, respectively). At 0.31 μm/pixel, there are subtle visible differences in the image quality. JP2-WSI eliminates random noise resulting in a smooth or blurry image appearance, whereas JPEG produces a grainier or noisier image. With the higher optical scanning resolution we were unable to detect diagnostic differences in image quality.

Figure 3.

Figure 3

Comparison of image quality between JP2-WSI (a and b), JPEG quality level 80 (c and d), and fixed 35:1 ratio JPEG 2000 compression (e and f). Digitally magnified screenshots of whole-slide images 14 (a, c, e) and 16 (b, d, f)

Figure 4.

Figure 4

Effects of image compression and whole-slide image scanning resolution on detecting Helicobacteria in a gastric biopsy. JP2-WSI (a and c) and JPEG quality level 80 (b and d) compressed images scanned at 0.31 μm/pixel (a and b) and 0.16 μm/pixel (c and d) sampling resolutions

Table 2 shows the file sizes resulting from digitizing the set of 17 slides with 3DHISTECH, Aperio and Hamamatsu scanners. For each scanner, three different file sizes are shown per slide: the raw uncompressed file size, the file size using the scanners default compression method, and the file size using JP2-WSI compression. The file sizes are not comparable between scanners because of different scan area dimensions and different scanning resolutions. The uncompressed file sizes ranged from 4.84 GB to 63.15 GB. 3DHISTECH default compression produced compression ratios of 10–158 while JP2-WSI compressed the same images with ratios of 66–2250. For Aperio, the compression ratios were 7–32 for default compression and 48–1289 for JP2-WSI. Hamamatsu default compression produced compression ratios of 10–42 with JP2-WSI producing compression ratios of 49–1342. JP2-WSI compression had the widest range of overall compression ratios, 66–2250, 48–1289 and 49–1342 for 3DHISTECH, Aperio, and Hamamatsu scanned images, respectively. JP2-WSI compressed images were smallest and had the highest overall compression ratios in every case. All of the compression methods produced the highest compression ratios for the biopsy slides 14–17. These slides had the highest ratios of empty slide to tissue area [Figure 1]. Table 3 summarizes the mean file sizes and mean compression ratios of the slides digitized with 3DHISTECH, Aperio, and Hamamatsu scanners. JP2-WSI compressed file sizes were 15%, 9%, and 16% of the file sizes produced by 3DHISTECH, Aperio, and Hamamatsu default compression methods.

Table 2.

File sizes of the slides digitized with different scanner and compression method combinations

graphic file with name JPI-9-20-g006.jpg

Table 3.

Mean whole-slide images file sizes produced by different scanner and compression method combinations

graphic file with name JPI-9-20-g007.jpg

DISCUSSION

Due to the costly hard disk storage of modern pathology WSIs, it is essential to optimize data compression for routine applications of digital pathology. Many commercial whole slide scanners still employ JPEG compression in their proprietary file formats, probably mainly due to its computational simplicity. Its successor, JPEG 2000, offers more powerful wavelet-based compression, and its random data access, among other features, makes it well suited for server-based remote WSI viewing.[10,13] An optimal WSI data compression method should take into account the type of the images to be stored.[16] In the case of pathology slides, this means taking advantage of the abundant empty space on the slide. With this in mind, we designed our novel WSI-optimized JPEG 2000 parametrization, JP2-WSI, which compresses the empty slide area very heavily but retains image quality on the tissue-containing parts of the image. This not only minimizes the effect of empty slide area on WSI file size but also adjusts the overall compression ratio to the amount of image detail meaning that the image quality rather than overall compression ratio is kept constant. A standard practice in histopathology is to distribute several serial sections of small biopsies on the slide. This leads inevitably to biopsy-WSIs containing larger empty slide areas than WSIs of surgical specimens which have been scanned with tight margins. JP2-WSI was designed to handle both WSI types but was found particularly effective in reducing the file size of biopsy-WSIs.

When designing a novel compression method for WSIs, it was self-evident that the image quality may not be compromised. We designed JP2-WSI to keep the inevitable information loss associated with all lossy image compression algorithms essentially the same as that obtained with JPEG level 80 compression, which has been considered a satisfactory image quality in histopathological WSIs.[2,3] Using this definition, JP2-WSI yielded file sizes that were < 20% of those obtained by proprietary JPEG 80 compression, and only 33% of conventional fixed-ratio JPEG 2000 compression. This permits significant cost savings in the routine use of WSI, where tens or even hundreds of terabytes of image data are generated each year.[2,3,4,5]

CONCLUSION

We developed a novel parametrization of JPEG 2000 image compression designed for histopathological WSIs. The main advantage of JP2-WSI is its sensitivity to image content. Our optimized JP2-WSI conforms to the JPEG 2000 image coding standard maintaining its open, nonproprietary nature. JPEG2000 encoding is an allowed image compression method in the DICOM standard, and it is also allowed for the WSI storage class. A commercial picture archiving and communications system has recently implemented JPEG 2000 encoded WSI storage class in full workflow (Neagen, Oulu, Finland). This demonstrates that JPEG 2000 encoding is applicable also in the DICOM context. A large-scale clinical study is underway to confirm the diagnostic accuracy of JP2-WSI-compressed WSIs compared to conventional glass slides and optical microscopes.

Financial support and sponsorship

Nil.

Conflicts of interest

There are no conflicts of interest.

Footnotes

REFERENCES

  • 1.Pantanowitz L, Dickinson K, Evans AJ, Hassell LA, Henricks WH, Lennerz JK, et al. American telemedicine association clinical guidelines for telepathology. J Pathol Inform. 2014;5:39. doi: 10.4103/2153-3539.143329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Stathonikos N, Veta M, Huisman A, van Diest PJ. Going fully digital: Perspective of a Dutch academic pathology lab. J Pathol Inform. 2013;4:15. doi: 10.4103/2153-3539.114206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Huisman A, Looijen A, van den Brink SM, van Diest PJ. Creation of a fully digital pathology slide archive by high-volume tissue slide scanning. Hum Pathol. 2010;41:751–7. doi: 10.1016/j.humpath.2009.08.026. [DOI] [PubMed] [Google Scholar]
  • 4.Clunie DA, Dennison DK, Cram D, Persons KR, Bronkalla MD, Primo HR, et al. Technical challenges of enterprise imaging: HIMSS-SIIM collaborative white paper. J Digit Imaging. 2016;29:583–614. doi: 10.1007/s10278-016-9899-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Lundström C, Thorstenson S, Waltersson M, Persson A, Treanor D. Summary of 2nd nordic symposium on digital pathology. J Pathol Inform. 2015;6:5. doi: 10.4103/2153-3539.151889. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Chlipala E, Elin J, Eichhorn O, Krishnamurti M, Long RE, Sabata B, et al. Archival and Retrieval in Digital Pathology Systems. [Last accessed on 2017 Oct 21]. Available from: http://www.digitalpathologyassociation.org/_data/files/Archival_and_Retrieval_in_Digital_Pathology_Syste ms.pdf .
  • 7.García-Rojo M. International clinical guidelines for the adoption of digital pathology: A review of technical aspects. Pathobiology. 2016;83:99–109. doi: 10.1159/000441192. [DOI] [PubMed] [Google Scholar]
  • 8.Häger S. How to build a business case to justify the investment in digital pathology. [Last accessed on 2017 Oct 21]. Available from: https://www.sectra.com/medical/resources/how-to-build-a-business-case-to-justify-theinvestment-in-digital-pathology/
  • 9.Thorstenson S, Molin J, Lundström C. Implementation of large-scale routine diagnostics using whole slide imaging in Sweden: Digital pathology experiences 2006-2013. J Pathol Inform. 2014;5:14. doi: 10.4103/2153-3539.129452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Digital Imaging and Communications in Medicine (DICOM) – Supplement 145: Whole Slide Microscopic Image IOD and SOP Classes. DICOM Standards Committee, Working Group 26, Pathology. [Last accessed on 2017 Oct 21]. Available from: http://www.medical.nema.org/medical/dicom/final/sup145_ft.pdf .
  • 11.Krupinski EA, Johnson JP, Jaw S, Graham AR, Weinstein RS. Compressing pathology whole-slide images using a human and model observer evaluation. J Pathol Inform. 2012;3:17. doi: 10.4103/2153-3539.95129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Kalinski T, Zwönitzer R, Grabellus F, Sheu SY, Sel S, Hofmann H, et al. Lossless compression of JPEG2000 whole slide images is not required for diagnostic virtual microscopy. Am J Clin Pathol. 2011;136:889–95. doi: 10.1309/AJCPYI1Z3TGGAIEP. [DOI] [PubMed] [Google Scholar]
  • 13.Tuominen VJ, Isola J. The application of JPEG2000 in virtual microscopy. J Digit Imaging. 2009;22:250–8. doi: 10.1007/s10278-007-9090-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Wang Z, Bovik AC, Lu L. IEEE International Conference on Acoustics, Speech, and Signal Processing. Orlando, FL, USA: IEEE; 2002. Why is image quality assessment so difficult; pp. 3313–6. [Google Scholar]
  • 15.Kakadu Software. [Last accessed on 2017 Oct 21]. Available from: http://www.kakadusoftware.com .
  • 16.Muir P, Li S, Lou S, Wang D, Spakowicz DJ, Salichos L, et al. The real cost of sequencing: Scaling computation to keep pace with data generation. Genome Biol. 2016;17:53. doi: 10.1186/s13059-016-0917-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Pathology Informatics are provided here courtesy of Elsevier

RESOURCES