Skip to main content
Bioinformatics logoLink to Bioinformatics
. 2011 Feb 23;27(8):1179–1180. doi: 10.1093/bioinformatics/btr095

Improved structure, function and compatibility for CellProfiler: modular high-throughput image analysis software

Lee Kamentsky 1, Thouis R Jones 1, Adam Fraser 1, Mark-Anthony Bray 1, David J Logan 1, Katherine L Madden 1, Vebjorn Ljosa 1, Curtis Rueden 2, Kevin W Eliceiri 2, Anne E Carpenter 1,*
PMCID: PMC3072555  PMID: 21349861

Abstract

Summary: There is a strong and growing need in the biology research community for accurate, automated image analysis. Here, we describe CellProfiler 2.0, which has been engineered to meet the needs of its growing user base. It is more robust and user friendly, with new algorithms and features to facilitate high-throughput work. ImageJ plugins can now be run within a CellProfiler pipeline.

Availability and Implementation: CellProfiler 2.0 is free and open source, available at http://www.cellprofiler.org under the GPL v. 2 license. It is available as a packaged application for Macintosh OS X and Microsoft Windows and can be compiled for Linux.

Contact: anne@broadinstitute.org

Supplementary information: Supplementary data are available at Bioinformatics online.

1 INTRODUCTION

CellProfiler is freely available, open-source software that enables researchers without training in computer programming to measure biological phenotypes quantitatively and automatically from thousands of images. With an interface designed by biologists and underlying algorithms developed by computer scientists, CellProfiler bridges the gap between advanced image analysis algorithms and scientists who lack computational expertise. In the 4 years since its publication (Carpenter et al., 2006; Lamprecht et al., 2007), it has been rapidly adopted by the worldwide biological community and cited in more than 250 articles. Roughly half of its users are outside the USA. CellProfiler was initially designed for high-throughput image analysis but is often used for small-scale projects. This highlights the trend toward quantifying information in images regardless of experiment size.

CellProfiler's interface lets researchers build customized chains of interoperable image analysis modules to identify and measure biological objects and features in images. These modular pipelines can be saved and shared with colleagues. CellProfiler has been used to measure individual cells, colonies of cells and whole organisms in a wide range of assays (e.g. counting cells, measuring staining intensities and scoring complex phenotypes with machine learning) and at many experimental scales (from a few to hundreds of thousands of images). A variety of cell types have been analyzed, including budding yeast, Drosophila, mouse, rat and dozens of human cell types. The diverse measurements generated by CellProfiler provide raw material for machine-learning algorithms that can identify challenging phenotypes (Jones et al., 2009; Misselwitz et al., 2010; Ramo et al., 2009).

CellProfiler fills a unique role in the software landscape. It is a modular, high-throughput, open-source biological image analysis package, and it won the 2009 Bio-IT World Best Practices Award in IT & Informatics. CellProfiler 2.0 improves upon the design of the original version, resulting in professionally engineered software with improved usability and functionality, as well as integration with other open-source image-related software.

2 IMPROVEMENTS IN CELLPROFILER 2.0

Robust infrastructure and interoperability: we redesigned the software's infrastructure while porting it from the proprietary MATLAB language to the open-source Python language, making use of the high-performance scientific libraries NumPy and SciPy (Oliphant, 2007). While retaining the successful attributes of CellProfiler 1.0 (Supplementary Fig. 1 and Table 1), CellProfiler 2.0 compares favorably to CellProfiler 1.0 in terms of performance (Supplementary Fig. 2) and features (Supplementary Table S2). Object-oriented design and professional software practices were integral to the porting effort, including version control, a continuous build process and the development of an extensive validation suite.

CellProfiler 2.0 is designed to be extensible and interoperable; its plug-in interface allows outside developers to write and distribute new CellProfiler modules. We use Cython (http://www.cython.org) to implement computationally intensive algorithms, as well as bridge to precompiled libraries including Java via the Java Native Interface. The Java/Python bridge allows CellProfiler 2.0 to load nearly 100 image formats via the Open Microscopy Environment Consortium's Bio-formats library (http://www.loci.wisc.edu/software/bio-formats). Because 5% of CellProfiler-citing papers also used ImageJ (http://rsbweb.nih.gov/ij), we built a bridge to run ImageJ macros in the context of a CellProfiler pipeline. In our own research, we have used third-party ImageJ plug-ins via CellProfiler to enhance neurites in images (Supplementary Fig. 1A) and detect focal planes in three-dimensional images.

User-oriented improvements: CellProfiler 2.0 has a much-enhanced user interface for editing pipelines (Fig. 1), including drag-and-drop operations, context-sensitive menus, undo capabilities, user-friendly error reporting and context-dependent warnings for mistakes in a pipeline's settings (Supplementary Fig. 1B). A newly designed test mode allows a researcher to step through a pipeline and repeatedly adjust settings (Supplementary Fig. 1C) to optimize image analysis. Within each module, CellProfiler shows only those settings relevant to the user's existing choices, resulting in a concise and comprehensible display. Extensive context-dependent help guides users in choosing settings for their assay (Supplementary Fig. 1D). Pipelines are now saved in a human-readable text format (Supplementary Material: Example CellProfiler 2.0 pipeline file).

Fig. 1.

Fig. 1.

User interface for CellProfiler 2.0.

New and improved algorithms: for neuron image analysis, CellProfiler 2.0 includes operations to enhance neurites and to measure their branching, and algorithms for neuron-specific metrics are in development. An updated time-lapse object-tracking module implements a recently developed algorithm based on a linear assignment approach (Jaqaman et al., 2008). New morphological operations can find the convex hull of foreground objects and enhance dark holes in images. Illumination correction options now include spline fitting (Lindblad and Bengtsson, 2001), and thresholding options have been extended to partition intensities into three classes instead of the typical two. Other changes include an algorithm for more accurate operations on masked images (Knutsson and Westin, 1993), faster measurement of Zernike-based shape features (Supplementary Fig. 2) and improved measurement of Gabor (Supplementary Fig. 3) and Haralick texture features (Supplementary Table 3).

Enhancements for high-throughput use: CellProfiler can be run in batch mode: sets of images are partitioned between CellProfiler instances running on separate computing cores or cluster nodes in a distributed environment. In CellProfiler 2.0, images can be loaded via HTTP or located based on a comma-delimited text file containing image file locations, which might be generated by automated microscopes or laboratory information systems. Metadata about the images can also be loaded similarly. CellProfiler 2.0 has enhanced database capabilities and is now able to upload directly to MySQL or SQLite databases during image processing. CellProfiler 2.0's FlagImage module can exclude images from analysis based on measurements of image quality, such as blurriness and presence of debris. Images can be grouped for aggregate operations, such as illumination correction of images on a per-plate basis or analysis of multiple time-lapse movies or three-dimensional image stacks. More detailed information on CellProfiler and high-throughput screening is available at http://www.cellprofiler.org/hcs.html.

Future directions: we will use the improved infrastructure and design of CellProfiler 2.0 as the basis for our future work. Where feasible, we will continue to leverage existing open-source projects to add functionality, such as software for workflow management (e.g. OMERO and KNIME) and classification of pixels or whole images by machine learning (e.g. Wndchrm and Ilastik). While supporting contributions from other developers, we will also develop novel algorithms for CellProfiler based on our ongoing research, including time-lapse and three-dimensional image analysis, metrics and corrections for assay quality control and performance evaluation and algorithms for Caenorhabditis elegans image-based screens (Riklin-Raviv et al., 2010; Wählby et al., 2010).

Supplementary Material

Supplementary Data

ACKNOWLEDGEMENTS

The authors thank the members of their laboratories for contributing to the development of the software and this article, especially Shravas Rao and Emily Schloff.

Funding: National Institutes of Health (R01 GM089652-01 to A.E.C., RC2 GM092519-01 to K.W.E. and NIH RL1 HG004671, which is administratively linked to RL1 CA133834, RL1 GM084437 and UL1 RR024924).

Conflict of Interest: none declared.

REFERENCES

  1. Carpenter A.E., et al. CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 2006;7:R100. doi: 10.1186/gb-2006-7-10-r100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Jaqaman K., et al. Robust single-particle tracking in live-cell time-lapse sequences. Nat. Methods. 2008;5:695–702. doi: 10.1038/nmeth.1237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Jones T.R., et al. Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning. Proc. Natl Acad. Sci. USA. 2009;106:1826–1831. doi: 10.1073/pnas.0808843106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Knutsson H., Westin C.-F. Normalized and differential convolution: methods for interpolation and filtering of incomplete and uncertain data. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. 1993:515–523. [Google Scholar]
  5. Lamprecht M.R., et al. CellProfiler: free, versatile software for automated biological image analysis. Biotechniques. 2007;42:71–75. doi: 10.2144/000112257. [DOI] [PubMed] [Google Scholar]
  6. Lindblad J., Bengtsson E. Proceedings of the 12th Scandinavian Conference on Image Analysis (SCIA) Bergen, Norway: Norwegian Society for Image Processing and Pattern Recognition; 2001. A comparison of methods for estimation of intensity nonuniformities in 2D and 3D microscope images of fluorescence stained cells; pp. 264–271. [Google Scholar]
  7. Misselwitz B., et al. Enhanced CellClassifier: a multi-class classification tool for microscopy images. BMC Bioinformatics. 2010;11:30. doi: 10.1186/1471-2105-11-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Oliphant T.E. Python for scientific computing. Comput. Sci. Eng. 2007;9:10–20. [Google Scholar]
  9. Ramo P., et al. CellClassifier: supervised learning of cellular phenotypes. Bioinformatics. 2009;25:3028–3030. doi: 10.1093/bioinformatics/btp524. [DOI] [PubMed] [Google Scholar]
  10. Riklin-Raviv T., et al. Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). Beijing, China: Springer; 2010. Morphology-guided graph search for untangling objects: C. elegans analysis. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Wählby C., et al. IEEE International Symposium on Biomedical Imaging (ISBI): From Nano to Macro. Rotterdam, The Netherlands: 2010. Resolving clustered worms via probabilistic shape models. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES