Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2021 Aug 12;16(8):e0252000. doi: 10.1371/journal.pone.0252000

easyXpress: An R package to analyze and visualize high-throughput C. elegans microscopy data generated using CellProfiler

Joy Nyaanga 1,2, Timothy A Crombie 1, Samuel J Widmayer 1, Erik C Andersen 1,*
Editor: Heidi A Tissenbaum3
PMCID: PMC8360505  PMID: 34383778

Abstract

High-throughput imaging techniques have become widespread in many fields of biology. These powerful platforms generate large quantities of data that can be difficult to process and visualize efficiently using existing tools. We developed easyXpress to process and review C. elegans high-throughput microscopy data in the R environment. The package provides a logical workflow for the reading, analysis, and visualization of data generated using CellProfiler’s WormToolbox. We equipped easyXpress with powerful functions to customize the filtering of noise in data, specifically by identifying and removing objects that deviate from expected animal measurements. This flexibility in data filtering allows users to optimize their analysis pipeline to match their needs. In addition, easyXpress includes tools for generating detailed visualizations, allowing the user to interactively compare summary statistics across wells and plates with ease. Researchers studying C. elegans benefit from this streamlined and extensible package as it is complementary to CellProfiler and leverages the R environment to rapidly process and analyze large high-throughput imaging datasets.

Introduction

Developments in high-throughput imaging techniques have led to a rapid increase in these data. Researchers are able to move away from the laborious manual collection of images that typically limits large-scale analyses [1]. Furthermore, these advances have enabled scientists to collect data of intact cells, tissues, and whole-organisms with increased temporal and spatial resolution [2]. However, typical users require software methods for efficient handling, analysis, and visualization to make the most of these extensive image datasets.

C. elegans is a globally distributed, free-living roundworm nematode that is amenable to many types of experimental biology. The C. elegans cell lineage is completely characterized [3], and the C. elegans connectome is completely mapped [4], making these animals an exemplary model for developmental biology and neurobiology. The species can also be rapidly reared in large, genetically diverse populations in laboratory settings, providing unparalleled statistical power for experimental biology compared to any other metazoan [5]. Furthermore, metabolic and developmental pathways in C. elegans are conserved in humans [6]. High-throughput imaging technologies can improve C. elegans studies by increasing experimental efficiency, scalability, and quality. Existing systems for automated image acquisition, such as the Molecular Devices ImageXpress platforms generate images of nematodes that can be analyzed with software like CellProfiler’s WormToolbox [7] to extract nematode phenotype information. This software uses probabilistic nematode models trained on user selected animals to automate the segmentation of nematodes from the background of images in high-throughput. As a result, CellProfiler’s WormToolbox is able to measure hundreds of phenotypes related to animal shape, intensity, and texture. Implementing this software for large-scale imaging experiments can generate large quantities of data that requires additional analysis software for reliable and reproducible handling, processing, and visualization. CellProfiler Analyst was developed to offer tools for the analysis of image-based datasets, but this software is not integrated with modern statistical environments. We sought to design a resource that facilitates the exploration of CellProfiler data in the R environment [8], where this limitation can be eliminated. The R language provides extensive open-source statistical and data visualization tools that are well supported by the user community. In leveraging R, we are able to create a flexible tool that can be rapidly integrated with other statistical R packages to suit project-specific analysis needs.

We developed easyXpress, a software package for the R statistical programming language, to assist in the processing, analysis, and visualization of C. elegans data generated using CellProfiler. easyXpress provides tools for quality control, summarization, and visualization of image-based C. elegans phenotype data. Built to be complementary to CellProfiler, this package provides a streamlined workflow for the rapid quantitative analysis of high-throughput imaging datasets.

Methods

Preparation of animals for imaging

Bleach-synchronized animals were fed E. coli HB101 bacteria suspended and allowed to develop at 20°C with continuous shaking. Animals in 96-well microtiter plates were titered to approximately 30 animals per well. Prior to imaging, animals were treated with sodium azide (50 mM in 1X M9) for 10 minutes to paralyze and straighten their bodies.

Imaging

Animals in microtiter plates were imaged at 2X magnification with an ImageXpress Nano (Molecular Devices, San Jose, CA). The ImageXpress Nano acquires brightfield images with a 4.7 megaPixel CMOS camera and are stored in 16-bit TIFF format. The images were processed using CellProfiler software (for details see https://github.com/AndersenLab/CellProfiler).

Paraquat dose response

A 1.5 M solution of paraquat (Methyl viologen dichloride, Sigma, 856177-1G) was prepared in sterile water, aliquoted, and frozen at -20°C until used. Experimental animals were grown at 20°C and fed OP50 bacteria spotted on modified nematode growth medium, containing 1% agar and 0.7% agarose to prevent animals from burrowing. After three generations of passaging, animals were bleach-synchronized and embryos were transferred to the wells of 96-well microplates. Each well contained approximately 30 embryos in 50 μL of K medium [9]. Microplates were incubated overnight at 20°C with continuous shaking. The following day, arrested L1 animals were fed HB101 bacteria suspended in K medium. At the time of feeding, the animals were also exposed to paraquat at one of six concentrations (0, 7.81, 31.25, 125, 500, 2000 μM) by serial dilution of a freshly thawed aliquot of 1.5 M paraquat solution. The final volume in each well after dosing and feeding was 75 μL. The animals were then grown for 48 hours at 20°C with continuous shaking, afterwards the microplates were imaged to assess the effects of paraquat exposure on nematode development.

Naming conventions

Several functions in the easyXpress package require specific naming conventions to work properly. For full details regarding essential file naming and directory structure see the package repository (https://github.com/AndersenLab/easyXpress). Importantly, when using the Metadata module in CellProfiler to extract information describing your images, specific column names are suggested (Table 1).

Table 1. Suggested naming conventions for CellProfiler metadata.

Image_FileName_RawBF Image_PathName_RawBF Metadata_Date Metadata_Experiment Metadata_Plate Metadata_Magnification Metadata_Well
20191119-growth-p05-m2X_C03.TIF /CellProfiler/example/raw_images 20191119 growth p05 m2X C03
20191119-growth-p06-m2X_C09.TIF /CellProfiler/example/raw_images 20191119 growth p06 m2X C09
20191119-growth-p09-m2X_C06.TIF /CellProfiler/example/raw_images 20191119 growth p09 m2X C06

The naming of “Metadata_Plate” and “Metadata_Well” are essential to the setflags(), viewPlate(), viewWell(), and viewDose() functions. Additionally, “Image_fileName_RawBF” and “Image_PathName_RawBF” are necessary for the proper function of viewDose().

Results

Design and implementation

The easyXpress package is designed to be simple and accessible to users familiar with the R environment. The easyXpress package comprises nine functions for reading, processing, and visualizing large high-throughput image-based datasets acquired from microplate-based assays processed with CellProfiler (Fig 1). Because our software is built to handle CellProfiler data as input, we suggest users review the overview and applications of CellProfiler as a prerequisite description of data generation [7]. Below, we describe the workflow for users to analyze their image data with easyXpress.

Fig 1. easyXpress workflow.

Fig 1

The suggested workflow for using the easyXpress package starts with raw data generated from CellProfiler. For more information on implementing CellProfiler to generate data, see https://github.com/CellProfiler and https://github.com/AndersenLab/CellProfiler. Reading steps are shown in blue, processing steps are shown in green, and visualization steps are shown in yellow.

Data import and model assignment

To read in CellProfiler data files, we provide readXpress(). Measurements calculated by CellProfiler can be exported in a comma-separated value (csv) file and accessed using readXpress(). For large-scale, high-throughput experiments, users can employ a computing cluster for increased analysis speed (https://github.com/AndersenLab/CellProfiler). In this case, CellProfiler data stored in.RData format is accessed using readXpress(). Additionally, the function can optionally import a design file created by the user containing experimental treatments and conditions. This design file is joined to the CellProfiler data and output as a single dataframe.

CellProfiler’s WormToolbox detects and measures the phenotypes of individual animals based on user-calibrated models of variability in body size and shape [7]. To effectively detect animals in a mixed-stage population, multiple worm models must be used. However, using multiple worm models creates a one-to-many relationship between real animals and their measured phenotype (S1 Fig). We have included the function modelSelection() to annotate this information for downstream analysis. In instances where multiple worm model objects are assigned to a single primary object, modelSelection() will identify the best fitting model. Models are first ranked by frequency in the dataset such that the smallest model is classified as the most frequently occurring and the largest model is the least frequently occurring. In our experience, the most frequently occuring model in the dataset has the smallest size because it is often repeatedly assigned to a single primary object. Conversely, the least frequently occurring model in the dataset has the largest size as it is able to define the entire size of an animal, and is thus assigned to a primary object only once. The largest ranked model occuring within a single primary object is then selected as the best fitting model for that primary object. If necessary, modelSelection() will also specify whether the selected model object was repeatedly assigned to the same primary object and flag this event as a cluster. This problem occurs in instances where a model object is repeatedly assigned to a single primary object. If the largest model object is found to occur repeatedly in a single primary object, this model will be selected and a cluster flag will be added (S1C Fig). The modelSelection() step is essential to resolve cases where multiple instances of a selected model object are assigned to a single primary object, thus contributing to inaccurate phenotype measurements.

Data pruning and summarization

Once the data are read into the R statistical environment, it is crucial to optimize data quality before in-depth analysis. Uneven well illumination can hinder the performance of CellProfiler’s object identification and phenotype extraction. Despite correcting for uneven illumination within a well, discerning foreground objects from background can be especially challenging near the periphery of the well and can add noise to nematode phenotype data (S2 Fig). The function edgeFlag() was written to identify and flag animals located near the edge of circular wells using the centroid coordinates of the selected model object. By default, the function sets the radius of even illumination from the image center to 825 pixels, but this parameter can be adjusted by the user to serve project-specific analyses.

We also developed setFlags() in conjunction with edgeFlag() to further address data points that deviate from the expected animal measurements. The function setFlags() takes the output of edgeFlags() and detects outlier measurements among all measurements within a well using Tukey’s fences [10]. By default, outlier calculations are performed by excluding data identified by modelSelection() as part of a cluster as well as data in close proximity to the well edge. However, setFlags() is customizable, allowing the user to specify which filters to include. edgeFlag() and setFlags() were designed to allow for analysis-specific optimization when handling various experimental datasets. This flexibility in data filtering makes easyXpress extensible to many unique projects.

Once data are adequately flagged, the function process() organizes the data into a list containing four elements: raw data, processed data, and summaries for both datasets. The raw data element is the CellProfiler data following modelSelection() and flag annotation. The processed data are generated by default after subsequent removal of all cluster, edge, and outlier flags. If a user includes data annotated as clusters or edge cases in setFlags(), cluster and edge cases will be retained in the processed data output. Finally, it is often useful to summarize data by well to interpret patterns specific to experimental variables. Alternatively, measurements may be summarized by other experimental factors according to the individual experimenter’s plate design. process() aids in the summarization of both the raw and processed data elements. This function comprehensively calculates the means, variances, quantiles, minimum, and maximum values of animal length for any experimental unit (e.g. well). We have also included the wrapper function Xpress() to accelerate the import and processing of CellProfiler data. Xpress() will perform the above functions with all default settings, but a user can alter input arguments to better suit project specific needs.

Visualization

The easyXpress package provides several plotting functions to allow users to explore the data through detailed and elegant visualizations. After data summarization, it is often useful to inspect the values of the summary statistics in order to recognize patterns or identify potential outlier data. We provide viewPlate() to assist with the visualization of mean animal length within each well across a microtiter plate (Fig 2). This function accepts either raw or processed data to generate an interactive plot that allows users to scan across a plate to determine the number of objects present within individual wells.

Fig 2. Example plot generated by viewPlate().

Fig 2

Well-wise plot of mean animal length (μm) from the summarized processed data. Interactive feature enables the assessment of the number of animals per well.

To complement the top-level data visualization provided by viewPlate(), we have included viewWell() to allow users to deeply explore data within individual wells. This function generates a plot of the well image following CellProfiler analysis with all objects annotated with their assigned class (Fig 3). Additionally, viewWell() can optionally generate a boxplot of the length values for each object. This plotting function is especially useful because it enables rapid qualitative assessment of object classification performance. By overlaying the model object classifications on the well image, users can quickly determine whether CellProfiler classified objects as expected or whether errors in model selection or data flags occurred.

Fig 3. Example plots generated by viewWell().

Fig 3

The function viewWell() facilitates the exploration of data within an individual well. Well images displaying easyXpress raw (A) and processed (B) data are annotated with the location of each model object centroid (circles) and are colored by object class in the legend (left). Animals are outlined in different colors to indicate the model object(s) identified for each primary object (see S1 Fig). The length of each object is displayed as a boxplot (right). Well edge circumference defined by the function edgeFlag() is shown in red.

Lastly, we have developed the function viewDose() to allow for the visualization of dose response data. C. elegans are often used to study conserved responses to various compounds [1115]. viewDose() allows a user to visually examine the effect of a compound on animal size and shape over a range of concentrations (Fig 4). By specifying the strain and compound of interest, a plot of representative wells will be generated that includes labels for each identified object.

Fig 4. Example plots generated by viewDose().

Fig 4

The function viewDose() plots representative raw (A) or processed (B) well images with objects annotated by model class for each dose of a selected drug and strain. The length measurements of raw (C) and processed (D) are also shown.

Application to C. elegans growth data

We evaluated easyXpress using data collected from a C. elegans growth experiment [16]. Animals were imaged throughout the entire life cycle, beginning at the first larval (L1) stage and continuing until adulthood. Images were then processed with CellProfiler’s WormToolbox and analyzed using easyXpress. During the implementation of easyXpress, four unique worm models representing C. elegans life stages were calibrated and applied: L1, L2/L3, L4, and Adult. These worm models do not designate stage assignments but rather represent the approximate sizes of animals that fall within the respective age groups (S1 Fig). The function modelSelection() assigned the appropriate model object to animals at each life stage, edgeFlag() and setFlags() identified outlier data points, and viewWell() provided clear visualizations of both the processed (Fig 5) and raw (S3 Fig) data.

Fig 5. easyXpress applied to C. elegans growth data.

Fig 5

A subset of well images acquired during C. elegans development displaying easyXpress processed data are shown here. Images taken at (A) 9 hours indicating the L1 stage, (B) 28 hours indicating the L2/L3 stage, (C) 46 hours indicating the L4 stage, and (D) 63 hours indicating the adult stage were analyzed with CellProfiler using four worm models. The easyXpress workflow was then used to process and visualize the data. The length of each object identified after processing is shown in (E).

Conclusions

The easyXpress package presents an organized workflow for managing C. elegans phenotype data generated using CellProfiler. This package provides tools for the reading, processing, and visualization of these data in a simple and efficient way. By leveraging existing R infrastructure, easyXpress enables reproducible analysis, integration with other statistical R packages, and extensibility to many research projects using an open-source analysis pipeline.

Supporting information

S1 Fig. Multiple model objects assigned to a single primary object.

When running CellProfiler’s WormToolbox with multiple worm models, multiple model objects can be assigned to a single primary object (real animal). Different colors are used to outline each worm model object. Here four unique models were used: L1, L2/L3, L4, and Adult. These worm models represent the approximate sizes of animals at each life stage. For example, some mutant or diverse wild genetic backgrounds might have differently sized adult animals as compared to the laboratory-adapted N2 strain. We have included this “soft matching” to account for small differences in the sizes of developmental stages across different genetic backgrounds, laboratories, and environmental conditions. (A) An animal detected by CellProfiler as a primary object has been assigned three unique worm models: two L1 model objects, one L2/L3 model object, and one L4 model object. modelSelection() classifies this animal as an L4 model object. (B) An animal detected as a primary object has been assigned four unique worm models: three L1 model objects, two L2/L3 model objects, one L4 model object, and one Adult model object. Here, modelSelection() identifies the Adult model as the best fitting model object. (C) An animal detected as a primary object has been assigned two unique worm models: three L1 model objects, and two L2/L3 model objects. In this case, modelSelection() classifies this animal as an L2/L3 model object and adds a cluster flag annotation to indicate the repeated assignment of the selected model object to the primary object.

(TIFF)

S2 Fig. Uneven illumination along well edge hinders CellProfiler’s ability to segment animals from background.

(A) Left is raw intensity values across well. (B) Right is with background correction. Intensities of object illumination are displayed on each z-axis. Objects near the edge of the well (y < 500 and y > 1500) have similar raw detected intensities (int) to more medial objects (y ~ 1000) in (A) but lower corrected intensities in (B) because of uneven background correction. Raw and background-corrected image segments are displayed in (C). Notice animals on the edges of the well do not stand out from the background as much as animals in the center of the well and therefore are more challenging to discern.

(TIFF)

S3 Fig. Raw data from C. elegans growth experiment displayed by the function viewWell().

Similar to Fig 5, well images taken at (A) 9 hours indicating the L1 stage, (B) 28 hours indicating the L2/L3 stage, (C) 46 hours indicating the L4 stage, and (D) 63 hours were analyzed. Here, the raw data results are displayed. The length of each identified object identified is shown in (E).

(TIFF)

Acknowledgments

We would like to thank members of the Andersen laboratory for their helpful suggestions and feedback developing easyXpress.

Data Availability

All data are contained within the manuscript and are also available in a public repository: https://github.com/AndersenLab/easyXpress.

Funding Statement

J.N. and E.C.A. received support from the NSF-Simons Center for Quantitative Biology at Northwestern University (awards Simons Foundation/SFARI 597491-RWC and the National Science Foundation 1764421). This project and equipment was funded by an NIH grant (ES029930) from the National Institute of Environmental Health Sciences to E.C.A.

References

  • 1.Swedlow JR. Innovation in biological microscopy: current status and future directions. Bioessays. 2012;34: 333–340. doi: 10.1002/bies.201100168 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Cassidy PJ, Radda GK. Molecular imaging perspectives. J R Soc Interface. 2005;2: 133–144. doi: 10.1098/rsif.2005.0040 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Sulston JE, Horvitz HR. Post-embryonic cell lineages of the nematode, Caenorhabditis elegans. Dev Biol. 1977;56: 110–156. doi: 10.1016/0012-1606(77)90158-0 [DOI] [PubMed] [Google Scholar]
  • 4.White JG, Southgate E, Thomson JN, Brenner S. The structure of the nervous system of the nematode Caenorhabditis elegans. Philos Trans R Soc Lond B Biol Sci. 1986;314: 1–340. doi: 10.1098/rstb.1986.0056 [DOI] [PubMed] [Google Scholar]
  • 5.Teotónio H, Estes S, Phillips PC, Baer CF. Experimental Evolution with Nematodes. Genetics. 2017;206: 691–716. doi: 10.1534/genetics.115.186288 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kim W, Underwood RS, Greenwald I, Shaye DD. OrthoList 2: A New Comparative Genomic Analysis of Human and Caenorhabditis elegans Genes. Genetics. 2018;210: 445–461. doi: 10.1534/genetics.118.301307 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wählby C, Kamentsky L, Liu ZH, Riklin-Raviv T, Conery AL, O’Rourke EJ, et al. An image analysis toolbox for high-throughput C. elegans assays. Nat Methods. 2012;9: 714–716. doi: 10.1038/nmeth.1984 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Team RC. R: A Language and Environment for Statistical Computing http://www.R-projectorg. 2014. [Google Scholar]
  • 9.Boyd WA, Smith MV, Freedman JH. Caenorhabditis elegans as a Model in Developmental Toxicology. In: Harris C, Hansen JM, editors. Developmental Toxicology: Methods and Protocols. Totowa, NJ: Humana Press; 2012. pp. 15–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Tukey JW, Others. Exploratory data analysis. Reading, Mass.; 1977. [Google Scholar]
  • 11.Zdraljevic S, Fox BW, Strand C, Panda O, Tenjo FJ, Brady SC, et al. Natural variation in C. elegans arsenic toxicity is explained by differences in branched chain amino acid metabolism. 2019. [cited 5 May 2021]. doi: 10.7554/eLife.40260 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Evans KS, Andersen EC. The Gene scb-1 Underlies Variation in Caenorhabditis elegans Chemotherapeutic Responses. G3 Genes|Genomes|Genetics. 2020;10: 2353–2364. doi: 10.1534/g3.120.401310 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dilks CM, Hahnel SR, Sheng Q, Long L, McGrath PT, Andersen EC. Quantitative benzimidazole resistance and fitness effects of parasitic nematode beta-tubulin alleles. Int J Parasitol Drugs Drug Resist. 2020;14: 28–36. doi: 10.1016/j.ijpddr.2020.08.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Wit J, Rodriguez BC, Andersen EC. Natural variation in Caenorhabditis elegans responses to the anthelmintic emodepside. Int J Parasitol Drugs Drug Resist. 2021;16: 1–8. doi: 10.1016/j.ijpddr.2021.04.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Andersen EC, Shimko TC, Crissman JR, Ghosh R, Bloom JS, Seidel HS, et al. A Powerful New Quantitative Genetics Platform, Combining Caenorhabditis elegans High-Throughput Fitness Assays with a Large Collection of Recombinant Strains. G3. 2015;5: 911–920. doi: 10.1534/g3.115.017178 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Nyaanga J, Goss C, Zhang G, Ahmed HN, Andersen EJ, Miller IR, et al. Physical constraints on growth dynamics guide C. elegans developmental trajectories and animal shape. bioRxiv. 2021. p. 2021.04.01.438121. doi: 10.1101/2021.04.01.438121 [DOI] [Google Scholar]

Decision Letter 0

Heidi A Tissenbaum

12 Jul 2021

PONE-D-21-15039

easyXpress: An R package to analyze and visualize high-throughput C. elegans microscopy data generated using CellProfiler

PLOS ONE

Dear Dr. Andersen,

Thank you for submitting your manuscript to PLOS ONE. THANK YOU FOR YOU PATIENCE! After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by  AUGUST 27, 2021. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Heidi A. Tissenbaum

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: No

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Authors has developed a package easyXpress based on R-Statistical Environment platform that can facilitate the analysis with CellProfiler. This study has potential and could be beneficiary for many scientists looking for a solution. However, I find it difficult to assess the methods that they have developed due to limitation posed with unavailability of the code as well as statistical description of functions. In addition, I have few questions to authors as follows.

Q.1 : Authors mentioned that the code is available on github, however, I was unable to see the code or validate that it is working in the way it is proposed.

Q.2 Many place it is written to refer the manual however manual is not accessible, and information is not provided in the manuscript. My suggestion is, since it is package -based article, authors must provide the code to validate their claim or write full description in the manuscript.

Q.3 Authors mention that they have developed some functions to customize the filtering of noise in data, however no description or statistics underlying the function is explained anywhere in manuscript.

Q.4 I am unable to find the explanation/statistics for summarization in methods though it is mention.

Reviewer #2: This paper presents a piece of software meant to aid in the management and analysis of data first examined by CellProfiler. Beyond offering a method to visualize the outputs of CellProfiler, with the option of overlaying annotations, it provides further functionality in terms of refining annotations and removal of C. elegans deemed as outliers by custom filtering routines. Flexible control of functions and hyperparameters make this software amenable to an array of imaging situations. The contributions of this software, although limited when compared to a package such as CellProfiler itself, seem meaningful as it automates processes that are otherwise menial and time consuming. Further, inter-experimental comparisons will benefit from the consistency offered by using this software to manage how outlier removal occurs within a dataset. This paper makes meaningful contributions to the field of study however, there remains a few outstanding questions after reading the paper. Those questions can be found below.

Lines 130-143 begin to describe the modelSelection function, however, some items remain unclear. Based on the description starting at line 136, it seems to be that modelSelection would select the most infrequent model as the best model fit. Given that this is the correct interpretation, it seems to run against intuition where if one were to rely on model frequency alone to clear up cases of ambiguous labeling, the most frequent model may be better suited. If the least frequent is indeed the model that is chosen in cases of ambiguity, it would be helpful to elaborate on why this is the case. If this was understood incorrectly, it would be helpful to expand upon this sentence and provide further details on how models are ranked/selected.

The following sentence starting on line 139 raised another question. It seemed to be in the previous sentences that a "primary object" was referencing a C. elegans however, the statement "modelSelection() will also specify whether the selected model object was repeatedly assigned to the same primary object..." makes this conclusion seem unlikely. It is understood that a single C. elegans could be assigned multiple instances of different models however, it is unclear how a C. elegans could be assigned multiple instances of the exact same model. Due to this confusion, it would be helpful to have a statement on either i) what a primary object is in reference to, if it is not a C. elegans, or ii) in what cases one would expect a C. elegans to receive duplicates of the same model label.

The authors indicate that the easyXpress will help with necessary statistical tools for quality control and provide quantitative analysis of high-throughput imaging datasets. Although Figure 4 indicates a study using paraquat on worm measurements, the authors did not provide many details neither present quantitative results from this data. It is advisable to indicate the experimental details in the method section and show a detailed statistical analysis of this data. To show the full power of this platform, the author should represent the quantitative data of the study, worm body sizes, and the necessary statistical analysis.

Figure 3 identifies a few animals as L4_model while they appear to be of similar size as the adult animals as per the box plot on the right. Were these animal images confirmed with high-resolution imaging or manual inspection? If these annotations were erroneous, the authors are encouraged to discuss possible reasons for this error and whether it will affect the measurements.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Decision Letter 1

Heidi A Tissenbaum

2 Aug 2021

easyXpress: An R package to analyze and visualize high-throughput C. elegans microscopy data generated using CellProfiler

PONE-D-21-15039R1

Dear Dr.Eric Amdersom,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

As you can see the reviewer's were pleased with the revisions. Congratulations and thanks for your patience!

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Heidi A. Tissenbaum

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Authors have answered all my questions and made available the code accessible. I have no further question.

Reviewer #2: All comments have been addressed. The paper is well written and provides a useful information to the community of C. elegans researchers running high-throughout screens.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Adela Ben-Yakar

Acceptance letter

Heidi A Tissenbaum

4 Aug 2021

PONE-D-21-15039R1

easyXpress: An R package to analyze and visualize high-throughput C. elegans microscopy data generated using CellProfiler

Dear Dr. Andersen:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Heidi A. Tissenbaum

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Multiple model objects assigned to a single primary object.

    When running CellProfiler’s WormToolbox with multiple worm models, multiple model objects can be assigned to a single primary object (real animal). Different colors are used to outline each worm model object. Here four unique models were used: L1, L2/L3, L4, and Adult. These worm models represent the approximate sizes of animals at each life stage. For example, some mutant or diverse wild genetic backgrounds might have differently sized adult animals as compared to the laboratory-adapted N2 strain. We have included this “soft matching” to account for small differences in the sizes of developmental stages across different genetic backgrounds, laboratories, and environmental conditions. (A) An animal detected by CellProfiler as a primary object has been assigned three unique worm models: two L1 model objects, one L2/L3 model object, and one L4 model object. modelSelection() classifies this animal as an L4 model object. (B) An animal detected as a primary object has been assigned four unique worm models: three L1 model objects, two L2/L3 model objects, one L4 model object, and one Adult model object. Here, modelSelection() identifies the Adult model as the best fitting model object. (C) An animal detected as a primary object has been assigned two unique worm models: three L1 model objects, and two L2/L3 model objects. In this case, modelSelection() classifies this animal as an L2/L3 model object and adds a cluster flag annotation to indicate the repeated assignment of the selected model object to the primary object.

    (TIFF)

    S2 Fig. Uneven illumination along well edge hinders CellProfiler’s ability to segment animals from background.

    (A) Left is raw intensity values across well. (B) Right is with background correction. Intensities of object illumination are displayed on each z-axis. Objects near the edge of the well (y < 500 and y > 1500) have similar raw detected intensities (int) to more medial objects (y ~ 1000) in (A) but lower corrected intensities in (B) because of uneven background correction. Raw and background-corrected image segments are displayed in (C). Notice animals on the edges of the well do not stand out from the background as much as animals in the center of the well and therefore are more challenging to discern.

    (TIFF)

    S3 Fig. Raw data from C. elegans growth experiment displayed by the function viewWell().

    Similar to Fig 5, well images taken at (A) 9 hours indicating the L1 stage, (B) 28 hours indicating the L2/L3 stage, (C) 46 hours indicating the L4 stage, and (D) 63 hours were analyzed. Here, the raw data results are displayed. The length of each identified object identified is shown in (E).

    (TIFF)

    Attachment

    Submitted filename: Response to Reviewers.pdf

    Data Availability Statement

    All data are contained within the manuscript and are also available in a public repository: https://github.com/AndersenLab/easyXpress.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES