Unlocking the potential of plant phenotyping data through integration and data-driven approaches

Frederik Coppens; Nathalie Wuyts; Dirk Inzé; Stijn Dhondt

doi:10.1016/j.coisb.2017.07.002

. 2017 Aug;4:58–63. doi: 10.1016/j.coisb.2017.07.002

Unlocking the potential of plant phenotyping data through integration and data-driven approaches

Frederik Coppens ^1,², Nathalie Wuyts ^1,², Dirk Inzé ^1,^2,^∗, Stijn Dhondt ^1,²

PMCID: PMC7477990 PMID: 32923745

Abstract

Plant phenotyping has emerged as a comprehensive field of research as the result of significant advancements in the application of imaging sensors for high-throughput data collection. The flip side is the risk of drowning in the massive amounts of data generated by automated phenotyping systems. Currently, the major challenge lies in data management, on the level of data annotation and proper metadata collection, and in progressing towards synergism across data collection and analyses. Progress in data analyses includes efforts towards the integration of phenotypic and -omics data resources for bridging the phenotype–genotype gap and obtaining in-depth insights into fundamental plant processes.

Keywords: Plant phenotyping, Data management, Data integration, Data-driven analysis

Highlights

•
Imaging methodologies used in plant phenotyping generate huge amounts of complex data.
•
The major challenge is data management: metadata collection and data annotation.
•
Implementation of standard ontologies is key to integrate data efficiently.
•
Data-driven approaches are promising to generate new scientific insights.

Introduction

During the past decade, plant phenomics has evolved from an emerging niche to a thriving research field, both in academia and industry. This can be largely attributed to the use of imaging for the non-invasive analysis of structural, physiological and performance-related plant traits [1]. Automated image analysis procedures allow substantial increases in the throughput of trait measurements, thereby countering the so-called phenotyping bottleneck, which considers phenotypic measurements the rate-limiting factor in the functional analysis of specific genotypes or the assessment of genotype performance in plant breeding [2]. Improvements in plant imaging have been accompanied by technological advancements in plant handling and camera positioning to keep up with the speed of image acquisition. Plant-to-sensor systems, utilizing conveyors and grippers to present the plant to the camera, and sensor-to-plant systems, which move the camera to the plants, have been developed in growth cabinets, chambers and greenhouses [3]. While the vast majority of the phenotyping is still done manually under field conditions, automated image acquisition always occurs in a sensor-to-plant fashion, assisted by manual or engine-driven ‘phenomobiles’, gantry systems on the ground, or unmanned aerial vehicles (UAVs) [4].

Undoubtedly, it is the development of digital image sensors that underlies this remarkable evolution in plant phenotyping. Sensitivity of the sensor for a specific part of the electromagnetic spectrum, in combination with appropriate filters, defines which traits can be extracted. Typical Red Green Blue (RGB) color sensors are sensitive to wavelengths in a range from 400 to 1000 nm. Most color cameras provide an infrared (IR) cut-off filter for imaging specifically in the visible spectrum, but without this filter, they allow near-IR imaging, and as such image acquisition of plants in the dark 5, 6. Indium gallium arsenide (InGaAs) sensors show a spectral response to a range from approximately 900 to 1700 nm. These sensors are used in Short Wave InfraRed (SWIR) cameras, which can be adopted for the measurement of water content in plants [7]. Long Wave Infrared (LWIR) sensors with a spectral range of 3–14 μm, on the other hand, are used for thermal imaging of shoots as a proxy for stomatal conductance or water use behavior in general [8].

The use of advanced imaging systems has drastically increased the volume of data from a couple of bytes, e.g. manually scored traits in a spreadsheet, to several megabytes (MB) or sometimes more than 100 MB, e.g. in the case of hyperspectral imaging or scene characterization by means of video capture. Data are also stored in a myriad of formats on diverse types of media ranging from a researcher's hard drive to local server stations or in “the cloud”. Proper annotation of data to ensure their continued relevance after acquisition is thus essential. Furthermore, because the plant's phenotype is the result of a strong interaction between its genotype and the environment in which it grows (G × E) [9], plant phenotyping efforts should include the logging of environmental conditions, which in turn requires the collection of metadata on the sensors in use. Because of the tremendous amounts and diversity of data produced within the plant phenotyping research field, data management, storage and analysis are currently considered as the major challenges. On the other hand, large datasets may also create opportunities for data modeling and machine learning towards “Big Data” analyses.

Data management to enable data integration

The current technologies and methods used in plant phenotyping generate a huge amount of complex, unstructured “Big Data”, which can give the impression that a lot of the phenotype data might not be retrieved anymore [10]. In first instance, phenotypic data management requires the use of ontology terms for the unique and repeatable annotation of data in order to ensure their persistence in view of traceability and reuse under the form of data sharing and meta-analyses. The use of ontologies therefore promotes synergism. Moreover, in contrast to repositories such as the European Nucleotide Archive (ENA) [11] or Sequence Read Archive (SRA) [12] for sequencing data, there is currently no central, structured repository for phenotyping data or metadata. Although data can be uploaded to general purpose repositories such as Zenodo (https://zenodo.org/), FigShare (https://figshare.com) and Dryad (http://datadryad.org), these do not provide services to facilitate the description of, access to and integration of data. As a consequence of the lack of a central repository, advanced data mining and discovery depends on the error-prone scavenging of scientific literature. As a consequence, a plethora of resources has been developed by individual research groups and consortia, ranging from resources dedicated to one species or one type of phenotyping system to more generic platforms allowing the integration of several data types. AraPheno provides a central repository of population-scale phenotypes for Arabidopsis accessions [13], whereas the Plant Genomics and Phenomics (PGP) research data repository is an infrastructure to comprehensively publish plant research data covering cross-domain datasets [14]. The Phenomics Ontology Driven Data (PODD) repository was developed to handle and distribute phenotyping data and metadata from Australian facilities [15]. ClearedLeaves DB functions as an online database of cleared plant leaf images [16]. Phenopsis DB is an information system for sharing data generated by the PHENOPSIS plant phenotyping platform [17] and PhenoFront is a web-server front end to the LemnaTec Phenotyper platform [18]. Whereas BreeDB hosts datasets of tomato and potato populations (https://www.eu-sol.wur.nl), Genoplante Information System (GnpIS) is a multispecies integrative information system dedicated to plant and fungi pests, bridging genetic and genomic data [19]. This non-exhaustive list illustrates the variety of available resources, which in some cases, provide the data for download and further analysis.

Many of these data resources have been built to organize a huge amount of collected phenotypic data. In the light of high-throughput phenotyping, there is a need for managing the data at the moment it is being generated (Figure 1). Besides data derived from experiments, provisions are made for metadata related to the environment sensors in use, and to the imaging sensors themselves, including the type of sensor, the camera systems and their optical properties. The latter are required for image analysis, whereas the whole ensures traceability and quality insurance. These functionalities are built-in in PIPPA, the PSB (Plant Systems Biology) Interface for Plant Phenotype Analysis (https://pippa.psb.ugent.be), a web-based framework for the analysis, visualization and management of phenotypic data, which enables biologists to perform dedicated image processing and (statistical) analyses of data generated by Weighing, Imaging and Watering Machine (WIWAM) phenotyping platforms or of externally imported data. Frameworks harboring comparable functionalities include Integrated Analysis Platform (IAP), and Plant Computer Vision (PlantCV) 18, 20.

**A systems biology approach in phenotypic data management**. A scientific hypothesis leads to new experiments including image-based plant phenotyping or other -omics approaches. Active vision systems can directly feedback into the image acquisition. Image acquisition features like the spatial and temporal resolution can also be optimized after data analysis. Sanity checks on the generated data help to quickly validate the image analysis. The analyzed data and images are saved along with the metadata and the experimental design in a dedicated data repository. Additional value is created by the integration of -omics data coming from private or public data resources, after which new hypotheses are generated through data-driven approaches like modeling, machine learning and meta-analysis.

Image data extraction

The advanced development of imaging in plant phenotyping enables multi-dimensional, high-throughput monitoring of plants at an increasing pace. Although numerous image analysis software tools are available for the extraction of biologically meaningful phenotypic or physiological parameters from these images 21, 22, they mainly focus on the analysis and often are disconnected from the data management part. To address this, dedicated analysis platforms have been developed: IAP [20], PlantCV [18], InfraPhenoGrid [23], OMERO [24], BisQue on CyVerse [25], and PIPPA (https://pippa.psb.ugent.be). These systems offer a user-friendly interface to a grid compute cluster that facilitates researchers without a computer science background to run image analysis pipelines. Moreover, they also cater for bioinformaticians as they are inherently flexible, allowing custom analysis pipelines through extensions or Application Programming Interfaces (APIs). These platforms ensure provenance through metadata and thus play an important role in data management. Data visualization is also an important aspect, both for reporting and interpretation, as well as for quality control of the input data (Figure 1). For example, PIPPA deploys several ‘sanity check’ algorithms to flag outliers for further inspection.

As our capacity to extract information from images increases, so do the size and complexity of the derived data and downstream analyses. Therefore, the computing infrastructure needs to keep pace. Usage of Graphical Processing Units (GPUs) has the potential to dramatically increase the efficiency of image analysis algorithms, but programming GPUs is notoriously hard. Libraries such as OpenCV (http://opencv.org) or the QUASAR programming languages [26] encapsulate the usage of GPUs. However, the availability of tools for easier analysis optimization will be important to efficiently process the vast amount of data generated.

Value creation through integration

International projects such as transPLANT (Trans-national Infrastructure for Plant Genomic Science, http://transplantdb.eu) and EPPN (European Plant Phenotyping Network, http://plant-phenotyping-network.eu) recognize the need for metadata improvement and alignment [27]. They propose the Minimal Information About a Plant Phenotyping Experiment (MIAPPE, http://www.miappe.org) as the emerging standard for the description of a phenotyping experiment. Next to source material and experimental design, MIAPPE also allows detailed description of the environmental conditions, which has been shown to be crucial for comparison and interpretation 28, 29. During the development of the standard, it became clear that available metadata, the usage and interpretation of ontologies, as well as the method of access, differ between resources. Further development through community engagement and implementation of MIAPPE as a standard will be instrumental for the integration of phenotypic data from different providers and to promote synergism.

On the systems biology side, the next step is the integration of image-derived data and various -omics datasets (Figure 1). In particular, the combined analysis of datasets that were never set out to be integrated, is a promising target for value creation. This requires a rigorous curation of input data, and more importantly, a harmonization of metadata. The use of different measurement methodologies which are not inherently interoperable makes this a challenging task, but efforts to map this will contribute to an increased alignment in the future. The BioSamples database serves as a central hub for metadata, which allows to link these different data types and provides a query interface and computational access through an API [30]. The Breeding API (BrAPI, http://docs.brapi.apiary.io) specifies such an interface for phenotype and genotype databases and is emerging as the standard in the field. Community-wide adoption of these technologies is essential for the identification of relevant data and an efficient data integration. Currently, the number of publicly available datasets that can be readily integrated is limited.

This adoption constitutes the crucial next step and challenge for the plant community to make all data Findable, Accessible, Interoperable and Re-usable (FAIR) [31], both by humans and computer systems. ELIXIR, an infrastructure aimed at coordinating and integrating bioinformatics resources, recognized this challenge and has put this forward as one of the use cases in the H2020 ELIXIR-EXCELERATE project (https://www.elixir-europe.org/excelerate/plants).

Data-driven approaches aid in hypothesis generation

The speed of data generation in plant phenotyping has reached such a level that the question can be raised whether data-driven approaches can replace traditional hypothesis-driven analyses. The vast amount of data may indeed provide us with new insights, for example by means of machine learning approaches in data analysis [32]. Machine learning allows the development of algorithms that can learn from a dedicated training set and make decisions on newly presented data. Data associations can then be uncovered, which may lead to new insights and further developments in fields such as marker-assisted breeding. Machine learning has been applied for the identification, classification, quantification and prediction of plant stress, in which each level builds on the previous one. As an example, disease symptoms of three Alternaria species have been classified in oilseed rape based on thermal and hyperspectral imaging [33]. Also the severity of Verticillium wilt in olives has been quantified using these imaging technologies [34]. One should, however, realize that the resulting procedure will only be as good as the used training dataset. The number of publicly available datasets is currently rather limited for this to be widely applicable. In particular, advancing from associations to causal relations requires specifically designed experiments, e.g. detailed time series.

Nowadays, machine learning also has its role at the level of image analysis [35]. Deep learning approaches can automatically determine useful features for image classification, deciding on whether an image patch contains a specific plant part, such as a root tip or wheat ears [36]. Such algorithms can help in the localization of these plant parts in entire images. Furthermore, machine learning can also aid in the segmentation of plants from their background, as exemplified for maize shoots [37]. With further advances and data availability, machine learning will undoubtedly prove to be a valuable resource to generate new hypotheses (Figure 1).

Images and plant sensors provide “Big Data” information about the structure and function of whole-plants and plant organs throughout development. These data form the basis of functional-structural plant models (FSPM) that describe the development and physiology of growing plants over time. Furthermore, transcriptomic, metabolomic, proteomic, and possibly other -omics data continue to reveal potential control mechanisms in regulatory nodes of plant growth by providing insights into the molecular basis underlying major events during plant development. The integration of molecular networks into whole-plant level models allows the simulation of environmental and genetic perturbations [38], enabling a data-driven systems biology approach to advance our insights into plant growth and development (Figure 1). Furthermore, for a more applied point of view, such integrated, multi-scale FSPMs will need to be validated across genotypes and field environments and ultimately could form the basis of what we define as ‘prescription agriculture’. Plant image analysis, sensors and possible biomarkers could be used to alert the farmer that crops experience less optimal conditions and the FSPM model will provide a decision tool to predict the potential yield gained by the application of extra resources, such as irrigation and nitrogen fertilization.

Future challenges and perspectives

Deep integration of image analysis in high-throughput phenotyping will allow for on-the-fly feedback and decision-making. As such, image analysis can assist in the optimization of information generation already during an experiment, rather than weeks or even months after it is finished. Active vision systems allow for the repositioning of the camera/object in such a way that most additional information can be extracted from an image at its new position (Figure 1) [39]. Such technologies have the potential to reduce the amount of data capture and requirements for data storage and analysis, while ensuring and increasing the relevance of what is generated.

Hence, image analysis technologies can pave the way for an agile systems biology approach that guides researchers to create value, for example the combination of feature selection and growth modeling supporting the biological interpretation of plant growth and stress tolerance in barley [40]. New and highly repeatable traits, such as maximum growth rate and stress elasticity, which are related to these complex agronomic phenotypes, have permitted the identification of stable QTLs controlling their expression.

Integration of available datasets holds much potential to further deepen our knowledge. However, the amount of data readily available for such a meta-analysis within one resource is often insufficient to come up with strong conclusions or to provide solid evidence for a specific hypothesis. Therefore, linking of several data resources across phenotyping platforms to enable large-scale meta-analyses, would be a major step forward in data integration. This is envisioned within the ELIXIR-EXCELERATE project, which aims to annotate datasets and make phenotypic databases discoverable and interoperable through usage of ontologies and a standardized API (https://www.elixir-europe.org/excelerate/plants).

The main challenge is to engage with the broad plant phenotyping community, across academia and industry, to converge on these standards for the description and access of the vast amount of currently distributed data. The future of plant phenotyping lies in synergism, as the comprehensive integration and analysis of this “Big Data” allow to unravel the biological processes governing plant growth and development, and to advance plant breeding for much-needed climate-resilient and high-yielding crops.

Acknowledgements

We thank Pascal Braun and Marc Vidal for their invitation to contribute to this issue. We also thank Annick Bleys for her help in preparing the manuscript. This work was supported by funding from the European Research Council under the European Community's Seventh Framework Programme [FP7/2007–2013] under ERC grant agreement n° [339341-AMAIZE]11, from Ghent University (“Bijzonder Onderzoeksfonds Methusalem project” no. BOF08/01M00408), from ELIXIR-EXCELERATE, which is funded by the European Commission within the Research Infrastructures programme of Horizon 2020, grant agreement number 676559, and from Research Foundation-Flanders for a postdoctoral fellowship to S.D.

This review comes from a themed issue on Big data acquisition and analysis (2017)

Edited by Pascal Falter-Braun and Michael A. Calderwood

References

1.Dhondt S., Wuyts N., Inzé D. Cell to whole-plant phenotyping: the best is yet to come. Trends Plant Sci. 2013;18:428–439. doi: 10.1016/j.tplants.2013.04.008. [DOI] [PubMed] [Google Scholar]
2.Furbank R.T., Tester M. Phenomics – technologies to relieve the phenotyping bottleneck. Trends Plant Sci. 2011;16:635–644. doi: 10.1016/j.tplants.2011.09.005. [DOI] [PubMed] [Google Scholar]
•3.Fahlgren N., Gehan M.A., Baxter I. Lights, camera, action: high-throughput plant phenotyping is ready for a close-up. Curr Opin Plant Biol. 2015;24:93–99. doi: 10.1016/j.pbi.2015.02.006. [DOI] [PubMed] [Google Scholar]; A recent review discussing platform design, imaging modalities, and image data extraction and management in high-throughput plant phenotyping
4.Araus J.L., Cairns J.E. Field high-throughput phenotyping: the new crop breeding frontier. Trends Plant Sci. 2014;19:52–61. doi: 10.1016/j.tplants.2013.09.008. [DOI] [PubMed] [Google Scholar]
5.Dhondt S., Gonzalez N., Blomme J., De Milde L., Van Daele T., Van Akoleyen D., Storme V., Coppens F., Beemster G.T.S., Inzé D. High-resolution time-resolved imaging of in vitro Arabidopsis rosette growth. Plant J. 2014;80:172–184. doi: 10.1111/tpj.12610. [DOI] [PubMed] [Google Scholar]
6.Apelt F., Breuer D., Nikoloski Z., Stitt M., Kragler F. Phytotyping4D: a light-field imaging system for non-invasive and accurate monitoring of spatio-temporal plant growth. Plant J. 2015;82:693–706. doi: 10.1111/tpj.12833. [DOI] [PubMed] [Google Scholar]
7.Munns R., James R.A., Sirault X.R.R., Furbank R.T., Jones H.G. New phenotyping methods for screening wheat and barley for beneficial responses to water deficit. J Exp Bot. 2010;61:3499–3507. doi: 10.1093/jxb/erq199. [DOI] [PubMed] [Google Scholar]
8.Merlot S., Mustilli A.-C., Genty B., North H., Lefebvre V., Sotta B., Vavasseur A., Giraudat J. Use of infrared thermal imaging to isolate Arabidopsis mutants defective in stomatal regulation. Plant J. 2002;30:601–609. doi: 10.1046/j.1365-313x.2002.01322.x. [DOI] [PubMed] [Google Scholar]
9.Poorter H., Fiorani F., Pieruschka R., Wojciechowski T., van der Putten W.H., Kleyer M., Schurr U., Postma J. Pampered inside, pestered outside? Differences and similarities between plants growing in controlled conditions and in the field. New Phytol. 2016;212:838–855. doi: 10.1111/nph.14243. [DOI] [PubMed] [Google Scholar]
10.Zamir D. Where have all the crop phenotypes gone? PLoS Biol. 2013;11:e1001595. doi: 10.1371/journal.pbio.1001595. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Toribio A.L., Alako B., Amid C., Cerdeño-Tarrága A., Clarke L., Cleland I., Fairley S., Gibson R., Goodgame N., ten Hoopen P. European nucleotide archive in 2016. Nucleic Acids Res. 2017;45:D32–D36. doi: 10.1093/nar/gkw1106. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Leinonen R., Sugawara H., Shumway M., on behalf of the International Nucleotide Sequence Database Collaboration The sequence Read archive. Nucleic Acids Res. 2011;39:D19–D21. doi: 10.1093/nar/gkq1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Seren Ü., Grimm D., Fitz J., Weigel D., Nordborg M., Borgwardt K., Korte A. AraPheno: a public database for Arabidopsis thaliana phenotypes. Nucleic acids Res. 2017;45:D1054–D1059. doi: 10.1093/nar/gkw986. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Arend D., Junker A., Scholz U., Schüler D., Wylie J., Lange M. PGP repository: a plant phenomics and genomics data publication infrastructure. Database. 2016;2016 doi: 10.1093/database/baw033. baw033. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Li Y.-F., Kennedy G., Davies F., Hunter J. PODD: an ontology-driven data repository for collaborative phenomics research. Lect Notes Comput Sci. 2010;6102:179–188. [Google Scholar]
16.Das A., Bucksch A., Price C.A., Weitz J.S. ClearedLeavesDB: an online database of cleared plant leaf images. Plant Methods. 2014;10:8. doi: 10.1186/1746-4811-10-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Fabre J., Dauzat M., Nègre V., Wuyts N., Tireau A., Gennari E., Neveu P., Tisné S., Massonnet C., Hummel I. PHENOPSIS DB: an Information System for Arabidopsis thaliana phenotypic data in an environmental context. BMC Plant Biol. 2011;11:77. doi: 10.1186/1471-2229-11-77. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Fahlgren N., Feldman M., Gehan M.A., Wilson M.S., Shyu C., Bryant D.W., Hill S.T., McEntee C.J., Warnasooriya S.N., Kumar I. A versatile phenotyping system and analytics platform reveals diverse temporal responses to water availability in Setaria. Mol Plant. 2015;8:1520–1535. doi: 10.1016/j.molp.2015.06.005. [DOI] [PubMed] [Google Scholar]
19.Steinbach D., Alaux M., Amselem J., Choisne N., Durand S., Flores R., Keliet A.-O., Kimmel E., Lapalu N., Luyten I. GnpIS: an information system to integrate genetic and genomic data from plants and fungi. Database. 2013;2013 doi: 10.1093/database/bat058. bat058. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Klukas C., Chen D., Pape J.-M. Integrated Analysis Platform: an open-source information system for high-throughput plant phenotyping. Plant Physiol. 2014;165:506–518. doi: 10.1104/pp.113.233932. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Lobet G., Draye X., Périlleux C. An online database for plant image analysis software tools. Plant Methods. 2013;9:38. doi: 10.1186/1746-4811-9-38. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Cobb J.N., DeClerck G., Greenberg A., Clark R., McCouch S. Next-generation phenotyping: requirements and strategies for enhancing our understanding of genotype–phenotype relationships and its relevance to crop improvement. Theor Appl Genet. 2013;126:867–887. doi: 10.1007/s00122-013-2066-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Pradal C., Artzet S., Chopard J., Dupuis D., Fournier C., Mielewczik M., Nègre V., Neveu P., Parigot D., Valduriez P. InfraPhenoGrid: a scientific workflow infrastructure for plant phenomics on the Grid. Future Gener Comput Syst. 2017;67:341–353. [Google Scholar]
24.Allan C., Burel J.-M., Moore J., Blackburn C., Linkert M., Loynton S., MacDonald D., Moore W.J., Neves C., Patterson A. OMERO: flexible, model-driven data management for experimental biology. Nat methods. 2012;9:245–253. doi: 10.1038/nmeth.1896. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Kvilekval K., Fedorov D., Obara B., Singh A., Manjunath B.S. Bisque: a platform for bioimage analysis and management. Bioinformatics. 2010;26:544–552. doi: 10.1093/bioinformatics/btp699. [DOI] [PubMed] [Google Scholar]
26.Goossens B., De Vylder J., Philips W. IEEE International Conference on Image Processing (ICIP) proceedings; Paris. 2014. Quasar – a new heterogeneous programming framework for image and video processing algorithms on CPU and GPU; pp. 2183–2185. [Google Scholar]
••27.Krajewski P., Chen D., Ćwiek H., van Dijk A.D.J., Fiorani F., Kersey P., Klukas C., Lange M., Markiewicz A., Nap J.P. Towards recommendations for metadata and data handling in plant phenotyping. J Exp Bot. 2015;66:5417–5427. doi: 10.1093/jxb/erv271. [DOI] [PubMed] [Google Scholar]; Formulates the need for standardization in plant phenotyping data and proposes recommendations to improve the current situation: adherence to Minimal Information About Plant Phenotyping Experiment (MIAPPE), use of standard ontologies and data formats. The authors provide an implementation of MIAPPE in ISA-TAB.
28.Massonnet C., Vile D., Fabre J., Hannah M.A., Caldana C., Lisec J., Beemster G.T.S., Meyer R.C., Messerli G., Gronlund J.T. Probing the reproducibility of leaf growth and molecular phenotypes: a comparison of three Arabidopsis accessions cultivated in ten laboratories. Plant Physiol. 2010;152:2142–2157. doi: 10.1104/pp.109.148338. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Poorter H., Fiorani F., Stitt M., Schurr U., Finck A., Gibon Y., Usadel B., Munns R., Atkin O.K., Tardieu F. The art of growing plants for experimental purposes: a practical guide for the plant biologist. Funct Plant Biol. 2012;39:821–838. doi: 10.1071/FP12028. [DOI] [PubMed] [Google Scholar]
30.Faulconbridge A., Burdett T., Brandizi M., Gostev M., Pereira R., Vasant D., Sarkans U., Brazma A., Parkinson H. Updates to BioSamples database at European Bioinformatics Institute. Nucleic Acids Res. 2014;42:D50–D52. doi: 10.1093/nar/gkt1081. [DOI] [PMC free article] [PubMed] [Google Scholar]
••31.Wilkinson M.D., Dumontier M., Aalbersberg I.J., Appleton G., Axton M., Baak A., Blomberg N., Boiten J.-W., da Silva Santos L.B., Bourne P.E. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3 doi: 10.1038/sdata.2016.18. [DOI] [PMC free article] [PubMed] [Google Scholar]; Describes the principles and provide guidelines how data can be made FAIR (Findable, Accessible, Interoperable and Re-usable) both for machines and humans.
•32.Singh A., Ganapathysubramanian B., Singh A.K., Sarkar S. Machine learning for high-throughput stress phenotyping in plants. Trends Plant Sci. 2016;21:110–124. doi: 10.1016/j.tplants.2015.10.015. [DOI] [PubMed] [Google Scholar]; This review showcases the utility of machine learning for high-throughput data-driven plant phenotyping.
33.Baranowski P., Jedryczka M., Mazurek W., Babula-Skowronska D., Siedliska A., Kaczmarek J. Hyperspectral and thermal imaging of oilseed rape (Brassica napus) response to fungal species of the genus Alternaria. PLoS One. 2015;10:e0122913. doi: 10.1371/journal.pone.0122913. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Calderón R., Navas-Cortés J.A., Zarco-Tejada P.J. Early detection and quantification of verticillium wilt in olive using hyperspectral and thermal imagery over large areas. Remote Sens. 2015;7:5584–5610. [Google Scholar]
35.Tsaftaris S.A., Minervini M., Scharr H. Machine learning for plant phenotyping needs image processing. Trends Plant Sci. 2016;21:989–991. doi: 10.1016/j.tplants.2016.10.002. [DOI] [PubMed] [Google Scholar]
36.Pound M.P., Burgess A.J., Wilson M.H., Atkinson J.A., Griffiths M., Jackson A.S., Bulat A., Tzimiropoulos G., Wells D.M., Murchie E.H. Deep Machine Learning provides state-of-the-art performance in image-based plant phenotyping. bioRxiv. 2016 doi: 10.1101/053033. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Donné S., Luong Q., Goossens B., Dhondt S., Wuyts N., Inzé D., Philips W. Proceedings of the 25th Belgian-Dutch conference on machine learning; 12–13 September 2016. 2016. Machine learning for maize plant segmentation. [Google Scholar]
38.Wuyts N., Dhondt S., Inzé D. Measurement of plant growth in view of an integrative analysis of regulatory networks. Curr Opin plant Biol. 2015;25:90–97. doi: 10.1016/j.pbi.2015.05.002. [DOI] [PubMed] [Google Scholar]
39.Gibbs J.A., Pound M., Wells D.M., Murchie E., French A., Pridmore T. Three-dimensional reconstruction of plant shoots from multiple images using an active vision system. In: Kootstra G., Edan Y., van Henten E., Bergerman M., editors. Proceedings of the IROS workshop on agri-food robotics, October 2, 2015, Hamburg, Germany. 2015. [Google Scholar]
40.Chen D., Neumann K., Friedel S., Kilian B., Chen M., Altmann T., Klukas C. Dissecting the phenotypic components of crop plant growth and drought responses based on high-throughput image analysis. Plant Cell. 2014;26:4636–4655. doi: 10.1105/tpc.114.129601. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib1] 1.Dhondt S., Wuyts N., Inzé D. Cell to whole-plant phenotyping: the best is yet to come. Trends Plant Sci. 2013;18:428–439. doi: 10.1016/j.tplants.2013.04.008. [DOI] [PubMed] [Google Scholar]

[bib2] 2.Furbank R.T., Tester M. Phenomics – technologies to relieve the phenotyping bottleneck. Trends Plant Sci. 2011;16:635–644. doi: 10.1016/j.tplants.2011.09.005. [DOI] [PubMed] [Google Scholar]

[bib3] •3.Fahlgren N., Gehan M.A., Baxter I. Lights, camera, action: high-throughput plant phenotyping is ready for a close-up. Curr Opin Plant Biol. 2015;24:93–99. doi: 10.1016/j.pbi.2015.02.006. [DOI] [PubMed] [Google Scholar]; A recent review discussing platform design, imaging modalities, and image data extraction and management in high-throughput plant phenotyping

[bib4] 4.Araus J.L., Cairns J.E. Field high-throughput phenotyping: the new crop breeding frontier. Trends Plant Sci. 2014;19:52–61. doi: 10.1016/j.tplants.2013.09.008. [DOI] [PubMed] [Google Scholar]

[bib5] 5.Dhondt S., Gonzalez N., Blomme J., De Milde L., Van Daele T., Van Akoleyen D., Storme V., Coppens F., Beemster G.T.S., Inzé D. High-resolution time-resolved imaging of in vitro Arabidopsis rosette growth. Plant J. 2014;80:172–184. doi: 10.1111/tpj.12610. [DOI] [PubMed] [Google Scholar]

[bib6] 6.Apelt F., Breuer D., Nikoloski Z., Stitt M., Kragler F. Phytotyping4D: a light-field imaging system for non-invasive and accurate monitoring of spatio-temporal plant growth. Plant J. 2015;82:693–706. doi: 10.1111/tpj.12833. [DOI] [PubMed] [Google Scholar]

[bib7] 7.Munns R., James R.A., Sirault X.R.R., Furbank R.T., Jones H.G. New phenotyping methods for screening wheat and barley for beneficial responses to water deficit. J Exp Bot. 2010;61:3499–3507. doi: 10.1093/jxb/erq199. [DOI] [PubMed] [Google Scholar]

[bib8] 8.Merlot S., Mustilli A.-C., Genty B., North H., Lefebvre V., Sotta B., Vavasseur A., Giraudat J. Use of infrared thermal imaging to isolate Arabidopsis mutants defective in stomatal regulation. Plant J. 2002;30:601–609. doi: 10.1046/j.1365-313x.2002.01322.x. [DOI] [PubMed] [Google Scholar]

[bib9] 9.Poorter H., Fiorani F., Pieruschka R., Wojciechowski T., van der Putten W.H., Kleyer M., Schurr U., Postma J. Pampered inside, pestered outside? Differences and similarities between plants growing in controlled conditions and in the field. New Phytol. 2016;212:838–855. doi: 10.1111/nph.14243. [DOI] [PubMed] [Google Scholar]

[bib10] 10.Zamir D. Where have all the crop phenotypes gone? PLoS Biol. 2013;11:e1001595. doi: 10.1371/journal.pbio.1001595. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib11] 11.Toribio A.L., Alako B., Amid C., Cerdeño-Tarrága A., Clarke L., Cleland I., Fairley S., Gibson R., Goodgame N., ten Hoopen P. European nucleotide archive in 2016. Nucleic Acids Res. 2017;45:D32–D36. doi: 10.1093/nar/gkw1106. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib12] 12.Leinonen R., Sugawara H., Shumway M., on behalf of the International Nucleotide Sequence Database Collaboration The sequence Read archive. Nucleic Acids Res. 2011;39:D19–D21. doi: 10.1093/nar/gkq1019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib13] 13.Seren Ü., Grimm D., Fitz J., Weigel D., Nordborg M., Borgwardt K., Korte A. AraPheno: a public database for Arabidopsis thaliana phenotypes. Nucleic acids Res. 2017;45:D1054–D1059. doi: 10.1093/nar/gkw986. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] 14.Arend D., Junker A., Scholz U., Schüler D., Wylie J., Lange M. PGP repository: a plant phenomics and genomics data publication infrastructure. Database. 2016;2016 doi: 10.1093/database/baw033. baw033. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] 15.Li Y.-F., Kennedy G., Davies F., Hunter J. PODD: an ontology-driven data repository for collaborative phenomics research. Lect Notes Comput Sci. 2010;6102:179–188. [Google Scholar]

[bib16] 16.Das A., Bucksch A., Price C.A., Weitz J.S. ClearedLeavesDB: an online database of cleared plant leaf images. Plant Methods. 2014;10:8. doi: 10.1186/1746-4811-10-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib17] 17.Fabre J., Dauzat M., Nègre V., Wuyts N., Tireau A., Gennari E., Neveu P., Tisné S., Massonnet C., Hummel I. PHENOPSIS DB: an Information System for Arabidopsis thaliana phenotypic data in an environmental context. BMC Plant Biol. 2011;11:77. doi: 10.1186/1471-2229-11-77. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib18] 18.Fahlgren N., Feldman M., Gehan M.A., Wilson M.S., Shyu C., Bryant D.W., Hill S.T., McEntee C.J., Warnasooriya S.N., Kumar I. A versatile phenotyping system and analytics platform reveals diverse temporal responses to water availability in Setaria. Mol Plant. 2015;8:1520–1535. doi: 10.1016/j.molp.2015.06.005. [DOI] [PubMed] [Google Scholar]

[bib19] 19.Steinbach D., Alaux M., Amselem J., Choisne N., Durand S., Flores R., Keliet A.-O., Kimmel E., Lapalu N., Luyten I. GnpIS: an information system to integrate genetic and genomic data from plants and fungi. Database. 2013;2013 doi: 10.1093/database/bat058. bat058. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib20] 20.Klukas C., Chen D., Pape J.-M. Integrated Analysis Platform: an open-source information system for high-throughput plant phenotyping. Plant Physiol. 2014;165:506–518. doi: 10.1104/pp.113.233932. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib21] 21.Lobet G., Draye X., Périlleux C. An online database for plant image analysis software tools. Plant Methods. 2013;9:38. doi: 10.1186/1746-4811-9-38. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib22] 22.Cobb J.N., DeClerck G., Greenberg A., Clark R., McCouch S. Next-generation phenotyping: requirements and strategies for enhancing our understanding of genotype–phenotype relationships and its relevance to crop improvement. Theor Appl Genet. 2013;126:867–887. doi: 10.1007/s00122-013-2066-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib23] 23.Pradal C., Artzet S., Chopard J., Dupuis D., Fournier C., Mielewczik M., Nègre V., Neveu P., Parigot D., Valduriez P. InfraPhenoGrid: a scientific workflow infrastructure for plant phenomics on the Grid. Future Gener Comput Syst. 2017;67:341–353. [Google Scholar]

[bib24] 24.Allan C., Burel J.-M., Moore J., Blackburn C., Linkert M., Loynton S., MacDonald D., Moore W.J., Neves C., Patterson A. OMERO: flexible, model-driven data management for experimental biology. Nat methods. 2012;9:245–253. doi: 10.1038/nmeth.1896. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib25] 25.Kvilekval K., Fedorov D., Obara B., Singh A., Manjunath B.S. Bisque: a platform for bioimage analysis and management. Bioinformatics. 2010;26:544–552. doi: 10.1093/bioinformatics/btp699. [DOI] [PubMed] [Google Scholar]

[bib26] 26.Goossens B., De Vylder J., Philips W. IEEE International Conference on Image Processing (ICIP) proceedings; Paris. 2014. Quasar – a new heterogeneous programming framework for image and video processing algorithms on CPU and GPU; pp. 2183–2185. [Google Scholar]

[bib27] ••27.Krajewski P., Chen D., Ćwiek H., van Dijk A.D.J., Fiorani F., Kersey P., Klukas C., Lange M., Markiewicz A., Nap J.P. Towards recommendations for metadata and data handling in plant phenotyping. J Exp Bot. 2015;66:5417–5427. doi: 10.1093/jxb/erv271. [DOI] [PubMed] [Google Scholar]; Formulates the need for standardization in plant phenotyping data and proposes recommendations to improve the current situation: adherence to Minimal Information About Plant Phenotyping Experiment (MIAPPE), use of standard ontologies and data formats. The authors provide an implementation of MIAPPE in ISA-TAB.

[bib28] 28.Massonnet C., Vile D., Fabre J., Hannah M.A., Caldana C., Lisec J., Beemster G.T.S., Meyer R.C., Messerli G., Gronlund J.T. Probing the reproducibility of leaf growth and molecular phenotypes: a comparison of three Arabidopsis accessions cultivated in ten laboratories. Plant Physiol. 2010;152:2142–2157. doi: 10.1104/pp.109.148338. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib29] 29.Poorter H., Fiorani F., Stitt M., Schurr U., Finck A., Gibon Y., Usadel B., Munns R., Atkin O.K., Tardieu F. The art of growing plants for experimental purposes: a practical guide for the plant biologist. Funct Plant Biol. 2012;39:821–838. doi: 10.1071/FP12028. [DOI] [PubMed] [Google Scholar]

[bib30] 30.Faulconbridge A., Burdett T., Brandizi M., Gostev M., Pereira R., Vasant D., Sarkans U., Brazma A., Parkinson H. Updates to BioSamples database at European Bioinformatics Institute. Nucleic Acids Res. 2014;42:D50–D52. doi: 10.1093/nar/gkt1081. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib31] ••31.Wilkinson M.D., Dumontier M., Aalbersberg I.J., Appleton G., Axton M., Baak A., Blomberg N., Boiten J.-W., da Silva Santos L.B., Bourne P.E. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3 doi: 10.1038/sdata.2016.18. [DOI] [PMC free article] [PubMed] [Google Scholar]; Describes the principles and provide guidelines how data can be made FAIR (Findable, Accessible, Interoperable and Re-usable) both for machines and humans.

[bib32] •32.Singh A., Ganapathysubramanian B., Singh A.K., Sarkar S. Machine learning for high-throughput stress phenotyping in plants. Trends Plant Sci. 2016;21:110–124. doi: 10.1016/j.tplants.2015.10.015. [DOI] [PubMed] [Google Scholar]; This review showcases the utility of machine learning for high-throughput data-driven plant phenotyping.

[bib33] 33.Baranowski P., Jedryczka M., Mazurek W., Babula-Skowronska D., Siedliska A., Kaczmarek J. Hyperspectral and thermal imaging of oilseed rape (Brassica napus) response to fungal species of the genus Alternaria. PLoS One. 2015;10:e0122913. doi: 10.1371/journal.pone.0122913. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib34] 34.Calderón R., Navas-Cortés J.A., Zarco-Tejada P.J. Early detection and quantification of verticillium wilt in olive using hyperspectral and thermal imagery over large areas. Remote Sens. 2015;7:5584–5610. [Google Scholar]

[bib35] 35.Tsaftaris S.A., Minervini M., Scharr H. Machine learning for plant phenotyping needs image processing. Trends Plant Sci. 2016;21:989–991. doi: 10.1016/j.tplants.2016.10.002. [DOI] [PubMed] [Google Scholar]

[bib36] 36.Pound M.P., Burgess A.J., Wilson M.H., Atkinson J.A., Griffiths M., Jackson A.S., Bulat A., Tzimiropoulos G., Wells D.M., Murchie E.H. Deep Machine Learning provides state-of-the-art performance in image-based plant phenotyping. bioRxiv. 2016 doi: 10.1101/053033. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib37] 37.Donné S., Luong Q., Goossens B., Dhondt S., Wuyts N., Inzé D., Philips W. Proceedings of the 25th Belgian-Dutch conference on machine learning; 12–13 September 2016. 2016. Machine learning for maize plant segmentation. [Google Scholar]

[bib38] 38.Wuyts N., Dhondt S., Inzé D. Measurement of plant growth in view of an integrative analysis of regulatory networks. Curr Opin plant Biol. 2015;25:90–97. doi: 10.1016/j.pbi.2015.05.002. [DOI] [PubMed] [Google Scholar]

[bib39] 39.Gibbs J.A., Pound M., Wells D.M., Murchie E., French A., Pridmore T. Three-dimensional reconstruction of plant shoots from multiple images using an active vision system. In: Kootstra G., Edan Y., van Henten E., Bergerman M., editors. Proceedings of the IROS workshop on agri-food robotics, October 2, 2015, Hamburg, Germany. 2015. [Google Scholar]

[bib40] 40.Chen D., Neumann K., Friedel S., Kilian B., Chen M., Altmann T., Klukas C. Dissecting the phenotypic components of crop plant growth and drought responses based on high-throughput image analysis. Plant Cell. 2014;26:4636–4655. doi: 10.1105/tpc.114.129601. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Unlocking the potential of plant phenotyping data through integration and data-driven approaches

Frederik Coppens

Nathalie Wuyts

Dirk Inzé

Stijn Dhondt

Abstract

Highlights

Introduction

Data management to enable data integration

Figure 1.

Image data extraction

Value creation through integration

Data-driven approaches aid in hypothesis generation

Future challenges and perspectives

Acknowledgements

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Unlocking the potential of plant phenotyping data through integration and data-driven approaches

Frederik Coppens

Nathalie Wuyts

Dirk Inzé

Stijn Dhondt

Abstract

Highlights

Introduction

Data management to enable data integration

Figure 1.

Image data extraction

Value creation through integration

Data-driven approaches aid in hypothesis generation

Future challenges and perspectives

Acknowledgements

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases