Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2020 Mar 13;15(3):e0230458. doi: 10.1371/journal.pone.0230458

Cultivar-specific nutritional status of potato (Solanum tuberosum L.) crops

Zonlehoua Coulibali 1, Athyna Nancy Cambouris 2, Serge-Étienne Parent 1,*
Editor: Paul Esker3
PMCID: PMC7069643  PMID: 32168339

Abstract

Gradients in the elemental composition of a potato leaf tissue (i.e. its ionome) can be linked to crop potential. Because the ionome is a function of genetics and environmental conditions, practitioners aim at fine-tuning fertilization to obtain an optimal ionome based on the needs of potato cultivars. Our objective was to assess the validity of cultivar grouping and predict potato tuber yields using foliar ionomes. The dataset comprised 3382 observations in Québec (Canada) from 1970 to 2017. The first mature leaves from top were sampled at the beginning of flowering for total N, P, K, Ca, and Mg analysis. We preprocessed nutrient concentrations (ionomes) by centering each nutrient to the geometric mean of all nutrients and to a filling value, a transformation known as row-centered log ratios (clr). A density-based clustering algorithm (dbscan) on these preprocessed ionomes failed to delineate groups of high-yield cultivars. We also used the preprocessed ionomes to assess their effects on tuber yield classes (high- and low-yields) on a cultivar basis using k-nearest neighbors, random forest and support vector machines classification algorithms. Our machine learning models returned an average accuracy of 70%, a fair diagnostic potential to detect in-season nutrient imbalance of potato cultivars using clr variables considering potential confounding factors. Optimal ionomic regions of new cultivars could be assigned to the one of the closest documented cultivar.

1 Introduction

Potato cultivars are commonly classified into maturity groups based on the number of days from planting to maturity [1]. Compared to other maturity groups, cultivars with longer maturity generally show yield potential that is similar or higher [24] because of higher genetic potential [5] related to higher foliar nitrogen status [6] and root acquisition rate [7]. Hence, nutrient management of potato cultivars often consider the cultivar maturity group. However, nutrient profiles or ionomes [8, 9] may vary among potato cultivars of the same maturity groups because cultivars inherit from a diversity of parents specific traits for nutrient absorption and assimilation [10]. Indeed, White et al. [11] provided evidence of important ionome variations in angiosperm species and stated that plant families could be distinguished by their shoot ionomes. Successful classifications of plant species based on axis-reductions have been implemented on compositionally preprocessed plant ionomes [12, 13]. The potato cultivar may also be classified similarly, allowing newly introduced cultivars to benefit from the documented nutrient management of older cultivars. Hence, the foliar ionome, easily collected from field trials, could provide a tool for the fertilization of newly introduced cultivars.

Tissue ionome portrays plant nutritional status [13] under the assumption of causal relationships between plant growth rate and nutrient concentration in a tissue [14, 15]. In survey datasets, reference compositions are those that are nutritionally balanced [12]. Imbalanced ionomes could be rebalanced theoretically through a perturbation operation [16] i.e., a change in tissue composition after nutrient stress has been applied. Any factor impacting yield response to nutrients can perturb leaf composition [17]. Fertilization perturbs soil composition [18] by supplying readily available plant-nutrients [19].

Because nutrients interact in the plant, Baxter [20] suggested that the ionome could be treated as a combination of elements rather than elements taken in isolation. Parent [13] described ionomes as multivariate balance systems of isometric log-ratios [16]. Isometric log-ratios maps vectors of concentrations, which are strictly positive data constrained to the measurement unit that convey only relative information, to a real space of orthonormal coordinates [21]. Indeed, ionomes are intrinsically multivariate: each part cannot be interpreted without being related to the other parts of the whole [22]. Parent and Dafir [23] developed the compositional nutrient diagnosis in plants using row-centered log-ratios (clr). Thereafter, compositional data transformation has been used to preprocess combined nutrients traits of plant species and cultivars [13, 2426] as well as animal species [27], and human food [28, 29].

The first objective of this study was to identify clusters of potato cultivars based on their leaf ionomes. The second objective was to develop, evaluate and compare the performance of machine learning algorithms in predicting yield categories using ionomes. The third objective was to develop a conceptual workflow to adjust the ionome of potato cultivars using compositional perturbations. Our hypotheses were that (1) nutritionally balanced leaf ionomes of potato cultivars differ among potato cultivars, (2) tuber yield is impacted by specifically leaf compositional traits, and (3) cultivar-specific leaf ionomes could be rebalanced using a perturbation operation.

2 Methodology

2.1 Data set

The data set is a collection of potato surveys, and nitrogen (N), phosphorus (P) and potassium (K) fertilizer trials conducted in the province of Québec (Canada) from 1970 to 2017 (S1 Table) between the US border (45th parallel) and the Northern limit of cultivation (49th parallel). The data set was filtered to remove foliar samples collected too early or too late from the beginning (10%) of flowering, as reported by scouting teams, and where three or more of the five major elements (N, P, K, Ca and Mg) have not been quantified. The complete data set comprised 3382 observations of 151 field trials. Five maturity classes were represented, and we matched the duration from planting to harvest described by the Canadian Food Inspection Agency [1], although the names differed: early season (65–70 days), early mid-season (70–90 days), mid-season (90–110 days), mid-season late (110–130) and late season (130 days and more) cultivars. The number of samples per cultivar and the corresponding maturity classes are reported in S2 Table.

2.2 Diagnostic tissue composition

The potato diagnostic tissue is the first mature leaf (4th leaf from top) sampled at the beginning (10%) of the blooming stage [15, 30]. Twenty to 30 leaves were collected at random in each plot, composited, dried at 65°C, ground to pass through a 1 mm sieve, and analyzed for N, P, K, Ca and Mg concentrations after dissolution of combustion. Total N was determined by micro-Kjeldahl or Dumas combustion (Leco CNS-2000 analyzer, St. Joseph, MI, USA). After acid dissolution [31], K, Ca, and Mg concentrations were quantified by atomic absorption spectrometry or inductively coupled plasma spectroscopy (ICP), and P by colorimetry or ICP. We made no distinction between methodologies in the analysis of ionomes.

2.3 Processing nutrient composition to nutrient balances

The compositional space [16] of the leaf tissue comprised five nutrients (N, P, K, Mg, Ca) and undetermined components amalgamated into a filling value (Fv) computed by difference between the measurement unit and the sum of quantified nutrients. Tissue components were preprocessed using the row-centered log-ratio transformation, as follows [23]:

clri=ln(xig(x)) (1)

where xi is raw concentration of the ith component and g(x) is the geometric mean across components including the filling value.

2.4 Clustering cultivars

Yield thresholds are useful for decision-making. Because tuber yield potential varies widely among cultivars, we processed by discretizing tuber yields into low- and high-productivity categories [12] by ranking the marketable yield in ascending order within a given cultivar, and selecting the yield corresponding to the 65th percentile as cut-off between the two subgroups. Hence, each cultivar had its cut-off with respect to its yield potential as shown in S2 Table. The high-yielding subpopulation ionomes were used to assess cultivars clustering ability. This subgroup comprised 1190 occurrences (after the exclusion of 144 outliers) across 151 trials and 47 cultivars. A density-based clustering method [32] was used to delineate cultivar groups of similar compositions using clr variables.

2.5 Ionome effect and yield prediction

Machine learning algorithms can either regress to predict continuous variables or classify to predict categories [33]. Tuber yield categories were predicted using clr variables and information on ionomic groups of the full data set (high and low yielders i.e. 3382 rows). Three machine learning algorithms were compared: k-nearest neighbors, random forest and support vector machines.

We estimated the relative influence of variables in the model and their rank by examining how can prediction error increases when data for a variable is permuted while all others are left unchanged [34, 35]. A variable can score a zero or too small value compared to others. Deleting such variable from the dataset should not impact on the results. The random forest model was used for feature selection to assess importance of each clr variable in predicting tuber yield, but none of the variable was removed.

The data were split into training (75%) and testing (remaining 25%) sets at cultivar level i.e., for each cultivar the samples were randomly separated according to these proportions. The performance of the classification models was assessed using accuracy computed with the testing set. Applied to the context, the four quadrants defined by Swets [36] in binary system diagnosis to delineate the response classes are presented in the contingency table (Table 1). The accuracy is the proportion of correctly-predicted instances:

Accuracy=TN+TPTN+TP+FN+FP (2)

Table 1. Term definitions used for the study.

Observed yield
Low (unbalanced) High (balanced)
Predicted yield Low True positive (TP): observed low-yielders correctly predicted as low-yielders. False positive (FP): observed high-yielders incorrectly predicted as low-yielders.
High False negative (FN): observed low-yielders incorrectly predicted as high-yielders. True negative (TN): observed high-yielders correctly predicted as high-yielders.

As in medical sciences, the negative term is used in cases where no intervention is needed after diagnosis.

2.6 Rebalancing a composition: The enchanting islands

A compositional perturbation is a translation in the compositional space [37, 38]. A perturbation vector expressed as clr values contains a series of deltas (differences). Once back-transformed into the compositional space, the perturbation vector alters a composition through a perturbation (⊕) operation as follows [37]:

AB=[a1,a2,,aD][b1,b2,,bD]=C(a1×b1,a2×b2,,aD×bD) (3)

where a D-part composition A is perturbed (⊕) by a D-part composition B, and C is the closure operator to constant sum.

We used the testing set to display the effect of a perturbation across the simplex. We selected two elements (N and P) and simulated an increase of their initial (observed) clr values by 20% (theoretically). The observed (ionome of the instance) and new clr vector (perturbed ionome) were back-transformed into N, P, K, Ca, Mg and Fv compositional space for comparison using familiar concentration units.

The high yielders of the training set correctly diagnosed as balanced (true negative specimens) by the most accurate model were used as the reference subpopulations. The clr values of these reference specimens were used as reference nutritional status at high yield potential. A potato nutrient imbalance index was computed as a distance from the closest high-yielding specimen using the Aitchison distance, i.e. the Euclidean distance between compositions using clr-transformed concentrations [39]. For any misbalanced or new specimen of a given cultivar, the closest true negative (closest reference composition) was identified as the sample with the minimum Aitchison distance from the new composition. The nutrient clr differences defining the Aitchison distance may be considered as apparently excess or deficiency of the nutrient requiring correcting measures in a multivariate and compositional data perspective [40]. Hence, the clr space of nutrient components (N, P, K, Ca, Mg) was described not as an ellipsoidal hyper-space [41] but as islands of high-yielding specimens dispersed in the hyper-space of differently yielding specimens. The closer is a specimen from the enchanting island, the higher its chance to become a high-yielder [40]. The clr-difference was converted into a perturbation vector between two nutrient compositions expressed as familiar nutrient concentrations.

2.7 Statistical analysis

Statistical computations were performed in the R statistical environment version 3.6.1 [42]. Compositional data analysis was conducted using the R compositions package version 1.40–2 [43]. Multivariate outliers were removed for robust multivariate analysis [44] using the Mahalanobis distance at a 0.01 level of significance with the R mvoutlier package version 2.0.9 [45]. The clustering operation were performed using dbscan package version 1.1–3 [32]. Linear discriminant analysis (LDA) was conducted using the R ade4 package version 1.7–13 [46] which allows computing linear combinations of clr coordinates that best discriminate cultivars ionomes centroids. Supervised analysis was conducted using the caret package version 6.0–84 [47]. Our results are reproducible by using the R computation codes and data given as supplementary information and available online in a GitHub repository (https://git.io/Jvt2r).

3 Results

3.1 Cluster analysis

The data set used for clustering is described in S2 Table. The AC Chaleur cultivar showed the lowest tuber marketable yield cut-off (65th percentile) at 17.4 Mg ha-1 and Red-Maria, the highest at 64.6 Mg ha-1. Average marketable yield was 40.5 Mg ha-1 for high yielders and 24.8 Mg ha-1 for low yielders. In comparison, average potato tuber yields in Canada and Québec were 31.2 Mg ha-1 and 32.2 Mg ha-1 respectively, in 2018 [48].

The dbscan clustering function looked for dense regions in the clr-space, and detected a single cluster of cultivars i.e., cultivars were scattered without any particular dense region. A principle components analysis allowed to map cultivars and nutrients in the biplot shown in Fig 1. The principle components scores mapped on the distance biplot (Fig 1A) showed no particular pattern allowing groups partition. The clr correlation loadings (Fig 1B) showed a negative relationship between K and Mg, P and Ca, and positive relationship between N and P in agreement to concentration changes with time as the plant matures [49]. Discrepancies between cultivars were driven mainly by Mg and K on the first axis, and by P and Ca on the second axis (right hand side plot).

Fig 1.

Fig 1

Principle components biplot of potato ionome showing (A) scores in distance scaling and (B) loadings in correlation scaling.

3.2 Predicting tuber yield

Classification models assigned explanatory clr variables to two categorical tuber marketable yield: high- and low-yielders. The random-forest algorithm allowed to rank the importance of variables in the model. The clr of nitrogen appeared to be the most discriminant variable between tuber yield categories, followed by the amalgamated unknown components (Fv), then Ca, Mg and, finally, P.

After splitting data into training (75%) and testing (25%) data sets, we used a ten-fold cross-validation process that sequentially splits the training data set into ten parts, using nine parts for calibration and the remainder for validation. The k-nearest neighbours, the random forest and the support vector machine models returned practically similar predictive accuracies (although slightly lower for the support vector machine algorithm), with a mean accuracy of 70% representing 591 successful and 254 unsuccessful cases classification with the testing set. The null hypothesis for a random classifier i.e., non-informative classification consisting of an equal distribution of 50% successful and 50% unsuccessful cases was rejected after a χ2 homogeneity test (χ2 = 69.135, p < 2.2 10−16). Since all the models returned practically similar accuracy over the testing set, predictions with the k-nearest neighbors model were used for interpreting. There was high variation in model fit by cultivar as shown in Fig 2. The accuracy at testing varied from 25% for Estima and Waneta, to 100% for Ambra, Carolina, Dark Red Chieftain, Harmony, Peribonka and Viking. All these cultivars had small sample sizes in the dataset, as shown in the S2 Table.

Fig 2. The k nearest neighbors model evaluation accuracies for cultivars.

Fig 2

3.3 Ionome perturbation

The true negative specimens (correctly diagnosed as balanced) comprising 783 occurrences in the training data set provided the clr reference values required to compute the Aitchison distance, which is equal to the Euclidean distance across clr-transformed compositions. The S3 Table displays mean values for each cultivar. Using the Aitchison metric, the closest true negative specimen was set as the reference composition for each imbalanced specimen. In the clr-space, the difference between the reference and the imbalanced compositions returns a perturbation vector. The Fig 3 shows the imbalanced sample with the highest Aitchison distance from its reference and the perturbation to apply as a translation to reach a balanced ionome.

Fig 3. Perturbation vector example mapped using the most imbalanced sample.

Fig 3

The most imbalanced observation nutrient composition was (0.0601, 0.0037, 0.0355, 0.0032, 0.0048. 0.8919), the nearest reference composition was (0.0561, 0.0036, 0.0603, 0.0052, 0.0184, 0.8565), the corresponding perturbation vector was (0.0919, 0.0965, 0.1696, 0.1629, 0.3832, 0.0959) for N, P, K, Mg, Ca and Fv respectively. The Aitchison distance computed between the observation and its associated true negative was 1.135.

4 Discussion

4.1 Clustering potato cultivars

The Canadian Food Inspection Agency classified potato cultivars broadly into maturity groups based on the time elapsed between planting and maturity [1]. However, nutrient requirements, especially nitrogen, vary widely between cultivars of the same maturity group. In New Brunswick (Canada), Zebarth et al. [50] recommended 200–208 kg N ha-1 for Russet Norkota (early-season cultivar) and Russet Burbank (late-season cultivar), 190 kg N ha-1 for Superior (early-mid-season cultivar) and Goldrush (mid-season cultivar), 175 kg N ha-1 for Shepody (mid-season), 135 kg N ha-1 for early cultivars for the table market, 160–180 kg N ha-1 for other mid-season, 180–200 kg N ha-1 for other late, and 135–160 kg N ha-1 for low N requirement cultivars. Such large discrepancies within the same cultivar maturity group was attributed to differential foliar gene expression [6] and root development [7]. Hence, information additional to maturity grouping is needed to assess nutrient requirements of potato cultivars. Huang and Salt [51] reported that ionomics allows the discovery of genes controlling natural variation in the plant ionome and for Salt et al. [9], ionomics could capture information about the functional state of an organism driven by genetic and environmental factors. The content of plant tissue reflects what the plant can absorb from the soil and for each nutrient, there is a correlation between its concentration and yield. Moreover, since tissue analysis is also carried out to observe the effect of fertilizer applications, and for determining the in-season or next season nutrient requirement [52, 53], ionomes could be useful in discriminating potato cultivars. Indeed, using a small data set of eight potato cultivars, Hernandes et al. [10] showed that foliar nutrient profiles varied widely among cultivars of the same maturity group. According to Parent et al. [12], variations in ionomes could be interpreted only partly as genotypic effect, and phenotypic plasticity can also be driven by nutrient supply capacity specific to agroecosystems while breeding programs are conducted under relatively luxurious environments to reach high productivity. The N, Mg and K clr values, that dominated principal components (Fig 1), could reflect the abilities of individual cultivars to acquire and use those nutrients more efficiently [54, 55]. Natale et al. [56] provided evidence that in general macronutrient contents differ among species and cultivars and within the same species for fruit trees. For N, K and Ca, this range is wider because of higher requirement of these elements by plants, and narrower for P, Mg and S, indicating smaller demand for the latter.

To cluster is to recognize that objects are sufficiently similar to be put in the same group, and to identify distinctions or separations between groups of objects [57, 58]. Based on the assumptions of differential genotypic potential, root development, nutrient requirements, nutrient uptake and use efficiency, the goal was to discover interesting structures in the N, P, K, Mg and Ca contents of the diagnosis tissue in order to decipher dissimilarities between cultivars [33]. However, the process failed to discriminate groups of cultivars along the clr coordinates. Hernandes et al. [10] reached similar results with overlapping nutrient profiles between cultivar groups depending on isometric log ratio (ilr) coordinates. They found similar nutrient profiles between cultivars groups along some ilr coordinates and very different ones along others. While ionome dissimilarities are not numerically compelling, they could assist classifying new cultivars into appropriate ionomic group to benefit from costly fertilizer trials conducted on cultivars of the same group.

4.2 Tuber marketable yield prediction

The P content of the diagnostic leaf did not appear useful in predicting potato tuber yield classes. Other elements (N, K, Ca and Mg) showed important contribution of their clr values to the prediction quality metric, especially N, which is directly related to photosynthesis [59]. Since the fertilization trials were conducted over a time span of 47 years (1970–2017), the question arises whether the different methodology of quantifying P (colorimetry/ICP) may have contributed to depreciating this variable in predicting tuber yield classes. The ICP method is shown to be faster and to give higher results for total phosphorus content in ‘soil’ extracts in comparison to the colorimetric method. However, there are exceptions and controversial results [6062]. Ivanov et al. [61] found that the two methods for total P determination in plant material were highly correlated, and the results were generally within 5% to 10% of one another. Moreover, Valkama et al. [63] reported that, although agricultural practices, soil conditions and analytical techniques have undergone substantial changes over time, the differences between old and recent experiments in yield responses to P application were not statistically important. For all these reasons, we consider the two analytical methods equally relevant to the analysis. The low importance of the P clr variable in predicting tuber yield classes may come from its correlation with Ca. Globally, the selection of relevant features is achieved, by first checking the correlation between features and response to select the features that have correlation above a selected level (e.g., 0.5). Then, the independent variables need to be uncorrelated with one another. If some features are correlated, only one is kept. The process selected the clr_Ca variable (alphabetical order) instead of clr_P since these features are correlated as shown in Fig 1B. In this study no element was discarded from the process relative to its importance.

The tested algorithms (k-nearest neighbours, random forest and support vector machine) returned similar accuracies in the prediction of yield classes using clr variables as predictors and showed fair diagnostic potential to detect nutrient imbalance. The correctly predicted high and low yielders reached 70% in the testing data set. The models classified more accurately the yield categories compared to a random classifier [64]. Specimens classified as false negatives (i.e., low yielders incorrectly classified as high yielders) are attributable to limiting conditions other than N, P, K, Mg, and Ca nutrition: soil physical and chemical properties [65, 66], fertilization [67], management failures, diseases [68] or weather events [69] impacting plants growth and yield potential. False positive specimens (i.e., high yielders incorrectly classified as low yielders) indicate luxury consumption when nutrient concentrations are higher [12, 70], or other particularly favorable growth conditions. The confusion matrix built for cultivars revealed poor predictive accuracy for certain cultivars (i.e., 25% for Estima and Waneta) and conversely an accuracy of 100% for others (i.e., Ambra, Peribonka) as shown in Fig 2. These cultivars involved mainly small sample sizes (only one, two or three high-yielders and five, six or lightly more low-yielders). The problems of small-data in machine learning are numerous, but mainly revolve around over-fitting. The training and testing datasets division could only aggregate observations of one class in the training set so that the model would train to always predict this dominant class [71]. The model could also memorize labels, which is not ideal for generalizing from new data. Brownlee [72] explained that imbalanced classifications (one or less examples in a minority class for hundreds or more examples in the other) pose a challenge for predictive modeling as most of the machine learning algorithms used for classification were designed around the assumption of an equal number of examples for each class. This results in models that have poor predictive performance, specifically for the minority class. The controversial accuracy level for some cultivars (especially low level) could also come from other yield limiting factors specific to the experiments but not involved in this study, as for false positive specimens. Our model was not effective for these cultivars treated separately.

The differential nutrition of potato cultivars could be addressed objectively using mineral analysis of the diagnostic leaf. More data are needed for poorly documented cultivars. Moreover, dedicated models could be trained for cultivars for which sufficient data are available (e.g., Goldrush, Superior, FL 1207, Chieftain). Other algorithmic, sampling and quality measurement approaches could further be implemented to deal with the problems of small-data and unequal distribution of classes [71, 72]. One could extend the predictors to the experimental conditions (soils, weather data), fitting a site-and-cultivar-specific nutrients diagnosis model.

4.3 Perturbation vector for fertilizer recommendation

Rational fertilization requires information on the nutrients that are available in the soil, and the nutritional status of the plant [14] as portrayed by the diagnostic tissue composition [14, 15]. However, the diagnosis of deficiency and toxicity of mineral nutrients may be complicated in field-grown plants where more than one mineral nutrient is deficient or where there is a deficiency of one nutrient and simultaneously toxicity of another [14]. The scientific principle behind tissue analysis is that healthy plants contain predictable concentrations of analytical nutrients [73]. The values are compared to established norms for inadequate, adequate and excess levels. However, Parent et al. [13] proved that this concept of growth-limiting nutrient concentrations supported by the Law of minimum and illustrated by Liebig’s barrel, should be replaced by a concept of growth-limiting nutrient balances illustrated by a pan balance design, where groups of elements are balanced optimally against each other in weighing pans.

The difference between two equal-length compositional vectors can only be computed using tools of compositional data analysis. The perturbation vector concept applied to foliar tissue diagnosis returns a scaling operator [21] that when applied to an imbalanced composition translates it (theoretically) into a balanced composition with high yield potential (i.e., true negative). Although the closure of the simplex implies that a perturbation on the clr of a specific nutrient is methodologically not a change in proportion of a single nutrient, perturbations expressed in the clr space appear suitable for interpretation. Indeed, the difference measured between clr values of the diagnosed sample and reference (true negative) specimen can be ranked using the sign of that difference [10, 74, 75], hence indicating which components are at excessive or deficient levels. As provided by Parent [40], K and Mg were apparently deficient while N, P and Ca were apparently in excess compared to the closest reference specimen (Fig 3). Using the same approach, ionomes of newly introduced cultivars with unknown nutrient requirements could be assigned to the cultivars of known nutrient requirements showing the closest ionomes.

A perturbation as the one shown in Fig 3 should not be interpreted as shifts of individual components, since the operation on a single component resonates on the whole simplex [40]. For instance, an offset in the simplex S (N, P, K, Ca, Mg, Fv) composition following the increase by 20% (theoretically) of N and P clr values is displayed on Fig 4. The K, Ca and Mg concentrations seemed more stable with respect to the others. Although P clr values have been increased, P proportion decreased globally for the new equilibrium of the simplex. The offset was higher for the selected components followed by the filling value (Fv).

Fig 4. Effect of the perturbation of N and P clr coordinates on the other element proportions.

Fig 4

‘Observation’ stands for the element’s original proportion, ‘Perturbation’ designates the new proportion after the ‘Observed’ vector’s clr value was offset. Greyed boxplots plot distribution of perturbed elements of the simplex.

Perturbation (as defined in Eq 3) is the measure of compositional change from one composition to another [37]. Because foliar composition belongs to compositional data family, the Fig 4 illustrates the principle that changing a proportion of such data affects at least another proportion of the simplex [16]. The result displayed variable offsets for other elements, decreasing or increasing to reach another balance in the simplex.

5 Conclusion

Since the concept of compositional data analysis was applied to plant tissues, several studies classified plant species and cultivars using multivariate analysis of nutrients compositions. This study is, to our knowledge, the third (following Parent et al. [49] and Hernandes et al. [10]) to use statistical tools to address the differential nutrition of potato cultivars using combination of nutrient concentrations in the diagnostic leaf, and the first using tools of machine learning to predict tuber marketable yield. The potato ionomes showed some dissimilarities in principle components analysis, but not compelling to separate definite density-based clusters between cultivars on the basis of the clr values. However, the ionome showed a determinant effect on tubers yield. Used as predictors in machine learning tools, clr variables showed diagnostic potential to detect in-season nutrient imbalance to address objectively the differential response of cultivars to fertilization. The perturbation vector of the leaf compositional space could indicate cultivar sensitivity to fertilization and address specific problems of nutrient imbalance in new cultivars. Tissue testing remains an informative, diagnostic and preventive tool with real-world applications for growers in evaluating the effectiveness of their nutrient management program. When using the right interpretation, this timely and correct tissue testing helps diagnose the presence and magnitude of suspected nutrient deficiencies. By using the compositional perturbation vector involving interactions among nutrients, our study provided a useful tool in potato precision fertilization in Quebec. The perturbation vector can help identify limiting nutrients requiring correcting measures as a season progresses or for subsequent seasons. Moreover, our study implicitly provided robust multi-nutrient norms for potato crops, gathering more cultivars of different maturity classes than the previous works. These norms are sets of true-negative or nutritionally-balanced compositions per cultivar (enchanting islands) with high-yield potential. More data are needed to fine-tune the models, especially for poorly-documented cultivars. New algorithms, other sampling methods and model quality measures could be tested to deal with the problem of small-data and imbalanced classification. Further studies extending predictive features to site-specific conditions could improve the diagnosis with a site- and cultivar-specific nutrient diagnosis model.

Supporting information

S1 Table. Quebec potato leaves ionome data set.

raw_leaf_df.csv file available online in data repository at https://git.io/Jvt2r.

(CSV)

S2 Table. Potato data set used for cluster analysis.

(DOCX)

S3 Table. True negatives mean clr values for cultivars.

(DOCX)

Data Availability

All relevant data are within the paper and its Supporting Information files. There is no restriction on sharing of data and/or materials.

Funding Statement

ZC is partly funded by the Natural Sciences and Engineering Council of Canada (CRDPJ 385199-09 and DG-2254 - https://www.nserc-crsng.gc.ca), the Quebec Ministry of Agriculture, Fisheries and Food (IA216581 - https://www.mapaq.gouv.qc.ca), Centre SEVE (https://centreseve.recherche.usherbrooke.ca/), Patate Dolbec Inc. (https://patatesdolbec.com/), Groupe Gosselin FG (http://gosseling2.com), Agriparmentier Inc., Ferme Daniel Bolduc Inc. (http://fermedanielbolduc.com/), Patate Laurentienne, Ferme Bergeron-Niquet, and Patates Lac-St-Jean (http://plsj.ca/). There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

References

  • 1.CFIA. Potato plants characteristics, maturity. Canadian Food Inspection Agency: Canadian Food Inspection Agency; 2015 [Available from: http://www.inspection.gc.ca/plants/potatoes/characteristics/eng/1326490397702/1326490477981#mature.
  • 2.Eschemback V, Kawakami J, Melo PEd. Performance of modern and old, European and national potato cultivars in different environments. Horticultura Brasileira. 2017;35:377–84. [Google Scholar]
  • 3.Kawakami J, Iwama K, Jitsuyama Y, Zheng X. Effect of cultivar maturity period on the growth and yield of potato plants grown from microtubers and conventional seed tubers. American Journal of Potato Research. 2004;81(5):327–33. [Google Scholar]
  • 4.Söğüt T, Öztürk F. Effects of harvesting time on some yield and quality traits of different maturing potato cultivars. African Journal of Biotechnology. 2011;10(38):7349–55. [Google Scholar]
  • 5.Saric MR. Theoretical and practical approaches to the genetic specificity of mineral-nutrition of plants. Plant and Soil. 1983;72(2–3):137–50. [Google Scholar]
  • 6.Zebarth BJ, Tai HL, Luo SN, Millard P, De Koeyer D, Li XQ, et al. Differential gene expression as an indicator of nitrogen sufficiency in field-grown potato plants. Plant and Soil. 2011;345(1–2):387–400. [Google Scholar]
  • 7.Sattelmacher B, Klotz F, Marschner H. Influence of the nitrogen level on root growth and morphology of two potato varieties differing in nitrogen acquisition. Plant and soil. 1990;123(2):131–7. [Google Scholar]
  • 8.Lahner B, Gong JM, Mahmoudian M, Smith EL, Abid KB, Rogers EE, et al. Genomic scale profiling of nutrient and trace elements in Arabidopsis thaliana. Nature Biotechnology. 2003;21(10):1215–21. 10.1038/nbt865 [DOI] [PubMed] [Google Scholar]
  • 9.Salt DE, Baxter I, Lahner B. Ionomics and the study of the plant ionome. Annual Review of Plant Biology. 2008;59:709–33. 10.1146/annurev.arplant.59.032607.092942 [DOI] [PubMed] [Google Scholar]
  • 10.Hernandes A, Parent S-É, Veillette J-P, Parent P, Leblanc M, Roy G, et al. Compositional meta-analysis of the nutrient profile of potato cultivars. 2011. [Google Scholar]
  • 11.White PJ, Broadley MR, Thompson JA, McNicol JW, Crawley MJ, Poulton PR, et al. Testing the distinctness of shoot ionomes of angiosperm families using the Rothamsted Park Grass Continuous Hay Experiment. New Phytologist. 2012;196(1):101–9. 10.1111/j.1469-8137.2012.04228.x [DOI] [PubMed] [Google Scholar]
  • 12.Parent SE, Parent LE, Rozane DE, Natale W. Plant ionome diagnosis using sound balances: case study with mango (Mangifera Indica). Frontiers in plant science. 2013;4:1–12. 10.3389/fpls.2013.00001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Parent SE, Parent LE, Egozcue JJ, Rozane DE, Hernandes A, Lapointe L, et al. The plant ionome revisited by the nutrient balance concept. Frontiers in Plant Science. 2013;4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Marschner H. Diagnosis of deficiency and toxicity of mineral nutrients. In: Marschner H, editor. Mineral Nutrition of Higher Plants 2e ed: Academic Press, London; 1995. p. 461–79. [Google Scholar]
  • 15.Jones JJB, Wolf B, Mills HA. Plant analysis handbook. A practical sampling, preparation, analysis, and interpretation guide: Micro-Macro Publishing, Inc.; 1991. [Google Scholar]
  • 16.Aitchison J. The statistical analysis of compositional data. London: Chapman and Hall; 1986. [Google Scholar]
  • 17.Dumenil LC. Relationship between the chemical composition of corn leaves and yield responses from nitrogen and phosphorus fertilizer Iowa State University Capstones; 1958. [Google Scholar]
  • 18.McKenzie RH, Stewart JWB, Dormaar JF, Schaalje GB. Long-term crop rotation and fertilizer effects on phosphorus transformations: I. In a Chernozemic soil. Canadian Journal of Soil Science. 1992;72(4):569–79. [Google Scholar]
  • 19.McKenzie R. Crop nutrition and fertilizer requirements. Alberta Agriculture, Food and Rural Development Lethbridge. 1998:1–7. [Google Scholar]
  • 20.Baxter I. Should we treat the ionome as a combination of individual elements, or should we be deriving novel combined traits? Journal of Experimental Botany. 2015;66(8):2127–31. 10.1093/jxb/erv040 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Pawlowsky-Glahn V, Buccianti A. Compositional data analysis. Theory and applications: A John Wiley & Sons, Ltd, Publication; 2011. 378 p. [Google Scholar]
  • 22.Tolosana-Delgado R, Van Den Boogart KG. Linear models with compositions in R In: Pawlowsky-Glahn V, Buccianti A, editors. Compositional data analysis: Theory and applications: (New York: John Wiley and Sons; ); 2011. p. 356–71. [Google Scholar]
  • 23.Parent LE, Dafir M. A theoretical concept of compositional nutrient diagnosis. Journal of the American Society for Horticultural Science. 1992;117(2):239–42. [Google Scholar]
  • 24.de Deus JAL, Neves JCL, Correa MCD, Parent SE, Natale W, Parent LE. Balance design for robust foliar nutrient diagnosis of "Prata" banana (Musa spp.). Scientific Reports. 2018;8:1–7. 10.1038/s41598-017-17765-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Nicolas O, Charles MT, Jennie S, Toussaint V, Parent SE, Beaulieu C. The ionomics of lettuce infected by Xanthomonas campestris pv. vitians. Frontiers in Plant Science. 2019;10:1–10. 10.3389/fpls.2019.00001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Melo GW, Rozane DE, Brunetto G, Lattuada DS. Discriminant analysis in the selection of groups of peach cultivars. In: Mimmo T, Pii Y, Scandellari F, editors. Viii International Symposium on Mineral Nutrition of Fruit Crops. Acta Horticulturae. 12172018. p. 335–42. [Google Scholar]
  • 27.Prater C, Scott DE, Lance SL, Nunziata SO, Sherman R, Tomczyk N, et al. Understanding variation in salamander ionomes: A nutrient balance approach. Freshwater Biology. 2019;64(2):294–305. [Google Scholar]
  • 28.Leite MLC, Prinelli F. A compositional data perspective on studying the associations between macronutrient balances and diseases. European Journal of Clinical Nutrition. 2017;71(12):1365–9. 10.1038/ejcn.2017.126 [DOI] [PubMed] [Google Scholar]
  • 29.Leite MLC. Applying compositional data methodology to nutritional epidemiology. Statistical Methods in Medical Research. 2016;25(6):3057–65. 10.1177/0962280214560047 [DOI] [PubMed] [Google Scholar]
  • 30.Westermann DT, Davis JR. Potato nutritional management changes and challenges into the next century. American Potato Journal. 1992;69(11):753–67. [Google Scholar]
  • 31.Mills HAJJ Walsh LMB, James D, CottenieE A, Faithfull NT, Larrahondo JE, et al. Plant analysis handbook II: a practical preparation, analysis, and interpretation guide: Potash and Phosphate Institute; 1996. [Google Scholar]
  • 32.Hahsler M, Piekenbrock M, Arya S, Mount D. dbscan: Density based clustering of applications with noise (DBSCAN) and related algorithms. R package version 1.1–3. 2017.
  • 33.James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning-with applications in R. New York, NY: Springer; 2013. [Google Scholar]
  • 34.Liaw A, Wiener M. Classification and regression by randomForest. R news. 2002;2(3):18–22. [Google Scholar]
  • 35.Breiman L. Manual on setting up, using, and understanding random forests v3. 1. Statistics Department University of California Berkeley, CA, USA: 2002;1:58. [Google Scholar]
  • 36.Swets JA. Measuring the accuracy of diagnostic systems. Science. 1988;240(4857):1285–93. 10.1126/science.3287615 [DOI] [PubMed] [Google Scholar]
  • 37.Aitchison J, Ng KW. The role of perturbation in compositional data analysis. Statistical Modelling. 2005;5(2):173–85. [Google Scholar]
  • 38.Monna F, Marques AN, Guillon R, Losno R, Couette S, Navarro N, et al. Perturbation vectors to evaluate air quality using lichens and bromeliads: a Brazilian case study. Environmental Monitoring and Assessment. 2017;189(11). [DOI] [PubMed] [Google Scholar]
  • 39.Egozcue JJ, Pawlowsky-Glahn V. Simplicial geometry for compositional data. In: Buccianti A, MateuFigueras GH, GlahnPawlowsky V, editors. Compositional Data Analysis in the Geosciences: From Theory to Practice Geological Society Special Publication; 2642006. p. 145–59. [Google Scholar]
  • 40.Parent SE. Why we should use balances and machine learning to diagnose ionomes. Authorea [Internet]. 2020. Available from: https://www.authorea.com/users/23640/articles/281937-why-we-should-use-balances-and-machine-learning-to-diagnose-ionomes.
  • 41.Hron K. Analytical representation of ellipses in the Aitchison geometry and its application. Acta Universitatis Palackianae Olomucensis Facultas Rerum Naturalium Mathematica. 2009;48(1):53–60. [Google Scholar]
  • 42.R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria: 2019. [Google Scholar]
  • 43.Van den Boogaart KG, Raimon T, Bren M. compositions: compositional data analysis. R package version 1.40–1. 2014.
  • 44.Filzmoser P, Hron K. “Robust statistical analysis Chapter 5. In: Pawlowsky-Glahn V, Buccianti A, editors. Compositional Data Analysis: Theory and Applications: John Wiley and Sons, New York, NY; 2011. p. 59–72. [Google Scholar]
  • 45.Filzmoser P, Gschwandtner M. mvoutlier: Multivariate Outlier Detection Based on Robust Methods. R package version 2.0.9. 2018.
  • 46.Dray S, Dufour AB. The ade4 package: implementing the duality diagram for ecologists. Journal of Statistical Software. 2007;22(4):1–20. [Google Scholar]
  • 47.Kuhn M, Wing J, Weston S, Williams A. Caret package: classification and regression training Journal of Statistical Software. 2008;28(5):1–26. [Google Scholar]
  • 48.Statistics Canada. Area, production and farm value of potatoes 2017 [Available from: https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=3210035801&pickMembers%5B0%5D=1.6&request_locale=en.
  • 49.Parent LE, Cambouris AN, Muhawenimana A. Multivariate diagnosis of nutrient imbalance in potato crops. Soil Science Society of America Journal. 1994;58(5):1432–8. [Google Scholar]
  • 50.Zebarth BJ, Karemangingo C, Scott P, Savoie D, Moreau G. Nitrogen management for potato: general fertilizer recommendations. New-Brunswick Ministry of Agriculture, Fisheries and Aquaculture, Fredericton, NB, Canada. 2007. [Google Scholar]
  • 51.Huang XY, Salt DE. Plant Ionomics: From Elemental Profiling to Environmental Adaptation. Molecular Plant. 2016;9(6):787–97. 10.1016/j.molp.2016.05.003 [DOI] [PubMed] [Google Scholar]
  • 52.Hochmuth GJ, Maynard D, Vavrina C, Hanlon E, Simonne E. Plant tissue analysis and interpretation for vegetable crops in Florida. 2004. p. 1–48. [Google Scholar]
  • 53.Cottenie A. Soil and plant testing as a basis of fertilizer recommendations. FAO Soils Bulletin. 1980;38(2):1–118. [Google Scholar]
  • 54.White PJ, Wheatley RE, Hammond JP, Zhang K. Minerals, soils and roots. Potato Biology and Biotechnology: Elsevier; 2007. p. 739–52. [Google Scholar]
  • 55.Giletto CM, Echeverría HE. Critical nitrogen dilution curve in processing potato cultivars. American Journal of Plant Sciences. 2015;6(19):3144–56. [Google Scholar]
  • 56.Natale W, Lima Neto AJ, Rozane DE, Parent L-É, Corrêa MCM. Mineral nutrition evolution in the formation of fruit tree rootstocks and seedlings. Revista Brasileira de Fruticultura. 2018;40(6):(e-133). [Google Scholar]
  • 57.Legendre P, Legendre L. Cluster analysis In: Legendre P, Legendre L, editors. Developments in environmental modelling Numerical ecology. 24: Elsevier; 2012. p. 337–424. [Google Scholar]
  • 58.Borcard D, Gillet F, Legendre P. Numerical ecology with R: Springer; 2018. [Google Scholar]
  • 59.Andrews M, Raven JA, Lea PJ. Do plants need nitrate? The mechanisms by which nitrogen form affects plants. Annals of Applied Biology. 2013;163(2):174–99. [Google Scholar]
  • 60.Sikora FJ, Howe PS, Hill LE, Reid DC, Harover DE. Comparison of colorimetric and ICP determination of phosphorus in Mehlich3 soil extracts. Communications in Soil Science and Plant Analysis. 2005;36(7–8):875–87. [Google Scholar]
  • 61.Ivanov K, Zaprjanova P, Angelova V, Bekjarov G, Dospatliev L, editors. ICP determination of phosphorous in soils and plants. 19th World Congress of Soil Science, Soil Solutions for a Changing World; 2010.
  • 62.Adesanwo OO, Ige DV, Thibault L, Flaten D, Akinremi W. Comparison of Colorimetric and ICP Methods of Phosphorus Determination in Soil Extracts. Communications in Soil Science and Plant Analysis. 2013;44(21):3061–75. [Google Scholar]
  • 63.Valkama E, Uusitalo R, Ylivainio K, Virkajarvi P, Turtola E. Phosphorus fertilization: a meta-analysis of 80 years of research in Finland. Agriculture Ecosystems & Environment. 2009;130(3–4):75–85. [Google Scholar]
  • 64.Hollander M, Wolfe DA, Chicken E. Nonparametric statistical methods. Third ed. Hoboken, New Jersey: John Wiley & Sons, Inc; 2013. 837 p. [Google Scholar]
  • 65.Stalham MA, Allen EJ, Herry FX. Effects of soil compaction on potato growth and its removal by cultivation. Research review. 2005(R261):1–60.17288065 [Google Scholar]
  • 66.Boiteau G, Goyer C, Rees HW, Zebarth BJ. Differentiation of potato ecosystems on the basis of relationships among physical, chemical and biological soil parameters. Canadian Journal of Soil Science. 2014;94(4):463–76. [Google Scholar]
  • 67.Zebarth BJ, Leclerc Y, Moreau G, Botha E. Rate and timing of nitrogen fertilization of Russet Burbank potato: Yield and processing quality. Canadian Journal of Plant Science. 2004;84(3):855–63. [Google Scholar]
  • 68.Rich AE. Potato diseases. New York: Academic Press; 1983. xiv, 238 p p. [Google Scholar]
  • 69.Herman DJ, Knowles LO, Knowles NR. Heat stress affects carbohydrate metabolism during cold-induced sweetening of potato (Solanum tuberosum L.). Planta. 2017;245(3):563–82. 10.1007/s00425-016-2626-z [DOI] [PubMed] [Google Scholar]
  • 70.Parent SE, Parent LE, Rozane DE, Hernandes A, Natale W. Nutrient balance as paradigm of plant and soil chemometrics. Chapter 4. In: Issaka RN, editor. Soil Fertility: Tech Publ, NY; 2012. p. 83–114. [Google Scholar]
  • 71.Kuhn M, Johnson K. Applied predictive modeling. New York, NY: Springer; 2013. Available from: 10.1007/978-1-4614-6849-3. [DOI] [Google Scholar]
  • 72.Brownlee J. Imbalanced classification with Python: better metrics, balance skewed classes, cost-sensitive learning. mistery Ml, editor2020. 463 p. [Google Scholar]
  • 73.Campbell CR. Reference sufficiency ranges for plant analysis in the southern region of the United States. SAAESD, editor: SAAESD; 2000. 134 p. [Google Scholar]
  • 74.Rozane DE, Mattos Junior Dd, Parent SE, Natale W, Parent LE, editors. Compositional meta-analysis of citrus varieties in the state of São Paulo, Brazil. 4th International Workshop on Compositional Data Analysis; 2011; Saint Feliu de Giuxols, Girona, Spain.
  • 75.Rozane DE, Mattos D, Parent SE, Natale W, Parent LE. Meta-analysis in the selection of groups in varieties of citrus. Communications in Soil Science and Plant Analysis. 2015;46(15):1948–59. [Google Scholar]

Decision Letter 0

Paul Esker

27 Dec 2019

PONE-D-19-27444

Cultivar-specific nutritional status of potato (Solanum tuberosum L.) crops

PLOS ONE

Dear Dr. Parent,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by Feb 10 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Paul Esker

Academic Editor

PLOS ONE

Additional Editor Comments:

This paper presents a machine learning approach to modeling tuber yields as a function of foliar ionomes. The concept is interesting and the database extensive considering time and cultivar diversity. Overall, the paper does add something new to the literature, but as indicated by the primary reviewer, revisions are required before the article is ready for publication. I would like to apologize for the delay in returning this review since there were challenges with finding reviewers, as well as having several situations where the reviewer was released from finalizing the review due to non-response. Nonetheless, both the primary reviewer and I are in agreement regarding areas for improvement for this manuscript.

In my case, I was confused somewhat by the description of how trials were selected, and what the yield range was since in the methods, trials that had less than 28 Mg ha-1 were dropped from inclusion, yet the low yielding group (high versus low) averaged 24.8 Mg ha-1. I assume this means that within the selected trials, there were still many low-yielding cultivars, correct?

Also, the sample sizes by cultivar were quite variable and in some cases, it appeared that there were very few of one class or the other when looking at the supplementary material. As indicated by Reviewer 1, I think this is important to provide further details or context since it may partially explain the high variation in fit by cultivar - this is not necessarily addressed well in the discussion.

Some more specific observations include:

Lines 229-231: Confusing statement

Line 269 (and in other statements), the citation style was rather odd, with a double mention of the author.

Lines 300-304: Confused with pre-selection procedure - also ties in to the question that Reviewer 1 had for lines 305-307.

Lines 319-323: Seems like a transition is missing to the connect the different thoughts.

Journal Requirements:

When submitting your revision, we need you to address these additional requirements:

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.plosone.org/attachments/PLOSOne_formatting_sample_main_body.pdf and http://www.plosone.org/attachments/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating in your Funding Statement:

"ZC is partly funded by the Natural Sciences and Engineering Council of Canada (CRDPJ 385199-09 and DG-2254), the Quebec Ministry of Agriculture, Fisheries and Food (IA216581), Centre SEVE, Patate Dolbec Inc., Groupe Gosselin FG, Agriparmentier Inc., Ferme Daniel Bolduc Inc., Patate Laurentienne, Ferme Bergeron-Niquet, and Patates Lac-St-Jean. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

a. Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now.  Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement.

b. Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf.

3. Thank you for stating the following in the Competing Interests section:

'The authors have declared that no competing interests exist.'

We note that you received funding from commercial sources: Patate Dolbec Inc., Groupe Gosselin FG, Agriparmentier Inc., Ferme Daniel Bolduc Inc.

a. Please provide an amended Competing Interests Statement that explicitly states this commercial funder, along with any other relevant declarations relating to employment, consultancy, patents, products in development, marketed products, etc.

Within this Competing Interests Statement, please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests).  If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

b. Please include your amended Competing Interests Statement within your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

4. We note that Figure 1 in your submission contains map images which may be copyrighted.

All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For these reasons, we cannot publish previously copyrighted maps or satellite images created using proprietary data, such as Google software (Google Maps, Street View, and Earth). For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (a) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (b) remove the figures from your submission:

a.    You may seek permission from the original copyright holder of Figure(s) [#] to publish the content specifically under the CC BY 4.0 license.

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission.

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

b.    If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

The following resources for replacing copyrighted map figures may be helpful:

USGS National Map Viewer (public domain): http://viewer.nationalmap.gov/viewer/

The Gateway to Astronaut Photography of Earth (public domain): http://eol.jsc.nasa.gov/sseop/clickmap/

Maps at the CIA (public domain): https://www.cia.gov/library/publications/the-world-factbook/index.html and https://www.cia.gov/library/publications/cia-maps-publications/index.html

NASA Earth Observatory (public domain): http://earthobservatory.nasa.gov/

Landsat: http://landsat.visibleearth.nasa.gov/

USGS EROS (Earth Resources Observatory and Science (EROS) Center) (public domain): http://eros.usgs.gov/#

Natural Earth (public domain): http://www.naturalearthdata.com/

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In lines 305-307, the authors state that the sample size is small and results should be carefully interpreted. It would be misleading to the reader to indicate any relationship or conclusions at this stage until further investigation/research is done to explore such implications from this data.

Line 25-26, “The scarcity of data, in particular for new cultivars, constrains to group cultivars into maturity groups.” Awkward sentence – consider revision.

Line 51 “In particular, foliar gene expression….” Incomplete sentence- revise grammar.

Line 109, Can the maturity classes be further described by their range of days from planting to maturity?

Line 113-114, If I understand the protocol correctly, this leaf sampling was taken at different times throughout the growing season due to the differences in maturity of the potato cultivars?

Line 115-116, “ground to less than 1mm” what does that mean exactly? The plant material was ground to less than 1mm particle diameter? Just curious.

Line 265-266, “…information additional to maturity grouping is needed to assess nutrient requirements of potato cultivars.” Can you provide some additional considerations and why? Are they practical?

Line 278-280, The paragraph starts with describing the variation in cultivars and foliar nutrient profiles and would help if there’s more of a transition in explaining to the reader how this variation could be explained through the clustering process. Sentence 278-280 is a stark transition in thought. Consider the additional of another sentence to guide the reader.

Line 290, could the different methodology of quantifying P (colorimetry/ICP) have contributed to the insignificance of its content in predicting tuber yield classes?

Lines 305-307, Expand on why the predictive accuracy for some cultivars were very high while others were not. Would these potential factors need to be considered in future yield prediction models?

Lines 367-369, could the perturbation vector of leaf compositional space assist with correcting for in-season nutrient imbalances (per cultivar) to improve yield potential?

Conclusion section –

This section needs further expansion and discussion. What are the implications of the study on potato fertility and management? What about future directions and next steps? Does this research provide a positive direction in precision potato production in Canada? Does this mean that cultivar-specific nutrition recommendations may need revision or can be more precise in the future? What are the economic implications of this research for the potato industry? Are there any?

Why use potato? Has there been significant background research already conducted in this species or has there been cultivar-specific fertility variability that has led to the investigations with potato? Has there been economic impact of varied fertility regimes on potato cultivars that more precise fertility recommendations based on this model could address?

Could the dataset have been more robust if collected from other geographical sites?

Would it be possible to do some similar analyses with select potato cultivars grown in very controlled conditions and compare these results also to the analyses here?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

Decision Letter 1

Paul Esker

2 Mar 2020

Cultivar-specific nutritional status of potato (Solanum tuberosum L.) crops

PONE-D-19-27444R1

Dear Dr. Parent,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Paul Esker

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Thank you for taking the time and energy to consider the reviewer comments. The manuscript reads very well and I am satisfied that it can be moved along for publication.

Reviewers' comments:

None required.

Acceptance letter

Paul Esker

3 Mar 2020

PONE-D-19-27444R1

Cultivar-specific nutritional status of potato (Solanum tuberosum L.) crops

Dear Dr. Parent:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Paul Esker

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Quebec potato leaves ionome data set.

    raw_leaf_df.csv file available online in data repository at https://git.io/Jvt2r.

    (CSV)

    S2 Table. Potato data set used for cluster analysis.

    (DOCX)

    S3 Table. True negatives mean clr values for cultivars.

    (DOCX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    All relevant data are within the paper and its Supporting Information files. There is no restriction on sharing of data and/or materials.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES