Abstract
The Atlantic forest is one of the world's major tropical biomes due to its rich biodiversity. Its vast diversity of plant species poses challenges in floristic surveys. Fourier transform infrared spectroscopy (FTIR) enables rapid and residue-free data collection, providing diverse applications in organic sample analysis. FTIR spectra quality depends on the sample preparation methodology. However, no research on FTIR spectroscopy methodology for taxonomy has been conducted with tropical tree species. Hence, this study addresses the sample preparation influence on FTIR spectra for the taxonomic classification of 12 tree species collected in the Serra do Mar State Park (PESM) - Cunha Nucleus – São Paulo State, Brazil. Spectra were obtained from intact fresh (FL), intact dried (DL), and heat-dried ground (GL) leaves. The spectra were evaluated through chemometrics using Principal Component Analysis (PCA), Hierarchical Cluster Analysis (HCA), and Linear Discriminant Analysis (LDA) with validation by LDA-PCA. The results demonstrate that sample preparation directly influences tropical species FTIR spectra categorization capability. The best taxonomic classification result for all techniques, validated by LDA-PCA, was obtained from GL. FTIR spectra evaluation through PCA, HCA, and LDA allow for the observation of phylogenetic relationships among the species. FTIR spectroscopy proves to be a viable technique for taxonomic evaluation of tree species in floristic exploration of tropical biomes which can complement traditional tools used for taxonomic studies.
Keywords: FTIR, Plant taxonomy, Multivariate analysis, Atlantic forest, Tropical trees
Graphical abstract
Highlights
-
•
The sample preparation technique directly influences FTIR spectra classification capability of tropical trees.
-
•
FTIR spectra obtained from heat-dried ground (GL) leaves can be taxonomically classified by PCA, HCA and LDA-PCA.
-
•
The FTIR technique presents as a viable complementary alternative tropical tree taxonomy research.
1. Introduction
The Atlantic forest is considered one of the main biodiversity hotspots due to its high number of species, high endemism, and extensive degradation [1,2]. The Serra do Mar State Park (PESM) is one of the last remnants of primary native forests in the Atlantic Forest biome in São Paulo state, encompassing submontane, montane, and high-montane forest formations, which harbor a great diversity of tree species [3]. Floristic surveys conducted in primary and secondary forest areas in PESM have revealed a diversity of 100–200 tree species per hectare and 562 species of higher plants in the altitudinal gradient from 10 to 1100 m [[4], [5], [6]].
Consequently, due to the high biodiversity present in the Atlantic Forest, taxonomic identification of tree species represents a challenge in fieldwork [3]. Taxonomic knowledge is essential to understand ecological species dynamics that can be reproduced in reforestation projects in Atlantic forest biome. This identification traditionally requires a deep knowledge of botany and usually involves the complete collection of leaves, flowers, and fruits of the species for accurate identification through exsiccates [3,6]. On the other hand, the use of chemical identification methodologies, such as spectroscopic techniques, has the advantage of being objective and quantitative, without requiring the complete collection of plant samples, only tissue fragments such as leaves are sufficient for the evaluation [[7], [8], [9]].
Among spectroscopic techniques, Fourier Transform Infrared Vibrational Spectroscopy with Attenuated Total Reflectance (FTIR-ATR) stands out as an effective tool for the analysis of plant organic material, aiming at sample classification [[10], [11], [12]]. FTIR is a non-invasive collection technique that does not generate laboratory waste, is fast and low-cost, allowing the characterization and quantification of organic functional groups that reflect the biochemical composition of plants [[8], [9], [10]].
Due to the high versatility of the technique, FTIR-based studies have been applied in various plant research. For example, taxonomic identification of species using leaf samples [7,10,13]), pollen [12,14,15], and roots [16], classification of invasive species hybrids [17], study of the influence of different land uses on grasses [18], evaluation of plant-environment-atmosphere interaction [14,19,20] influence of environmental pollution on pollen composition [21], change in plant biochemical composition after disease infection [22,23], influence of toxic soils on plants [24] evaluation of hydrogels use as plant substrate [25], and evaluation of species with pharmaceutical potential through in situ collection of spectra [8], among others.
The quality of the obtained FTIR spectrum depends on various factors, such as the field sample collection process, sample storage, sample preparation method, laboratory environment condition during spectra collection, spectral processing methods, and statistical analysis methodology of the data [9,27]. Most of the research on plant material using FTIR spectroscopy has been conducted with temperate region plants [7,10,12,13]. It is also important to consider that the distance between the field sites and laboratories may need the analysis of dry and ground samples [9].
Consequently, it is necessary to evaluate sample preparation methodologies for FTIR tropical species analysis. These areas present greater temperature range, biodiversity, and morphological diversity compared to temperate areas, which may have a significant impact on the FTIR collection and sampling process. Therefore, the present study aims to evaluate which sample preparation methodology allows for better taxonomic classification of twelve Atlantic Forest tree species FTIR spectra by chemometric techniques of Principal Component Analysis (PCA), Hierarchical Clustering Analysis (HCA), and Linear Discriminant Analysis (LDA).
2. Methodology
2.1. Species selection
Twelve native species were selected from the PESM in the municipality of Cunha, São Paulo state, Brazil (23°14′22.05″S; 45° 1′26.05″W) (Appendix A. Supplementary Material), namely: Araucaria angustifolia (Aa); Campomanesia guaviroba (Cb); Guapira opposita (Go); Inga sessilis (Is); Myrsine gardneriana (Mg); Myrsine lineata (Ml); Myrsine umbellata (Mu); Nectandra lanceolata (Nl); Psychotria suterella (Ps); Schinus terebinthifolia (St); Senna multijuga (Sm); Sapium glandulosum (Sg). The selection done in the collection site was based on the list of species recommended for reforestation in the Atlantic Forest biome in the state of São Paulo and represent the most common species per family found in floristic surveys in the PESM [[3], [4], [5],26]. Araucaria angustifolia was also evaluated because it is a representative threatened species found in the Cunha PESM nucleus [27].
2.2. Sample collection and storage
Leaf samples of the twelve species were collected in the field during the summer season between 10:00 a.m. and 2:00 p.m., as this period is ideal to avoid differences caused by the plants' circadian cycle [28]. The leaves were collected from the lower part of the trees canopy. Considering that leaf diseases can alter the FTIR spectroscopic signal and compromise the final results only healthy leaves were selected from the samples collected [22,23]. After collection, the fresh leaf samples were stored in plastic bags to prevent water loss during transportation to the laboratory, which occurred on the same day as collection. In the laboratory, the samples were kept in a refrigerator at 10 °C, they were not frozen prior to sampling, as has been done in research with temperate region species [9].
2.3. Sample preparation
For comparison purposes, three different types of sample preparation were performed: i) intact fresh leaves, in natura (FL); ii) intact oven dried leaves (DL); and iii) heat-dried ground leaves, using a mill (GL). FL stored for a maximum of 24 h after collection were directly evaluated on the spectrophotometer. The samples were not hydrated in the laboratory prior to FTIR spectroscopy analysis [9]. Considering that the objective was to observe the spectra characteristics of FL closer to field conditions, reckoning with the time between collection and laboratory analysis.
For the analysis of DL, the FL samples were placed in an oven at 60 °C for 72 h to remove moisture [9]. The same leaves specimens were evaluated in the category DL and after processing as the GL group. The spectra in DL were acquired from adjacent spots to the previous FL spectra spot measure to avoid post pressure impacts. During the drying step, excessively high temperatures were not used not to compromise the biochemical composition of the plants [29]. The spectra collection of FL and DL was performed on the abaxial side at three points per leaf [9,25]. Spectra were collected from 6 leaves per plant, totaling 18 FTIR spectra per species.
The selected DL were ground using an IKA A11 basic analytical mill, a pestle for complete pulverization of the samples, and sifted through a 10-mesh sieve. The finer the particle size of the analytical compound, the better the contact with the spectrophotometer sample holder and, consequently, the more intense the FTIR spectrum signal [9]. The analysis of GL was performed directly on the FTIR sample holder on the ATR as the FL and DL analysis where the same pressure was applied for all samples.
2.4. Preprocessing of FTIR spectra
The 216 analyzed spectra were obtained in mid-infrared absorbance with FTIR in the range of 4000 cm−1 to 450 cm−1, with a resolution of 4 cm−1, 32 scans at room temperature, and a data spacing of 2 cm−1 on the Bruker Optik GmbH Alpha II FTIR spectrophotometer equipped with a diamond crystal ATR (Appendix B. Supplementary Material). The spectra were preprocessed using the Bruker OPUS 8.5 software, including baseline correction by the rubber band method and spectral smoothing using the Savitsky-Golay algorithm (9 points) [30].
During preprocessing, the spectra were also normalized in the range of 1690 cm−1 to 1620 cm−1, which showed the least variation among the different evaluated species. This range corresponds to Amide I, related to plants’ structural proteins [10,11,31]. The spectra were analyzed focused on the fingerprint region of the samples (1770 cm−1 - 700 cm−1).
The application of the first derivative in FTIR spectra is widely used in plant research [9,10]. However, it was observed that in the spectra of the tropical species evaluated in this study, there was amplification of noise in the fingerprint area. Therefore, the spectra were analyzed without that procedure.
2.5. Statistical analysis of FTIR spectra
One of the crucial statistical issues in multivariate analysis is the presence of collinearity throughout data. Although this factor is not prevalent for FTIR data, where the variables are measured independently, the preprocessing procedure can induce collinearity [8]. Therefore, previously to the PCA analysis, the data were assessed by calculating the Variance Inflation Factor (VIF) and mean centered using the function ‘pca.fit_transform’ on Python script with the Jupyter programming interface to observe if the dataset could be affected by multicollinearity.
PCA is a multivariate statistical method that allows the reduction of a data matrix by combining the original variables through the calculation of uncorrelated indices in order of importance based on variance [32]. The coefficients (loadings) determine how much each original variable contributes to each principal component extracted in PCA, and the combination of the original data weighted by the coefficients determines the scores [33]. The PCA scores result from the linear transformation of the source data that provide information about the distribution of the samples, aiding to understand existing patterns in the original data [33]. Therefore, PCA was conducted to observe patterns in the spectra data. The plotting of scores and loadings graphs were performed using the Spectroscopy Data PCA package (v1.30) in the software OriginPro 2022.
HCA is a multivariate data analysis technique that allows the use of variable values to establish a grouping scheme for objects (samples) into similar classes in order to observe the formation of groups [32]. Therefore, to observe the similarity relationships in the fingerprint region (1770 cm−1 - 700 cm−1) of the FTIR spectra of the species, Hierarchical Cluster Analysis (HCA) was also performed in OriginPro 2022 by using Euclidean distance and the Ward method for each of the sample treatments [17].
LDA considers the classes or categories of the source data, aiming to discover a data projection that maximizes data variability, being suit to supervised evaluation [32]. The validation parameters of LDA-PCA allow us to observe the ability to classify a dataset. Accuracy measures the ability to correctly predict data in relation to the total data in the model, specificity predicts negative values, recall measures sensitivity to positive values, precision measures how accurate positive predictions are, and the F1-score combines recall with precision for a comprehensive analysis of possible false positives that influence the model [34].
Therefore, LDA was established based on the components obtained in PCA to reduce the dimensionality of the data, remove correlation, and classify the clusters based on classes [32]. The Cross-validation was conducted using the cross_val_score function in the scikit-learn library in Python where the dataset was split into 10 folds, by calculating the parameters of the model's ability to classify the data from all conducted sample treatments [35].
The first 4 principal components were used for validation with LDA-PCA as they were statistically sufficient to explain the total variance of the data without under/overfitting. The cross-validation presents better performance for classification of data than the single split tests that may induce bias, especially in small datasets [36,37]. In order to observe the predictive ability of taxonomic classification, the cross-validation test was conducted through the calculation of the quality parameters and plotting the confusion matrixes for the different samplings evaluated.
3. Results and discussion
3.1. Spectral peak analysis
There is good reproducibility of species-specific spectra in the three different sampling methods evaluated – FL, DL and GL (Fig. 1). The spectra analysis is focused in the FTIR fingerprint region (1770 cm−1 – 700 cm−1) where the main defining molecules are found in plant samples [[9], [10], [11]]. Due to the sample conditions, it was not possible to collect FL spectra of Senna multijuga (Sm).
Fig. 1.
Average FTIR spectra of plant samples fingerprint region (1770 cm−1 – 700 cm−1): A = FL; B = DL and C = GL.
The main observed peaks in the three sampling methods are at 1640 cm−1 corresponding to the stretching of C O and C N bonds of the proteins in Amide I and the peak at 1030 cm−1 corresponding to the stretching of hydroxyls and carboxyls of polysaccharides [10,11,17,18,38,39].
The grinding process enables a better definition of bands, especially in the region of 1200 cm−1 to 1000 cm−1, region which presents discriminant saccharides among different plant species as well as internal structural proteins and lipids of the plant cell wall [8,[10], [11], [12],40].
The band at 1440 cm−1 also appears with better definition in GL samples, as also observed by Holden [28]. The grinding procedure promotes greater homogenization, reflecting the total biochemical content of the leaves more accurately, compared to FL and DL, where especially the composition of the leaf outer surface is reflected in the FTIR spectrum [11,20,27].
There is a slight variation in peak position among the samples from different treatments; however, no significant alteration in the main peaks is observed. This result is expected since the samples were not subjected to chemical extraction, which can significantly alter the position and intensity of FTIR bands [41]. The tentative main assignments of the spectra collected for the twelve tropical trees are gathered in Table 1.
Table 1.
Frequency Assignments Peaks - Fingerprint Region (1770-700 cm−1).
| Wavenumber (cm−1) | Vibration Assignment |
Approximate Components | References |
|---|---|---|---|
| 1730 | ʋ of the C O ester group | Pectin, Polysaccharides (Triglycerides) | [9,10,12,17,21,38] |
| 1670(a) | ʋ of β-turns in the C O bond | Proteins of Amide I | [17] |
| 1640 | ʋ of the C O and C–N bonds | Amide I/Proteins | [10,17,18,38,39] |
| 1540 | ʋ of the C N and N–H bonds | Amide II/Lipids and particularly Proteins | [8,11,12,17] |
| 1460 | δass of the C–H bond in -CH2 and -CH3 groups | Amide III/Cell Wall Proteins | [11,17,21] |
| 1440 | ʋ of carbon bonds in aromatic rings | Lipids/Fatty Acids/Proteins | [17,27] |
| 1370 | δs of the C–H bond in -CH2 and -CH3 groups | Cell Wall Proteins/Lipids | [9,11,17] |
| 1320 | δ of the –CH bond | Hemicellulose/Cellulose | [11] |
| 1240 | ʋ of the C N, N–H, and C O bonds | Amide IV/Proteins/Hemicellulose | [7,21] |
| 1150 | ʋass of the C–O and C–N bonds in the –COOH group | Chlorophyll/Cell Wall Polysaccharides | [11,40] |
| 1100 | ʋ of the C–O–C bond in esters and the C–N bond | Polysaccharides/Cutin | [9,10,12,39] |
| 1030 | ʋ of the O–H and C–OH bonds | Polysaccharides/Glucomannan | [10,11] |
| 830(b) | δoop of C–H bonds in rings | Polyphenols | [8,38] |
| 730(c) | δass of the -CH2 bond in-plane “rocking” | Lignin (Rings) | [38] |
Caption: ʋ: Stretching. ʋs: Symmetric stretching. ʋass: Asymmetric stretching. δ: Bending. δs: Symmetric bending. δass: Asymmetric bending. δoop: Out-of-plane bending. (a) present in N. lanceolata; (b) present in P. suterella, M. gardneriana, S. terebinthifolia, and A. angustifolia; (c) present in M. umbelata, P. suterella, M. gardneriana, S. terebinthifolia, S. glandulosum, and A. angustifolia.
There are studies related to some of the species here evaluated concerning their biochemical composition, especially their applications (Appendix C. Supplementary Material). However, some species here analyzed require further studies to improve their composition knowledge in which FTIR spectroscopy can be used as an important tool to describe biomolecular components (Table 1).
3.2. PCA and HCA
The preliminary analysis of Variance Inflation Factor (VIF) do not reveal evidence of collinearity in the data, which allows to proceed with PCA and HCA. Fig. 2 presents the scores and loadings representations of the PCA and the HCA for each of the three sampling methods evaluated. The score plots represent 89,8%, 85,1% and 79,2% of the data total variance for FL (Fig. 2A), DL (Fig. 2D) and GL (Fig. 2G), respectively.
Fig. 2.
Score plot and loadings from PCA with 95% confidence ellipse and HCA graph plot of the fingerprint region (1770 cm−1 – 700 cm−1) of FL (A, B, C), DL (D, E, F) and GL (G, H, I).
The loadings demonstrate that the polysaccharides, especially glucomannan (1150 cm−1 - 1030 cm−1) are the main influence in the PC1 and PC2 in all sampling methods (Fig. 2B, E and 2H) which suggests a reflection of the samples wax cuticle composition particularly on FL as also observed by Chen et al. [8] in the analysis of the medicinal plant Lonicera japonica. The polysaccharide pectin (1730 cm−1) is influential in the PC2 especially in the intact samples which also suggests it may constitute a major leaf epidermis compound [28].
Some species presents morphological attributes that may affect the contact between sample and spectrophotometer. That was observed in the species Aa and Nl, both species present coriaceous leaves with a thicker wax cuticle [42]. Which is a possible factor for the higher dispersion observed in the HCA of FL and DL (Fig. 2C, F). The Aa, Sg and Myrsine species FL samples coupling in the HCA suggest a possible chemical similarity among the volatile compounds present in the leaf epidermis of these species [27,43,44]. However, further studies by FTIR in FL are needed to observe the capacity to identify cuticle compounds in the next steps of this research.
The FL and DL samples cannot be separated by species with as observed by the high dispersion in the PCA and HCA (Fig. 2A, C, 2D, 2F), which corroborates the premise that plant leaf spectra cannot be explained just by the cuticle composition and structure alone [30]. Also, other factors such as the distance of the field areas from laboratory, storage, and leaf morphology can affect the intact leaves FTIR spectra quality [11]. The HCA of FL and DL also indicates low capability of classification of intact leaves spectra where only the species Cb, Mu and Mg presents a grouping tendency.
Better results of classification with in natura samples were obtained through in loco spectra collection with a portable FTIR spectrophotometer [8]. However, especially in the context of tropical biomes, there are difficulties to collect data in loco, as observed during this research, which demands processing methods such as sample grinding to a better evaluation of samples.
Thus, in the present analysis the GL samples separate in homogeneous groups by the PCA and HCA accordingly to the species (Fig. 2G and I). The PCA can successfully separate samples which have some linear correlation between the features [45]. The clades in phylogenetic classification establish groups that share the same evolutionary ancestor [46]. PCA scores of GL indicate a tendency of separation between species representing the Rosids (Cb) and the Asterids (Mu) clades along the PC1 axis [47].
The HCA can classify GL by species clusters with similarity bigger than 95%. Also, clusters of phylogenetically close species are formed: by family: Myrsinaceae species Mu and Mg (82,5%) and Fabaceae species Sm and Is (90%), and by clade: Asterids families: Ml (Myrsinaceae) and Go (Nyctaginaceae) (95%), Ps (Rubiaceae) with Mu and Mg (Myrsinaceae) (70%) and Mu, Mg, Ps, Ml and Go (47.5%); Rosids families: St (Anacardiaceae) and Sg (Euphorbiaceae) (72.5%) and Cg (Myrtaceae) with Sg (Euphorbiaceae) and St (Anacardiaceae) (60%) [47]. These results support the applicability of FTIR in taxonomic researches with tropical plants.
The similarity observed among phylogenetically distant species, especially Nl and Aa with the remaining species may be related with similar internal compounds shared by those species that became more prominent after the grinding process [28]. Although, future studies are needed to clarify the FTIR capability to evaluate biomolecular tropical plant compounds.
3.3. LDA-PCA validation
The combination of LDA with the scores obtained from PCA allows evaluating the ability to classify samples by group (label) in a supervised manner [33]. Therefore, the LDA-PCA is applied in the different method evaluated (Fig. 3). Also, the quality parameters for the prediction per species and per method are disposed in Fig. 4.
Fig. 3.
Score plot graph LDA-PCA with 95% confidence ellipse and LDA-PCA prediction confusion matrixes (B, D, F) - Fingerprint Region (1770 cm−1 – 700 cm−1): FL (A, B); DL (C, D); GL (E, F).
Fig. 4.
LDA-PCA classification quality parameters for the three different sampling methods (A) and for each species: FL (B); DL (C) and GL (D) – 18 spectra for each species on the dataset cross-validation. Caption:Araucaria angustifolia (Aa); Campomanesia guaviroba (Cb); Guapira opposita (Go); Inga sessilis (Is); Myrsine gardneriana (Mg); Myrsine lineata (Ml); Myrsine umbellata (Mu); Nectandra lanceolata (Nl); Psychotria suterella (Ps); Schinus terebinthifolia (St); Senna multijuga (Sm); Sapium glandulosum (Sg).
The LDA-PCA first two components respectively explain 94,8%, 84,4% and 86,1% of FL (Fig. 3A), DL (Fig. 3C) and GL (Fig. 3E). The prediction by the confusion matrix is conducted with the first four PCs to avoid under/overfitting. As also observed in the PCA/HCA, LDA-PCA cannot classify the FL and DL all samples by species (Fig. 3A and C). However, some species FL presented good classification on the LDA, especially Cg, that species have membranaceous leaves abaxially composed with tector trichomes which grant humidity retention [48]. That morphological adaptation might be an important factor playing for the better classification of Cg FL and DL. For FL only Cg and St, and for DL only Go, Mg and Mu were 100% correctly predicted by the LDA-PCA validation (Fig. 3B, D – Fig. 4B, C).
Factors such as leaf composition and morphology may be altered by the drying process [29] and consequently alter the FTIR signal. The reduction in prediction from FL to DL affected particularly species with membranaceous leaves [42,49] exceptionally St and Ps (Fig. 3B and D). Both species also have biochemical similarities such as the presence of volatile monoterpenes on the wax cuticle that might had been lost in DL and affected the prediction [50,51]. Therefore, it is necessary to consider the morphological and biochemical aspects of the species during the leaves processing, more research is required to better comprehend the relation of biochemical composition in the classification of FTIR spectra.
However, the LDA-PCA of GL demonstrate a good capacity to separate the samples by species (Fig. 3E), as indicated by the quality parameters (Fig. 4A). LDA-PCA provides a better separation of the species Aa and Nl from the other phylogenetically distant species that overlapped in the PCA (Fig. 2G vs Fig. 3E). The overlap of confidence ellipse only occurs between Go and Ml, both species from families of the Asterid clade [47].
The Asterid clade families represented in the samples have a closer phylogenetical relation [47]. The LDA-PCA components evidences that on this study through the Asterid species similarity in LD2 (Mg, Mu, Ml, Go and Ps). Rubiaceae species Ps is more distant from the Asterids of Myrsinaceae and Nycatginaceae in LD2 which is consistent with the clade phylogeny [47,52].
Despite the broader representation of Rosid families in the samples, a proximity in close related species is observed between Fabaceae Sm and Is in LD2 and between the Euphorbiaceae Sg and the Fabaceae Sm in LD1, the proximity of these families is also confirmed by the genetic sequencing of species from these orders [52]. The St separated farer from the other Rosid species, which is consistent, considering its order Sapindales is phylogenetically more distant from the other Rosid orders represented in the samples [46]. The order Myrtales has a closer relationship with the order Sapindales according to studies of genetic sequencing of species belonging to these orders [53]. The LDA-PCA here also indicates that the Myrtales Cg is closer to the Sapindales St in relation to LD1.
Therefore, the best classification is achieved with GL. In Fig. 3F, it is observed that only one sample of Go was misclassified as Ml, two species belonging to closely related families (Myrsinaceae and Nyctaginaceae) within the Asterid clade [47,54]. This treatment was the only one to achieve 100% in all parameters for the classification of the evaluated tropical tree species (Fig. 4D). Holden [28] also observed that analysis of GL reduces variation between samples, resulting in higher accuracy, specificity, and sensitivity of results in the validation through LDA-PCA, as also demonstrated in the present analysis.
This analysis highlights the importance of defining sample preparation method for FTIR spectra of tropical plant species collection. The need for a treatment to homogenize the samples is observed. Heat-dry grinding proves to be an efficient method for the classification of FTIR spectra of species, as validated by LDA-PCA analysis. Traditional phylogenetic studies using genetic sequencing are costly, and there are many gaps in understanding the classification of many species [52,53]. Hence, FTIR being a rapid technique, it conveys potential as a complementary technique for phylogenetic studies when combined with multivariate statistics of PCA, HCA and LDA-PCA cross-validation [9,10,17].
Despite the longer time of processing, the analysis of ground samples in future studies is proving to be reliable for the analysis of plant samples FTIR spectra where the distance between lab and field is unavoidable. Finally, the grinding allows insights of the internal composition of plants, that can be important for futures studies in the characterization of tropical trees biochemistry for a broad variety of applications as botanic, pharmacology and environmental studies.
The results here obtained suggests that the taxonomic classification of intact in natura leaves FTIR spectra are limited in natural field conditions especially by the lapse time between the collection of samples and spectra acquisition. FL evaluation will be important to consider how the environment influences the plants, for that reason future analysis in the field with portable FTIR spectrophotometer may be a solution for the distance limitation improving the results.
Finally, considering that the leaves were collected during the summer season, new studies examining the seasonal impact over tropical plant FTIR spectra are needed. The FTIR precise analysis capability may proportionate the evaluation of individual plant specimen responses to degradation which can substantiate the use of the FTIR spectroscopy technique in environmental studies, especially in forest recovery projects. These studies are the main future targets in the ongoing research project.
4. Conclusion
Sample collection, transportation, and storage were identified as limiting factors for obtaining classifiable FTIR spectra of tropical trees. The FTIR spectra of the intact leaves (FL, DL) does not contain enough information to provide the addressed tropical trees taxonomic classification by PCA, HCA and LDA. Otherwise, the analysis indicates that heat-dried ground (GL) leaves spectra can be taxonomically classified as validated by LDA-PCA combined. The multivariate analysis suggests that FTIR spectroscopy provides information on the phylogenetic relationship of tropical tree species. Combining FTIR with other techniques used for plant taxonomic studies may contribute to a better understanding of tropical species classification, especially in floristic surveys of reforestation projects in endangered tropical biomes as the Atlantic forest.
Data availability statement
All data here evaluated are available by requesting the corresponding author.
CRediT authorship contribution statement
Douglas Cubas Pereira: Writing – review & editing, Writing – original draft, Validation, Software, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Breno Pupin: Writing – review & editing, Data curation. Laura de Simone Borma: Writing – review & editing, Supervision, Project administration, Conceptualization.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
Financial support by the Coordination of Superior Level Staff Improvement - Brazil (CAPES) – Finance Code 001. The authors are thankful to the Graduate Program in Earth System Science (PGCST) at National Institute for Space Research (INPE) for assisting the research to continue, to Dr Luciana Maria Ferrer for helping in text review, and to Dr. Kelly Cristina Tonello for the support at Cunha Nucleus where the samples were collected.
Footnotes
Supplementary data to this article can be found online at https://doi.org/10.1016/j.heliyon.2024.e27232.
Contributor Information
Douglas Cubas Pereira, Email: douglas.pereira@inpe.br.
Breno Pupin, Email: breno0891@hotmail.com.
Laura de Simone Borma, Email: laura.borma@inpe.br.
Appendix A. Supplementary data
The following are the Supplementary data to this article:
figs1.
References
- 1.Myers N., et al. Biodiversity hotspots for conservation priorities. Nature. 2000;403(6772):853–858. doi: 10.1038/35002501. [DOI] [PubMed] [Google Scholar]
- 2.Forzza R.C., et al. New Brazilian floristic list highlights conservation challenges. Bioscience. 2012;62(1):39–45. doi: 10.1525/bio.2012.62.1.8. [DOI] [Google Scholar]
- 3.Joly C.A., et al. Florística e fitossociologia em parcelas permanentes da Mata Atlântica do sudeste do Brasil ao longo de um gradiente altitudinal. Biota Neotropica. 2012;12:125–145. doi: 10.1590/S1676-06032012000100012. [DOI] [Google Scholar]
- 4.DE AGUIAR O.T., et al. Flora Fanerogâmica de um trecho da floresta densa Secundária No Parque Estadual Da Serra do Mar-Núcleo Cunha/Indaiá: Cunha (Sp) Revista do Instituto Florestal. 2001;13(1):1–18. doi: 10.24278/2178-5031.2001131625. [DOI] [Google Scholar]
- 5.Rochelle A.L.C., Cielo-Filho R., Martins F.R. Florística e estrutura de um trecho de floresta ombrófila densa atlântica submontana no Parque Estadual da Serra do Mar, em Ubatuba/SP, Brasil. Biota neotropica. 2011;11:337–346. doi: 10.1590/S1676-06032011000200032. [DOI] [Google Scholar]
- 6.Marchiori N.M., et al. Tree community composition and aboveground biomass in a secondary Atlantic forest, Serra do Mar state park, São Paulo, Brazil. Cerne. 2016;22:501–514. doi: 10.1590/01047760201622042242. [DOI] [Google Scholar]
- 7.Gorgulu S.T., Dogan M., Severcan F. The characterization and differentiation of higher plants by Fourier transform infrared spectroscopy. Appl. Spectrosc. 2007;61(3):300–308. doi: 10.1366/000370207780220903. [DOI] [PubMed] [Google Scholar]
- 8.Chen J., et al. Rapid and automatic chemical identification of the medicinal flower buds of Lonicera plants by the benchtop and hand-held Fourier transform infrared spectroscopy. Spectrochim. Acta Mol. Biomol. Spectrosc. 2017;182:81–86. doi: 10.1016/j.saa.2017.03.070. [DOI] [PubMed] [Google Scholar]
- 9.Durak T., Depciuch J. Effect of plant sample preparation and measuring methods on ATR-FTIR spectra results. Environ. Exp. Bot. 2020;169 doi: 10.1016/j.envexpbot.2019.103915. [DOI] [Google Scholar]
- 10.Holden C.A., et al. Know your enemy: application of ATR-FTIR spectroscopy to invasive species control. PLoS One. 2022;17(1) doi: 10.1371/journal.pone.0261742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Türker-Kaya S., Huck C.W. A review of mid-infrared and near-infrared imaging: principles, concepts and applications in plant tissue analysis. Molecules. 2017;22(1):168. doi: 10.3390/molecules22010168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Depciuch J., et al. Identification of birch pollen species using FTIR spectroscopy. Aerobiologia. 2018;34:525–538. doi: 10.1007/s10453-018-9528-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kim S.W., et al. Taxonomic discrimination of flowering plants by multivariate analysis of Fourier transform infrared spectroscopy data. Plant Cell Rep. 2004;23:246–250. doi: 10.1007/s00299-004-0811-1. [DOI] [PubMed] [Google Scholar]
- 14.Zimmermann B., Kohler A. Infrared spectroscopy of pollen identifies plant species and genus as well as environmental conditions. PLoS One. 2014;9(4) doi: 10.1371/journal.pone.0095417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kenđel A., Zimmermann B. Chemical analysis of pollen by FT-Raman and FTIR spectroscopies. Front. Plant Sci. 2020;11:352. doi: 10.3389/fpls.2020.00352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rewald B., Meinen C. Plant roots and spectroscopic methods–analyzing species, biomass and vitality. Front. Plant Sci. 2013;4:393. doi: 10.3389/fpls.2013.00393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Holden C.A., et al. Regional differences in clonal Japanese knotweed revealed by chemometrics-linked attenuated total reflection Fourier-transform infrared spectroscopy. BMC Plant Biol. 2021;21(1):1–20. doi: 10.1186/s12870-021-03293-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Rana R., et al. Leaf Attenuated Total Reflection Fourier Transform Infrared (ATR-FTIR) biochemical profile of grassland plant species related to land-use intensity. Ecol. Indicat. 2018;84:803–810. doi: 10.1016/j.ecolind.2017.09.047. [DOI] [Google Scholar]
- 19.Lahlali R., et al. ATR–FTIR spectroscopy reveals involvement of lipids and proteins of intact pea pollen grains to heat stress tolerance. Frontiers in plant science. 2014;5:747. doi: 10.3389/fpls.2014.00747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Bağcioğlu M., et al. Monitoring of plant–environment interactions by high‐throughput FTIR spectroscopy of pollen. Methods Ecol. Evol. 2017;8(7):870–880. doi: 10.1111/2041-210X.12697. [DOI] [Google Scholar]
- 21.Depciuch J., et al. FTIR analysis of molecular composition changes in hazel pollen from unpolluted and urbanized areas. Aerobiologia. 2017;33:1–12. doi: 10.1007/s10453-016-9445-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Hawkins S.A., et al. Comparison of FTIR spectra between huanglongbing (citrus greening) and other citrus maladies. J. Agric. Food Chem. 2010;58(10):6007–6010. doi: 10.1021/jf904249f. [DOI] [PubMed] [Google Scholar]
- 23.Chow Y.Y., Ting A., Su Y. Influence of fungal infection on plant tissues: FTIR detects compositional changes to plant cell walls. Fungal Ecology. 2019;37:38–47. doi: 10.1016/j.funeco.2018.10.004. [DOI] [Google Scholar]
- 24.Palacio S., et al. Gypsophile chemistry unveiled: Fourier transform infrared (FTIR) spectroscopy provides new insight into plant adaptations to gypsum soils. PLoS One. 2014;9(9) doi: 10.1371/journal.pone.0107285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Pereira D.C., Pupin B., Sakane K.K. Avaliação do uso de hidrogel no desenvolvimento da Rapanea ferruginea com restrição hídrica por espectroscopia vibracional no infravermelho médio com transformada de Fourier (FTIR-UATR) Revista Ambiente & Água. 2021;16 doi: 10.4136/ambi-agua.2744. [DOI] [Google Scholar]
- 26.Barbosa L.M., et al. São Paulo: Instituto de Botânica; 2017. Lista de espécies indicadas para restauração ecológica para diversas regiões do estado de São Paulo; pp. 7–344.https://www.infraestruturameioambiente.sp.gov.br/institutodebotanica/wp-content/uploads/sites/235/2019/10/lista-especies-rad-2019.pdf Available in: [Google Scholar]
- 27.Peralta R.M., et al. Biological activities and chemical constituents of Araucaria angustifolia: an effort to recover a species threatened by extinction. Trends Food Sci. Technol. 2016;54:85–93. [Google Scholar]
- 28.Holden C.A. 2023. ATR-FTIR Spectroscopy-Linked Chemometrics: A Novel Approach to the Analysis and Control of the Invasive Species Japanese Knotweed.https://search.proquest.com/openview/28e74bf922d7c89cf93051c9a867e7a9/1?pq-origsite=gscholar&cbl=2026366&diss=y Doctorate Thesis. Lancaster University (United Kingdom). Available in: [Google Scholar]
- 29.Babu A.K., et al. Review of leaf drying: mechanism and influencing parameters, drying methods, nutrient preservation, and mathematical models. Renewable and sustainable energy reviews. 2018;90:536–556. doi: 10.1016/j.rser.2018.04.002. [DOI] [Google Scholar]
- 30.Götz A., et al. Apparent penetration depth in attenuated total reflection Fourier-transform infrared (ATR-FTIR) spectroscopy of Allium cepa L. epidermis and cuticle. Spectrochim. Acta Mol. Biomol. Spectrosc. 2020;224 doi: 10.1016/j.saa.2019.117460. [DOI] [PubMed] [Google Scholar]
- 31.Berthomieu C., Hienerwadel R. Fourier transform infrared (FTIR) spectroscopy. Photosynth. Res. 2009;101:157–170. doi: 10.1007/s11120-009-9439-x. [DOI] [PubMed] [Google Scholar]
- 32.Manly B.F.J., Alberto J.A.N. Chapman and Hall/CRC; 2016. Multivariate Statistical Methods: a Primer; p. 270p. [Google Scholar]
- 33.Warton D.I. Springer Nature; 2022. Eco-Stats: Data Analysis in Ecology: from T-Tests to Multivariate Abundances; p. 434p. [Google Scholar]
- 34.Ami D., Mereghetti P., Doglia S.M. Multivariate analysis for Fourier transform infrared spectra of complex biological systems and processes. Multivariate analysis in management, engineering and the sciences. 2013:189–220. doi: 10.5772/53850. [DOI] [Google Scholar]
- 35.Pedregosa F., et al. Scikit-learn: machine learning in Python. the Journal of machine Learning research. 2011;12:2825–2830. doi: 10.5555/1953048.2078195. [DOI] [Google Scholar]
- 36.Cawley G.C., Talbot N.L.C. On over-fitting in model selection and subsequent selection bias in performance evaluation. J. Mach. Learn. Res. 2010;11:2079–2107. doi: 10.5555/1756006.1859921. [DOI] [Google Scholar]
- 37.Gunasegaran T., Cheah Y.-N. 2017 8th International Conference on Information Technology (ICIT) IEEE; 2017. Evolutionary cross validation; pp. 89–95. [DOI] [Google Scholar]
- 38.Falcioni R., et al. Classification and prediction by pigment content in lettuce (Lactuca sativa L.) varieties using machine learning and ATR-FTIR spectroscopy. Plants. 2022;24:3413. doi: 10.3390/plants11243413. 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Selvam M.S., et al. Assessment of phytochemical, FT-IR and GC-MS fingerprint profiling of marine angiosperms Enhalus acoroides (Lf) Royle and Syringodium isoetifolium (Asch) Dandy, Gulf of mannar Biosphere Reserve, Tamil nadu. Asian J. Biol. Life Sci. 2022;11(2):469. doi: 10.5530/ajbls.2022.11.64. [DOI] [Google Scholar]
- 40.Ribeiro Da Luz B. Attenuated total reflectance spectroscopy of plant leaves: a tool for ecological and botanical studies. New Phytol. 2006;172(2):305–318. doi: 10.1111/j.1469-8137.2006.01823.x. [DOI] [PubMed] [Google Scholar]
- 41.Kucharska-Ambrożej K., et al. Quality control of mint species based on UV-VIS and FTIR spectral data supported by chemometric tools. Food Control. 2021;129 doi: 10.1016/j.foodcont.2021.108228. [DOI] [Google Scholar]
- 42.Lorenzi H. Vol. 1. – 2 – 3. Instituto Plantarum de Estudos da Flora; São Paulo: 2016. Arvores brasileiras: manual de identificação e cultivo de plantas arbóreas nativas do Brasil; p. 1152p. 7a Ed. [Google Scholar]
- 43.He Q., et al. Genus Sapium (Euphorbiaceae): a review on traditional uses, phytochemistry, and pharmacology. J. Ethnopharmacol. 2021;277 doi: 10.1016/j.jep.2021.114206. [DOI] [PubMed] [Google Scholar]
- 44.Laskoski L.V., et al. Phytochemical prospection and evaluation of antimicrobial, antioxidant and antibiofilm activities of extracts and essential oil from leaves of Myrsine umbellata Mart. (Primulaceae) Braz. J. Biol. 2022;82 doi: 10.1590/1519-6984.263865. [DOI] [PubMed] [Google Scholar]
- 45.Rios T.G., et al. FTIR spectroscopy with machine learning: a new approach to animal DNA polymorphism screening. Spectrochim. Acta Mol. Biomol. Spectrosc. 2021;261 doi: 10.1016/j.saa.2021.120036. [DOI] [PubMed] [Google Scholar]
- 46.Souza V.C., Lorenzi H. 4a Ed. Nova Odessa; Plantarum: 2019. Botânica sistemática: guia ilustrado para identificação das famílias de fanerógamas nativas e exóticas no Brasil, baseado em APG IV; p. 768. [Google Scholar]
- 47.ANGIOSPERM PHYLOGENY GROUP An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV. Bot. J. Linn. Soc. 2016;181(1):1–20. doi: 10.1111/boj.12385. et al. [DOI] [Google Scholar]
- 48.Saibert P.C.S., Romagnolo M.B., Albiero A.L.M. Comparação Morfoanatômica de Folhas de Campomanesia xanthorcapa O. Berg e Campomanesia guaviroba (DC.) Kiaersk. (Myrtaceae) Como Contribuição a Farmacognosia. Visão Acadêmica. 2018;19:3. doi: 10.5380/acd.v19i3.60594. [DOI] [Google Scholar]
- 49.Ferreira Junior M., Vieira A.O.S. Espécies arbóreo-arbustivas da família Rubiaceae Juss. na bacia do rio Tibagi, PR, Brasil. HOEHNEA. 2015;42:289–336. doi: 10.1590/2236-8906-10/2015. [DOI] [Google Scholar]
- 50.Dos Santos Passos C., et al. Monoamine oxidase inhibition by monoterpene indole alkaloids and fractions obtained from Psychotria suterella and Psychotria laciniata. J. Enzym. Inhib. Med. Chem. 2013;28(3):611–618. doi: 10.3109/14756366.2012.666536. [DOI] [PubMed] [Google Scholar]
- 51.Dos Santos Cavalcanti A., et al. Volatiles composition and extraction kinetics from Schinus terebinthifolius and Schinus molle leaves and fruit. Revista Brasileira de Farmacognosia. 2015;25:356–362. doi: 10.1016/j.bjp.2015.07.003. [DOI] [Google Scholar]
- 52.Liu T., et al. De novo assembly and characterization of transcriptome using Illumina paired-end sequencing and identification of CesA gene in ramie (Boehmeria nivea L. Gaud) BMC Genom. 2013;14(1):1–11. doi: 10.1186/1471-2164-14-125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Bi Q., et al. Complete mitochondrial genome of a Chinese oil tree yellowhorn, Xanthoceras sorbifolium (Sapindales, Sapindaceae) Mitochondrial DNA Part B. 2019;4(1):1492–1493. doi: 10.1080/23802359.2019.1601038. [DOI] [Google Scholar]
- 54.Zhang C., et al. Asterid phylogenomics/phylotranscriptomics uncover morphological evolutionary histories and support phylogenetic placement for numerous whole-genome duplications. Mol. Biol. Evol. 2020;37(11):3188–3210. doi: 10.1093/molbev/msaa160. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data here evaluated are available by requesting the corresponding author.






