Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2023 Dec 14;18(12):e0295773. doi: 10.1371/journal.pone.0295773

Genetic differentiation of a southern Africa tepary bean (Phaseolus acutifolius A Gray) germplasm collection using high-density DArTseq SNP markers

Saul Eric Mwale 1,2,*, Hussein Shimelis 1, Wilfred Abincha 3, Wilson Nkhata 1,4, Abel Sefasi 5, Jacob Mashilo 1
Editor: Aditya Pratap6
PMCID: PMC10721083  PMID: 38096255

Abstract

Genetic resources of tepary bean (Phaseolus acutifolius A. Gray) germplasm collections are not well characterized due to a lack of dedicated genomic resources. There is a need to assemble genomic resources specific to tepary bean for germplasm characterization, heterotic grouping, and breeding. Therefore, the objectives of this study were to deduce the genetic groups in tepary bean germplasm collection using high-density Diversity Array Technology (DArT) based single nucleotide polymorphism (SNP) markers and select contrasting genotypes for breeding. Seventy-eight tepary bean accessions were genotyped using 10527 SNPs markers, and genetic parameters were estimated. Population structure was delineated using principal component and admixture analyses. A mean polymorphic information content (PIC) of 0.27 was recorded, indicating a relatively low genetic resolution of the developed SNPs markers. Low genetic variation (with a genetic distance [GD] = 0.32) existed in the assessed tepary bean germplasm collection. Population structure analysis identified five sub-populations through sparse non-negative matrix factorization (snmf) with high admixtures. Analysis of molecular variance indicated high genetic differentiation within populations (61.88%) and low between populations (38.12%), indicating high gene exchange. The five sub-populations exhibited variable fixation index (FST). The following genetically distant accessions were selected: Cluster 1:Tars-Tep 112, Tars-Tep 10, Tars-Tep 23, Tars-Tep-86, Tars-Tep-83, and Tars-Tep 85; Cluster 3: G40022, Tars-Tep-93, and Tars-Tep-100; Cluster 5: Zimbabwe landrace, G40017, G40143, and G40150. The distantly related and contrasting accessions are useful to initiate crosses to enhance genetic variation and for the selection of economic traits in tepary bean.

Introduction

Tepary bean (Phaseolus acutifolius A. Gray) is a climate-smart legume crop that provides food and nutrition security in arid and semi-arid regions of the world [1]. It is a self-pollinating diploid (2n = 2x = 22) species with a genome size of approximately 647 million base pairs (Mbp) [2]. The grains are a source of proteins, lipids and essential mineral elements [3]. The crop is tolerant to drought and heat stress [1,4,5], allowing its cultivation in dry environments. It is highly resistant to diseases including bacterial blight [Xanthomonas campestris pv. Phaseoli] [6], Fusarium wilt (Fusarium oxysporum) [7], and bean golden mosaic virus [8]. As a result, the crop served as a useful gene donor for disease-resistance breeding in common bean [9,10].

In Southern Africa, tepary bean is mostly cultivated by smallholder farmers in Botswana, South Africa, Zimbabwe, Zambia, and Malawi using genetically unimproved landrace varieties [11,12]. This is attributed to limited breeding efforts to develop improved varieties with farmer, consumer and market-preferred traits [13]. The crop yield in the region is approximately 500 kilograms per hectare [14,15] compared to the potential yield of greater than 2000 kilograms per hectare [16]. To address the imminent challenge of low productivity in Southern Africa, the University of KwaZulu-Natal’s African Center for Crop Improvement (ACCI) acquired a diverse germplasm panel of tepary bean from the International Centre for Tropical Agriculture (CIAT-Columbia) [17]. The collected genetic resources of tepary bean are yet to be explored using dedicated genomic resources. Reportedly, genetic analysis using various molecular markers developed for common bean (Phaseolus vulgaris L.) such as simple sequence repeat (SSR), random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), and sequenced characterized amplified region (SCAR) revealed a low to moderate genetic variation when applied in tepary bean [1825]. There is a need to assemble genomic resources specific to tepary bean for germplasm characterization, selection, and breeding.

The reference genome to which the molecular marker covers the genome influences germplasm characterization accuracy and predictability [26]. Single nucleotide polymorphism (SNP) markers are widely abundant across the genome and quickly unravel diversity; hence, they are the most preferred for diversity and association mapping studies [27,28]. Gene-based SNP markers, which mainly uncover functional variation, were used to assess the population structure and genetic diversity of wild and cultivated tepary bean genotypes using the common bean reference genome [26]. However, the genotyping assay was limited in capturing several loci (< 60%) in tepary bean since it was aligned to the common bean reference genome [26]. A draft tepary bean reference genome was developed recently, providing enormous opportunities for developing tepary bean genomic resources [1]. Genotyping-by-sequencing (GBS) is one of the most cost-effective approaches for concurrent SNPs identification and genotyping [29]. Diversity Array Technology Sequencing (DArTSeq) is a GBS platform that combines the principles of genome complexity reduction methods with high throughput sequencing. It allows the simultaneous identification of numerous SNPs across a genome at an affordable cost [30]. DArTseq SNP markers have been used for quantitative trait loci (QTL) mapping, genome-wide association studies, the development of linkage maps, genetic diversity, and population structure analyses in several legume crops, including pigeonpea, chickpea, soybean, and common bean [3034]. Genetic diversity and population structures were examined in Interspecific Mesoamerican X Wild Tepary (IMAWT) population using GBS with high-density SNP markers aligned to the common bean reference genome [5]. Further, GBS with high-density SNP markers mapped to the tepary bean reference genome in a tepary bean diversity panel has been used to determine the extent of genetic diversity and population structure [35]. Nevertheless, the southern Africa tepary bean diversity panel collection, comprising landraces, released lines, and breeding lines, has not yet been characterized with high-density DArTSeq SNP markers matched to the tepary bean reference genome. Genetic analysis of the southern Africa tepary bean germplasm collection could potentially capture a broader range of genetic variation and provide useful information for heterotic grouping, genome-wide association mapping, marker-assisted selection for precision, and speed breeding. Therefore, the objectives of this study were to deduce the genetic groups in tepary bean germplasm collection using high-density DArT-based single nucleotide polymorphism markers and select contrasting genotypes for breeding.

Materials and methods

Plant genetic materials

A panel of 78 tepary bean germplasm collection comprising released varieties, breeding lines, and landraces were used for the study. The accessions were sourced from the International Center for Tropical Agriculture (CIAT)/Colombia, the United States Department of Agriculture (USDA), and farmers in South Africa and Zimbabwe. The names of the genotypes and their sources of origin are presented in Table 1.

Table 1. Names and origins of tepary bean germplasm collection used in the study.

Genotype Code Genotype name or designation Origin Genotype Code Genotype name or designation Origin Genotype Code Genotype name or designation Origin
G1 G40001 Mexico G27 G40129 Mexico G55 TARS-TEP-32 USDA
G2 G40005 El Salvador G28 G40132 Mexico G56 TARS-TEP-49A USDA
G3 G40013 Nicaragua G29 G40133 Mexico G57 TARS-TEP-49B USDA
G4 G40014 Nicaragua G30 G40134 Mexico G58 TARS-TEP-51 USDA
G5 G40017 El Salvador G31 G40135 Mexico G59 TARS-TEP-52 USDA
G6 G40019 Mexico G32 G40136 Mexico G60 TARS-TEP-54 USDA
G7 G40020 Mexico G33 G40137 Mexico G61 TARS-TEP-58A USDA
G8 G40022 USA G34 G40138 Mexico G62 TARS-TEP-58B USDA
G9 G40023 USA G35 G40139 Mexico G63 TARS-TEP-60 USDA
G10 G40031 Mexico G36 G40140 Mexico G64 TARS-TEP-64 USDA
G11 G40032 Guatemala G37 G40143 Mexico G65 TARS-TEP-73 USDA
G12 G40033 Mexico G38 G40144A Mexico G66 TARS-TEP-77 USDA
G13 G40035 Mexico G39 G40145 Mexico G67 TARS-TEP-83 USDA
G14 G40036 Mexico G40 G40147 Mexico G68 TARS-TEP-85 USDA
G15 G40042 USA G41 G40148 Mexico G69 TARS-TEP-86 USDA
G16 G40059 El Salvador G42 G40150 Mexico G71 TARS-TEP 23 USDA
G17 G40062 Nicaragua G43 G40157 Mexico G72 PI-310801 Nicaragua
G18 G40063 USA G44 G40158 Mexico G73 G40119 Mexico
G19 G40065 USA G45 Zimbabwe landrace Zimbabwe G74 PI-440786 USDA
G20 G40066A USA G46 G40173A Mexico G75 G40200 Costa Rica
G21 G40068 USA G48 G40201 Costa Rica G79 TARS-TEP97 USDA
G22 G40069 USA G49 G40237 Mexico G80 TARS-TEP112 USDA
G23 G40084 Mexico G50 Uchokwane South Africa G81 TARS-TEP101 USDA
G24 G40111 Mexico G51 SONORA Sonora G83 TARS-TEP93 USDA
G25 G40125 Mexico G53 TARS-TEP-10 USDA G84 TARS-TEP51 USDA
G26 G40127 Mexico G54 TARS-TEP-22 USDA G85 TARS-TEP100 USDA

USDA = United States Department of Agriculture.

DNA extraction and SNP genotyping

The above tepary bean germplasm were genotyped at SEQART Africa in the International Livestock Research Institute (1.2693° S, 36.7216° E) in Nairobi. Genomic DNA was extracted using the TANBEAD Plant Extraction Kit (Diagnocine, Hackensack, NJ, USA). The genomic DNA extracted was in the range of 50–100 ng/ul. DNA quality and quantity were checked on 0.8% Agarose Gel. The genomic DNA was subjected to restriction digestion using Mst1 and Pst1 as rare and frequent cutters, respectively. Ligation of the digested DNA fragments was accomplished by both common and barcode adapters. This was followed by selective amplification via polymerase chain reaction (PCR) of the adapter-ligated fragments. Pooling and purification of the PCR products were accomplished through a QIAquick PCR purification kit (QIAGEN GmbH, Hilden, Germany). An Illumina Hiseq 2500 (Macrogen, Seoul, Korea) that utilizes single reads was used to sequence the purified PCR products. DNA libraries were constructed according to Kilian et al. [36]. DArTseq marker scoring was achieved using DArTsoft 14, an in-house marker scoring pipeline based on algorithms [3739]. Two types of DArTseq markers were scored: silica-DArT markers and SNP markers, which were both scored as binary for the presence (1) or absence (0) of the restriction fragment with the marker sequence in the genomic representation of the sample. 11, 318 SNP markers were aligned to 11 chromosomes of the reference genome of P. acutifolius_580_v1.0 to identify chromosome positions [1]. The markers used in the current study were highly reproducible, with polymorphic information content (PIC) that varied from 0.01 to 0.05 and a mean call rate of 0.93 ranging from 0.81 to 1.00.

SNP quality control

Quality control was implemented for the SNP data using the raw.data function of the snpReady package [40] in R statistical software version 4.3.1 [41]. SNP markers were filtered by eliminating markers with no chromosomal position and a minor allele frequency (MAF) of <5%. The percent similarity among identical markers was assessed, and markers with the highest missing data were eliminated. A total of 10527 markers with a SNP call rate greater than 95% were retained and used for genetic and population structure analysis. The MVP.report function of rMVP package in R was used to develop SNP density plot of the filtered SNPs [42]. Previously this analysis was done using CMplot package which has now been integrated in rMVP package.

Genetic diversity and population structure analysis

Genetic diversity analyses were conducted to determine the polymorphic information content (PIC), minor allele frequency (MAF), observed heterozygosity (Ho), genetic diversityGD), fixation index, additive variance and dominance variance. The genetic parameters were computed using the snpReady package [40] utilizing the popgen function in R statistical software [41]. A Landscape and Ecological Associations (LEA) R package was used to determine the population genetic structure [4345]. Using snmf function of LEA, ancestry coefficients were estimated where Cross-entropy criteria was used to identify the optimum number of K populations where K = 1 to 10 with six repetitions [46,47]. Principal component analysis was implemented in LEA package using pca function to validate the identified subpopulation through the elbow method [48]. The individuals in the subpopulations were identified using the developed Q matrix in R statistical software [41]. Admixture plot of individuals were plotted using the developed Q matrix using plot function of base R. The results obtained from population analysis were subjected to Principal Coordinate Analysis (PCoA) and analysis of molecular variance (AMOVA) using the poppr package [49] in R statistical software [41]. Neighbor joining hierarchical cluster analysis was performed in R statistical software using a dendextend package [41].

Results

Marker characterization

Genetic parameters derived by SNP markers are presented in Table 2. PIC ranged from 0.10 to 0.40, with a mean of 0.27. Most SNP markers were found within the following PIC categories; 0.2 to 0.03 followed by 0.3–0.4 and the least was 0–0.1 (“Fig 1). The mean Ho and He were 0.30 and 0.32, respectively. The mean Va and Vd were 3415.19 and 1227.16, respectively. Markers revealed a mean effective population size (Ne) of 488.28. The majority of the SNP markers had minor allele frequencies ranging from 0.1 to 0.2 and the least MAF ranging from 0.4 to 0.5. The SNP markers that explained the most genetic diversity were within the category 0.2 to 0.4 and the least informative were between 0.0 to 0.1 (Fig 1). The SNP markers were widely distributed across the 11 chromosomes with a SNP density ranging from 0 to > 65 (Fig 2).

Table 2. Genetic parameters computed from genetic diversity assessment of tepary bean germplasm using high-density SNP markers.

Genetic parameter Overall Mean Minimum Maximum
GD 0.32 0.10 0.50
PIC 0.27 0.09 0.38
MAF 0.22 0.05 0.5
Ho 0.30 0.06 0.81
He 0.32 0.10 0.50
Va 3415.19
Vd 1227.16
Ne 488.28

GD = Genetic Diversity; PIC = Polymorphic information content; MAF = Minor allele frequency; Ho = observed heterozygosity; He = Expected heterozygosity; Va = additive variance; Vd = dominance variance; Ne = effective population size.

Fig 1. Single nucleotide polymorphism markers characteristics: A = Polymorphic Information Content (PIC), B = Minor allele frequency (MAF) and C = Gene diversity (GD).

Fig 1

Fig 2. Single nucleotide polymorphism density plot mapped in 78 tepary bean genotypes.

Fig 2

Population structure of the assessed tepary bean germplasm

Population structure analyses based on cross-entropy criteria revealed that the population was optimally structured into 5 sub-populations (K = 5) (Fig 3”). The result was further validated by observing the principal component screen plot, which showed five principal components (PCs). Results from admixture analysis revealed the presence of genetic admixture across the five subpopulations (Fig 4”).

Fig 3. Tepary bean population structure based on cross-entropy (left) and principal component analysis (right) employing 10527 high-density SNP markers.

Fig 3

Fig 4. Population admixtures in 78 tepary bean genotypes using 10527 high-density SNP markers.

Fig 4

The mean coefficient of inbreeding (F) for the five subpopulations was 0.08 and variable among the populations (Fig 5). Further, the extent of genetic divergence was highest between subpopulations 1 and 5 (0.23), subpopulations 1 and 2 (0.21), subpopulations 3 and 5, and moderate for subpopulations 1 and 3 (0.08) (Table 3).

Fig 5. Fixation index (FST) of five tepary bean subpopulations determined from population structure analysis based on SNP markers.

Fig 5

Table 3. Population pair-wise fixation index of 78 tepary bean genotypes genotyped using SNPs markers.

Subpopulations Subpopulation 1 Subpopulation 2 Subpopulation 3 Subpopulation 4 Subpopulation 5
Subpopulation 1 1 0.21 0.08 0.11 0.23
Subpopulation 2 1 0.18 0.14 0.20
Subpopulation 3 1 0.09 0.21
Subpopulation 4 1 0.06
Subpopulation 5 1

Analysis of molecular variance

Analysis of molecular variance revealed a 38.12% variation between the population and 61.87% within population (Table 4). This was further visualized using principal coordinate analysis (PCoA), which showed that clusters 1, 3, and 4 showed genetic lineage, thus making between-population variance less than within-population variance (Fig 6).

Table 4. Analysis of molecular variance in tepary bean germplasm collection based on SNP markers.

Source of variation DF SS MS Variance Estimated % variance
Between population 4 138098.70 34524.68 2097.93 38.12%
Within population 73 248584.00 3405.26 3405.26 61.87%
Total 77 386682.80 5021.85 5503.19 100%

DF, degrees of freedom; SS, sum of squares; MS, mean square.

Fig 6. Principal coordinate analysis (PCoA) of tepary bean genotypes based on population structure using SNPs markers.

Fig 6

Genetic grouping of tepary bean

Neighbor-joining hierarchical clustering assorted the population into five clusters with different numbers of genotypes as follows: Cluster 1 (24 genotypes), Cluster 2 (8 genotypes), Cluster 3 (13 genotypes), Cluster 4 (11 genotypes), and Cluster 5 (22 genotypes).

The landraces G50 and G45 were clustered with released varieties and breeding lines in Clusters 1 and 5, respectively (Fig 7). Most genotypes sourced from Mexico were grouped in Clusters 2, 3, 4 and 5. Similarly, genotypes from USDA were primarily grouped in Cluster 1 (Fig 7).

Fig 7. Dendrogram showing the five genetic clusters (Cluster 1 to 5) among 78 tepary bean genotypes based on SNPs markers.

Fig 7

See codes of genotypes in Table 1.

Discussion

A well-characterized genetic resource is useful for breeding and genetic analysis. Genetic analysis using molecular markers complements phenotypic selection to develop heterotic groups and genotype selection for breeding. Genetic gains for yield, grain quality, and biotic and abiotic stress tolerance depend on the magnitude of the response to selection. However, limited genomic resources are developed for tepary bean for genetic differentiation and selection. Previously, germplasm characterization in this crop has been based on the use of molecular markers developed for common bean [19,26]. Therefore, the present study aimed at deducing the genetic groups in tepary bean germplasm collection using high-density DArT-based single nucleotide polymorphism (SNPs) markers and selecting contrasting genotypes for breeding.

In the present study, the least number of the derived SNPs markers recorded PIC values ranging between 0 to 0.2 and were considered less informative (Fig 1). The low PIC value could be attributed to the bi-allelic nature of the SNP markers, which limits them to values of less than or equal to 0.5 [50,51]. The majority of the developed SNPs markers recorded PIC values ranging from 0.2 to 0.3, and 0.3 to 0.5 (Fig 1). These SNPs markers are considered informative. The SNPs markers revealed moderate genetic diversity in the studied tepary bean germplasm (Table 2). The AMOVA (Table 4) indicated fewer variations between populations than within the population and supplemented the moderate genetic diversity.

The moderate genetic divergence typical of cultivated tepary bean in our study corroborates with the study findings of Blair et al. [18], Gujaria-Verma et al. [26], and Mhlaba et al. [19]. This is attributed to localized domestication and distribution of the crop in the USA and Mexico, which culminated in a narrow genetic base. The low natural hybridization and outcrossing rates in tepary bean could also be attributed to its moderate genetic diversity [18,19]. However, the existing genetic diversity within the studied population (Table 4) implies that sufficient genetic variation is present for crop improvement. Moghaddam et al. [1] also reported high within-population variation in tepary bean, concurring with the present findings. Farmers and breeders may have selected tepary bean genotypes for certain agronomic traits, resulting in considerable genetic differentiation among populations [52].

The observed and expected heterozygosity were marginally different (Table 2), suggesting the preponderance of homozygous alleles in dominant and recessive forms at many loci in the assessed tepary bean collection. The high additive variance supports this compared to the dominance variance observed in the present study. The higher tendency favouring homozygosity is expected as the tepary bean is a self-pollinated crop with a limited outcrossing rate [13,18]. The occurrence of both dominant and recessive alleles has implications for selection and breeding. Specifically, the upregulation of dominant alleles may enhance the selection of well-adapted genotypes, while the same may hinder the selection of recessive alleles [32]. The moderate fixation index value or coefficient of inbreeding (Table 3) suggests high gene exchange emanating from artificial and natural hybridization, which culminates in moderate genetic differentiation. Fixation indices are classified as low (< 0.05), moderate (0.05–0.15), and high (> 0.15) and show the level of genetic divergence or similarity in the assessed population [53]. Therefore, strategic crossing for the development of new varieties should target genetically divergent genotypes selected in different clusters (Fig 7). These can enhance genetic gains for economic traits, including grain yield, nutrient compositions, heat and drought tolerance, and disease resistance.

The clustering or grouping of tepary bean genotypes into five sub-populations following their geographical origins in the USA and Mexico reiterated the single domestication theory, which apparently caused a genetic bottleneck and limited genetic variation in tepary bean [21]. The grouping of landraces like Uchokwane (G50) and Zimbabwe landrace (G45) with released and breeding varieties in Clusters 1 and 2 signifies the presence of admixtures that could have arisen due to historical exchanges of seeds through the informal seed system or the exchange of germplasm between breeding programs. The number of subpopulations in our study was slightly lower than the six subpopulations reported by Bornowski et al. [35]. This could be attributed to differences in the diversity panels and genotyping platform used.

The following genotypes are recommended for new variety design: Tars-Tep 112, Tars-Tep 10, Tars-Tep 23 selected from Cluster 1, G40022, Tars-Tep-93 and Tars-Tep -100 from Cluster 3, Zimbabwe landrace, G40017, G40143 and G40150 from Cluster 5. Genotypes, including Tars-Tep 23 and G40150, have high yield potential and tolerate drought and heat [54,55]. G40022 is tolerant to salt [56]. G40150 is resistant to common bacterial blight (CBB) while Tars-Tep 23 has broad resistance to common bacterial blight and rust [54,57].

Conclusion

The present study appraised SNP markers and determined their usefulness in assessing the extent of genetic diversity and population structure in tepary bean germplasm collections. The SNP markers revealed moderate genetic diversity in the germplasm collection. The study identified five distinctive sub-populations that were genetically differentiated. This will guide genotype selection and subsequent crosses to develop breeding populations and enhance genetic gains for economic traits. The following distantly related genotypes were selected, namely: Tars-Tep 112, Tars-Tep 10, Tars-Tep 23, Tars-Tep-86, Tars-Tep-83, and Tars-Tep 85 from Cluster 1, G40022, Tars-Tep-93, and Tars-Tep-100 from Cluster 3, Zimbabwe landrace, G40017, G40143, and G40150 from Cluster 5. The selected and contrasting accessions are valuable genetic resources to initiate crosses to enhance genetic variation and integrate traits in tepary bean and genetically related legume crops.

Supporting information

S1 Table. Hapman file for DArT- based SNP markers.

(CSV)

S2 Table. Divgenos file.

(CSV)

Acknowledgments

The authors would like to express their deepest gratitude to the staff of SEQART Africa for the Dartseq genotyping service and to the CIAT-Malawi staff for processing the seed samples.

Data Availability

All relevant data are within the paper and its Supporting information files.

Funding Statement

Yes. Kirkhouse Trust SCIO provided the funding for the genotyping of tepary bean germplasm.

References

  • 1.Moghaddam SM, Oladzad A, Koh C, Ramsay L, Hart J, Mamidi S, et al. The tepary bean genome provides insight into evolution and domestication under heat stress. Nat. Commun. 2021; 12:1–14. doi: 10.1038/s41467-021-22858-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Arumuganathan K, Earle ED. Nuclear DNA content of some important plant species. Mol. Biol. Rep.1991; 9, 208–221. [Google Scholar]
  • 3.Bhardwaj HL, Hamama AA. Oil and fatty acid composition of tepary bean seed. Hort Sci. 2005; 40, 1436–1438. [Google Scholar]
  • 4.Muñoz LC, Debouck DG, Rivera M, Muñoz JE, Alpala D, Sarsu F, et al. Mutation breeding for heat and drought tolerance in tepary bean (Phaseolus acutifolius A. Gray). Aust. J. Crop Sci. 2021; 60–68. [Google Scholar]
  • 5.Cruz S, Lobató n J, Urban MO, Ariza, Suarez D, Raatz B, et al. Interspecific common bean population derived from Phaseolus acutifolius using a bridging genotype demonstrate useful adaptation to heat tolerance. Front. Plant Sci. 2023; 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Miklas PN, Smith JR., Singh SP. Registration of common bacterial blight resistant dark red kidney bean germplasm line USDK-CBB-15. Crop Sci. 2006; 46, 1005–1007 [Google Scholar]
  • 7.Salgado MO, Schwartz HF, Brick MA, Pastor-Corrales MA. Resistance to Fusarium oxysporum f.sp. phaseoli in tepary bean (Phaseolus acutifolius). Plant Dis. 1994; 78, 357–360. [Google Scholar]
  • 8.Miklas PN, Santiago J. Reaction of select tepary bean to bean golden mosaic virus. Hort Sci. 1996; 31, 430–432. [Google Scholar]
  • 9.Thomas CV, Waines JG. Fertile backcross and allotetraploid plants from crosses between tepary beans and common beans. J. Hered. 1984; 75, 93–98. [Google Scholar]
  • 10.Rao I, Beebe S, Polania J, Ricaurte J, Cajiao C, Garcia R, et al. Can tepary bean be a model for improvement of drought resistance in common bean? Afr. Crop Sci. J. 2013; 21, 265–281. [Google Scholar]
  • 11.Molosiwa OO, Kgokong SB, Makwala B, Gwafila CM, & Ramokapane MG. Genetic diversity in tepary bean (Phaseolus acutifolius) landraces grown in Botswana. J. Plant Breed. Crop Sci. 2014; 6, 194–199. [Google Scholar]
  • 12.Gwata ET, Shimelis H, & Matova PM. Potential of improving agronomic attributes in tropical Legumes using two mutation breeding techniques in Southern Africa. Alternative crops and cropping systems, Petr Konvalina, IntechOpen. 2016.
  • 13.Mhlaba ZB, Mashilo J, Shimelis H, Assefa AB, and Modi AT. Progress in genetic analysis and breeding of tepary bean (Phaseolus acutifolius A. Gray): A review. Sci. Hortic. 2018a; 112–119.
  • 14.Konvalina P. Alternative Crops and Cropping Systems. IntechOpen. 2016 10.5772/61408. [DOI]
  • 15.Mwale SE, Shimelis H, Mafongoya P, Mashilo J. Breeding tepary bean (Phaseolus acutifolius) for drought adaptation: A review. Plant Breed. 2020; 1–13. doi: 10.1111/pbr.12806 [DOI] [Google Scholar]
  • 16.Pratt RC, Grant L, Velasco-Cruz C, & Lauriault L. Field performance of selected and landrace tepary bean varieties in diverse southwestern USA irrigated production environments. Legum. Sci. 2023; 5(1), e157. doi: 10.1002/leg3.157 [DOI] [Google Scholar]
  • 17.Mhlaba ZB, Amelework B, Shimelis HA, Modi AT, Mashilo J. Variance components and heritability of yield and yield-related traits in tepary bean (Phaseolus acutifolius). S. Afr. J. Plant Soil. 2018b; 1–12. [Google Scholar]
  • 18.Blair MW, Pantoja W, Muñoz LC. First use of microsatellite markers in a large collection of cultivated and wild accessions of tepary bean (Phaseolus acutifolius A. Gray). Theor Appl. Genet. 2012; 125:1137–47. [DOI] [PubMed] [Google Scholar]
  • 19.Mhlaba ZB, Amelework B, Shimelis HA, Modi AT, Mashilo J. Genetic interrelationship among tepary bean (Phaseolus acutifolius A. Gray) genotypes revealed through SSR markers. Aust. J. Crop Sci. 2018c; 12(10): 1587–1595. [Google Scholar]
  • 20.Gepts P, Llaca V, Nodari RO, PanellaL. Analysis of Seed Proteins, Isozymes, and RFLPs for Genetic and Evolutionary Studies in Phaseolus. In: Linskens, H.F., Jackson, J.F. (eds) Seed Analysis. Modern Methods of Plant Analysis. 1992. vol 14. Springer, Berlin, Heidelberg. 10.1007/978-3-662-01639-8_4. [DOI]
  • 21.Schinkel C, Gepts P. Phaseolin diversity in the tepary bean, Phaseolus acutifolius A. Gray. Plant Breed. 1988;101, 292–301. [Google Scholar]
  • 22.Pedraza F, Gallego G, Beebe S, Tohme J. Marcadores SCAR y RAPD para la resistencia a la bacteriosis comun (CBB). In: Singh, S.P., Voysest, O. (Eds.), Taller de mejoramiento de frijol para el Siglo XXI: Bases para una estrategia para America Latina. CIAT, Cali, Colombia. 1997; 130–134.
  • 23.Park SO, Coyne DP, Dursun A, Jung G. Identifying randomly amplified polymorphic DNA (RAPD) markers linked to major genes for common bacterial blight resistance in tepary bean. J. Am. Soc. Hortic. Sci. 1998;123, 278–282. [Google Scholar]
  • 24.Muñoz LC, Duque MC, Debouck DG, Blair MW. Taxonomy of tepary bean (Phaseolus acutifolius) and wild relatives as determined by amplified fragment length polymorphism (AFLP) markers. Crop Sci. 2006; 46:1744–54. [Google Scholar]
  • 25.Yu K, Park SJ, Poysa V. 2000. Marker-assisted selection of common beans for resistance to common bacterial blight: efficacy and economics. Plant Breed. 2000; 119, 411–415. [Google Scholar]
  • 26.Gujaria-Verma N, Ramsay L, Sharpe AG, Sanderson LA, Debouck DG, Tar’an B, et al. Gene-based SNP discovery in tepary bean (Phaseolus acutifolius) and common bean (P. vulgaris) for diversity analysis and comparative mapping. BMC Genomics. 2016; 17, 239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Cichy KA, Wiesinger JA, Mendoza FA. Genetic diversity and genome-wide association analysis of cooking time in dry bean (Phaseolus vulgaris L.). Theor. Appl. Genet. 2015; 128 (8):1555–67. doi: 10.1007/s00122-015-2531 [DOI] [PubMed] [Google Scholar]
  • 28.Raatz B, Mukankusi C, Lobaton JD, Male A, Chisale V, Amsalu B, et al. Analyses of African common bean (Phaseolus vulgaris L.) germplasm using a SNP fingerprinting platform: diversity, quality control and molecular breeding. Genet. Resour. Crop Evol. 2019; 66(3):707–22. doi: 10.1007/s10722-019-00746 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, et al. A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species. PLoS One. 2011; 6(5):e19379. doi: 10.1371/journal.pone.0019379 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Valdisser PAMR Pereira WJ, Almeida JE MuüLler BSF, Coelho GRC, and de Menezes IPP. In-depth genome characterization of a Brazilian common bean core collection using dartseq high-density Snp genotyping. BMC Genomics. 2017; 18:423. doi: 10.1186/s12864-017-3805-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Yohane EN, Shimelis H, Laing M, Shayanowako A. Genetic diversity and grouping of pigeonpea [Cajanus cajan Millspaugh] Germplasm using SNP markers and agronomic traits. PLoS ONE. 2022; 17(11): e0275060. doi: 10.1371/journal.pone.0275060 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ahmed SM, Alsamman AM, Jighly A, Mubarak MH, Al-Shamaa K, Istanbuli T, et al. Genome-wide association analysis of chickpea germplasms differing for salinity tolerance based on DArTseq markers. PLoS ONE. 2021; 16(12): e0260709. doi: 10.1371/journal.pone.0260709 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Lukanda MM, Dramadri IO, Adjei EA. et al. Genetic Diversity and Population Structure of Ugandan Soybean (Glycine max L.) Germplasm Based on DArTseq. Plant Mol Biol Rep. 2023; 41: 417–426. doi: 10.1007/s11105-023-01375-9 [DOI] [Google Scholar]
  • 34.Nkhata W, Shimelis H, Melis R, Chirwa R, Mzengeza T, Mathew I, et al. Population structure and genetic diversity analyses of common bean germplasm collections of East and Southern Africa using morphological traits and high-density SNP markers. PLoS ONE. 2020; 15(12). doi: 10.1371/journal.pone.0243238 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Bornowski N, Hart JP, Vargas Palacios A, Ogg B, Brick MA, Hamilton JP, et al. Genetic variation in a tepary bean (Phaseolus acutifolius A. Gray) diversity panel reveals loci associated with biotic stress resistance. The Plant Genome. 2023; 16, e20363. doi: 10.1002/tpg2.20363 [DOI] [PubMed] [Google Scholar]
  • 36.Kilian A, Wenzl P, Huttner E, Carling J, Xia L, Blois H. et al. Diversity arrays technology: a generic genome profiling technology on open platforms. Methods Mol. Biol. 2012; 888, 67–89. doi: 10.1007/978-1-61779-870-2_5 [DOI] [PubMed] [Google Scholar]
  • 37.Alam M, Neal J, O’Connor K, Kilian A, Topp B. Ultra-high-throughput DArTseq-based silicoDArT and SNP markers for genomic studies in macadamia. PLOS ONE. 2018; 13(8): e0203465. doi: 10.1371/journal.pone.0203465 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Adu BG, Akromah R, Amoah S, Nyadanu D, Yeboah A, et al. High-density DArT-based SilicoDArT and SNP markers for genetic diversity and population structure studies in cassava (Manihot esculenta Crantz). PLOS ONE. 2021; 16(7): e0255290. doi: 10.1371/journal.pone.0255290 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Gelaw YM, Eleblu JSY, Ofori K. et al. High-density DArTSeq SNP markers revealed wide genetic diversity and structured population in common bean (Phaseolus vulgaris L.) germplasm in Ethiopia. Mol Biol Rep. 2023; 50: 6739–6751. doi: 10.1007/s11033-023-08498-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Granato, Neto-Fritsche. ‘snpReady package’. 2017. https://cran.r-project.org/web/packages/snpReady/vignettes/snpReady-vignette.html.
  • 41.R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing. 2020. http://www. R-project.org.
  • 42.Yin L, Zhang H, Tang Z, Xu J, Yin D, Zhang Z, et al. rMVP: A Memory-efficient, Visualization-enhanced, and Parallel-accelerated Tool for Genome-wide Association Study. Genomics, Proteomics and Bioinformatics. 2021; 19(4): 619–628. doi: 10.1016/j.gpb.2020.10.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Frichot E, and Francois O. LEA: An R package for Landscape and Ecological Association studies. 2015.
  • 44.Patterson N, Price AL, and Reich D. Population structure and eigen analysis. PLoS Genet. 2006; 2:20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Pritchard JK, Stephens M and Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000; 155:945–959. doi: 10.1093/genetics/155.2.945 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Alexander DH, and Lange K. Enhancements to the Admixture algorithm for individual ancestry estimation. BMC Bioinformatics. 2011; 12:246. doi: 10.1186/1471-2105-12-246 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Frichot E, Mathieu F, Trouillon T, Bouchard G and Fran¸cois O. Fast and efficient estimation of individual ancestry coefficients. Genetics. 2014; 196:973–983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Byun J, Han Y, Gorlov IP, Busam JA, Seldin MF, & Amos CI. Ancestry inference using principal component analysis and spatial analysis: A distance-based analysis to account for population substructure. BMC Genomics. 2017;18(1):1–12. doi: 10.1186/s12864-017-4166-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kamvar ZN, Tabima JF, Grünwald NJ. Poppr: An R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. Peer J. 2014. doi: 10.7717/peerj.281 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Luo Z, Brock J, Dyer JM, Kutchan TM, Augustin M, Schachtman DP, et al. Genetic diversity and population structure of a Camelina sativa spring panel. Front. Plant Sci. 2019; 10:184. doi: 10.3389/fpls.2019.00184 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Singh N, Choudhury DR, Singh AK, Kumar S, Srinivasan K, Tyagi RK, et al. Comparison of SSR and SNP markers in estimation of genetic diversity and population structure of Indian rice varieties. PLoS ONE. 2013;8(12):1–14. doi: 10.1371/journal.pone.0084136 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Kinhoègbè G, Djèdatin G, Saxena RK, Chitikineni A, Bajaj P, Molla J, et al. Genetic diversity and population structure of pigeonpea (Cajanus cajan [L.] Millspaugh) landraces grown in Benin revealed by Genotyping-By-Sequencing. PLoS ONE. 2022; 17(7): e0271565. doi: 10.1371/journal.pone.027156 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Vitorino CA, Nogueira F, Souza IL, Araripe J, Venere PC. Low genetic diversity and structuring of the Arapaima (Osteoglossiformes, Arapaimidae) population of the Araguaia-Tocantins basin. Front. genet. 2017; 8:159. doi: 10.3389/fgene.2017.00159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Porch T, Barrera S, Berny J, Díaz-Ramírez J, Pastor-Corrales M, Gepts P, et al. Release of tepary bean TARS-Tep 23 germplasm with broad abiotic stress tolerance and rust and common bacterial blight resistance. J. Plant Regist. 2021; 1–11. [Google Scholar]
  • 55.Mwale SE, Shimelis H, Nkhata W, Sefasi A, Fandika I, Mashilo J. Genotype-by-Environment Interaction in Tepary Bean (Phaseolus acutifolius A. Gray) for Seed Yield. Agronomy. 2023; 13(1):12. doi: 10.3390/agronomy1301001 [DOI] [Google Scholar]
  • 56.Bayuelo-Jiménez JS, Debouck DG, Lynch JP. Salinity tolerance in Phaseolus species during early vegetative growth. Crop Sci. 2002; 42, 2184–2192. [Google Scholar]
  • 57.Vargas A, Porch TG, Beaver J. Evaluation of the tepary bean (Phaseolus acutifolius) CIAT germplasm collection for response to common bacterial blight and bean common mosaic necrosis virus. Ann. Rep. Bean. Improv. Coop. 2014; 181–182. [Google Scholar]

Decision Letter 0

Aditya Pratap

6 Oct 2023

PONE-D-23-28131Genetic differentiation of tepary bean (Phaseolus acutifolius) germplasm collection using genotyping-by-sequencing with high-density Single Nucleotide Polymorphism markersPLOS ONE

Dear Dr. Mwale,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.Nov 20 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Aditya Pratap

Academic Editor

PLOS ONE

Journal Requirements:

1. When submitting your revision, we need you to address these additional requirements.

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following in the Acknowledgments Section of your manuscript: 

The Kirkhouse Trust SCIO is sincerely thanked for the financial support to genotype the Tepary bean germplasm collection. Further, the African Center of Excellence in Neglected and Underutilized Biodiversity is also sincerely thanked for the financial support towards the completion of this manuscript. The authors would like to express their deepest gratitude to the staff of SEQART Africa for the Dartseq genotyping service and to the CIAT-Malawi staff for processing the seed samples.

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. 

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: 

Yes. 

Kirkhouse Trust SCIO provided the funding for the genotyping of tepary bean germplasm.

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

3, In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

4. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments:

I agree with the suggestions of reviewers who have recommended minor revisions. Kindly go through all suggestions and submit a revised manuscript.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Dear authors,

The manuscript titled “Genetic differentiation of tepary bean (Phaseolus acutifolius) germplasm collection using genotyping-by-sequencing with high-density Single Nucleotide Polymorphism markers” provides an interesting perspective to deduce the genetic groups in tepary bean germplasm collection using GBS. The authors reported a total of 5 genetic groups. These interesting results added to different similar initiatives, will help accelerate breeding programs by offering the status of population structure in the dataset used.

In general, the authors have a logical sequence of analysis, in addition to having a high number of SNP molecular markers despite the limited number of genotypes. However, it is strongly recommended to be more descriptive in the bioinformatic and analytic methodology, and be more informative in the figures.. Also, add more references for better context and discussion. The above in order to have more clarity on how the analyzes were carried out, and thus be more effective in the reproducibility of the results in subsequent works. For all the above reasons, I recommend this manuscript for publication in Plos One with minor revision. Thus, the authors should address the following recommendations to improve some aspects before the publication.

Best regards,

I attach a PDF with comments

Reviewer #2: Work on functional annotation and validation of targeted impending traits of Tepary bean could have made the work better intricate since draft genome sequence of the bean is already available. That would shed better light on utility of the high resolution markers for selection and improvement of such traits. Reason for incorporating varieties, advance lines and landraces together rather than landraces alone for the study could be elucidated.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Felipe López-Hernández

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Comments.pdf

PLoS One. 2023 Dec 14;18(12):e0295773. doi: 10.1371/journal.pone.0295773.r002

Author response to Decision Letter 0


11 Nov 2023

Response to PLoS-ONE Comments

Abstract

Reviewer comments: I suggest mentioning the method for selecting 5 sub-populations and adding punctuation signs for more clarity in the cluster list.

Response: Thank you very much for your comments. The method of selecting the 5 sub-populations has been specified in the manuscript and punctuation signs have been added to aid in clarity. The five sub-populations were identified using sparse non-negative matrix factorization (snmf), which computes an entropy criterion (Alexander & Lange, 2011). The entropy criterion helped in choosing the number of ancestral populations that explained the genotypic data. This was further validated using the elbow reference, which involved scree plot of the percentage of variation explained by each component of the PCA. The number of ancestral populations is closely linked to the number of PCA (Byun et al., 2017).

Reference:

Alexander, D. H., & Lange, K. (2011). Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinformatics, 12. https://doi.org/10.1186/1471-2105-12-246

Byun, J., Han, Y., Gorlov, I. P., Busam, J. A., Seldin, M. F., & Amos, C. I. (2017). Ancestry inference using principal component analysis and spatial analysis: A distance-based analysis to account for population substructure. BMC Genomics, 18(1), 1–12. https://doi.org/10.1186/s12864-017-4166-8

Reviewer comments: I suggest adding the most recent works in the topic. For instance, Bornowski et al. in 2023 published a study on genotyping many accessions of tepary beans. On the other hand, I recommend including more references to analytic tools that can be used in this work related to population structure, etc. Additionally, to provide context on how this study relates to previously published research, I suggest including other works on population stratification using GBS in common beans such as Cortés & Blair 2018 or Dartseq as Valdisser et al 2017; or studies involving interspecific panels (Common Bean × Tepary Bean) like those by Cruz et al. in 2023.

Response: The suggested articles have been added, and our study findings have been contextualized in line with published literature. We have included DArTseq, a more specific GBS platform used in the current study, for clarity.

Reviewer comments: In Figure 3, please list each subplot and change the Eigenvalues of each PC plot to the variance explained of each. PC. 2. In figure 4 organize from largest to smallest according to each percentage of subpopulation and add the accession code at the bottom. In figure 6 add the percentage of variance explained by each component.

Response: Figure 3: We have plotted a PC scree plot of PCs against the variance, explained as suggested. However, we could not change the PC into subpopulations because, in this method, the number of PCs are the subpopulations. By looking at the scree plot, the PCs can reliably inform us of the optimum number of K.

Response: Figure 4: As suggested, we have added accession codes at the bottom of the plot. With your advice, we have refined the admixture plot, and the groups are better visible.

Response: Figure 6: We have added percentage variation on each coordinate.

Reviewer comment: R//Yes, but please add the reference to the R-package used for plotting. For example, did the authors use the CMplot function for Figure 2?

Response: We have cited all referenced R packages used, including CMplot.

Reviewer comment: Are the experiments or interventions appropriate for addressing the research question?

R// The experiments and the analytic methodology are appropriate for addressing the research question. However, I suggest providing greater clarity regarding the function used for the ancestry analysis by LEA. Was the algorithm employed for this approach SMNF (e.g., SMNF is an optimization of Structure)? The variant calling process also needs a more detailed explanation. It's unclear how this process was conducted. While they used DArTsoft, a software dedicated to DArTseq, it's worth noting that the use of this tool for GBS is not common. Typically, the main bioinformatic approach for GBS involves tools such as GATK, TASSEL, Stacks, etc.

Response: Thank you very much for your comment. We have provided more information on the function snmf as used in the LEA package to estimate the co-ancestry coefficient of the population. The current GBS was done using the DArTseq platform, which is reliably analyzed using DArTsoft 14. DArTsoft is an automated genotypic data analysis program that was developed to analyze sequence data generated from DArTseq (Gelaw et al., 2023; Adu et al., 2021; Alam et al., 2018).

Reviewer comment: Is there enough data to draw a conclusion? R// Yes, but I advert that the using of this panel for future association analysis would need major number of genotypes accessions.

Response: Thank you for the comment. We will consider including several accessions in future association analysis.

Reviewer comment: Do the authors follow best practices for reporting? R// No. I suggest including the company name and country when reporting the use of kits and software.

Response: We have included the company for the TANBEAD Plant Extraction Kit (Diagnocine, Hackensack, NJ, USA) and a QIAquick PCR purification kit (QIAGEN GmbH, Hilden, Germany).

Reviewer comments: Could another researcher reproduce the study with the same methods? In other words, have the authors provided enough information to validate the study? R// Partially, because the authors have not provided with a link to SNP data or the scripts used in R or other software

Response: Supporting information on genotypic data has been provided.

Reviewer comments: Do the results support the conclusions? Yes, but I recommend specifying some analysis. In that sense, from data science, the visualization of the first two principal components does not suggest an optimal number of components (PCs) but rather it is a clustering validity analysis that suggests the optimal number of clusters.

Response: Thank you for the comment. The cross-entropy criteria was used to identify the optimum number of K populations and the identified subpopulations were validated by principal component analysis.

Attachment

Submitted filename: II Response to PLoS REVISED.docx

Decision Letter 1

Aditya Pratap

29 Nov 2023

Genetic differentiation of a southern African tepary bean (Phaseolus acutifolius A Gray) germplasm collection using high-density DArTseq Single Nucleotide Polymorphism markers

PONE-D-23-28131R1

Dear Dr. Mwale,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Aditya Pratap

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Aditya Pratap

4 Dec 2023

PONE-D-23-28131R1

Genetic differentiation of a southern African tepary bean (Phaseolus acutifolius A Gray) germplasm collection using high-density DArTseq SNP markers

Dear Dr. Mwale:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Aditya Pratap

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Hapman file for DArT- based SNP markers.

    (CSV)

    S2 Table. Divgenos file.

    (CSV)

    Attachment

    Submitted filename: Comments.pdf

    Attachment

    Submitted filename: II Response to PLoS REVISED.docx

    Data Availability Statement

    All relevant data are within the paper and its Supporting information files.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES