Skip to main content
eLife logoLink to eLife
. 2023 May 19;12:e85725. doi: 10.7554/eLife.85725

Environment as a limiting factor of the historical global spread of mungbean

Pei-Wen Ong 1, Ya-Ping Lin 2,3, Hung-Wei Chen 2, Cheng-Yu Lo 2, Marina Burlyaeva 4, Thomas Noble 5, Ramakrishnan Madhavan Nair 6, Roland Schafleitner 3, Margarita Vishnyakova 4, Eric Bishop-von-Wettberg 7,8, Maria Samsonova 8, Sergey Nuzhdin 9, Chau-Ti Ting 10, Cheng-Ruei Lee 1,2,
Editors: Detlef Weigel11, Detlef Weigel12
PMCID: PMC10299821  PMID: 37204293

Abstract

While the domestication process has been investigated in many crops, the detailed route of cultivation range expansion and factors governing this process received relatively little attention. Here, using mungbean (Vigna radiata var. radiata) as a test case, we investigated the genomes of more than 1000 accessions to illustrate climatic adaptation’s role in dictating the unique routes of cultivation range expansion. Despite the geographical proximity between South and Central Asia, genetic evidence suggests mungbean cultivation first spread from South Asia to Southeast, East and finally reached Central Asia. Combining evidence from demographic inference, climatic niche modeling, plant morphology, and records from ancient Chinese sources, we showed that the specific route was shaped by the unique combinations of climatic constraints and farmer practices across Asia, which imposed divergent selection favoring higher yield in the south but short-season and more drought-tolerant accessions in the north. Our results suggest that mungbean did not radiate from the domestication center as expected purely under human activity, but instead, the spread of mungbean cultivation is highly constrained by climatic adaptation, echoing the idea that human commensals are more difficult to spread through the south-north axis of continents.

Research organism: Other

eLife digest

Mungbean, also known as green gram, is an important crop plant in China, India, the Philippines and many other countries across Asia. Archaeological evidence suggests that humans first cultivated mungbeans from wild relatives in India over 4,000 years ago. However, it remains unclear how cultivation has spread to other countries and whether human activity alone dictated the route of the cultivated mungbean’s expansion across Asia, or whether environmental factors, such as climate, also had an impact.

To understand how a species of plant has evolved, researchers may collect specimens from the wild or from cultivated areas. Each group of plants of the same species they collect in a given location at a single point in time is known collectively as an accession. Ong et al. used a combination of genome sequencing, computational modelling and plant biology approaches to study more than 1,000 accessions of cultivated mungbean and trace the route of the crop’s expansion across Asia.

The data support the archaeological evidence that mungbean cultivation first spread from South Asia to Southeast Asia, then spread northwards to East Asia and afterwards to Central Asia. Computational modelling of local climates and the physical characteristics of different mungbean accessions suggest that the availability of water in the local area likely influenced the route. Specifically, accessions from arid Central Asia were better adapted to drought conditions than accessions from wetter South Asia. However, these drought adaptations decreased the yield of the plants, which may explain why the more drought tolerant accessions have not been widely grown in wetter parts of Asia.

This study shows that human activity has not solely dictated where mungbean has been cultivated. Instead, both human activity and the various adaptations accessions evolved in response to their local environments shaped the route the crop took across Asia. In the future these findings may help plant breeders to identify varieties of mungbean and other crops with drought tolerance and other potentially useful traits for agriculture.

Introduction

Domestication is a process that is cultivated by humans, leading to associated genetic and morphological changes. These changes may be intentional from human selection or unintentional as a result of adaptation to the environments of cultivation (Fuller, 2007). Later, the cultivated plants spread out from their initial geographical range (Meyer and Purugganan, 2013), and elucidating the factors affecting the range expansion of crops is another focus of active research (Gutaker et al., 2020). In the old world, during the process of ‘prehistoric food globalization’ (Jones et al., 2011), crops originated from distinct regions were transported and grown in Eurasia. Archeological evidence has shown that such ‘trans-Eurasian exchange’ had happened by 1500 BC (Liu et al., 2019), and the proposed spread routes from archeological studies were supported by modern genetic evidence especially in rice (Gutaker et al., 2020) and barley (Lister et al., 2018). Interestingly, the spread may accompany genetic changes for the adaptation to novel environments. For example, in barley, variations in the gene Photoperiod-H1 (Ppd-H1) resulting in the non-responsiveness to longer daylengths were likely associated with the historical expansion to high-latitude regions (Jones et al., 2008; Jones et al., 2016). While these mid-latitude cereals have been extensively studied, investigations of crops originated from other climate zones are needed. Using the South Asian (SA) legume mungbean as a test case, here, we investigate how climatic adaptation might affect crop spread route and the evolutionary changes making such spread possible.

Mungbean (Vigna radiata [L.] Wilczek var. radiata), also known as green gram, is an important grain legume in Asia (Nair and Schreinemachers, 2020), providing carbohydrates, protein, folate, and iron for local diets and thereby contributing to food security (Kim et al., 2015). Among pulses, mungbean is capable of tolerating moderate drought or heat stress and has a significant role in rainfed agriculture across arid and semiarid areas (Pratap et al., 2019), which are likely to have increased vulnerabilities to climate change. Although there have been studies about the genetic diversity of cultivated and wild mungbean (Ha et al., 2021; Kang et al., 2014; Noble et al., 2018; Sangiri et al., 2007), the evolutionary history of cultivated mungbean after domestication still lacks genetic studies. Existing archeological evidence suggests that South Asia is the probable area of mungbean domestication, and at least two independent domestication events have been suggested, including Maharashtra and the eastern Harappan zone (Fuller and Harvey, 2006). The early archeological records suggest that the selection of large seed sizes occurred in the eastern Harappan zone by the third millennium BC and in Maharashtra, dating to the late second to early first millennium BC (Fuller and Harvey, 2006). This pulse later spread to mainland Southeast Asia and has been reported in southern Thailand dating to the late first millennium BC (Castillo et al., 2016). Further north, the earliest record of mungbean in China was from the book Qimin Yaoshu (齊民要術, 544 AD). While mungbean is also cultivated in Central Asia today, it was not identified in archaeobotanical evidence ranging from several millennium BC to the medieval period (Miller, 1999; Spengler et al., 2018b; Spengler et al., 2017), suggesting later arrival. While the archaeobotanical studies elucidated the route of mungbean cultivation range expansion, researches are still needed to identify the genetic evidence and factors shaping such spread route.

A recent genetic study revealed that present-day cultivated mungbeans have the same haplotype in the promoter region, reducing the expression of VrMYB26a (Lin et al., 2022), a candidate gene controlling the important domestication trait, pod shattering, in several Vigna species (Takahashi et al., 2020). This suggests the loss of pod-shattering phenotype in cultivated mungbean may have a common origin and despite the archaeobotanical findings of several independent early cultivations of mungbean in South Asia (Fuller and Harvey, 2006), descendants from one of these cultivation origins might have dominated South Asia before the pan-Asia expansion. Since large regions remain archaeologically unexplored, utilization of genetic data can be a crucial complementation to reconstruct crop evolutionary history. Using seed proteins (Tomooka et al., 1992) and isozymes (Dela Vina and Tomooka, 1994), previous studies proposed two expansion routes out of India, one in the south to Southeast Asia and the other in the north along the silk road to China. While later studies used DNA markers to investigate mungbean population structure (Breria et al., 2020; Gwag et al., 2010; Islam and Blair, 2018; Noble et al., 2018; Sandhu and Singh, 2021; Sangiri et al., 2007), few have examined these hypothesized routes in detail. Therefore, genomic examination of the cultivation rage expansion proposed by archaeobotanical studies and the elucidation of its contributing factors are strongly needed.

In this study, we compiled an international effort, reporting a global mungbean diversity panel of more than 1100 accessions derived from (i) the mungbean mini-core collection of the World Vegetable Center (WorldVeg) genebank, (ii) the Australian Diversity Panel (ADP), and (iii) the Vavilov Institute (VIR), which hosts a one-century-old collection enriched with mid-latitude Asian accessions that are underrepresented in other genebanks, many of which were old landraces collected by Nikolai I. Vavilov and his teams in the early 20th century (Burlyaeva et al., 2019). These germplasms harbor a wide range of morphological variations (Figure 1A) and constitute the most comprehensive representation of worldwide mungbean genetic variation. We used this resource to investigate the global history of mungbean after domestication to reveal a spread route highly affected by climatic constraints across Asia, eventually shaping the phenotypic characteristics for local adaptation to distinct environments.

Figure 1. Diversity of worldwide mungbean.

(A) Variation in seed color. (B) ADMIXTURE ancestry coefficients, where accessions were grouped by group assignments (Q≥0.7). (C) Principal component analysis (PCA) plot of 1092 cultivated mungbean accessions. Accessions were colored based on their assignment to four inferred genetic groups (Q≥0.7), while accessions with Q<0.7 were colored gray. (D) Neighbor-joining (NJ) phylogenetic tree of 788 accessions with Q≥0.7 with wild mungbean as outgroup (black color).

Figure 1.

Figure 1—figure supplement 1. Cross-validation (CV) errors of ADMIXTURE.

Figure 1—figure supplement 1.

Means of CV errors were calculated based on K values ranging 1–10 with 10 independent runs.

Results

Population structure and spread of mungbean

Using DArTseq, we successfully obtained new genotype data of 290 mungbean accessions from VIR Supplementary file 1a. Together with previous data (Breria et al., 2020; Noble et al., 2018), our final set included 1108 samples with 16 wild and 1092 cultivated mungbean. A total of 40,897 SNPs were obtained. Of these, 34,469 bi-allelic SNPs, with a missing rate less than 10%, were mapped on 11 chromosomes and retained for subsequent analyses.

The genetic structure was investigated based on the 10,359 LD-pruned SNPs. Principal component analysis (PCA, Figure 1C) showed a triangular pattern of genetic variation among cultivated mungbeans, consistent with previous studies (Breria et al., 2020; Noble et al., 2018; Sokolkova et al., 2020) and ADMIXTURE K=3 (Figure 1B). The geographic distribution of these genetic groups is not random, as these three groups are distributed in South Asia (India and Pakistan), Southeast Asia (Cambodia, Indonesia, Philippines, Thailand, Vietnam, and Taiwan), and more northernly parts of Asia (China, Korea, Japan, Russia, and Central Asia). As K increased, the cross-validation (CV) error decreased a little after K=4 (Figure 1—figure supplement 1), where the north group could be further divided (Figure 1B). Therefore, worldwide diversity of cultivated mungbean could be separated into four major genetic groups corresponding to their geography: SA, Southeast Asian (SEA), East Asian (EA), and Central Asian (CA) groups. Note that the genetic groups were named after the region where most of their members distribute, and exceptions exist. For example, many EA accessions also distribute in Central Asia, and some SEA accessions were found near the eastern and northeastern coasts of India. Throughout this work, we make clear distinction between genetic group names (e.g. SA) and a geographic region (e.g. South Asia). Therefore, unlike any other previous work in this species, this study incorporates global genetic variation among cultivated mungbean of this important crop.

Using wild progenitor V. radiata var. sublobata (Wild hereafter) as the outgroup, the accession- (Figure 1D) and population-level (Figure 2A) phylogenies both suggest CA to be genetically closest to EA. The SEA group is more distant, and SA is the most diverged. This relationship is supported by the outgroup f3 tests showing CA shared the highest level of genetic drift with EA, followed by SEA and SA (Supplementary file 1b). Pairwise FST and dxy also give the same conclusion (Figure 2B). Similarly, the f4 tests (Figure 2C) strongly reject the cases where SEA and CA form a clade relative to SA and EA (f4[SA,EA;SEA,CA]=0.016, Z=9.519) or SEA and EA form a clade relative to SA and CA (f4[SA,CA;SEA,EA]=0.021, Z=13.956), again suggesting EA and CA to be closest. With regards to the relationship among Wild, SA, SEA, and EA, f4 tests suggest SEA and EA form a clade relative to Wild and SA (non-significant results in f4[Wild,SA;EA,SEA] but opposite in other combinations). Notably, both TreeMix (Figure 2A) and the f4 test (Figure 2C, f4[SA,SEA;CA,EA]=0.005, Z=6.843) suggest gene flow between SEA and EA. Consistent with archeological evidence of SA domestication, the nucleotide diversity (π) decreased from SA (1.0×10–3) to SEA (7.0×10–4) and EA (5.0×10–4), while the CA group has lowest diversity (3.0×10–4; Figure 2B). Linkage disequilibrium (LD) also decays the fastest in Wild and then the SA group (Figure 2D), followed by other genetic groups. In summary, all analyses are consistent with our proposed order of cultivated mungbean divergence.

Figure 2. Fine-scale genetic relationship and admixture among four inferred genetic groups.

(A) TreeMix topologies with one suggested migration event. Colors on nodes represent support values after 500 bootstraps. (B) Diversity patterns within and between inferred genetic groups as estimated using nucleotide diversity (π in diagonal, where the size of the circle represents the level of π) and population differentiation (FST in upper diagonal and dxy in lower diagonal). (C) f4 statistics. Points represent the mean f4 statistic, and lines are the SE. Only f4 statistics with Z-score>|3| are considered statistically significant. The dashed line denotes f4=0. (D) Linkage disequilibrium (LD) decay. (E) Isolation by distance plot of genetic distance versus geographic distance, with the southern group in red circles and the northern group in blue circles. (F) Relationship between Bio12 (annual precipitation) and nucleotide diversity (π) of the East Asian (EA) genetic group across the east-west axis of Asia. Dot colors represent the annual precipitation of each population.

Figure 2.

Figure 2—figure supplement 1. Schematic representation to investigate presence of admixture in a target population from two source populations using admixture f3 statistic.

Figure 2—figure supplement 1.

(A) f3(EA; SEA, CA), (B) f3(SEA; SA, EA), and (C) f3(CA; EA, SA). Colored circles indicate the geographic area occupied by distinct genetic groups. Arrows indicates the possible direction of expansion and admixture among populations.
Figure 2—figure supplement 2. Estimates of divergence time and inferred mungbean movement over time across Asia.

Figure 2—figure supplement 2.

The histograms of the divergence times represent (A) split time between South Asian (SA) and (Southeast Asian [SEA],[East Asian [EA], Central Asian [CA]]), (B) split time between SEA and (EA,CA), and (C) split time between EA and CA. (D) Geographic distribution of mungbean accessions and proposed mungbean spread routes. Exact locations for Vavilov Institute (VIR) accessions (filled circle) and Global Biodiversity Information Facility (GBIF) records (filled triangle) are provided. Each accession was colored the same as an inferred genetic group using ADMIXTURE in Figure 1. Arrow indicates the possible expansion directions. The map was shaded as a gray color representing altitude (meters above sea level).

Our proposed demographic history could be confounded by factors such as complex hybridization among groups. For example, SEA and CA might have independently originated from SA and later generated a hybrid population in EA (Figure 2—figure supplement 1A). Other possibilities are that either SEA or CA is the hybrid of other populations (Figure 2—figure supplement 1B and C). We examined these possibilities using f3 statistics for all possible trios among the four groups. None of the tests gave a significantly negative f3 value (Supplementary file 1c), suggesting the lack of a strong alternative model to our proposed relationship among these four groups.

Based on the solid relationship among these genetic groups, we used fastsimcoal2 to model their divergence time, allowing population size change and gene flow at all time points (Figure 2—figure supplement 2A–D). According to this model, after initial domestication, the out-of-India event (when other groups diverged from SA) happened about 8.3 thousand generations ago (kga) with 75% parametric bootstrap range between 4.7 and 11.3 kga. Not until more than 5000 generations later (2.7 kga, 75% range 1.1–4.6 kga) did SEA diverge from the common ancestor of present-day EA and CA. CA diverged from EA only very recently (0.2 kga, 75% range 0.1–0.8 kga). Note that the divergence time was estimated in the number of generations, and the much longer growing seasons in the southern parts of Asia may allow more than one cropping season per year (Mishra et al., 2022; Vir et al., 2016).

Our results suggest the non-SA accessions have a common origin out of India (otherwise these groups would branch off independently from the SA group). Given this, the phylogenetic relationship (Figure 2A) is consistent with the following hypotheses. (1) The east hypothesis: mungbean expanded eastward and gave rise to the SEA group. This group might initially occupy northeast South Asia and later expanded to Southeast Asia either through the land or maritime route (Castillo et al., 2016; Fuller et al., 2011). The group later expanded northward as EA. EA expanded westward into Central Asia and gave rise to the CA group. (2) The north hypothesis: the group leaving South Asia first entered Central Asia as the EA group. EA expanded eastward into East Asia through the Inner Asian Mountain Corridor (Stevens et al., 2016). The eastern population of EA expanded southward as the SEA group, and later the western population of EA diverged as the CA group. (3) The northeast hypothesis: the group leaving South Asia (through either of the above-mentioned routes) was first successfully cultivated in northern East Asia without previously being established in Southeast Asia or Central Asia. The EA group then diverged southward as SEA and later expanded westward, giving rise to CA. Consistent with this model, the genetic variation of the EA group gradually declines from east to west, accompanied by the gentlest decline of precipitation per unit geographic distance across Asia (Figure 2F).

While all three hypotheses are consistent with the phylogeny (Figure 2A), the SEA group originated earlier than EA in the east hypothesis but later in the two other hypotheses. The former case predicts higher nucleotide diversity and faster LD decay in SEA than EA, which is supported by our results (Figure 2B and D). While populations that were established in a region for an extended time could accumulate genetic differentiation, generating patterns of isolation by distance, rapid-spreading populations in newly colonized regions could not (Lee et al., 2017; The 1001 1001 Genomes Consortium, 2016). Using this idea, Mantel’s test revealed a significantly positive correlation between genetic and geographic distances for the SA genetic group (r=0.466, P=0.010), followed by SEA (r=0.252, although not as significant, P=0.069). No such association was found for EA (r=0.030, P=0.142) or CA (r=0.087, P=0.172). In addition, the southern groups (SA and SEA) together (r=0.737, P=0.001) have a much stronger pattern of isolation by distance than the northern groups (EA and CA, r=0.311, P=0.001; Figure 2E). Using Q≥0.5 instead of Q≥0.7 to assign individuals into genetic groups generated results that are largely consistent (Supplementary file 1d). These results are again consistent with the ‘east hypothesis’ that local accessions from the SA and SEA groups were established much earlier than those from EA and CA. Finally, the genetic variation of the EA group is highest in the eastern end and declines westward (Figure 2F). This does not support the north hypothesis where EA first existed in Central Asia and expanded eastward.

Environmental differentiation of the inferred genetic groups

We further examined the possible causes governing the expansion of mungbean cultivation ranges. For a crop to be successfully cultivated in a new environment, dispersal and adaptation are both needed. Being a crop that has lost the ability of pod shattering, the spread of mungbean was governed by commerce or seed exchange. While barriers such as the Himalayas or Hindu Kush may limit human activity, South and Central Asia was already connected by a complex exchange network linking the north of Hindu Kush, Iran, and the Indus Valley as early as about 4 thousand years ago (kya; Dupuy, 2016; Kohl, 2007; Kohl and Lyonnet, 2008; Lamberg‐Karlovsky, 2002; Lombard, 2020; Lyonnet, 2005), and some sites contain diverse crops originated across Asia (Spengler et al., 2021). Similarly, other ancient land or maritime exchange routes existed among South, Southeast, East, and Central Asia (Stevens et al., 2016). This suggests that mungbean could have been transported from South to Central Asia, but our genetic evidence suggests that the present-day CA group did not descend directly from the SA group. Therefore, we investigated whether climatic adaptation, that is, the inability of mungbean to establish in a geographic region after human-mediated long-range expansion, could be a contributing factor.

Multivariate ANOVA (MANOVA) of eight bioclimatic variables (after removing highly-correlated ones; Supplementary file 1e,f) indicated strong differentiation in the environmental niche space of the four genetic groups (Supplementary file 1g,h). PCA of climatic factors clearly reflects geographic structure, where the axis explaining most variation (PC1, 42%) separates north and south groups and is associated with both temperature- and precipitation-related factors (Figure 3A and Supplementary file 1i). Consistent with their geographic distribution, overlaps between EA and CA and between SA and SEA were observed. While these analyses were performed using bioclimatic variables from year-round data, we recognized that summer is the cropping season in the north. Parallel analyses using the temperature and precipitation of May, July, and September yielded similar results (Supplementary file 1j; Figure 3—figure supplement 1).

Figure 3. Environmental variation among genetic groups of mungbean.

(A) Principal component analysis (PCA) of the eight bioclimatic variables. Samples are colored according to four inferred genetic groups as indicated in the legend. (B) Predicted distribution at current climate conditions. Red color indicates high suitability, and blue indicates low suitability. Values between pairs represent niche overlap measured using Schoener’s D, and higher values represent higher overlaps. Abbreviations: SAw: South Asia (west), SAe: South Asia (east); SEA: Southeast Asia; EAe: East Asia (east); EAw: East Asia (west), and CA: Central Asia. (C) Environmental gradient across potential directions of expansion. The value on each arrow indicates a change in annual precipitation per kilometer. The background map is colored according to annual precipitation (Bio12, in mm).

Figure 3.

Figure 3—figure supplement 1. Principal component analysis (PCA) of the growing season climatic data including temperature and precipitation of May, July, and September.

Figure 3—figure supplement 1.

Samples are colored according to four inferred genetic groups, as indicated in the legend.
Figure 3—figure supplement 2. The distribution of accessions in major climate zones according to the Köppen climate classification (Köppen, 2011).

Figure 3—figure supplement 2.

Figure 3—figure supplement 3. Predicted distributions of six groups based on monthly temperature and precipitation (May, July, and September) during the summer growing season.

Figure 3—figure supplement 3.

Red color indicates high suitability, and blue indicates low suitability. Values between groups represent niche overlap measured using Schoener’s D. Abbreviations: SAw: South Asia (west), SAe: South Asia (east); SEA: Southeast Asia; EAe: East Asia (east); EAw: East Asia (west); and CA: Central Asia.
Figure 3—figure supplement 4. Monthly temperature and precipitation variations among the four genetic groups.

Figure 3—figure supplement 4.

Monthly (A) maximum temperature, (B) minimum temperature, (C) mean temperature, and (D) precipitation were computed based on median value among all accessions of a group. Genetic group were colored the same as in Figure 1.
Figure 3—figure supplement 5. Environmental gradient across Asia.

Figure 3—figure supplement 5.

The value on each arrow indicates a change in mean precipitation for May, July, and September (growth season) per kilometer. The background map is colored according to summer precipitation (Bio18, precipitation of warmest quarter, in mm).

Based on the Köppen climate classification (Köppen, 2011), we categorized the Asian mungbean cultivation range into six major climate zones (Figure 3—figure supplement 2): dry hot (BSh and BWh), dry cold (BSk and BWk), temperate dry summer (Csa), tropical savanna (Aw), continental (Dwb and Dfb), and temperate wet summer (Cfa and Cwa). The former three are relatively drier than the latter three zones. While SEA and CA are relatively homogeneous, SA and EA have about half of the samples in the dry and non-dry zones (Figure 3—figure supplement 2). We, therefore, separated SA into SAe and SAw and EA into EAe and EAw, corresponding to the wetter eastern and drier western regions within the SA and EA ranges. Environmental niche modeling revealed distinct suitable regions of these six groups except for CA and EAw, whose geographical ranges largely overlap (Figure 3B). Consistent with PCA, pairwise Schoener’s D values are smallest between the northern and southern groups while largest (suggesting overlaps of niche space) between the eastern and western subsets within north and south (Figure 3B), consistent with PCA that the major axis of climatic difference is between the northern and southern parts of Asia. Analyses using temperature and precipitation from May, July, and September yielded similar results (Figure 3—figure supplement 3). Given a single out-of-India event (Figure 2A), the results suggest it might be easier to first cultivate mungbean in Southeast rather than Central Asia, supporting the east hypothesis.

While both temperature and precipitation variables differ strongly between north and south, one should note that these year-round temperature variables do not correctly reflect conditions in the growing seasons. In the north, mungbean is mostly grown in summer where the temperature is close to the south (Figure 3—figure supplement 4A–C). On the other hand, precipitation differs drastically between the north and south, especially for the CA group, where the summer-growing season is the driest of the year (Figure 3—figure supplement 4D). By estimating the regression slope of annual precipitation on geographical distance, we obtained a gradient of precipitation change per unit geographic distance between pairs of genetic groups (Figure 3C). Despite the SA-SEA transect having the steepest gradient (slope = 0.21), the spread from SA to SEA has been accompanied by an increase of precipitation and did not impose drought stress. However, the second highest slope (0.18) is associated with a strong precipitation decrease if the SA group were to disperse to Central Asia. Results from the precipitation of May, July, and September yielded similar conclusion (Figure 3—figure supplement 5). This likely explains why no direct historic spread is observed from South to Central Asia.

Trait variation among genetic groups

If environmental differences constrained the spread route of mungbean, the currently cultivated mungbean accessions occupying distinct environments should have locally adaptive traits for these environments. Indeed, PCA of four trait categories shows substantial differences among genetic groups (phenology, reproductive output, and size in field trials, as well as plant weight in lab hydroponic systems, Figure 4A). In the field, CA appears to have the shortest time to flowering, the lowest yield in terms of seed size and pod number, and the smallest leaf size (Figure 4B and Supplementary file 1k). On the other hand, SEA accessions maximize seed size, while SA accessions specialize in developing the largest number of pods (Figure 4B). These results suggest that CA has a shorter crop duration, smaller plant size, and less yield, consistent with drought escape phenotypes. This is consistent with the northern short-growing season constrained by temperature and daylength (below), as well as the low precipitation during the short season.

Figure 4. Quantitative trait differentiation among genetic groups.

(A) Principal component analysis (PCA) of four trait categories. (B) Trait variability from common gardens in field experiments. Sample size of SA, SEA, and CA are 18, 17, and 14, respectively. (C) Comparison of QST-FST for four drought-related traits under two environments. FST values (mean, 5%, and 1%) were indicated by black dashed lines. The QST for each trait was colored according to treatment and was calculated as Equation 2 in Materials and methods. Abbreviations: RDW: root dry weight; SDW: shoot dry weight; TDW: total dry weight; RSRDW: root:shoot ratio dry weight; c: control; p: PEG6000. (D) Effect of PEG6000 (–0.6 MPa) on RDW, SDW, TDW, and RSRDW among genetic groups. Sampe size of SA, SEA, and CA are 20, 18, and 14, repectively. Data were expressed as the mean ± SE. Lowercase letters denote significant differences under Tukey’s honestly significant difference test in (B) and (D).

Figure 4.

Figure 4—figure supplement 1. Comparison of QST-FST for four drought-related traits under two environments.

Figure 4—figure supplement 1.

FST values (mean, 5% and 1%) were indicated by black dashed lines. The QST for each trait was colored according to treatment and was calculated as Equation 1 in Materials and methods. Abbreviations: RDW: root dry weight; SDW: shoot dry weight; TDW: total dry weight; RSRDW: root:shoot ratio dry weight; c: control; p: PEG6000.

In terms of seedling response to drought stress, the QST values of most traits (root, shoot, and whole plant dry weights under control and drought treatments) are higher than the tails of SNP FST, suggesting trait evolution driven by divergent selection (Figure 4C; Figure 4—figure supplement 1). Significant treatment, genetic group, and treatment by group interaction effects were observed except on a few occasions (Table 1). Consistent with field observation, SEA has the largest seedling dry weight (Figure 4D). While simulated drought significantly reduced shoot dry weight (SDW) for all groups, the effect on SEA is especially pronounced (treatment-by-group interaction effect, F2,575 = 23.55, P<0.001, Table 1 and Figure 4D), consistent with its native habitats with abundant water supply (Figure 3—figure supplement 4D and Supplementary file 1l). All groups react to drought in the same way by increasing root:shoot ratio (Figure 4D), suggesting such plastic change may be a strategy to reduce transpiration. Despite the lack of treatment-by-group interaction (F2,575 = 1.39, P>0.05), CA consistently exhibits a significantly higher root:shoot ratio, a phenotype that is potentially adaptive to its native environment of lower water supply (Figure 3—figure supplement 4D and Supplementary file 1l).

Table 1. ANOVA F values for the dry weight (mg) of mungbean seedlings across three different genetic groups.

Source of variation Degrees of freedom (df) Root dry weight Shoot dry weight Total dry weight Root:shoot ratio dry weight
Treatment 1 2.65n.s. 133.26*** 72.26*** 978.76***
Genetic group 2 60.63*** 79.62*** 76.54*** 13.27***
Treatment × Genetic group 2 3.29* 23.55*** 17.79*** 1.39n.s.

*P<0.05 and ***P<0.001; n.s. non-significant.

Support from ancient Chinese sources

Mungbean has been occasionally mentioned in ancient Chinese sources. Here, we report the records associated with our proposed mungbean spread route and the underlying mechanisms. The ‘Classic of Poetry’ (Shijing 詩經) contains poems dating between the 11th and 7th centuries BCE near the lower and middle reaches of the Yellow River. While crops (especially soy bean, 菽), vegetables, and many other plants have been mentioned, mungbean was not recorded. This is consistent with our results that mungbean had not reached the northern parts of East Asia at that time (the EA group diverged from the SEA group at around 2.7 kga). The first written record of mungbean in China is in an agricultural encyclopedia Qimin Yaoshu (齊民要術, 544 AD, Chinese text and translation in Supplementary note), whose spatiotemporal background (~1.5 kya near the lower reaches of Yellow River) is again consistent with our estimated origin of the EA group.

Our results suggest that the expansion of the mungbean cultivation range may be associated with the novel phenotypic characteristics potentially adaptive to the new environments. This proposal would be rejected if the novel phenotypic characteristics appeared very recently. In support of our proposal, Xiangshan Yelu (湘山野錄, an essay collection during 1068–1077 AD) recorded that mungbean from the southern parts of Asia had higher yield and larger grains than those in northern China (Chinese text and translation in Supplementary note). Similarly, Tiangong Kaiwu (天工開物, 1637 AD) mentioned that mungbean must be sown during July and August (Chinese text and translation in Supplementary note). The record suggests that the daylength requirement restricts the sowing period of mungbean in the north. Together with the dry summer (Figure 3—figure supplement 4D) and soon-arriving autumn frost, there might be a strong selection favoring accessions with the rapid life cycle. These records suggest the phenotypic characteristics of northern accessions did not originate very recently, and the unique distribution of climatic zones in Asia resulted in not only the specific patterns of expansion but also the evolution of novel phenotypic characteristics in mungbean.

Discussion

Using mungbean as a test case, we combined population genomics, environmental niche modeling, empirical field and laboratory investigation, and ancient Chinese text analyses to demonstrate the importance of climatic adaptation in dictating the unique patterns of cultivation range expansion after domestication. In this study, we focus on how or when mungbean could be established as part of local agriculture throughout Asia. We showed that after leaving South Asia, mungbean was likely first cultivated in Southeast Asia, East Asia, and finally Central Asia. We acknowledge that our data do not allow us to specify the number of previous out-of-India events that did not leave traces in modern genetic data or their exact routes (e.g. whether mungbean expanded from South to Southeast Asia through the land or maritime routes). While there might be multiple attempts to bring mungbean out of India as a commodity for consumption, our results suggest all present-day non-SA accessions have a common out-of-India origin.

The climate-driven spread route despite historical human activities

Combining archeological records, population genetics, and niche modeling (Figures 2 and 3), our results suggest that after the early cultivation of mungbean in northwestern or southern South Asia (Fuller, 2007; Kingwell-Banham et al., 2015), the large environmental difference may restrict its northward spread to Central Asia. Mungbean may first spread to eastern South Asia, and the subsequent expansion to Southeast Asia might be facilitated by the environmental similarity between these two regions. This is supported by archaeobotanical remains from the Thai-Malay Peninsula date to ca. 400–100 BCE (Castillo et al., 2016). It took more than 5000 generations until mungbean further spread to northeast Asia, again likely due to the environmental difference. The later appearance of mungbean in northern China is also supported by historical records. After that, the EA group spread across the northern part of Asia within a few thousand generations. Our proposed route suggests that mungbean reached Central Asia at the latest, consistent with its absence from archeological sites in Central Asia, including Turkmenistan and Uzbekistan in the Chalcolithic and Bronze ages (fifth to second millennium BC; Miller, 1999), Southeastern Kazakhstan in the Iron age dating first millennium BC (Spengler et al., 2017), and eastern Uzbekistan during the medieval period (800–1100 AD; Spengler et al., 2018b). In addition, mungbean was only mentioned later by the 18th and early 19th centuries as a pulse grown in the Khiva region of Uzbekistan (Annanepesov and Bababekov, 2003).

In this study, we suggest that the ability to disperse may not be an essential factor restricting mungbean spread from South to Central Asia. Cultivated mungbean has lost the natural ability of pod shattering to disperse seeds, and they mostly traveled through landscapes by human-mediated seed exchange or commerce. Evidence of long-distance human-mediated dispersal of mungbean was available. For example, mungbean seeds have been found near the Red Sea coast of Egypt during the Roman (AD 1–250) period (Van der Veen and Morales, 2015). As early as about 4 kya, the Bactria–Margiana Archaeological Complex civilization north of the Hindu Kush had extensive contact with the Indus Valley Civilization (Dupuy, 2016; Kohl, 2007; Kohl and Lyonnet, 2008; Lamberg‐Karlovsky, 2002; Lombard, 2020; Lyonnet, 2005). By 1500 BC, the ‘Trans-Eurasian Exchanges’ of major cereal crops has happened (Liu et al., 2019). The frequent crop exchange is evidenced by archaeobotanical findings in the Barikot site (ca. 1200 BC-50 AD) in northern Pakistan (Spengler et al., 2021), where diverse crops were cultivated, including those from West Asia (wheat, barley, pea, and lentil), South Asia (urdbean/mungbean), and likely East Asia (rice). Despite this, in Bronze-age archeological sites north of Hindu Kush, legumes (such as peas and lentils) were observed to a lesser extent than cereals, and SA crops were not commonly found (Jeong et al., 2019; Spengler, 2015; Spengler et al., 2014a; Spengler et al., 2018a; Spengler et al., 2014b). Interestingly, archeologists suggested legume’s higher water requirement than cereals may be associated with this pattern, and pea and lentil’s role as winter crops in Southwest Asia may be associated with their earlier appearance in northern Central Asia than other legumes (Spengler et al., 2014a; Spengler et al., 2018a; Spengler et al., 2014b). Therefore, despite the possibility of human-mediated seed dispersal between South and Central Asia, our results and archeological evidence concurred that mungbean arrived in Central Asia at the latest, likely restricted by environmental adaptation.

Local adaptation of mungbean genetic groups

Despite the profound impact of human-mediated dispersal on the spread of these and many other crops (Herniter et al., 2020; Kistler et al., 2018), in mungbean, we suggest adaptation to distinct climatic regimes to be an important factor in the establishment after dispersal. Mungbean is commonly grown under rainfed cultivation and depends on the residual moisture in the fields after the primary crop, thus responding to water stress (Douglas et al., 2020). In the south, a temperature range of 20–30°C and annual precipitation of 600–1000 mm is optimal for mungbean (Ha and Lee, 2019). In Central Asia, however, the annual precipitation could be as low as 286 mm, greatly below the lower limit required for the southern mungbean. This situation could be further acerbated by the fact that mungbean might not be a highly valued crop under extensive care during cultivation. Indeed, the earliest record of mungbean in China (Qimin Yaoshu 齊民要術, 544 AD) emphasizes its use as green manure. In Central Asia, mungbean is a minor crop (Rani et al., 2018) grown with little input, only in the short duration between successive planting of main crops (which is also the dry season in Central Asia, Supplementary file 1 and Figure 3—figure supplement 4) and using residual soil moisture with little irrigation. We suggest that the lack of extensive input subjects mungbean to more substantial local climatic challenges than highly valued high-input crops that receive intensive management, including irrigation. Therefore, the combination of climatic constraints and cultural usage, instead of physical barriers, may have shaped the historical spread route of the mungbean despite extensive human activities across the continent.

In addition to the constraint of soil moisture, other factors may have contributed to the selection of short-season accessions in the north. In the short summer seasons of much of Central Asia, short crop cycling is a requirement. In Uzbekistan, mungbean is often sown in early July after the winter wheat season and harvested before mid-October to avoid delays in the next round of winter wheat and escape frost damage. Therefore, fast-maturing accessions are essential for this production system (Rani et al., 2018). Similar rotation systems using mungbean to restore soil fertility during the short summer season after the harvest of the main crop were also mentioned in ancient Chinese sources (Chen, 1980). Mungbean is a short-day species from the south, and daylength likely limits the window when mungbean could be grown in the north: Chinese texts during the 17th century (Tiangong Kaiwu 天工開物, 1637 AD) specifically mentioned the suitable duration to sow mungbean to control the flowering behavior for maximum yield (Supplementary note). Therefore, unlike in the south where yield appears to be an important selection target, the unique combination of daylength, agricultural practices, soil water availability, and frost damage in the north requires the selection for short-season accessions, likely limiting the direct adoption of southern accessions in the north. Consistent with this, CA accessions have a faster life cycle potentially adaptive to both short growing season and reduced soil water availability, with reduced plant size and lower yield as tradeoffs. These accessions also have increased root:shoot ratio for drought adaptation, similar to findings in rice (Xu et al., 2015), alfalfa (Zhang et al., 2018), and chickpea (Kumar et al., 2012).

About accession sampling and climatic niche modeling, we recognize that not all samples have available spatial data, and we do not have samples from some parts of Asia. For example, while most samples of the SEA group were collected from Taiwan, Thailand, and Philippines, we do not have many samples from the supposed contact zone between SA and SEA (Bangladesh and Myanmar) or between SEA and EA (southern China). If more samples were available from these contact zones, the modeled niche space between SA and SEA and between SEA and EA would be even more similar than the current estimate, strengthening our hypothesis that niche similarity might facilitate the cultivation expansion. On the other hand, clear niche differentiation between SA and CA was evident despite the dense sampling near their contact zone. Based on the Köppen climate classification, South Asia could be roughly separated into two major zones, with the eastern zone slightly more similar to Southeast Asia (Figure 3—figure supplement 2). This partially explained the existence of some SEA accessions in the northeastern coast of India. While the SEA genetic group was named after the geographic region where most of its members were found in the present time, we recognize the possibility that it first occupied northeastern South Asia when it diverged from SA. In that case, the SA-SEA divergence time (4.7–11.3 kga) might indicate the divergence between the two climate zones within South Asia rather than the expansion of mungbean into Southeast Asia, which may occur much later.

Conclusion

Our study demonstrates that mungbean’s cultivation range expansion is associated with climatic conditions, which shaped the genetic diversity and contributed to adaptive differentiation among genetic groups. The climatic differences likely also resulted in farmers’ differential emphasis on using it mainly as a grain or green manure crop, further intensifying the phenotypic diversification among regional mungbean accessions that could be used as an invaluable genetic resource for genetic improvement in the future.

Materials and methods

Plant materials and SNP genotyping

A total of 290 cultivated mungbean (V. radiata var. radiata) accessions were provided by the VIR. Most of the accessions are mainly landraces collected during 1910–1960 and are considered these accessions as the oldest cultivated mungbean collection from VIR (Burlyaeva et al., 2019). The term landrace, as we use it here, refers to locally adaptive accessions coming from the countries traditionally cultivating them, which also lacks modern genetic improvement. The complete list of materials can be found in Supplementary file 1a. Genomic DNA was extracted from a single plant per accession using the QIAGEN Plant Mini DNA kit according to the manufacturer’s instruction with minor modification of pre-warming the AP1 buffer to 65°C and increasing the incubation time of the P3 buffer up to 2 hr on ice to increase DNA yield. DNA samples were sent to Diversity Arrays Technology Pty Ltd, Canberra, Australia for diversity array technology sequence (DArTseq) genotyping.

DArTseq data of 521 accessions from the ADP (Noble et al., 2018) and 297 accessions from the WorldVeg mini-core (Breria et al., 2020) were also included in this study. In total, our dataset contains more than 1000 accessions (1092) and covers worldwide diversity of cultivated mungbean representing a wide range of variation in seed color (Figure 1A). Sixteen wild mungbean (V. radiata var. sublobata) accessions were included as an outgroup. While all accessions used in this study have the country of origin information, only those from VIR have detailed longitude and latitude information. Therefore, for analyses connecting genetic information and detailed location (the isolation by distance analyses), only the VIR samples were used.

The major goal of this study is to investigate the patterns of population expansion and the underlying ecological causes instead of detailed haplotype analyses of specific genomic regions. For this goal, genomewide SNPs provide similar information as whole-genome sequencing, as have been shown in other species. Compared to other genotyping-by-sequencing technologies, DArTseq has the additional advantage of less missing data among loci or individuals, providing a more robust estimation of population structure.

SNP calling

Trimmomatic version 0.38 (Bolger et al., 2014) was used to remove adapters based on the manufacturer’s adapter sequences. Reads for each accession were trimmed for low-quality bases with quality scores of Q≤10 using SolexaQA version 3.1.7.1 (Cox et al., 2010) and mapped to the mungbean reference genome (Vradiata_ver6, Kang et al., 2014) using the Burrows-Wheeler Aligner version 0.7.15 (Li and Durbin, 2009). Reads were then sorted and indexed using samtools version 1.4.1 (Li et al., 2009). We used Genome Analysis Toolkit (GATK) version 3.7–0-gcfedb67 (McKenna et al., 2010) to call all sites, including variant and invariant sites. We obtained 1,247,721 sites with a missing rate of <10% and a minimum quality score of 30. SNP calling was performed using GATK (McKenna et al., 2010). Finally, we used VCFtools version 0.1.13 (Danecek et al., 2011) to remove SNPs with more than two alleles and 10% missing data, resulting in 34,469 filtered SNPs. To reduce non-independence caused by LD among SNPs, SNPs were pruned based on a 50-SNP window with a step of five SNPs and r2 threshold of 0.5 in PLINK (Purcell et al., 2007). This dataset of 10,359 LD-pruned SNPs (10% missing data) was applied for all analyses related to population genomics unless otherwise noted. For TreeMix that require LD-pruned SNPs with no missing dataset, we used 4396 LD-pruned SNPs with no missing data.

Population genetics and differentiation analyses

Population structure was investigated based on 10,359 LD-pruned SNPs using ADMIXTURE (Alexander et al., 2009) with the number of clusters (K) ranging from 1 to 10. The analyses were run 10 times for each K value, and CV error was used to obtain the most probable K value for population structure analysis. ADMIXTURE plots were generated using ‘Pophelper’ in R (Francis, 2017). Genetic groups of accessions were assigned based on ancestry coefficient Q≥0.7, otherwise the accession was considered admixed. The population structure was also examined with PCA. The neighbor-joining phylogenetic tree was calculated using TASSEL (Trait Analysis by aSSociation, Evolution, and Linkage) software version 5.2.60 (Bradbury et al., 2007) and visualized using FigTree version 1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/).

The relationships and gene flow among the four inferred genetic groups were further assessed by TreeMix version 1.12 (Pickrell et al., 2012) using 4396 LD-pruned SNPs with no missing data. The analysis was run for 0–3 migration events with V. radiata var. sublobata as an outgroup with a block size of 20 SNPs to account for the effects of LD between SNPs. We estimated one as the optimal number of migration events using the ‘OptM’ in R (Fitak, 2021). Bootstrap support for the resulting observed topology was obtained using 500 bootstrap replicates.

Nucleotide diversity (π) and genetic differentiation (dxy and FST) were estimated in 10 kb windows with pixy version 1.2.7.beta1 (Korunes and Samuk, 2021) using all 1,247,721 invariant and variant sites. LD decay for each genetic group was estimated based on 34,469 non-LD-pruned SNPs using PopLDdecay (Zhang et al., 2019). The curves were fitted by a LOESS function, and an LD decay plot was drawn using R.

To investigate the relation among inferred genetic groups, f3 and f4 statistics were computed based on filtered SNPs using ADMIXTOOLS version 7.0 (Patterson et al., 2012). The f3 statistic compares allele frequencies in two populations (A and B) and a target population C. In ‘outgroup f3 statistic,’ C is the outgroup, and positive values represent the shared genetic drift between A and B. In ‘admixture f3 statistic,’ negative values indicate that the C is admixed from A and B. For f4 statistics, f4(A, B; C, D) measures the shared genetic drift between B populations and C and D after their divergence from outgroup A. A positive value indicates that the B population shares more alleles with D, and a negative value indicates that the B population shares more alleles with C. We used two Mb as a unit of block-jackknife resampling to compute SEs. The Z-scores with absolute values greater than three are considered statistically significant.

To examine the role of geographic distance in shaping spatial genetic differentiation, Mantel tests with 1000 permutations were performed for each of the ADMIXTURE-inferred genetic groups (separately for the groups defined by Q≥0.7 or Q≥0.5) using ‘ade4’ in R. Pairwise genetic distance between accessions was estimated based on all sites while the great circle geographic distance was determined using ‘fields’ in R. In addition, the same analysis was conducted for southern and northern groups to examine if there was a south-north pattern of differentiation.

Based on the shape of the phylogenetic tree, we used fastsimcoal2 (Excoffier et al., 2021), which does not rely on whole-genome sequencing, to estimate the split time among genetic groups. Fifty accessions were randomly picked from each genetic group. Population size was allowed to change, and gene flow was allowed among populations. This analysis used all sites covered by the DArT tags (including monomorphic sites), and the mutation rate was set to 1×10–8 which was within the range of mutation rates used in eudicots (Barrera-Redondo et al., 2021; Zheng et al., 2022). The models were run using unfolded site frequency spectrum using the major allele in the wild progenitor population (V. radiata var. sublobata) as the ancestral allele. The model was run independently 100 times, each with 100,000 simulations. After obtaining the run with the highest likelihood, we performed parametric bootstrapping 100 times to obtain the 75% CIs of each parameter based on the previous study of Gutaker et al., 2020.

Ecological niche modeling

To understand whether the habitats of genetic groups are differentiated, 248 sampling sites (82 for EA, 45 for SEA, 49 for SA, and 72 for CA genetic groups), in combination with additional presence records obtained from the Global Biodiversity Information Facility (GBIF, https://www.gbif.org/), were used for the analysis. Using the longitude and latitude information, we extracted the Köppen climate zones (Köppen, 2011) using ‘kgc’ in R (Bryant et al., 2017). After excluding zones with less than 5 samples, the remaining 10 zones were grouped into 6 categories based on climate similarity: dry hot (BSh and BWh), dry cold (BSk and BWk), temperate dry summer (Csa), tropical savanna (Aw), continental (Dwb and Dfb), and temperate wet summer (Cfa and Cwa). The former three are relatively dry environments.

Climate layers comprising monthly minimum, maximum, mean temperature, precipitation, and 19 bioclimatic variables were downloaded from the WorldClim database version 1.4 (Hijmans et al., 2005). All climate layers available from WorldClim were created based on climate conditions recorded between 1960 and 1990 at a spatial resolution of 30 arc-seconds (approximately 1 km2). To minimize redundancy and model overfitting, pairwise Pearson correlations between the 19 bioclimatic variables were calculated using ENMTools version 1.4.4 (Warren et al., 2010), excluding one of the two variables that has a correlation above 0.8. As a result, eight bioclimatic variables were used for all further analyses, including Bio1 (annual mean temperature), Bio2 (mean diurnal range), Bio3 (isothermality), Bio8 (mean temperature of wettest quarter), Bio12 (annual precipitation), Bio14 (precipitation of driest month), Bio15 (precipitation seasonality), and Bio19 (precipitation of coldest month). Bioclimatic variables were extracted for each occurrence point using ‘raster’ in R (Hijmans, 2021). PCA and MANOVA were conducted to examine whether there was a significant habitat difference among genetic groups. Ecological niche modeling (ENM) was performed using MAXENT version 3.3.1 (Phillips et al., 2006) to predict the geographic distribution of suitable habitats for cultivated mungbean. The ENM analysis was run with a random seed, a convergence threshold of 5000 and 10-fold CV. As a measure of the habitat overlaps of the four genetic groups, pairwise of Schoener’s D was calculated using ENMTools. The value ranges from 0 (no niche overlap) to 1 (niche complete overlap). In addition, we carried out the same analyses using monthly temperature and precipitation from May, July, and September.

Field evaluation

Among the 52 accessions used for laboratory experiments, phenotyping of 49 accessions was conducted at WorldVeg, Taiwan in 1984 and 2018 and at Crop Sciences Institute, National Agricultural Research Centre, Pakistan in 2015. The traits related to phenology (days to 50% flowering), reproduction (100 seed weight, pod length, pods per plant, 1000 seed weight, seeds yield per plant, and seeds per pod), and plant size (petiole length, plant height, plant height at flowering, plant height at maturity, primary leaf length, primary leaf width, terminal leaflet length, and terminal leaflet width) were included. Trait values were inverse normal transformed. The ANOVA was performed to test for inferred genetic groups differences for each trait using R software (version 4.1.0).

Drought phenotyping

A total of 52 accessions with ancestry coefficients Q≥0.7 from three genetic groups (SEA, SA, and CA) were selected for experiments of seedling-stage drought response. The experiment was laid out in a completely randomized design with three replicates of each accession under two treatments (control/drought). The experiment was conducted in two independent batches, and the whole experiment included 624 plants (52 accessions × 2 treatments × 3 plants per treatment × 2 batches).

Mungbean seeds were surface-sterilized with 10% bleach for 10 min and rinsed with distilled water for three times. Seeds were treated with 70% ethanol for 5 min and washed three times in distilled water. The sterilized seeds were germinated on wet filter paper in petri dishes for 3 days. The experiment was conducted in a 740FLED-2D plant growth chamber (HiPoint, Taiwan) at a temperature of 25 ± 1°C and 12 hr of photoperiod (light ratios of red: green: blue 3: 1: 1) with light intensity 350 µmol m–2s–1 and relative humidity at 60 ± 5%. The seedlings were then transplanted to a hydroponic system with half-strength Hoagland nutrient solution (Phytotechnology Laboratory, USA) and were grown for 6 days before drought stress started. The nutrient solution was changed on alternate days, and the pH of the solution was adjusted to 6.0 with 1 M KOH or 1 M HCl.

For drought treatment, seedlings of mungbean were exposed to polyethylene glycol (PEG)-induced drought stress for 5 days. The solution of PEG6000 with an osmotic potential of –0.6 MPa was prepared by adding PEG6000 (Sigma-Aldrich, Germany) to the nutrient solution according to Michel and Kaufmann, 1973, and pH was also adjusted to 6.0. The seedlings grown with the nutrient solution under the same environmental conditions were considered as controls.

At the end of the experiment, plants were evaluated for SDW and root dry weight, measured on digital balance after oven-drying at 70°C for 48 hr. All traits were analyzed by mixed-model ANOVA with the treatment (control/drought) and the genetic group as fixed effects. The models included accessions as a random effect nested within genetic groups and a random effect of batches. Tukey’s test was conducted to compare genetic groups. All statistics were performed using JMP v13.0.0 (SAS Institute, 2016).

QST-FST comparisons

For each trait, quantitative trait divergence (QST) was calculated separately with respect to each treatment. Our root and shoot weight experiment used a selfed-progeny design, using the self-fertilized seeds from each accession as replicates, as recommended for partially inbred species (Goudet and Büchi, 2006). For the selfed-progeny design of inbred species, (Equation 1) QST = VB/(VB+VFam), where VB is the among-population variance component, and VFam is the within-population among-family variance component (Goudet and Büchi, 2006). Variance components were estimated using a model with genetic groups, accessions nested within genetic groups, and batches as random factors. To accommodate the possibility that mungbean is not completely selfing, we also applied (Equation 2) QST = (1+f)VB/([1+f]VB +2 VAW) (Goudet and Büchi, 2006), where f is the inbreeding coefficient (estimated by VCFtools as 0.8425), VB is the among-population variance component, and VAW is the additive genetic variance within genetic groups estimated by the kinship matrix using TASSEL software (Bradbury et al., 2007). The results and conclusions are similar to our previous version. The FST was calculated only using accessions in the phenotyping experiment.

Acknowledgements

We thank Chia-Yu Chen and Shang-Ying Tien for their assistance in sample preparation. C-RL was funded by 107–2923-B-002–004-MY3 and 110–2628-B-002–027 from the Ministry of Science and Technology, Taiwan. C-TT was funded by 107–2923-B-002–004-MY3 from the Ministry of Science and Technology, Taiwan. Y-PL was supported by 110–2313-B-125–001-MY3 from the Ministry of Science and Technology, Taiwan. RN and RS were funded by the Australian Center for International Agricultural Research (ACIAR) through the projects on International Mungbean Improvement Network (CIM-2014–079 and CROP-2019–144) and by the strategic long-term donors to the World Vegetable Center: Republic of China (Taiwan), UK aid from the UK government, United States Agency for International Development (USAID), Australian Centre for International Agricultural Research (ACIAR), Germany, Thailand, Philippines, Korea, and Japan. EBvW was supported by USDA Multistate Hatch NE2210 and USDA NIFA award 2022-67013-37120. MS was supported by the Ministry of Science and Higher Education of the Russian Federation as part of the World-class Research Center program: Advanced Digital Technologies (contract No. 075-15-2022-311 dated 20.04.2022). SN was supported by the Zumberge foundation.

Appendix 1

Supplementary note

Text analysis and translation of ancient Chinese texts regarding mungbean

Qimin Yaoshu (齊民要術, about 544 AD)

Qimin Yaoshu, compiled by Sixie Jia (賈思勰), is one of the earliest and most complete agricultural sources in China, detailing agricultural techniques near the lower reaches of Yellow River at that era. This is the earliest record of mungbean in China, demonstrating mungbean has reached northern China at that time and is consistent with our estimates of population divergence time. The popularity of mungbean is demonstrated by it being mentioned multiple times under different contexts, most notably as a green manure:

「若糞不可得者,五六月中,穊種菉豆,至七月,八月,犁掩殺之。如以糞糞田,則良美與糞不殊,又省功力。」

Translation: “Should feces be unavailable, during May and June one could grow mungbean. In July or August, one could plow mungbean plants into the soil. This is equivalent to using feces to manure the land. This is as good as using feces and saves efforts.”

Notice that the months used in ancient China are slightly different from the Gregorian calendar.

Xiangshan Yelu (湘山野錄, 1068–1077 AD)

Xiangshan Yelu was written by a monk, Wen-Ying (文瑩), recording anecdotes during that era. Its records about the Emperor Zhenzong of Song (宋真宗, 968–1022 AD) detailed the phenotypes of Indian mungbean at that time:

「真宗深念稼穡,聞占城稻耐旱, 西天綠豆子多而粒大,各遣使以珍貨求其種。占城得種二十石,至今在處播之。西天中印土得菉豆種二石,不知今之菉豆是否?」

Translation: ‘Zhenzong of Song deeply concerned about agriculture. He heard Champa rice being drought tolerant, and mungbean from India produce numerous and large seeds. Diplomats were sent to exchange the seeds with treasure. Twenty dans of Champa rice were obtained and propagated everywhere. Two dans of mungbean were obtained from India, but it is unclear whether the mungbean today descended from these.’

‘Dan’ (石) is a unit of volume in ancient China and is called ‘Koku’ in Japanese. The exact amount varied with time.

The texts provide us with two pieces of important information. First, mungbean from South Asia (likely also includes the SEA genetic groups if accessions near eastern India and Bangladesh were included) at that time had higher yield and larger seeds than native mungbean accessions in northern China, consistent with our results on trait divergence. Second, compared to the clear success of Champa rice in China, it was unclear whether those southern mungbean accessions had prospered in northern China, likely suggesting an unsuccessful introduction of southern high-yield and large-seeded accessions to the north.

Tiangong Kaiwu (天工開物, 1637 AD)

Tiangong Kaiwu is a famous Chinese encyclopedia compiled by Song Yingxing (宋應星). While it mostly covers technologies at that time, a section about agricultural practices covers mungbean:

「綠豆必小暑方種,未及小暑而種,則其苗蔓延數尺,結莢甚稀。若過期至於處暑,則隨時開花結莢,顆粒亦少。」

Translation: ‘Mungbean must be sown at or after Xiaoshu (Gregorian 7–8 July). Being sown before Xiaoshu, mungbean stems would spread for meters with few pods set. Being sown as late as Chushu (Gregorian 23–24 August), the plants would flower and set pods at any time, also with low yield.’

As a short-day plant, being sown too early when the days are too long, mungbean would have mostly vegetative growth. Being sown too late when the days are too short, flowering would be induced too quickly before sufficient vegetative development. In addition to our results that short-season accessions were favored in the north due to the requirement for drought escape, this source provides us with another support that mungbean could only be sown in a narrow time window due to daylength requirement. Given the autumn frost damage in the north, not being able to be sown earlier restricts the growing season length in the north, limiting the adoption of southern long-season accessions.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Cheng-Ruei Lee, Email: chengrueilee@ntu.edu.tw.

Detlef Weigel, Max Planck Institute for Biology Tübingen, Germany.

Detlef Weigel, Max Planck Institute for Biology Tübingen, Germany.

Funding Information

This paper was supported by the following grants:

  • Ministry of Science and Technology, Taiwan 107-2923-B-002-004-MY3 to Chau-Ti Ting, Cheng-Ruei Lee.

  • Ministry of Science and Technology, Taiwan 110-2628-B-002-027 to Cheng-Ruei Lee.

  • Australian Centre for International Agricultural Research CROP-2019-144 to Ramakrishnan Madhavan Nair, Roland Schafleitner.

  • Ministry of Science and Technology, Taiwan 110-2313-B-125-001-MY3 to Ya-Ping Lin.

  • Australian Centre for International Agricultural Research CIM-2014-079 to Ramakrishnan Madhavan Nair, Roland Schafleitner.

  • U.S. Department of Agriculture Multistate Hatch NE2210 to Eric Bishop-von-Wettberg.

  • Ministry of Science and Higher Education of the Russian Federation 075-15-2022-311 to Maria Samsonova.

  • USDA National Institute of Food and Agriculture 2022-67013-37120 to Eric Bishop-von-Wettberg.

  • Zumberge foundation to Sergey Nuzhdin.

  • Russian Science Foundation 18-46-08001 to Eric Bishop-von-Wettberg, Marina Burlyaeva, Maria Samsonova, Margarita Vishnyakova.

Additional information

Competing interests

No competing interests declared.

Author contributions

Data curation, Formal analysis, Validation, Investigation, Visualization, Writing – original draft, Writing – review and editing.

Data curation, Formal analysis, Investigation, Writing – original draft.

Data curation, Formal analysis, Investigation, Writing – original draft.

Data curation, Validation, Investigation.

Data curation, Investigation, Writing – original draft, Resources.

Data curation, Investigation, Writing – original draft, Resources.

Data curation, Investigation, Writing – original draft, Resources.

Data curation, Investigation, Writing – original draft, Resources.

Data curation, Investigation, Writing – original draft, Resources.

Data curation, Investigation, Writing – original draft, Resources.

Data curation, Investigation, Writing – original draft, Resources.

Data curation, Investigation, Writing – original draft, Resources.

Data curation, Investigation, Writing – original draft, Resources.

Conceptualization, Resources, Data curation, Formal analysis, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Additional files

MDAR checklist
Supplementary file 1. History of mungbean spread: genetic, environment, and traits data.

(a) Mungbean accessions from Vavilov Institute (VIR) collection. (b) Outgroup f3 statistics among all possible combinations of genetic group pairs. (c) Admixture f3 statistics among all possible population trios. (d) Mantel tests for isolation by distance of inferred genetic group (Q≥0.5). (e) Description of bioclimatic variables used in ecological niche modeling. (f) Pearson’s correlation coefficient between pairs of bioclimatic variables (denoted in lower triangle). (g) Comparison of bioclimatic variables among the four genetic groups analyzed with multivariate ANOVA (MANOVA). (h) Summary of ANOVA for bioclimatic variables. (i) Correlation between eight bioclimatic variables and climatic PC axes 1–4. (j) Comparison of summer growing season data including temperature and precipitation of May, July, and September among the four genetic groups analyzed with MANOVA. (k) ANOVA table for all evaluated field traits (phenology, reproduction, and size) as well as drought-related traits. (l) Mean of eight bioclimatic variables of the genetic groups

elife-85725-supp1.docx (94.4KB, docx)

Data availability

Sequences generated in this study are available under NCBI BioProject PRJNA809503. Accession names, GPS coordinates, and NCBI accession numbers of the Vavilov Institute accessions are available under Supplementary file 1a. Plant trait data are available at Dryad https://doi.org/10.5061/dryad.d7wm37q3h. Sequences and accession information of the World Vegetable Centre mini-core and the Australian Diversity Panel collections were obtained from the NCBI BioProject PRJNA645721 (Breria et al., 2020) and PRJNA963182 (Noble et al., 2018).

The following datasets were generated:

Ong P, Lin Y, Chen H, Lo C, Noble T, Nair R, Schafleitner R, Vishnyakova M, Bishop-von-Wettberg E, Samsonova M, Nuzhdin S, Ting C, Lee C. 2023. The climatic constrains of the historical global spread of mungbean. Dryad Digital Repository.

Ong P, Lin Y, Chen H, Lo C, Noble T, Nair R, Schafleitner R, Vishnyakova M, Bishop-von-Wettberg E, Samsonova M, Nuzhdin S, Ting C, Lee C. 2023. Vavilov Institute (VIR) mungbean collection - DArTseq. NCBI BioProject. PRJNA809503

The following previously published datasets were used:

Breria CM, Hsieh CH, Yen J-Y, Nair R, Lin C-Y, Huang S-M, Noble TJ, Schafleitner R. 2020. World Vegetable Center Mini Core Collection - DartSeq. NCBI BioProject. PRJNA645721

Noble TJ, Tao Y, Mace ES, Williams B, Jordan DR, Douglas CA, Mundree SG. 2023. Australian mungbean diversity panel collection - DArTseq. NCBI BioProject. PRJNA963182

References

  1. 1001 Genomes Consortium 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell. 2016;166:481–491. doi: 10.1016/j.cell.2016.05.063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Research. 2009;19:1655–1664. doi: 10.1101/gr.094052.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Annanepesov M, Bababekov HN. In: History of Civilizations of Central Asia, Volume 5: Development in Contrast, from the Sixteenth to the Mid-Nineteenth Century. Adle C, Habib I, editors. Paris: UNESCO Publishing; 2003. The Khanates of Khiva and Kokand and the relations between the Knanates and with other powers; pp. 64–89. [Google Scholar]
  4. Barrera-Redondo J, Sánchez-de la Vega G, Aguirre-Liguori JA, Castellanos-Morales G, Gutiérrez-Guerrero YT, Aguirre-Dugua X, Aguirre-Planter E, Tenaillon MI, Lira-Saade R, Eguiarte LE. The domestication of Cucurbita argyrosperma as revealed by the genome of its wild relative. Horticulture Research. 2021;8:109. doi: 10.1038/s41438-021-00544-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics. 2007;23:2633–2635. doi: 10.1093/bioinformatics/btm308. [DOI] [PubMed] [Google Scholar]
  7. Breria CM, Hsieh CH, Yen JY, Nair R, Lin CY, Huang SM, Noble TJ, Schafleitner R. Population structure of the World Vegetable Center mungbean mini core collection and genome-wide association mapping of Loci associated with variation of seed coat luster. Tropical Plant Biology. 2020;13:1–12. doi: 10.1007/s12042-019-09236-0. [DOI] [Google Scholar]
  8. Bryant C, Wheeler NR, Rubel F, French RH. Kgc: Köeppen-Geiger Climatic Zones. version 1.0.0.2R Package. 2017 https://CRAN.R-project.org/package=kgc
  9. Burlyaeva M, Vishnyakova M, Gurkina M, Kozlov K, Lee CR, Ting CT, Schafleitner R, Nuzhdin S, Samsonova M, Wettberg E. Collections of mungbean [Vigna radiata (L.) R. Wilczek] and urdbean [V. mungo (L.) Hepper] in Vavilov Institute (VIR): traits diversity and trends in the breeding process over the last 100 years. Genetic Resources and Crop Evolution. 2019;66:767–781. doi: 10.1007/s10722-019-00760-2. [DOI] [Google Scholar]
  10. Castillo CC, Bellina B, Fuller DQ. Rice, beans and trade crops on the early maritime Silk Route in Southeast Asia. Antiquity. 2016;90:1255–1269. doi: 10.15184/aqy.2016.175. [DOI] [Google Scholar]
  11. Chen LT. A study of the systems of rotating crops in Chinese history 我國歷代輪種制度之研究. Bulletin of the Institute of History and Philology. 1980;51:281–313. [Google Scholar]
  12. Cox MP, Peterson DA, Biggs PJ. SolexaQA: at-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics. 2010;11:485. doi: 10.1186/1471-2105-11-485. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R, 1000 Genomes Project Analysis Group The variant call format and VCFtools. Bioinformatics. 2011;27:2156–2158. doi: 10.1093/bioinformatics/btr330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Dela Vina AC, Tomooka N. Genetic diversity in mungbean [Vigna Radiata (L.) Wilczek] based on two enzyme systems. Philippine Journal of Crop Science. 1994;19:1–9. [Google Scholar]
  15. Douglas C, Pratap A, Rao BH, Manu B, Dubey S, Singh P, Tomar R. In: The Mungbean Genome. Nair RM, Schafleitner R, Lee SH, editors. Cham: Springer International Publishing; 2020. Breeding progress and future challenges: Abiotic stresses; pp. 81–96. [DOI] [Google Scholar]
  16. Dupuy PD. The Oxford Handbook of Topics in Archaeology. Oxford University Press; 2016. Bronze Age Central Asia. [DOI] [Google Scholar]
  17. Excoffier L, Marchi N, Marques DA, Matthey-Doret R, Gouy A, Sousa VC. Fastsimcoal2: demographic inference under complex evolutionary scenarios. Bioinformatics. 2021;37:4882–4885. doi: 10.1093/bioinformatics/btab468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Fitak RR. OptM: estimating the optimal number of migration edges on population trees using Treemix. Biology Methods and Protocols. 2021;6:bpab017. doi: 10.1093/biomethods/bpab017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Francis RM. Pophelper: an R package and web App to analyse and visualize population structure. Molecular Ecology Resources. 2017;17:27–32. doi: 10.1111/1755-0998.12509. [DOI] [PubMed] [Google Scholar]
  20. Fuller DQ, Harvey EL. The archaeobotany of Indian pulses: identification, processing and evidence for cultivation. Environmental Archaeology. 2006;11:219–246. doi: 10.1179/174963106x123232. [DOI] [Google Scholar]
  21. Fuller DQ. Contrasting patterns in crop domestication and domestication rates: recent archaeobotanical insights from the Old World. Annals of Botany. 2007;100:903–924. doi: 10.1093/aob/mcm048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Fuller DQ, Boivin N, Hoogervorst T, Allaby R. Across the Indian Ocean: the prehistoric movement of plants and animals. Antiquity. 2011;85:544–558. doi: 10.1017/S0003598X00067934. [DOI] [Google Scholar]
  23. Goudet J, Büchi L. The effects of dominance, regular inbreeding and sampling design on QST, an estimator of population differentiation for quantitative traits. Genetics. 2006;172:1337–1347. doi: 10.1534/genetics.105.050583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Gutaker RM, Groen SC, Bellis ES, Choi JY, Pires IS, Bocinsky RK, Slayton ER, Wilkins O, Castillo CC, Negrão S, Oliveira MM, Fuller DQ, Guedes J d, Lasky JR, Purugganan MD. Genomic history and ecology of the geographic spread of rice. Nature Plants. 2020;6:492–502. doi: 10.1038/s41477-020-0659-6. [DOI] [PubMed] [Google Scholar]
  25. Gwag JG, Dixit A, Park YJ, Ma KH, Kwon SJ, Cho GT, Lee GA, Lee SY, Kang HK, Lee SH. Assessment of genetic diversity and population structure in mungbean. Genes & Genomics. 2010;32:299–308. doi: 10.1007/s13258-010-0014-9. [DOI] [Google Scholar]
  26. Ha J, Lee SH. In: Advances in Plant Breeding Strategies: Legumes. Al-Khayri JM, Jain SM, Johnson DV, editors. Springer; 2019. Mung bean (Vigna radiata (L.) R. Wilczek) breeding; pp. 371–407. [DOI] [Google Scholar]
  27. Ha J, Satyawan D, Jeong H, Lee E, Cho KH, Kim MY, Lee SH. A near-complete genome sequence of mungbean (Vigna radiata L.) provides key insights into the modern breeding program. The Plant Genome. 2021;14:e20121. doi: 10.1002/tpg2.20121. [DOI] [PubMed] [Google Scholar]
  28. Herniter IA, Muñoz‐Amatriaín M, Close TJ. Genetic, textual, and archeological evidence of the historical global spread of cowpea (Vigna unguiculata [L.] Walp.) Legume Science. 2020;2:leg3.57. doi: 10.1002/leg3.57. [DOI] [Google Scholar]
  29. Hijmans RJ, Cameron SE, Parra JL, Jones PG, Jarvis A. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology. 2005;25:1965–1978. doi: 10.1002/joc.1276. [DOI] [Google Scholar]
  30. Hijmans R. Raster: Geographic Data Analysis and Modeling. version 3.4-13R Package. 2021 https://CRAN.R-project.org/package=raster
  31. Islam A, Blair MW. Molecular characterization of mung bean germplasm from the USDA core collection using newly developed KASP-based SNP markers. Crop Science. 2018;58:1659–1670. doi: 10.2135/cropsci2018.01.0044. [DOI] [Google Scholar]
  32. Jeong C, Balanovsky O, Lukianova E, Kahbatkyzy N, Flegontov P, Zaporozhchenko V, Immel A, Wang CC, Ixan O, Khussainova E, Bekmanov B, Zaibert V, Lavryashina M, Pocheshkhova E, Yusupov Y, Agdzhoyan A, Koshel S, Bukin A, Nymadawa P, Turdikulova S, Dalimova D, Churnosov M, Skhalyakho R, Daragan D, Bogunov Y, Bogunova A, Shtrunov A, Dubova N, Zhabagin M, Yepiskoposyan L, Churakov V, Pislegin N, Damba L, Saroyants L, Dibirova K, Atramentova L, Utevska O, Idrisov E, Kamenshchikova E, Evseeva I, Metspalu M, Outram AK, Robbeets M, Djansugurova L, Balanovska E, Schiffels S, Haak W, Reich D, Krause J. The genetic history of admixture across inner Eurasia. Nature Ecology & Evolution. 2019;3:966–976. doi: 10.1038/s41559-019-0878-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Jones H, Leigh FJ, Mackay I, Bower MA, Smith LMJ, Charles MP, Jones G, Jones MK, Brown TA, Powell W. Population-based resequencing reveals that the flowering time adaptation of cultivated barley originated east of the Fertile Crescent. Molecular Biology and Evolution. 2008;25:2211–2219. doi: 10.1093/molbev/msn167. [DOI] [PubMed] [Google Scholar]
  34. Jones M, Hunt H, Lightfoot E, Lister D, Liu X, Motuzaite-Matuzeviciute G. Food globalization in prehistory. World Archaeology. 2011;43:665–675. doi: 10.1080/00438243.2011.624764. [DOI] [Google Scholar]
  35. Jones H, Lister DL, Cai D, Kneale CJ, Cockram J, Peña-Chocarro L, Jones MK. The trans-Eurasian crop exchange in prehistory: discerning pathways from barley phylogeography. Quaternary International. 2016;426:26–32. doi: 10.1016/j.quaint.2016.02.029. [DOI] [Google Scholar]
  36. Kang YJ, Kim SK, Kim MY, Lestari P, Kim KH, Ha BK, Jun TH, Hwang WJ, Lee T, Lee J, Shim S, Yoon MY, Jang YE, Han KS, Taeprayoon P, Yoon N, Somta P, Tanya P, Kim KS, Gwag JG, Moon JK, Lee YH, Park BS, Bombarely A, Doyle JJ, Jackson SA, Schafleitner R, Srinives P, Varshney RK, Lee SH. Genome sequence of mungbean and insights into evolution within Vigna species. Nature Communications. 2014;5:5443. doi: 10.1038/ncomms6443. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kim SK, Nair RM, Lee J, Lee SH. Genomic resources in mungbean for future breeding programs. Frontiers in Plant Science. 2015;6:626. doi: 10.3389/fpls.2015.00626. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kingwell-Banham E, Petrie CA, Fuller DQ. In: The Cambridge World History. Goucher C, Barker G, editors. Cambridge: Cambridge University Press; 2015. Early Agriculture in South Asia; pp. 261–288. [DOI] [Google Scholar]
  39. Kistler L, Maezumi SY, Gregorio de Souza J, Przelomska NAS, Malaquias Costa F, Smith O, Loiselle H, Ramos-Madrigal J, Wales N, Ribeiro ER, Morrison RR, Grimaldo C, Prous AP, Arriaza B, Gilbert MTP, de Oliveira Freitas F, Allaby RG. Multiproxy evidence highlights a complex evolutionary legacy of maize in South America. Science. 2018;362:1309–1313. doi: 10.1126/science.aav0207. [DOI] [PubMed] [Google Scholar]
  40. Kohl PL. In: The Making of Bronze Age Eurasia. Kohl PL, editor. Cambridge: Cambridge University Press; 2007. Entering a sown world of irrigation agriculture – from the steppes to Central Asia and beyond: processes of movement, assimilation, and transformation into the "civilized" world east of Sumer; pp. 182–243. [DOI] [Google Scholar]
  41. Kohl PL, Lyonnet B. In: Intercultural Relations between South and Southwest Asia. Studies in Commemoration of E.C.L. During Caspers (1934–1996). BAR International Series 1826. Olijdam E, Spoor RH, editors. Oxford: Archaeopress; 2008. By land and by sea: the circulation of materials and peoples, ca. 3500-1800 B.C; pp. 29–42. [Google Scholar]
  42. Köppen W. The thermal zones of the earth according to the duration of hot, moderate and cold periods and to the impact of heat on the organic world. Meteorologische Zeitschrift. 2011;20:351–360. doi: 10.1127/0941-2948/2011/105. [DOI] [Google Scholar]
  43. Korunes KL, Samuk K. Pixy: Unbiased estimation of nucleotide diversity and divergence in the presence of missing data. Molecular Ecology Resources. 2021;21:1359–1368. doi: 10.1111/1755-0998.13326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Kumar N, Nandwal AS, Waldia RS, Singh S, Devi S, Sharma KD, Kumar A. Drought tolerance in chickpea as evaluated by root characteristics, plant water status, membrane integrity and chlorophyll fluorescence techniques. Experimental Agriculture. 2012;48:378–387. doi: 10.1017/S0014479712000063. [DOI] [Google Scholar]
  45. Lamberg‐Karlovsky CC. Archaeology and language: the Indo‐Iranians. Current Anthropology. 2002;43:63–88. doi: 10.1086/324130. [DOI] [Google Scholar]
  46. Lee C-R, Svardal H, Farlow A, Exposito-Alonso M, Ding W, Novikova P, Alonso-Blanco C, Weigel D, Nordborg M. On the post-glacial spread of human commensal Arabidopsis thaliana. Nature Communications. 2017;8:14458. doi: 10.1038/ncomms14458. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Subgroup GPDP. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Lin YP, Chen HW, Yeh PM, Anand SS, Lin J, Li J, Noble T, Nair R, Schafleitner R, Samsonova M, Bishop-von-Wettberg E, Nuzhdin S, Ting CT, Lawn RJ, Lee CR. Distinct selection signatures during domestication and improvement in crops: a tale of two genes in mungbean. bioRxiv. 2022 doi: 10.1101/2022.09.08.506689. [DOI]
  50. Lister DL, Jones H, Oliveira HR, Petrie CA, Liu X, Cockram J, Kneale CJ, Kovaleva O, Jones MK. Barley heads east: genetic analyses reveal routes of spread through diverse Eurasian landscapes. PLOS ONE. 2018;13:e0196652. doi: 10.1371/journal.pone.0196652. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Liu X, Jones PJ, Motuzaite Matuzeviciute G, Hunt HV, Lister DL, An T, Przelomska N, Kneale CJ, Zhao Z, Jones MK. From ecological opportunism to multi-cropping: mapping food globalisation in prehistory. Quaternary Science Reviews. 2019;206:21–28. doi: 10.1016/j.quascirev.2018.12.017. [DOI] [Google Scholar]
  52. Lombard P. In: The World of the Oxus Civilization. Lyonnet B, Dubova N, editors. Routledge; 2020. The Oxus civilization/BMAC and its interaction with the Arabian Gulf. A review of the evidences; pp. 607–634. [DOI] [Google Scholar]
  53. Lyonnet B. In: South Asian Archaeology. Jarrige C, Lefevre V, editors. Paris: Editions Recherche sur les Civilisations; 2005. Another possible interpretation of the Bactro-Margiana Culture (BMAC) of Central Asia: the tin trade; pp. 191–200. [Google Scholar]
  54. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research. 2010;20:1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Meyer RS, Purugganan MD. Evolution of crop species: genetics of domestication and diversification. Nature Reviews Genetics. 2013;14:840–852. doi: 10.1038/nrg3605. [DOI] [PubMed] [Google Scholar]
  56. Michel BE, Kaufmann MR. The osmotic potential of polyethylene glycol 6000. Plant Physiology. 1973;51:914–916. doi: 10.1104/pp.51.5.914. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Miller NF. Agricultural development in western Central Asia in the Chalcolithic and Bronze Ages. Vegetation History and Archaeobotany. 1999;8:13–19. doi: 10.1007/BF02042837. [DOI] [Google Scholar]
  58. Mishra GP, Dikshit HK, Tripathi K, Aski MS, Pratap A, Dasgupta U, Nair RM, Gupta S. In: Fundamentals of Field Crop Breeding. Yadava DK, Dikshit HK, Mishra GP, Tripathi S, editors. Singapore: Springer Nature Singapore; 2022. Mungbean breeding; pp. 1097–1149. [Google Scholar]
  59. Nair R, Schreinemachers P. In: The Mungbean Genome. Nair R, Schafleitner R, Lee SH, editors. Berlin: Springer International Publishing; 2020. Global status and economic importance of mungbean; pp. 1–8. [DOI] [Google Scholar]
  60. Noble TJ, Tao Y, Mace ES, Williams B, Jordan DR, Douglas CA, Mundree SG. Characterization of linkage disequilibrium and population structure in a mungbean diversity panel. Frontiers in Plant Science. 2018;8:2102. doi: 10.3389/fpls.2017.02102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, Zhan Y, Genschoreck T, Webster T, Reich D. Ancient admixture in human history. Genetics. 2012;192:1065–1093. doi: 10.1534/genetics.112.145037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Phillips SJ, Anderson RP, Schapire RE. Maximum entropy modeling of species geographic distributions. Ecological Modelling. 2006;190:231–259. doi: 10.1016/j.ecolmodel.2005.03.026. [DOI] [Google Scholar]
  63. Pickrell JK, Pritchard JK, Tang H. Inference of population splits and mixtures from genome-wide allele frequency data. PLOS Genetics. 2012;8:e1002967. doi: 10.1371/journal.pgen.1002967. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Pratap A, Gupta S, Basu PS, Tomar R, Dubey S, Rathore M, Prajapati US, Singh P, Kumari G. In: Genomic Designing of Climate-Smart Pulse Crops. Kole C, editor. Chamsford: Springer International Publishing; 2019. Towards development of climate smart mungbean: challenges and opportunities; pp. 235–264. [DOI] [Google Scholar]
  65. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC. PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Rani S, Schreinemachers P, Kuziyev B, Yildiz F. Mungbean as a catch crop for dryland systems in Pakistan and Uzbekistan: a situational analysis. Cogent Food & Agriculture. 2018;4:1499241. doi: 10.1080/23311932.2018.1499241. [DOI] [Google Scholar]
  67. Sandhu K, Singh A. Strategies for the utilization of the USDA mung bean germplasm collection for breeding outcomes. Crop Science. 2021;61:422–442. doi: 10.1002/csc2.20322. [DOI] [Google Scholar]
  68. Sangiri C, Kaga A, Tomooka N, Vaughan D, Srinives P. Genetic diversity of the mungbean (Vigna radiata, Leguminosae) genepool on the basis of microsatellite analysis. Australian Journal of Botany. 2007;55:837. doi: 10.1071/BT07105. [DOI] [Google Scholar]
  69. Sokolkova A, Burlyaeva M, Valiannikova T, Vishnyakova M, Schafleitner R, Lee C-R, Ting C-T, Nair RM, Nuzhdin S, Samsonova M, von Wettberg E. Genome-wide association study in accessions of the mini-core collection of mungbean (Vigna radiata) from the World Vegetable Gene Bank (Taiwan) BMC Plant Biology. 2020;20:363. doi: 10.1186/s12870-020-02579-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Spengler RN, Cerasetti B, Tengberg M, Cattani M, Rouse LM. Agriculturalists and pastoralists: Bronze Age economy of the Murghab alluvial fan, southern Central Asia. Vegetation History and Archaeobotany. 2014a;23:805–820. doi: 10.1007/s00334-014-0448-0. [DOI] [Google Scholar]
  71. Spengler R.N, Frachetti MD, Doumani PN. Late Bronze Age agriculture at Tasbas in the Dzhungar Mountains of eastern Kazakhstan. Quaternary International. 2014b;348:147–157. doi: 10.1016/j.quaint.2014.03.039. [DOI] [Google Scholar]
  72. Spengler RN. Agriculture in the Central Asian Bronze Age. Journal of World Prehistory. 2015;28:215–253. doi: 10.1007/s10963-015-9087-3. [DOI] [Google Scholar]
  73. Spengler R.N, Miller NF, Neef R, Tourtellotte PA, Chang C. Linking agriculture and exchange to social developments of the Central Asian Iron Age. Journal of Anthropological Archaeology. 2017;48:295–308. doi: 10.1016/j.jaa.2017.09.002. [DOI] [Google Scholar]
  74. Spengler RN, de Nigris I, Cerasetti B, Carra M, Rouse LM. The breadth of dietary economy in Bronze Age Central Asia: case study from Adji Kui 1 in the Murghab region of Turkmenistan. Journal of Archaeological Science. 2018a;22:372–381. doi: 10.1016/j.jasrep.2016.03.029. [DOI] [Google Scholar]
  75. Spengler RN, Maksudov F, Bullion E, Merkle A, Hermes T, Frachetti M. Arboreal crops on the medieval Silk Road: archaeobotanical studies at Tashbulak. PLOS ONE. 2018b;13:e0201409. doi: 10.1371/journal.pone.0201409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Spengler R.N, Tang L, Nayak A, Boivin N, Olivieri LM. The southern Central Asian mountains as an ancient agricultural mixing zone: new archaeobotanical data from Barikot in the Swat valley of Pakistan. Vegetation History and Archaeobotany. 2021;30:463–476. doi: 10.1007/s00334-020-00798-8. [DOI] [Google Scholar]
  77. Stevens CJ, Murphy C, Roberts R, Lucas L, Silva F, Fuller DQ. Between China and South Asia: a Middle Asian corridor of crop dispersal and agricultural innovation in the Bronze Age. The Holocene. 2016;26:1541–1555. doi: 10.1177/0959683616650268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Takahashi Y, Kongjaimun A, Muto C, Kobayashi Y, Kumagai M, Sakai H, Satou K, Teruya K, Shiroma A, Shimoji M, Hirano T, Isemura T, Saito H, Baba-Kasai A, Kaga A, Somta P, Tomooka N, Naito K. Same locus for non-shattering seed pod in two independently domesticated legumes, Vigna angularis and Vigna unguiculata. Frontiers in Genetics. 2020;11:748. doi: 10.3389/fgene.2020.00748. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Tomooka N, Lairungreang C, Nakeeraks P, Egawa Y, Thavarasook C. Center of genetic diversity and dissemination pathways in mung bean deduced from seed protein electrophoresis. Theoretical and Applied Genetics. 1992;83:289–293. doi: 10.1007/BF00224273. [DOI] [PubMed] [Google Scholar]
  80. Van der Veen M, Morales J. The Roman and Islamic spice trade: new archaeological evidence. Journal of Ethnopharmacology. 2015;167:54–63. doi: 10.1016/j.jep.2014.09.036. [DOI] [PubMed] [Google Scholar]
  81. Vir R, Lakhanpaul S, Malik S, Umdale S. In: Gene Pool Diversity and Crop Improvement. Rajpal VR, Rao SR, Raina SN, editors. Cham: Springer International Publishing; 2016. Utilization of germplasm for the genetic improvement of mung bean [Vigna radiata (L.) Wilczek]: the constraints and the opportunities; pp. 367–391. [DOI] [Google Scholar]
  82. Warren DL, Glor RE, Turelli M. ENMTools: a toolbox for comparative studies of environmental niche models. Ecography. 2010;33:607–611. doi: 10.1111/j.1600-0587.2009.06142.x. [DOI] [Google Scholar]
  83. Xu W, Cui K, Xu A, Nie L, Huang J, Peng S. Drought stress condition increases root to shoot ratio via alteration of carbohydrate partitioning and enzymatic activity in rice seedlings. Acta Physiologiae Plantarum. 2015;37:9. doi: 10.1007/s11738-014-1760-0. [DOI] [Google Scholar]
  84. Zhang C, Shi S, Wang B, Zhao J. Physiological and biochemical changes in different drought-tolerant alfalfa (Medicago sativa L.) varieties under PEG-induced drought stress. Acta Physiologiae Plantarum. 2018;40:25. doi: 10.1007/s11738-017-2597-0. [DOI] [Google Scholar]
  85. Zhang C, Dong SS, Xu JY, He WM, Yang TL. PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format files. Bioinformatics. 2019;35:1786–1788. doi: 10.1093/bioinformatics/bty875. [DOI] [PubMed] [Google Scholar]
  86. Zheng X, Wang T, Cheng T, Zhao L, Zheng X, Zhu F, Dong C, Xu J, Xie K, Hu Z, Yang L, Diao Y. Genomic variation reveals demographic history and biological adaptation of the ancient relictual, lotus (Nelumbo Adans) Horticulture Research. 2022;9:uhac029. doi: 10.1093/hr/uhac029. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Detlef Weigel 1

This is an important interdisciplinary effort, with compelling genetic evidence, that informs on the spread of an important crop. The work will be of broad interest to those studying the domestication and dissemination of cultivated plants.

Decision letter

Editor: Detlef Weigel1

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting the paper "The climatic constrains of the historical global spread of mungbean" for consideration by eLife.

First we would like to apologize for the length of time it took to reach this decision. We had hoped to get input from another archaeologist who had agreed to review but in the end did not submit a review. We had a rather long discussion of the work, plus several of the people involved were on vacation.

Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individual involved in the review of your submission has agreed to reveal their identity: Jeffrey Ross–Ibarra (Reviewer #1).

Comments to the Authors:

We are sorry to say that, after consultation with the reviewers, we have decided that this work will is not acceptable for publication in eLife.

The first two reviewers, both geneticists with experience in plant domestication, were excited about the topic and the interdisciplinary approach taken. But both raised a number of concerns about the methods and approaches used, which would require substantial additional work to address. The third reviewer, an archaeologist, raised real concerns about how well the paper has incorporated existing data into interpretation and analysis.

In the end, given the number of concerns raised on both the genetic and archaeological fronts, I'm sorry to say that we have decided that this work in its current form will not be considered further for publication by eLife. Having said this, there was broad interest in the work and the topic, and we would reconsider, albeit as a new submission, an extensively revised paper, which would likely look very different from the study at hand.

We are sorry about this outcome, but hope that the comments will be helpful for submission elsewhere.

Reviewer #1 (Recommendations for the authors):

The authors provide an ambitious interdisciplinary analysis of the post–domestication spread of mungbean. I believe the authors' general thesis is correct, and the various data they provide are all consistent with their model.

Nonetheless, there are some issues with analysis and interpretation, and in general, the data don't provide definitive evidence that allows claims about drought or the importance of the environment.

All of the evidence and analysis presented are consistent with the authors' proposed SA–SEA–EA–CA model. The f3 results are particularly convincing. I struggle, however, with how the data definitively show environment was more important than human movement. Some text shows, for example, that trading or demic movement between SA and CA was frequent would make clear that mungbean 'could' have moved SA–>CA via humans but didn't. Otherwise, could the SA–SEA–EA–CA model simply reflect historical movements/expansions of people in those directions/times?

On a similar note, the paper claims that drought was the most important factor limiting the mungbean movement. The phenotypic data certainly suggest adaptive differences in traits related to drought and drought differences in phenotype in an experimental setting, but additional evidence would be helpful to convince that drought is the key factor. The authors note the importance of daylength differences, for example – how do we know daylength wasn't the limiting factor (as it seems to have been in the northward spread of maize, for example)?

Methods Recommendations

– It's difficult in some places to know which SNP data set is used where. Are the LD–pruned SNPs used to estimate LD decay? The methods say there are 67K sites (including monomorphic) but there are 41K SNPs. This would imply that 2/3 of the sites are polymorphic. That can't be right so I must be misunderstanding something. Are these monomorphic sites also used for estimating nucleotide diversity (see the issue with VCFtools below)? For fastsimcoal, do you include sites where the derived allele is frequency=0 and where the derived allele is frequency 1?

– It would be good to modify the language of the MaxEnt modeling. While MaxEnt may show that the modern mungbean does not currently grow in conditions similar to those during the Holocene at a certain spot on the map, this is not the same as saying the plant could not have grown there. For example, if we were to model MaxEnt on modern CO2 concentrations, we would come to the conclusion that modern mungbean could not have grown anywhere during the Holocene. I don't think necessarily the methods need to be changed, but it would be good to change the language here to be less definitive.

– Although dominance is less of a concern in inbred species, I believe it is still incorrect to assume that using Vg is equivalent to Va in Qst. The epistatic variance could still be important, for example, and my brief internet search suggests mungbean isn't 100% selfing either. The authors have SNP data for all the individuals phenotyped, so it should be doable to estimate Va using kinship matrices and thus do a correct Qst analysis.

– VCFtools calculates nucleotide diversity assuming every bp in a window has been sequenced. This will lead to incorrect estimates of π and Fst. See https://pubmed.ncbi.nlm.nih.gov/33453139/ for details and one potential fix if you have a vcf with invariant sites, or https://github.com/RILAB/mop for another if you just have bams + variable vcf.

– an NJ tree (Figure 1E) is just a clustering algorithm and shouldn't be interpreted as providing information on the timing or order of evolutionary events. (lines 119–123)

– If the growing season is known and is what is most relevant for the crop (line 190), these should be used for most analyses or a justification for using annual data should be provided

– I can't find the total length of time plants were grown for the drought stress experiment. The text reads as if it were 9 days?

Reviewer #2 (Recommendations for the authors):

The study aims to understand the diffusion of the Mugbean population in Asia using genomic and phenotypic data. The authors used phylogeny, analysis of population structure, model–based inference, quantitative genetics, and niche modeling to understand the diffusion of Mugbean and proposed a scenario where diffusion is strongly constrained by climate, and partially by geographical.

I found the study very interesting with a mix of different methodologies from genomics to niche modeling and quantitative genetics. Few others studies used similar approaches but it is rather unique for the study of diffusion to combine these different approaches and bring up very interesting results. I have several comments on the methods so the results are better supported for the genetic part, for the phenotypic part, I am not sure the author will be able to support their claims, as the method they used might inflate variance in their estimations.

Main comments:

Phylogeny. Phylogeny is not really an appropriate method per see for intra–diversity study, phylogeny analysis could be used but the basic assumptions of the method are often inadequate, so the result should be put in context. A better method to do phylogeny–like inference is the TREEMIX approach based on the relationship of population integrating drift. I suggest the authors add this complementary analysis.

Model inference. Here, I would have liked a statistical comparison of the different scenarios. FastSimcoal allows for estimated probability of models/scenarios and it would be an independent validation to have such a comparison of models.

The inference of the model is dependent on mutation rate (unknown), notably on the time of divergence. The author used 10–8 but it is neither discussed nor justified.

Analysis of structure.

The author led aside from further studying individuals with ancestry lower than 70% in a given group. The analysis of structure could lead to such ancestry because of admixture or isolation by distance. One is secondary contact based on recent (or not) gene flow between genetic groups, the author is related to diffusion across the landscape. A large fraction of individuals in the PCA seems to me more likely related to isolation by distance (between SEA/EA, EA/CA). How this will impact the analysis of correlation by geographical distance if such individuals are not considered? The authors should perform the analysis of the Mantel test with all individuals to assess the impact of their choices.

Quantitative analysis of QST. The author used the genetic variance and not the additive variance. The reasoning behind that is not explained, neither the inflation created by this broad sense QST. Authors have concluded that the "extent to which comparisons between FST and broad–sense QST are appropriate remains unknown" (Pujol et al. 2008).

Reviewer #3 (Recommendations for the authors):

The paper aims to look at the domestication, post–domestication spread, and adaptation of mungbean (Vigna radiata) across Asia through the use of genetic data from landraces and accessions in see and genebanks.

The genetic aspects of this paper are a strength – there is a lot of work that has been done putting together a range of datasets allowing for inter–collection comparison and comparison of collections made by different institutes with their varying goals, sampling strategies, and dates of collecting. Mapping this diversity, think about how drift has occurred and why is something that needs to be done, especially in mungbean (and other tropical pulses of Asian/South Asian origin) as they are often overlooked in literature.

However, the main stated goal of the paper – to look at the domestication, post–domestication, and adaptations to climate change as this crop was moved around – is where it falls short. There is little engagement with the deep archaeological literature on both domestication as a process, post–domestication use and spread of mungbean (and other South Asian crops and those involved in the Silk Routes trade pathways), and the complexity of climate reconstructions and climate change over the stated period of interest/regions of interest. Works by Spengler and d'Alpoim Guedes for example are missed with regards to the Silk Routes debates, and literature by Fuller, Murphy given only short sentences as background for what is a very complex background regarding where and when mungbean is thought to have been domesticated. There is little reflection on the context of the two/three possible origins for mung (south, north, and west South Asia), how this interacts with the Southern Indian Neolithic and Indus regions, and how the changing cultural dynamics may have contributed to the processes of domestication, post–domestication change and the spread of different varieties. Without this background, it is hard to then move into discussing modern genetic data with a view to past patterns, for example with thinking about how climate may have affected change, given that the debates around the 4,2k event are extremely complex within these, let alone thinking pan–Asian and trying to link potentially 'drifted' genetic data today to these deep–time events. This comes across in the timescales given to the genetic data, as without the context of the where and when from archaeology, we see dates such as those given in Figure 2B that suggest domestication for South Asia moving to South East Asia at c.6kya (4000BC), which is when we still think they were under domestication within South Asia. The region is also not pinned down for where in South Asia these specific 'domesticated' mung are coming from to go to South East Asia, and the routes, yet arrows and big circles are added in Figure 2A. This shows the issue of not using that important context from archaeology.

A further issue arises from thinking about the climate data. By conflating vast areas (e.g.: South Asia, Central Asia, etc.), when applying climate modelling there has been an oversimplification, which makes any discussion of mung bean adapting to climate post domestication difficult to sustain. In line 184 for example, there is a suggestion on the role of the 6.2k event in Central Asia (putting aside the above issues of the dating of mung domestication in South Asia before it even reaches Central Asian regions already noted). While there are a few datasets as cited in the paper that show some impacts of wetter climates in some regions of Central Asia for a wetter 6,2k event this is by no means a universal impact, and regional data points are needed. We can see this when looking to the Indus and the impact of the 4.2k event as another example, again a point that needs refining in order to make such claims about mung domestication, let alone post–domestication adaptations.

Overall the thrust of the paper – domestication, post–domestication, and the spread of agriculture – are overshadowing what is actually a far more interesting point, hidden in lines 87–89: this data could be "used this resource to investigate the global history of mungbean after domestication […looking at the …] phenotypic characteristics for local adaptation to distinct environments." This perhaps is where the paper is most interesting, and reframing it in this context would be truly exciting, looking at the diversity of the crop, how it is now adapted to diverse environments, and what this might mean for long–term sustainability in cropping systems.

This paper sadly is losing some very interesting genetics data in complex and poorly explained mung history.

– The lack of engagement with archaeological data and the misunderstanding of the chronology of mung use in the past makes it very difficult to tally the results with the interpretations and discussions. This MAJOR point has been unpacked in more detail above and must be addressed in order to reduce the oversimplification of the background and remove the concerning premise that no one has done much work on ancient mung use (as stated up to l.77). While it has not had as much work as the cereals, there is still work being done on it, looking at its domestication regions, secondary domestication changes and spread across South Asia and then into different parts of Asia.

– Data seems to have been massaged to make it fit with the climate modelling in various regions (for example Figure 2B has mung arriving in Central Asia around 0.2k yet discussions of 6.2k climate events in l.184), and to also make mung seem to be spreading before the domestication event itself. More engagement with the archaeological discussions on mung domestication is needed and discussions of what domestication is as a process (there is a fundamental misunderstanding of the conscious/unconscious action outline in Larson et al. in l.47 – the way it is phrased implies deliberate choice to ensure change rather than recognizing the inherent unconscious and indirect action of human behavior and the entanglement of human–plant–environment interactions).

– Terms like cultivar and variety and landrace are used interchangeably. These must be defined in the paper. How they fit in with notions of domestication and post–domestication agricultural behavior must also be unpacked.

– "how the domesticated forms later expanded to a broader geographical area has also been detailed in several species, including maize (Matsuoka et al., 2002), rice (Huang et al., 2012), tomato (Razifard et al., 2020), chickpea (Varshney et al., 2021), and lettuce (Wei et al., 2021)." – these are unusual choices of case study to make a point as many (maize and rice as key examples) are not accepted as well defined and remain highly controversial. These are genetics papers, and demonstrate the lack of familiarity with the archaeological context of domestication. A quick glance at the literature around them will illustrate that they are poor choices of case studies to make this point as they too are highly controversial.

– "It is also unclear whether the expansion of most crops strictly follows the longitudinal axis of the continents (Diamond, 2005) or whether or why some are able to cross different climatic zones." – again poor knowledge of the archaeological context of these debates, and the reliance on Diamond is concerning as he is not an archaeologist. See works by Lister on barley and Lui on wheat as a good starting point.

– Debate on wild progenitor of mung bean needs to be explored. while sublobator is a likely candidate it should be explained in the paper that there are other possible options, and then why it was chosen here, with citations.

– The figures are difficult to use. This comes back to the conflation of space and time outlined in the public review. There are big circles on the maps covering the dots which I presume are either archaeological sites or accession points of the sampled beans(?! unclear), and then very odd choices of illustrating change over time. The figures are small and hard to see and require very long text in the figures to make them useable.

– Some aspects of basic geography have been overlooked to make climate the most important variable; l.166–7 "Given that geographic barrier might not be the most important factor". I find it hard to believe that both the Himalayas and major flood basins like the Brahmaputra would not be an issue, as would issues of day length when moving things north–south.

– In dealing with issues of climate change and adaptation some discussion of tolerance is needed. there must be a discussion (and a table perhaps) of the different watering, salination, temperature, etc. tolerance of the mung bean(s) under consideration to make the claims justified.

– Within the methods, the sampling strategy was hard to follow. This needs a more careful and clear description of decisions made: exactly where did the accessions come from geographically? how did their spread affect the dataset? Is there geographic clustering, did you compensate for that? how does the sampling potentially bias your data? a map would be useful.

– Fair and open protocols dictate that all methods must be stated: "Genomic DNA was extracted from a single plant per accession using QIAGEN Plant Mini DNA kit according to the manufacturer's instruction with minor modification." If you modified the protocol then you have to outline what you did so it can be reproducible and the data comparable.

– "Climate data for conditions between 1960–1990 were downloaded from the WORLDCLIM 1.4 database" – how was this dataset determined? why 30 years and not more? give citations to explain this decision, and look to other modelling efforts to check comparability.

– "19 bioclimatic variables" – what are these? why were they chosen? a table and explanation are needed.

– "excluding one of the two variables that have a correlation above 0.8 (Supplementary file 4)" – why? explain the reasoning for exclusion.

– Throughout the dataset has relied on "In total, our dataset contains more than one thousand accessions (1092) and covers worldwide diversity of mungbean representing a wide range of variation in seed colour" however at no point is there a discussion of whether these are modern variants of historic landraces and how this was assessed. This has a big impact on any discussion of "ancient" adaptations, and there must be a discussion of how you tested to see if the genetic changes you see are more recent or past changes and how the genetic clock was applied.

As noted in the public review, a far more interesting aspect than trying to tie into domestication/post–domestication and chronological vagaries are the points made in lines 87–89: this data could be "used this resource to investigate the global history of mungbean after domestication […looking at the …] phenotypic characteristics for local adaptation to distinct environments." Thinking about the value of this dataset for the preservation of diversity, and how diversity links to localised adaptations today and to sustainable cropping now is critical, and I suggest this could be the way to reframe things.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Environment as a limiting factor of the historical global spread of mungbean" for further consideration by eLife. Your revised article has been evaluated by Detlef Weigel as Senior and Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined in the comments of Reviewer #1 below.

Reviewer #1:

I found this study to be one of the rare to combine genomic data, climate data, phenotypic data to decipher the diversity of adaptation of plants, and try to build up scenario of their diffusion. Previous recommendations were mainly answered, and I am personally satisfied with this new version of the manuscript. At this stage I recommend acceptance of the paper as soon as the concern is addressed.

Data availability:

I was not able to access the bioproject PRJN809503, is the data already available or not? Neither a biosample I try to access for a check.

Neither I was able to access DRYAD data.

The authors provide a link to fastq data but I could not find a fastq file link in Noble et al. or Breria et al. How did the authors merge the data? If they merged the data based on Table S1 or Noble et al. why did they find more SNPs than Noble et al. with a 10% missing rate? Since some authors are common between these studies a clear path to access to the whole dataset should be available to the community.

eLife. 2023 May 19;12:e85725. doi: 10.7554/eLife.85725.sa2

Author response


[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

Reviewer #1 (Recommendations for the authors):

The authors provide an ambitious interdisciplinary analysis of the post–domestication spread of mungbean. I believe the authors' general thesis is correct, and the various data they provide are all consistent with their model.

1. Nonetheless, there are some issues with analysis and interpretation, and in general, the data don't provide definitive evidence that allows claims about drought or the importance of the environment.

All of the evidence and analysis presented are consistent with the authors' proposed SA–SEA–EA–CA model. The f3 results are particularly convincing. I struggle, however, with how the data definitively show environment was more important than human movement. Some text shows, for example, that trading or demic movement between SA and CA was frequent would make clear that mungbean 'could' have moved SA–>CA via humans but didn't. Otherwise, could the SA–SEA–EA–CA model simply reflect historical movements/expansions of people in those directions/times?

Thank you. In the previous version we put most of the information in Discussion, and now we elaborated this point. Specifically, we emphasized studies that as early as about 4kya, Central and South Asia was connected through a complex exchange network linking the north of Hindu Kush, Iran, and the Indus River Civilization (Doumani Dupuy, 2016; Kohl, 2007; Kohl and Lyonnet, 2008; Lamberg-Karlovsky, 2002; Lombard, 2020; Lyonnet 2005). Later, there was archaeological evidences that diverse crops from West, South, and East Asia have been cultivated in northern Pakistan (Spengler et al. 2021), suggesting the prevalence of crop exchange. Human-mediated long distance dispersal of mungbean seeds also happened, as mungbean seeds have been found near the Red Sea Coast of Egypt during the Roman period (Van der Veen and Morales, 2015). Despite the prevalence of exchange, we note in Discussion: “Despite this, in Bronze-age archaeological sites north of Hindu Kush, cereals were frequently observed. Legumes (such as peas and lentils) were observed to a lesser extent, and South Asian crops were not commonly found (Jeong et al. 2019, Spengler et al. 2018). Interestingly, archaeologists suggested legume’s higher water requirement than cereals may be associated with this pattern, and pea and lentil’s role as winter crops in Southwest Asia may be associated with their earlier appearance in northern Central Asia than other legumes (Spengler et al. 2014).” Our study of genetics, climate, and plant traits therefore support conclusion from archaeological studies.

Doumani Dupuy PN 2016. Bronze Age Central Asia. The Oxford Handbook of Topics in Archaeology, New York: Oxford University Press.

Jeong et al. 2019. The genetic history of admixture across inner Eurasia. Nature Ecology and evolution 3:966-976

Kohl PL 2007. Entering a Sown World of Irrigation Agriculture – From the Steppes to Central Asia and Beyond: Processes of Movement, Assimilation, and Transformation into the “Civilized” World East of Sumer. The Making of Bronze Age Eurasia, Cambridge: Cambridge University Press pp. 182-243

Kohl PL and Lyonnet B 2008. By land and by sea: The circulation of materials and peoples, ca. 3500–1800 BC. In: Olijdam E and Spoor RH (eds.), Intercultural Relations between South and Southwest Asia. Studies in Commemoration of E.C.L. During Caspers (1934–1996), Oxford: Archaeopress pp. 29–42

Lamberg-Karlovsky CC 2002. Archaeology and language: The Indo-Iranians. Current Anthropology 43:63–88

Lombard P 2020.The Oxus civilization/BMAC and its interaction with the Arabian Gulf. A review of the evidences. In: Lyonnet B and Dubova NA (eds.), The world of the Oxus civilization, London: Routledge pp. 607-637

Lyonnet B 2005. Another possible interpretation of the Bactro-Margiana Culture (BMAC) of Central Asia: The tin trade. In: Jarriage C and Lefevre V (eds.), South Asia Archaeology 2001, Volume 1: Prehistory, Paris: Editions recherche sur les civilisations pp. 191-200

Spengler et al. 2014. Agriculturalists and pastoralists: Bronze Age economy of the Murghab alluvial fan, southern Central Asia. Vegetation History and Archaeobotany 23:805-820

Spengler et al. 2018. Arboreal crops on the medieval Silk Road: Archaeobotanical studies at Tashbulak. PLOS ONE 13(8):e0201409

Spengler et al. 2021. The southern Central Asian mountains as an ancient agricultural mixing zone: new archaeobotanical data from Barikot in Swat valley of Pakistan. Vegetation History and Archaeobotany 30:463-476

Van der Veen M and Morales J 2015. The Roman and Islamic spice trade: New archaeological evidence. Journal of Ethnopharmacology 167:54-63

2. On a similar note, the paper claims that drought was the most important factor limiting the mungbean movement. The phenotypic data certainly suggest adaptive differences in traits related to drought and drought differences in phenotype in an experimental setting, but additional evidence would be helpful to convince that drought is the key factor. The authors note the importance of daylength differences, for example – how do we know daylength wasn't the limiting factor (as it seems to have been in the northward spread of maize, for example)?

Thank you. The main thesis of this study is that environmental adaptation may be an important factor. We used drought (among many other environmental factors) as a test case and chose root/shoot traits and field phenology as supporting evidences. We wish to note that we do not claim drought to be the most important. We believe mungbean was influenced by the join effects of multiple environmental factors: In the north, there is a limited duration for sowing (due to daylength), and the soon-arriving fall frost (temperature) further limited the growing season. During this period the water availability is also low (drought). The rapid phenology therefore appears to be important for adaptation to the short growing season (caused by daylength and temperature) and declining water availability (drought). We have rephrased the point throughout the manuscript.

Methods Recommendations

3. It's difficult in some places to know which SNP data set is used where. Are the LD–pruned SNPs used to estimate LD decay? The methods say there are 67K sites (including monomorphic) but there are 41K SNPs. This would imply that 2/3 of the sites are polymorphic. That can't be right so I must be misunderstanding something. Are these monomorphic sites also used for estimating nucleotide diversity (see the issue with VCFtools below)? For fastsimcoal, do you include sites where the derived allele is frequency=0 and where the derived allele is frequency 1?

We apologize for the error. After quality filtering, in total there are 1248K sites, among which 34K are SNPs. The nucleotide diversity, pairwise genetic distance, and all related analyses have been re-done using the estimates incorporating monomorphic sites. For fastsimcoal, we included monomorphic sites. The previous information (67K sites for fastsimcoal) was a typo. The LD decay was calculated using the SNP sites without LD-prune. We have revised the manuscript throughout.

4. It would be good to modify the language of the MaxEnt modeling. While MaxEnt may show that the modern mungbean does not currently grow in conditions similar to those during the Holocene at a certain spot on the map, this is not the same as saying the plant could not have grown there. For example, if we were to model MaxEnt on modern CO2 concentrations, we would come to the conclusion that modern mungbean could not have grown anywhere during the Holocene. I don't think necessarily the methods need to be changed, but it would be good to change the language here to be less definitive.

Thank you. After considering this and the comments of reviewer 3, we have removed the mid-Holocene niche modeling results. In this revision we focus on comparing the potential range overlap in current conditions.

5. Although dominance is less of a concern in inbred species, I believe it is still incorrect to assume that using Vg is equivalent to Va in Qst. The epistatic variance could still be important, for example, and my brief internet search suggests mungbean isn't 100% selfing either. The authors have SNP data for all the individuals phenotyped, so it should be doable to estimate Va using kinship matrices and thus do a correct Qst analysis.

Thank you. Goudet and Büchi (2006) showed that for partially inbred species, the selfed-progeny design (as in our study) is more suitable than the half-sib design. With selfed progeny, Goudet and Büchi (2006) suggested using QST = VB / ( VB + VFam ) , where VB is the among-group variance component and VFam is the family-level variance component within genetic groups. This is the method originally used in our study as well as in many Arabidopsis thaliana (Méndez-Vigo et al. 2013, Stenoien et al. 2004, Wieters et al. 2021) and Medicago truncatula (Bonnin et al. 1996) studies. To accommodate the possibility that mungbean is not completely selfing, we also applied the equation QST = (1+f) VB/ ( (1+f) VB + 2VAW ) from Goudet and Büchi (2006), where f is the inbreeding coefficient (estimated by VCFtools as 0.8425), VB is the among-population variance component, and VAW is the additive genetic variance within populations estimated by the kinship matrix using TASSEL. The results and conclusions are shown in this revision and are similar to our previous version.

Bonnin et al. 1996. Genetic markers and quantitative genetic variation in Medicago truncatula (Leguminosae): a comparative analysis of population structure. Genetics 143:1795–1805

Goudet J and Buchi L 2006. The effects of dominance, regular inbreeding and sampling design on QST, an estimator of population differentiation for quantitative traits. Genetics 172(2):1337-1347

Mendez-Vigo et al. 2013. Among- and within-population variation in flowering time of Iberian Arabidopsis thaliana estimated in field and glasshouse conditions. New Phytologist 197(4):1332–1343

Stenoien et al. 2005. Genetic variability in natural populations of Arabidopsis thaliana in northern Europe. Molecular Ecology 14:137–148

Wieters et al. 2021. Polygenic adaptation of rosette growth in Arabidopsis thaliana. PLOS Genetics 17(1):e1008748

6. VCFtools calculates nucleotide diversity assuming every bp in a window has been sequenced. This will lead to incorrect estimates of π and Fst. See https://pubmed.ncbi.nlm.nih.gov/33453139/ for details and one potential fix if you have a vcf with invariant sites, or https://github.com/RILAB/mop for another if you just have bams + variable vcf.

We re-generated the vcf including invariant sites and excluded sites with high missing proportion. We used pixy (Korunes and Samuk, 2021) to calculate nucleotide diversity (π) and genetic differentiation (FST) using all invariant sites. The results have been updated.

Korunes KL and Samuk K 2021. PIXY: Unbiased estimation of nucleotide diversity and divergence in the presence of missing data. Molecular Ecology Resources 21(4):1359-1368

7. An NJ tree (Figure 1E) is just a clustering algorithm and shouldn't be interpreted as providing information on the timing or order of evolutionary events. (lines 119–123)

Following the suggest from Reviewer 2, point 2, we used TREEMIX to build the population-level relationship. In addition, we used methods such as outgroup f3 tests and f4 tests to investigate the relationship among these genetic groups. All results are consistent with our previous results of (SA,(SEA,(EA,CA))).

8. If the growing season is known and is what is most relevant for the crop (line 190), these should be used for most analyses or a justification for using annual data should be provided

While mungbean could be grown in the summer/early fall for most of Asia, in the southern part of Asia mungbean are occasionally grown in winter/spring. Due to the discrepancy of growing season between the northern and southern parts of Asia, we therefore performed parallel analyses, one using annual data and the other using temperature and precipitation from May, July, and September as growing season data. Both results are presented in the revised manuscript.

9. I can't find the total length of time plants were grown for the drought stress experiment. The text reads as if it were 9 days?

We revised the Materials and methods section. The seedlings were grown in nutrient solution for 6 days before being subjected to drought stress. For drought treatment, seedlings of mungbean were exposed to polyethylene glycol (PEG)-induced drought stress for 5 days as illustrated in Author response image 1:

Author response image 1.

Author response image 1.

Reviewer #2 (Recommendations for the authors):

The study aims to understand the diffusion of the Mugbean population in Asia using genomic and phenotypic data. The authors used phylogeny, analysis of population structure, model–based inference, quantitative genetics, and niche odelling to understand the diffusion of Mugbean and proposed a scenario where diffusion is strongly constrained by climate, and partially by geographical.

1. I found the study very interesting with a mix of different methodologies from genomics to niche modeling and quantitative genetics. Few others studies used similar approaches but it is rather unique for the study of diffusion to combine these different approaches and bring up very interesting results. I have several comments on the methods so the results are better supported for the genetic part, for the phenotypic part, I am not sure the author will be able to support their claims, as the method they used might inflate variance in their estimations.

Thank you. For the phenotype part, in this revision we applied a method that specifically addressed additive genetic variance and the potential issue that mungbean is not fully inbred (Reviewer 1, point 5). The results are qualitatively similar and the conclusions are the same. Please refer to Reviewer 1, point 5 and Reviewer 2, point 5 for detail.

Main comments:

2. Phylogeny. Phylogeny is not really an appropriate method per see for intra–diversity study, phylogeny analysis could be used but the basic assumptions of the method are often inadequate, so the result should be put in context. A better method to do phylogeny–like inference is the TREEMIX approach based on the relationship of population integrating drift. I suggest the authors add this complementary analysis.

Thank you. We have performed TREEMIX in Figure 2A. The relationship among these genetic groups was also supported by outgroup f3 statistics and f4 statistics.

3. Model inference. Here, I would have liked a statistical comparison of the different scenarios. FastSimcoal allows for estimated probability of models/scenarios and it would be an independent validation to have such a comparison of models.

The inference of the model is dependent on mutation rate (unknown), notably on the time of divergence. The author used 10–8 but it is neither discussed nor justified.

Thank you. In addition to our previous results, in this revision we expanded the population structure inference section by incorporating and connecting independent analyses such as TREEMIX, dxy, FST, outgroup f3 statistics, and f4 statistics. All results point to the same pattern that EA and CA are genetically closest, followed by SEA, and SA is the most diverged group. Regarding the tree shape, (SA,(SEA,(EA,CA))) has strong supports from many independent analyses. We therefore mainly use fastsimcoal2 not as a tree-shape-testing method but as a method to estimate population divergence time based on this tree shape, which has strong support from many independent methods.

We wish to note that the current tree shape is still consistent with multiple hypotheses of cultivar expansion (please refer to the manuscript for the “east hypothesis”, “north hypothesis”, and “northeast hypothesis”), and testing different tree shapes would not necessarily distinguish among these hypotheses. Instead of using fastsimcoal2 to test different tree shapes, we employed the evidence from π, linkage disequilibrium decay, and isolation by distance to test these hypotheses.

Upon checking the common mutation rates used in plant population genetics analyses, we found our original 1e-8 was within the range of mutation rates used in eudicots (literatures have used 1.43e-9 to 2.50e-8). We performed a parallel analyses using 2e-8, and the results are qualitatively similar, and we therefore retained the original results from 1e-8. The Author response table 1 shows the 75% confidence range of these time estimates (kga = thousand generations ago).

Author response table 1.

Divergence Mutation 1e-8 Mutation 2e-8
SA vs. (SEA,(EA,CA)) 4.7 to 11.3 kga 5.6 to 11.3 kga
SEA vs. (EA,CA) 1.1 to 4.6 kga 1.3 to 5.4 kga
EA vs. CA 0.1 to 0.8 kga 0.1 to 0.5 kga

Analysis of structure.4. The author led aside from further studying individuals with ancestry lower than 70% in a given group. The analysis of structure could lead to such ancestry because of admixture or isolation by distance. One is secondary contact based on recent (or not) gene flow between genetic groups, the author is related to diffusion across the landscape. A large fraction of individuals in the PCA seems to me more likely related to isolation by distance (between SEA/EA, EA/CA). How this will impact the analysis of correlation by geographical distance if such individuals are not considered? The authors should perform the analysis of the Mantel test with all individuals to assess the impact of their choices.

Thanks for the suggestions. We have re-perform the Mantel test following the suggestion, and the results consistent with our previous IBD analysis.

5. Quantitative analysis of QST. The author used the genetic variance and not the additive variance. The reasoning behind that is not explained, neither the inflation created by this broad sense QST. Authors have concluded that the "extent to which comparisons between FST and broad–sense QST are appropriate remains unknown" (Pujol et al. 2008).

Thank you. Upon reading Pujol et al. (2008), it is unclear whether the comment pertains perdominately selfing species like mungbean. We therefore refer to Goudet and Büchi 2006 (https://pubmed.ncbi.nlm.nih.gov/16322514/), which has a complete treatment of the effect of inbredding and genetic variance estimates to QST. In addition to the suggestion from Goudet and Büchi 2006 (which is the same as our original approach), we follow the suggestions in Reviewer 1, point 5 and estimated the new QST using additive within-group genetic variance. The results are similar and the conclusions remain the same. Please also refer to Reviewer 1, point 5 for details.

Reviewer #3 (Recommendations for the authors):

The paper aims to look at the domestication, post–domestication spread, and adaptation of mungbean (Vigna radiata) across Asia through the use of genetic data from landraces and accessions in see and genebanks.

The genetic aspects of this paper are a strength – there is a lot of work that has been done putting together a range of datasets allowing for inter–collection comparison and comparison of collections made by different institutes with their varying goals, sampling strategies, and dates of collecting. Mapping this diversity, think about how drift has occurred and why is something that needs to be done, especially in mungbean (and other tropical pulses of Asian/South Asian origin) as they are often overlooked in literature.

1. However, the main stated goal of the paper – to look at the domestication, post–domestication, and adaptations to climate change as this crop was moved around – is where it falls short. There is little engagement with the deep archaeological literature on both domestication as a process, post–domestication use and spread of mungbean (and other South Asian crops and those involved in the Silk Routes trade pathways), and the complexity of climate reconstructions and climate change over the stated period of interest/regions of interest. Works by Spengler and d'Alpoim Guedes for example are missed with regards to the Silk Routes debates, and literature by Fuller, Murphy given only short sentences as background for what is a very complex background regarding where and when mungbean is thought to have been domesticated. There is little reflection on the context of the two/three possible origins for mung (south, north, and west South Asia), how this interacts with the Southern Indian Neolithic and Indus regions, and how the changing cultural dynamics may have contributed to the processes of domestication, post–domestication change and the spread of different varieties. Without this background, it is hard to then move into discussing modern genetic data with a view to past patterns, for example with thinking about how climate may have affected change, given that the debates around the 4,2k event are extremely complex within these, let alone thinking pan–Asian and trying to link potentially 'drifted' genetic data today to these deep–time events. This comes across in the timescales given to the genetic data, as without the context of the where and when from archaeology, we see dates such as those given in Figure 2B that suggest domestication for South Asia moving to South East Asia at c.6kya (4000BC), which is when we still think they were under domestication within South Asia. The region is also not pinned down for where in South Asia these specific 'domesticated' mung are coming from to go to South East Asia, and the routes, yet arrows and big circles are added in Figure 2A. This shows the issue of not using that important context from archaeology.

We appreciate the valuable comments. In this revision we discussed previous archaeological findings in Introduction and, as suggested, use the archaeological findings as the basis of our genetic investigation. Using genetic data, we re-constructed the evolutionary relationship among these major genetic groups. We showed that CA and EA to be genetically closest, followed by SEA, and SA is the most diverged group. As stated in our revised Results section, this observation is actually still consistent with several hypotheses, and we used further population genetics evidences to distinguish among these hypotheses. While our data suggested the general trend of SA-SEA-EA-CA, our investigation is limited by the amount of samples with detailed geo-referenced information.

Indeed, archaeological evidences have shown multiple independent early cultivation of mungbean within South Asia. As the reviewer has pointed out, the present-day samples were ‘drifted’ from the ancient varieties. It is exactly because of this reason that we did not attempt to use the present-day samples’ detailed GPS coordinates to answer/discuss the issue about these independent origins of mungbean cultivation within South Asia. Due to the possibility of long-term seed exchange and replacement within South Asia, the samples collected in a specific location might not necessarily reflect the genetic information for ancient mungbean at the same locations. This is supported by our recent results (Lin et al. 2022) that present-day worldwide cultivars, despite their geographic origin, have the same haplotype in the promoter region of VrMYB26a, a candidate gene controlling pod shattering in several Vigna species. This is consistent with a single origin of the loss of pod shattering phenotype common to present-day cultivars and suggests that while early mungbean cultivations happened independently, eventually one form with the loss-of-pod-shattering phenotype (and VrMYB26a haplotype) dominated and replaced others. Answering where the "domestication allele" of VrMYB26a originated or deciphering the relationship between present-day germplasms and ancient cultivation origins within India would require information from more geo-referenced ancient DNA, which would be out of the scope of this present study. On the other hand, we recognize the climate heterogeneity within South Asia. We performed new climatic analyses separating the SA group into two major regions based on the Köppen climate classification (one more similar to Southeast Asia and the other more similar to Central Asia). The results remain similar.

We completely agree with the reviewer that modern samples are ‘drifted’ and do not necessarily reflect ancient conditions. It is exactly due to this reason that we did not attempt to pinpoint the detailed or specific route of the out-of-India event, either. The genetic data simply do not allow us to pinpoint, for example, whether mungbean expanded from South Asia to Southeast Asia / southern China through the land or maritime routes (Fuller et al. 2011; Castillo et al. 2017). In our revision, we emphasized these routes are both compatible with our hypothesis. We wish to emphasize that we do not put much emphasis on the exact route of mungbean expansion, nor do we think there is only one wave/one route of the out-of-India event. There might be multiple attempts to bring mungbean out of India (which could not be answered unless with ancient DNA), and our focus in this study is whether or when mungbean could be grown and became part of the local agriculture throughout Asia. As in many other population genetics studies, if the previous waves were replaced by the later-arriving germplasms, our data could not provide information for the previous waves. We agree using big circles and arrows are too rough for the more complicated history of mungbean spread. We have modified this figure and prevent specifying a specific route without genetic evidence.

Take human genetics as an example, while modern populations have drifted from ancient populations, studies using modern DNA revealed much of the important demographic history. Using genetic and phenotypic data from modern populations, scientists are able to decipher the unique adaptation in human history (for example, lactose intolerance and many other studies from researchers such as Rasmus Nielsen and Graham Coop). Our work has similar aims and approaches to these studies. Similarly, questions such as “how many independent out-of-Africa events have happened”, “whether each of these originated from different locations within Africa”, or “the specific route of how anatomically modern human reached East Asia” would await studies incorporating ancient DNA from archaeological studies. For mungbean, we look very much forward to the chance of investigating these samples.

About the estimated divergence time among genetic groups, we wish to note that the time estimates have high confidence intervals (previously we only used the 50% range, and Reviewer1 suggests reporting wider ranges). Further, the time estimated was in units of generations instead of years (we modified the manuscript to clarify this). The growing season in the northern part of Asia is short, and we reasonably assumed one generation per year. This might not fit the whole Asia, since the southern parts of Asian have longer growing season and may allow more than one generation per year. Assuming one generation per year (as we naively did in the previous version) overestimated the divergence time for the southern groups. This is the limitation of current methods in this field, and we have acknowledged this in the manuscript.

Castillo et al. 2016. Rice, beans and trade crops on the early maritime Silk Route in Southeast Asia. Antiquity 90(353):1255-1269

Fuller et al. 2011. Across the Indian Ocean: The prehistoric movement of plants and animals. Antiquity 85(328):544-558

Lin et al. 2022. Distinct selection signatures during domestication and improvement in crops: a tale of two genes in mungbean. bioRxiv

Stevens et al. 2016. Between China and South Asia: A Middle Asian corridor of crop dispersal and agricultural innovation in the Bronze Age. The Holocene 26(10):1541-1555

2. A further issue arises from thinking about the climate data. By conflating vast areas (e.g.: South Asia, Central Asia, etc.), when applying climate modelling there has been an oversimplification, which makes any discussion of mung bean adapting to climate post domestication difficult to sustain. In line 184 for example, there is a suggestion on the role of the 6.2k event in Central Asia (putting aside the above issues of the dating of mung domestication in South Asia before it even reaches Central Asian regions already noted). While there are a few datasets as cited in the paper that show some impacts of wetter climates in some regions of Central Asia for a wetter 6,2k event this is by no means a universal impact, and regional data points are needed. We can see this when looking to the Indus and the impact of the 4.2k event as another example, again a point that needs refining in order to make such claims about mung domestication, let alone post–domestication adaptations.

Our study started from population genetic analyses, which separated the samples into four genetic groups. These genetic groups were separated purely based on genetic data without any geographic information, and their names (SA, SEA, EA,CA) were given based on where most of the materials came from. Given this, each genetic group is a relatively homogeneous entity compared to worldwide genetic variation. We fully agree that each specific location has its own unique climatic conditions, but since the purpose of this study is to investigate what factors affect the divergence among these four genetic groups, we used each genetic group as a unit in climate analyses and aimed to identify which factors have larger among-group than within-group variation. We fully agree and recognize that the distribution ranges for some genetic groups have large environmental heterogeneity. For example, the EA group distributes from the Pacific Coast to Central Asia. This is why we separated the EA groups into the eastern and western halves to accommodate this issue. Another geographic region potentially with large environmental heterogeneity is South Asia. In this revision, we investigated the distribution of these samples based on Köppen climate classification. CA samples have relatively homogeneous distribution in a few Köppen zones with similar climatic characteristics, and so are SEA samples. Based on the patterns of SA and EA samples, we separated them into two groups each, resulting in a total of sis zones. We wish to emphasize that the major goal of this study is to investigate factors affecting the differentiation among these four genetic groups. Further subsetting the range into many small regions was not strongly supported by genetic evidence. Finally, niche modeling requires reasonable amount of samples to estimate the “tolerance environmental range” (Reviewer 3, point 1) of a group, and subsetting samples based on the unique climatic characteristics of each location is not suitable for this analysis.

Note that in climatic modeling, we obtained the present-day environments from the present-day distribution of these present-day materials to identify their suitable “niche space”. Using this information, we predicted “where would be the suitable environment for these genetic groups under different climatic conditions” given the niche space information of each genetic group. The existence of the projected distribution for the CA group at 6.2kya (in the previous supplement figure) was merely a method to compare the climatic difference between 6.2kya and today, and this figure does not mean we thought the CA group already existed in the Central Asia geographical region at 6.2kya. We did not attempt to make mungbean seem already widespread at that time (in response to Reviewer 3, point 5). The whole purpose of the analysis using 6.2k climate was to investigate whether the Central Asia geographical region may be a suitable habitat for plants from the SA group when the Central Asian climate might be slightly different from today. If so, that would reject our hypothesis. Due to the possibility of confusion, we removed this figure in the revision.

To project the suitable distribution range, this niche modeling method requires the GIS layer with data in every 1km geographical grid, which is available for very limited number of time points. While mid-Holocene is the only close-enough time point with bioclimatic variables being modeled (http://www.worldclim.com/past), we agree this is a poor choice. We also agree detailed regional data would better reflect the spatial heterogeneity of climatic conditions, but this distribution modeling approach requires climate data for all geographical grids. We therefore removed the mid-Holocene niche modeling result.

3. Overall the thrust of the paper – domestication, post–domestication, and the spread of agriculture – are overshadowing what is actually a far more interesting point, hidden in lines 87–89: this data could be "used this resource to investigate the global history of mungbean after domestication […looking at the …] phenotypic characteristics for local adaptation to distinct environments." This perhaps is where the paper is most interesting, and reframing it in this context would be truly exciting, looking at the diversity of the crop, how it is now adapted to diverse environments, and what this might mean for long–term sustainability in cropping systems.

Thank you. We fully agree understanding the genetics, unique phenotypic characteristics, and local adaptation of these germplasms is important, and we sincerely hope to use our results to contribute to the goal of long-term sustainability and crop improvement. This is the topic of other research projects in our group. On the other hand, we also agree with the two other reviewers that using the genetic data to contribute to understanding of mungbean cultivation expansion is interesting. As the potential importance of environmental adaptation in determining crop cultivation range has been suggested in several archaeological studies (Spengler et al. 2014 and Spengler et al. 2018), here we provide supports from the genetics perspective. We have modified the manuscript to reflect the archaeological context.

Spengler et al. 2014. Late Bronze Age agriculture at Tasbas in the Dzhungar Mountains of eastern Kazakhstan. Quaternary International 348:147-157

Spengler et al. 2018. The breadth of dietary economy in Bronze Age Central Asia: Case study from Adji Kui 1 in the Murghab region of Turkmenistan. Journal of Archaeological Science: Reports 22:372-381

This paper sadly is losing some very interesting genetics data in complex and poorly explained mung history.

4. The lack of engagement with archaeological data and the misunderstanding of the chronology of mung use in the past makes it very difficult to tally the results with the interpretations and discussions. This MAJOR point has been unpacked in more detail above and must be addressed in order to reduce the oversimplification of the background and remove the concerning premise that no one has done much work on ancient mung use (as stated up to l.77). While it has not had as much work as the cereals, there is still work being done on it, looking at its domestication regions, secondary domestication changes and spread across South Asia and then into different parts of Asia.

Thank you. In our revision, we have devoted a paragraph in Introduction for the archaeological results.

5. Data seems to have been massaged to make it fit with the climate modelling in various regions (for example Figure 2B has mung arriving in Central Asia around 0.2k yet discussions of 6.2k climate events in l.184), and to also make mung seem to be spreading before the domestication event itself. More engagement with the archaeological discussions on mung domestication is needed and discussions of what domestication is as a process (there is a fundamental misunderstanding of the conscious/unconscious action outline in Larson et al. in l.47 – the way it is phrased implies deliberate choice to ensure change rather than recognizing the inherent unconscious and indirect action of human behavior and the entanglement of human–plant–environment interactions).

We wish to emphasize that we did not massage the data in any way. For the 6.2k climate issue, please refer to our response to Reviewer 3, point 2. We have modified the introduction to reflect the reviewer’s point about Larson et al. (2014).

6. Terms like cultivar and variety and landrace are used interchangeably. These must be defined in the paper. How they fit in with notions of domestication and post–domestication agricultural behavior must also be unpacked.

We apologise that we have not made it clearer about the definition of cultivar, variety and landrace. In our work, we defined landraces if accessions are collected from the countries traditionally cultivating them and locally adapted as well as lack of formal genetic improvement, ie. those conserved in the Vavilov Institute, many of which were collected in the early 20th century. In order to avoid confusion, we revised all sentences to clarify these terms in relevant places across the text.

7. "how the domesticated forms later expanded to a broader geographical area has also been detailed in several species, including maize (Matsuoka et al., 2002), rice (Huang et al., 2012), tomato (Razifard et al., 2020), chickpea (Varshney et al., 2021), and lettuce (Wei et al., 2021)." – these are unusual choices of case study to make a point as many (maize and rice as key examples) are not accepted as well defined and remain highly controversial. These are genetics papers, and demonstrate the lack of familiarity with the archaeological context of domestication. A quick glance at the literature around them will illustrate that they are poor choices of case studies to make this point as they too are highly controversial.

We have removed this citation and completely re-written this paragraph.

8. "It is also unclear whether the expansion of most crops strictly follows the longitudinal axis of the continents (Diamond, 2005) or whether or why some are able to cross different climatic zones." – again poor knowledge of the archaeological context of these debates, and the reliance on Diamond is concerning as he is not an archaeologist. See works by Lister on barley and Lui on wheat as a good starting point.

Thank you. We have removed the citation to Diamond and re-written this part to reflect the rich archaeological works.

9. Debate on wild progenitor of mung bean needs to be explored. while sublobator is a likely candidate it should be explained in the paper that there are other possible options, and then why it was chosen here, with citations.

Molecular phylogenetic based on internal transcribed spaces (ITS) sequences was carried with the aim to resolve the taxonomic contradiction in Vigna group (Goel et al. 2001). The constructed ITS-region-based trees revealed that V. radiata var. sublobata is closest to V. radiata.

Goel et al. 2001. Molecular evolution and phylogenetic implications of internal transcribed spaces sequences of nuclear ribosomal DNA in the Phaseolus-Vigna complex. Molecular Phylogenetic and Evolution 22(1):1-19

10. The figures are difficult to use. This comes back to the conflation of space and time outlined in the public review. There are big circles on the maps covering the dots which I presume are either archaeological sites or accession points of the sampled beans(?! unclear), and then very odd choices of illustrating change over time. The figures are small and hard to see and require very long text in the figures to make them useable.

Thank you. We responded the space and time issue in Reviewer 3, point 1 and Reviewer 3, point 2. In this revision we removed the circles and arrows and moved the figures to supplement. We still keep some supplement figures with circles and arrows, but these figures (Figure 2—figure supplement 2) should be merely treated as simple illustrations of potential hybridization scenarios instead of detailed routes.

11. Some aspects of basic geography have been overlooked to make climate the most important variable; l.166–7 "Given that geographic barrier might not be the most important factor". I find it hard to believe that both the Himalayas and major flood basins like the Brahmaputra would not be an issue, as would issues of day length when moving things north–south.

Thank you. Since pods of mungbean cultivars do not shatter naturally, they have lost the natural ability to disperse seeds. We therefore focus on human activities when we discuss the ability to disperse. We are fully aware that it is difficult for human to cut through the Himalayas directly and we did not claim the Himalayas are not a barrier. Previously we put our argument mainly in Discussion, and here we elaborated part of the points in Results. Specifically, we emphasized that South and Central Asia was connected by a complex exchange network among north of Hindu Kush, Iran, and the Indus Valley. Between South and Southeast Asia, we did not assign a specific route, either. It is equally likely that mungbean spread through land or maritime routes. As emphasized in Reviewer 3, point 1, we focus on the relationship among the genetic groups instead of designating the exact routes. We have modified the manuscript reflect this.

In this study, “geography” mostly refers to landscapes hindering human activity and the movement (such as the Himalayas and the Brahmaputra). We regard daylength (as well as temperature, growing season length, and water availability) as a environmental factor that might limit the successful cultivation of mungbean in a new environment even though human had successfully transported mungbean there as seeds. We have rephrased this part to reflect the reviewer’s comment.

12. In dealing with issues of climate change and adaptation some discussion of tolerance is needed. there must be a discussion (and a table perhaps) of the different watering, salination, temperature, etc. tolerance of the mung bean(s) under consideration to make the claims justified.

Thanks for your comment. We have now revised the discussion comprehensively, mentioning that the precipitation in Central Asia is greatly below the lower limit of standard optimal conditions to grow mungbean in the south. To the best of our knowledge, this is the first study to systematically compare the phenotypic characteristics of different genetic groups in mungbean.

13. Within the methods, the sampling strategy was hard to follow. This needs a more careful and clear description of decisions made: exactly where did the accessions come from geographically? how did their spread affect the dataset? Is there geographic clustering, did you compensate for that? how does the sampling potentially bias your data? a map would be useful.

Thank you. We have clarified in Materials and methods that while all samples have the information of their countries of origin, only a subset of samples have the detailed longitude and latitude coordinates. These samples were used in the detailed analyses connecting genetic and location information (the isolation by distance analyses). The genetic groups (SA, SEA, EA, CA) were named from the geographic region where most of their members originated from, by considering both country and geographic coordinate information. Only samples from Asia were used. In this revision, we also put the accession location maps in supplement figures. About how the distribution of sample might affect our conclusion, we added the following note in Discussion:

“We recognize that not all samples have available spatial data, and we do not have samples from some parts of Asia. For example, while most samples of the SEA group were collected from Taiwan, Thailand, and Philippines, we do not have many samples from the supposed contact zone between SA and SEA (Bangladesh and Myanmar) or between SEA and EA (southern China). If more samples were available from these contact zones, the modeled niche space between SA and SEA and between SEA and EA would be even more similar than the current estimate, strengthening our hypothesis that niche similarity might facilitate the cultivation expansion. On the other hand, clear niche differentiation between SA and CA was evident despite the dense sampling near their contact zone.”

14. FAIR and open protocols dictate that all methods must be stated: "Genomic DNA was extracted from a single plant per accession using QIAGEN Plant Mini DNA kit according to the manufacturer's instruction with minor modification." If you modified the protocol then you HAVE to outline what you did so it can be reproducible and the data comparable.

The modifications made during DNA extraction were added.

15. "Climate data for conditions between 1960–1990 were downloaded from the WORLDCLIM 1.4 database" – how was this dataset determined? why 30 years and not more? give citations to explain this decision, and look to other modelling efforts to check comparability.

Using the WorldClim database has now become a common practice in the field of environmental niche modeling. The WorldClim database used meteorological station data collected during 1960-1990 to extrapolate worldwide climatic conditions in about 1km2 (30s) resolution. Therefore, we directly used the available climate layers which were created based on climate conditions recorded between 1960 and 1990 to predict the geographic distribution of suitable habitats for cultivated mungbean. We have rephrased this part.

16. "19 bioclimatic variables" – what are these? why were they chosen? a table and explanation are needed.

Explanation of each bioclimatic variables used in this study are available on WorldClim website (https://www.worldclim.org/data/bioclim.html#google_vignette) and we have added a table in supplementary file 5. These are often used in species distribution modelling and any related ecological modelling techniques.

17. "excluding one of the two variables that have a correlation above 0.8 (Supplementary file 4)" – why? explain the reasoning for exclusion.

The purpose of removing the highly correlated variables was to minimise the effect of multicollinearity, which could result in the overfitting of the model. This is the standard method in the field of ecological niche modelling and studies related to the environment variables (Coulibaly et al. 2022; Gao et al. 2021 and Zhao et al. 2019).

Coulibaly et al. 2022. Coupling genetic structure analysis and ecological-niche modeling in Kersting’s groundnut in West Africa. Scientific Reports 12:5590

Gao et al. 2021. Combined genotype and phenotype analyses reveal patterns of genomic adaptation to local environments in the subtropical oak Quercus acutissima. Journal of Systematics and Evolution 59(3):541-556

Zhao et al. 2019. Resequencing 545 ginkgo genomes across the world reveals the evolutionary history of the living fossil. Nature Communications 10:4201

18. Throughout the dataset has relied on "In total, our dataset contains more than one thousand accessions (1092) and covers worldwide diversity of mungbean representing a wide range of variation in seed colour" however at no point is there a discussion of whether these are modern variants of historic landraces and how this was assessed. This has a big impact on any discussion of "ancient" adaptations, and there must be a discussion of how you tested to see if the genetic changes you see are more recent or past changes and how the genetic clock was applied.

Thank you. Our materials were directly obtained from worldwide stock centers, which house germplasms mostly collected during the 20th century. The missions of these stock centers are the collection and conservation of local landraces instead of developing or breeding novel varieties, and therefore our materials have limited numbers of “breeder lines” from recent artificial crosses. Since we focused on the comparison among the four genetic groups, recent admixture within the same group would have limited impact on the among-group pattern, and cross-group admixed accessions were excluded from most analyses. Similarly, despite generations of admixture, human geneticists could use contemporary human DNA and phenotypes to infer the divergence time and adaptation among major worldwide populations, as long as recently admixed individuals were excluded. In addition, as we noted in Introduction, the majority of accessions with detailed collection site information (from the Vavilov Institute) were collected during the early 20th century, before large-scale seed exchanges among worldwide stock centers. As noted by Jones et al. (2008), "The efforts of collectors such as N. I. Vavilov have prevented the extinction of some of these local ecotypes.", our use of the Vavilov institute collection is therefore important.

We, however, acknowledge that genetic results from these materials collected during the 20th century may not reflect specific changes in specific time points in the past. This may be solved by ancient DNA analyses and is out of the scope of this current work. About divergence time, the fastsimcoal2 analysis was specifically used to address this issue while allowing potential gene flow among groups. Finally, we acknowledge that it may be difficult to estimate when the adaptive phenotypes originated. This may require specific modelling on whole-genome sequencing data and the identification of candidate genes, which would be out of the scope of the current study. Therefore, we used the descriptions in ancient Chinese texts about plant phenotypes as a supporting evidence to show some of these phenotypic characteristics appeared long time ago.

Jones et al. 2008. Approaches and constraints of using existing landrace and extant plant material to understand agricultural spread in prehistory. Plant Genetic Resources 6(2):98-112

19. As noted in the public review, a far more interesting aspect than trying to tie into domestication/post–domestication and chronological vagaries are the points made in lines 87–89: this data could be "used this resource to investigate the global history of mungbean after domestication […looking at the …] phenotypic characteristics for local adaptation to distinct environments." Thinking about the value of this dataset for the preservation of diversity, and how diversity links to localised adaptations today and to sustainable cropping now is critical, and I suggest this could be the way to reframe things.

Thank you. Please see our response to Reviewer 3, point 3.

[Editors’ note: what follows is the authors’ response to the second round of review.]

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined in the comments of Reviewer #1 below.

Reviewer #1:

I found this study to be one of the rare to combine genomic data, climate data, phenotypic data to decipher the diversity of adaptation of plants, and try to build up scenario of their diffusion. Previous recommendations were mainly answered, and I am personally satisfied with this new version of the manuscript. At this stage I recommend acceptance of the paper as soon as the concern is addressed.

Data availability:

I was not able to access the bioproject PRJN809503, is the data already available or not? Neither a biosample I try to access for a check.

Neither I was able to access DRYAD data.

Thank you for your valuable comments and suggestions. Regarding this comment specifically, all the sequences generated from this study was already deposited to NCBI (Bioproject PRJNA809503). As for phenotyping measurement data was also deposited to DRYAR. We can provide reviewer links (as in Materials Design Analysis Reporting, MDAR) as below:

NCBI:

https://dataview.ncbi.nlm.nih.gov/object/PRJNA809503?reviewer=g4jqbn30fhpacs7di7vgj2b3ok

DRYAD:

https://datadryad.org/stash/share/b5TO7fUNguxu0hMp6zx1cQZkBrCKzDmznCPvzf_CqEs

We will ensure the whole dataset is available to the public before publication online.

The authors provide a link to fastq data but I could not find a fastq file link in Noble et al. or Breria et al. How did the authors merge the data? If they merged the data based on Table S1 or Noble et al. why did they find more SNPs than Noble et al. with a 10% missing rate? Since some authors are common between these studies a clear path to access to the whole dataset should be available to the community.

We apologize for not describing this clearly in the manuscript. We re-preformed the SNP calling step for all accessions from Vavilov Institute (VIR), Australian Diversity Panel (ADP) (Noble et al., 2018) and the World Vegetable Center (WorldVeg) mini-core (Breria et al., 2020). Additionally, we included BioProject information to the published dataset by Noble et al. (2018) and Breria et al. (2020) under Data Availability (Line 643). The detailed information of the whole dataset as listed in Author response table 2:

Author response table 2.

Title Reference BioProject Status
Australian mungbean diversity panel collection – DArTseq Noble et al., 2018 PRJNA963182 Available
World Vegetable Center Mini Core Collection – DartSeq Breria et al., 2020 PRJNA645721 Available
Vavilov Institute (VIR) mungbean collection – DArTseq PRJNA809503 Release immediately following publication

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Ong P, Lin Y, Chen H, Lo C, Noble T, Nair R, Schafleitner R, Vishnyakova M, Bishop-von-Wettberg E, Samsonova M, Nuzhdin S, Ting C, Lee C. 2023. The climatic constrains of the historical global spread of mungbean. Dryad Digital Repository. [DOI] [PMC free article] [PubMed]
    2. Ong P, Lin Y, Chen H, Lo C, Noble T, Nair R, Schafleitner R, Vishnyakova M, Bishop-von-Wettberg E, Samsonova M, Nuzhdin S, Ting C, Lee C. 2023. Vavilov Institute (VIR) mungbean collection - DArTseq. NCBI BioProject. PRJNA809503
    3. Breria CM, Hsieh CH, Yen J-Y, Nair R, Lin C-Y, Huang S-M, Noble TJ, Schafleitner R. 2020. World Vegetable Center Mini Core Collection - DartSeq. NCBI BioProject. PRJNA645721
    4. Noble TJ, Tao Y, Mace ES, Williams B, Jordan DR, Douglas CA, Mundree SG. 2023. Australian mungbean diversity panel collection - DArTseq. NCBI BioProject. PRJNA963182 [DOI] [PMC free article] [PubMed]

    Supplementary Materials

    MDAR checklist
    Supplementary file 1. History of mungbean spread: genetic, environment, and traits data.

    (a) Mungbean accessions from Vavilov Institute (VIR) collection. (b) Outgroup f3 statistics among all possible combinations of genetic group pairs. (c) Admixture f3 statistics among all possible population trios. (d) Mantel tests for isolation by distance of inferred genetic group (Q≥0.5). (e) Description of bioclimatic variables used in ecological niche modeling. (f) Pearson’s correlation coefficient between pairs of bioclimatic variables (denoted in lower triangle). (g) Comparison of bioclimatic variables among the four genetic groups analyzed with multivariate ANOVA (MANOVA). (h) Summary of ANOVA for bioclimatic variables. (i) Correlation between eight bioclimatic variables and climatic PC axes 1–4. (j) Comparison of summer growing season data including temperature and precipitation of May, July, and September among the four genetic groups analyzed with MANOVA. (k) ANOVA table for all evaluated field traits (phenology, reproduction, and size) as well as drought-related traits. (l) Mean of eight bioclimatic variables of the genetic groups

    elife-85725-supp1.docx (94.4KB, docx)

    Data Availability Statement

    Sequences generated in this study are available under NCBI BioProject PRJNA809503. Accession names, GPS coordinates, and NCBI accession numbers of the Vavilov Institute accessions are available under Supplementary file 1a. Plant trait data are available at Dryad https://doi.org/10.5061/dryad.d7wm37q3h. Sequences and accession information of the World Vegetable Centre mini-core and the Australian Diversity Panel collections were obtained from the NCBI BioProject PRJNA645721 (Breria et al., 2020) and PRJNA963182 (Noble et al., 2018).

    The following datasets were generated:

    Ong P, Lin Y, Chen H, Lo C, Noble T, Nair R, Schafleitner R, Vishnyakova M, Bishop-von-Wettberg E, Samsonova M, Nuzhdin S, Ting C, Lee C. 2023. The climatic constrains of the historical global spread of mungbean. Dryad Digital Repository.

    Ong P, Lin Y, Chen H, Lo C, Noble T, Nair R, Schafleitner R, Vishnyakova M, Bishop-von-Wettberg E, Samsonova M, Nuzhdin S, Ting C, Lee C. 2023. Vavilov Institute (VIR) mungbean collection - DArTseq. NCBI BioProject. PRJNA809503

    The following previously published datasets were used:

    Breria CM, Hsieh CH, Yen J-Y, Nair R, Lin C-Y, Huang S-M, Noble TJ, Schafleitner R. 2020. World Vegetable Center Mini Core Collection - DartSeq. NCBI BioProject. PRJNA645721

    Noble TJ, Tao Y, Mace ES, Williams B, Jordan DR, Douglas CA, Mundree SG. 2023. Australian mungbean diversity panel collection - DArTseq. NCBI BioProject. PRJNA963182


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES