Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2019 Jul 26;9:10846. doi: 10.1038/s41598-019-47318-x

Evidence for selection events during domestication by extensive mitochondrial genome analysis between japonica and indica in cultivated rice

Lin Cheng 1, Kyu-Won Kim 2,, Yong-Jin Park 1,2,
PMCID: PMC6659709  PMID: 31350452

Abstract

The history of the domestication of rice is controversial, as it remains unknown whether domestication processes occurred once or multiple times. To date, genetic architecture and phylogenetic studies based on the rice nuclear genome have been extensively studied, but the results are quite different. Here, we found interesting results for different selections in Oryza sativa based on comprehensive studies of the rice mitochondrial (mt) genome. In detail, 412 rice germplasms were collected from around the world for variant architecture studies. A total of 10632 variants were detected in the mt genome, including 7277 SNPs and 3355 InDels. Selection signal (πw/πc) indicated that the selection sites in Oryza sativa L. ssp. japonica were different from those of Oryza sativa L. indica rice. The fixation index (FST) was higher between indica and japonica than between indica and wild rice. Moreover, haplotype and phylogenetic analyses also revealed indica clusters and japonica clusters that were well separated from wild rice. As mentioned above, our studies indicate that the selection sites of the indica type were different from those of the japonica type. This means that indica and japonica have experienced different domestication processes. We also found that japonica may have experienced a bottleneck event during domestication.

Subject terms: Evolutionary genetics, Agricultural genetics

Introduction

The domestication of rice is the process of transforming the natural selection process of wild characteristics into the stable desired traits from selection. For African rice, it is well established that Oryza glaberrima was independently domesticated from the wild rice Oryza barthii1. For Asian rice (Oryza sativa L.), although Oryza rufipogon is widely considered to be the ancestor of Asian rice, there is still controversy about the occurrence of single or multiple domestication processes2,3. Asian rice is mainly divided into two major varieties, namely, indica and japonica. Generally, the indica type usually shows thin and long grain and is planted in tropical Asia; japonica rice usually shows short and sticky grain and is planted at high altitudes in South Asia. Both indica and japonica are important food crops for nearly half of the global population4. Exploring the genetic information of these diverse varieties can provide deep insights into rice domestication and breeding.

One of the most basic and controversial issues regarding Asian rice is the number of times domestication occurred2. Traditionally, molecular markers (microsatellites) were used to study certain domesticated genes for domestication history5,6. Since these molecular markers represent some part of the rice genome, whole-genome sequences of rice were used to improve this situation7. However, for nuclear genomic studies of rice, the results are completely different due to introgression, bottleneck events or materials8,9. Generally, for single-domestication studies of rice, the ‘domesticated loci’ that exist in both indica and japonica provide strong evidence to support the single domestication of rice10,11. The large genomic differences and breeding barriers present in indica and japonica directly support multiple independent domestications9,12. Due to the influence of gene flow and bottleneck events, some hypotheses have emerged, such as a single domestication with multiple origins or single origin with multiple introgressions11,13,14. As mentioned above, the history of Asian rice, specifically whether Asian rice stems from a single domestication event or from multiple domestications, remains unknown. Therefore, exploring the genetic information of Oryza sativa is very important as such information may provide more evidence for the domestication of rice and many important insights into the breeding of elite varieties for sustainable agriculture.

Mitochondria are important plastids that provide energy for the growth and evolution of plants. For the mt genome, the genome size ranges from ~200 kb to 2 Mb mostly, and mitochondria have specific modes of gene expression in higher plants15. Since the first entire sequence of the rice mitochondrial genome (490,520 bp) was reported in 200216, mt analysis has been a powerful tool for us to understand the evolutionary history of rice due to the apparent lack of recombination, maternal inheritance, high copy number, and substitution rate1720. Although there are some detailed explorations about rice mitochondrial genome such as the variations of rice mitochondrial genome and the comparison between nuclear and chloroplast genome, it is rare to associate them with the domestication of rice1921. What’s more, most of these studies are limited to certain genes or to certain locations of the mitochondrial genome and do not provide evidence for comprehensive analysis. Therefore, we have used 412 rice varieties aiming to provide a comprehensive analysis of the mitochondrial genome to deepen our understanding of the rice genetic and evolutionary background.

Here, we conducted genetic variant analyses of 412 rice germplasms to investigate the evolutionary history of Oryza sativa. First, 358 Asian cultivated rice and 54 wild rice samples were collected from around the world to detect single nucleotide variants (SNPs), and insertions and deletions (InDels) based on the rice mitochondrial genome of Oryza japonica. Then, we used the selective sweep, FST, haplotype network and phylogenetic tree to comprehensively mine the genetic background of Asian rice. Our analysis focuses on the genetic architecture of the rice mt genome, which provides more insight into the evolutionary history of Oryza sativa.

Results

Variants in the mitochondrial genome

The accession information and genome sequencing of all samples are summarized in Supplementary Table S1. A total of 412 rice samples were collected from various parts of the world and sequenced with high average coverage (~16X), yielding ~3.42 TB of read data. The entire collection included 253 temperate japonica, 25 tropical japonica, 66 indica, 9 aus, 2 aromatic rice, 54 wild rice, and 3 admixture types. These germplasms were aligned to the reference mt genome of Oryza sativa japonica [NC_011033.1] for variant calling.

A total of 10,632 primary variants were identified from the rice mt genome, including 7,277 SNPs (68.4%) and 3,355 InDels (31.6%) (Table 1). Since the number of each subgroup is different, we also summarized the average of the variants for each sample (Supplementary Table S2). For all SNPs, transitions appeared most frequently, accounting for 65.3% of all SNPs, almost 2 times of transversions. The type of variant is also summarized, revealing that G/A and C/T seem to be more likely to appear in the mt genome, followed by A/T and T/C (Supplementary Fig. S1 and Supplementary Table S1). After filtering minor allele frequencies (MAFs) <0.01 and variants >20% missing calls, 2,159 high-quality (HQ) variants were obtained for subsequent statistical analysis22,23. For Asian rice, we detected a total of 755 HQ variants, with 75 HQ variants (9.9%) located in the open read frame (ORF) and 52 HQ variants (6.8%) located in the coding region. Among Oryza sativa, we found 49 common SNPs that appeared in 5 subgroups, which means that these SNPs are almost fixed in rice and may play important roles in the mitochondrial genome (Supplementary Fig. S2a). Furthermore, we detected 48 of the same SNPs in O. rufipogon and O. nivara compared with the 49 same SNPs in Oryza sativa (Supplementary Fig. S2b). This means that these common variants (48/49) of Oryza sativa may come from wild rice, and one mutation (1/49) appeared and became fixed due to the drive of selection during domestication. The variants’ distribution in the whole accession and different groups were also targeted based on the reference genome, revealing that wild rice has the highest variant, followed by indica (Fig. 1). Interestingly, these variants showed a cluster distribution in each subgroup, indicating that certain positions of the mitochondria are not allowed to change. This is consistent with a highly conserved mitochondrial genome.

Table 1.

Summary of the total and subgroup variants (SNPs and InDels) detected in 358 cultivated rice along with 54 wild rice samples collected from different countries around the world.

Summary Type Mta Variant Mta HQb Variant
SNPs 7,277 1,764
InDels 3,355 395
Total 10,632 2,159
Type No. of Accession Variant HQb Variant
SNPs InDels Total Ts/Tv SNPs InDels Total Ts/Tv
Subgroup Cultivated 358 1,437 508 1,945 1.956 646 109 755 2.091
Wild 54 6,746 3,122 9,868 1.884 1,625 348 1,973 1.559
Indica 66 1,000 383 1,383 1.985 545 99 644 2.187
Te_japonica 253 908 266 1,174 1.759 430 51 481 2.028
Tr_japonica 25 682 214 896 1.877 329 42 371 2.391
Aus 9 549 202 751 1.553 351 71 422 1.949
Aromatic 2 189 77 266 0.909 63 8 71 1.52
Admixture 3 662 177 839 1.669 314 37 351 2.048

Ts/Tv is the proportion of transition/transversion. Te_japonica: temperate japonica; Tr_japonica: tropical japonica.

Mta Variants: All mitochondrial genome variants in our study.

HQb Variants: High-quality variants. Here, we removed 80% of missing data and minor allele frequency (MAF) < 0.01.

Mta HQb Variants: High-quality variants of the mitochondrial genomes in our study.

Figure 1.

Figure 1

The band distribution of variants (SNPs and InDels) across the mitochondrial genome. The band position is depicted as the distance of the first variant of SNPs or InDels based on the reference genome of Nipponbare. (AF) Highlights marked on the circle map indicate the SNP and InDel positions. (A) The label name of each gene located in the mitochondrial genome based on the position of the reference genome. (B) Total variants detected among the 412 accessions. (C) Variants identified in the indica subgroup. (D) Variants identified in the temperate japonica type. (E) Variants identified in the tropical japonica type. (F) Variants in wild rice. The outside distance unit is kb. The number inside the brackets indicates the number of each accession. On account of space, not all genes are illustrated in the figure.

The evidence of different selection in Oryza sativa

In genetic analysis, different methods lead to different conclusions due to the presence of hybridization or introgression events8,9. Since mitochondria are highly conserved genomes of maternal inheritance, there is almost no genetic recombination through hybridization. Therefore, whether gene flow present between indica and japonica rice has an important impact on our subsequent analysis at the mt genome level. For recent gene flow, we analyzed the frequency of each group and their distribution based on a physical map of the reference genome7. In our results, we found that indica-specific sites (allele frequency >95% in indica) were different from japonica sites (allele frequency <5% in japonica), which means that there is no gene flow or introgression event in the rice mt genome (Supplementary Table S4). Therefore, our analysis of the genetic history of the rice mitochondrial genome is trustworthy. For rice domestication studies, we first examined the dN/dS ratio (nonsynonymous substitution rate/synonymous substitution rate) of Asian cultivated rice to calculate the evolution rate by coding region (Supplementary Fig. S3). A total of 75 genes were identified from all subgroups, and 23 genes exhibited positive selection. To identify specific positions of rice that were selected, we performed selective sweep analysis based on the diversity of rice in the mitochondrial genome. The diversity of the rice mitochondrial genome ranges from 3.7 × 10−5 to 2.0 × 10−2 (Fig. 1A) (Supplementary Table S5). Wild rice exhibited higher diversity than Asian cultivated rice (P < 0.01) (Fig. 2c). The diversity of subgroups was also analyzed based on the whole variations (SNPs and InDels), and japonica has a lower diversity compared to the other subgroups (Fig. 2b) (Supplementary Table S6). Based on the analysis of diversity, we used πwildcultivated of the top 5% cutoff of each Asian rice to determine selection sites (Supplementary Table S7). For 5% cutoff values, we detected a total of 8 selection sites, 4 selection sites for the indica type, and 4 sites for the japonica type. If indica and japonica were only domesticated once, they should be roughly similar in selection sites. Here, in 4000 bp cutoff areas, we only detected a 500 bp (12.5%) similar area between indica and japonica type. The selective sweep of RAiSD analysis was also conducted, which used μ statistics to detect positive selection based on multiple signatures24 (SFS, LD, and diversity) and SNPs (Supplementary Fig. S4). The results revealed that one region of japonica (100–150 kb) had experienced strong selection compared with indica.

Figure 2.

Figure 2

Nucleotide diversity and selection analysis of all accessions and subgroups. (a) Nucleotide diversity of all accessions. A 500 bp window size was used in this analysis. (b) Nucleotide diversity of subgroups. The sorted values were plotted in each group. Ind: indica; Niv: O. nivara; Ruf: O. rufipogon; Te_J; temperate japonica; Tr_J; tropical japonica. (c) Nucleotide diversity of cultivated rice and wild rice. (d) The reduction in nucleotides was calculated based on previous diversity analysis. The threshold of the top 5 percentile is indicated as a red dotted circle for indica and blue circle for japonica. The regions within the 2.5 percentile are considered candidate regions under selection. The genome position unit is kb.

FST, Tajima’s D test, PCA and MDS of populations

The fixation index (FST) was used to determine the degree of differentiation in Oryza sativa based on weighted methods25. Indica and japonica displayed higher FST values compared with wild rice in the mt genome (Fig. 3a). This finding also indicates indica and japonica rice may have reproductive barriers, although the fertility of hybrids varies from individuals26. For Tajima’s D value, temperate japonica and tropical japonica had a similar curve, and indica was shown the different curve in some part of rice mt genome compared with japonica (Fig. 3b). Principal component analysis (PCA) and multidimensional scaling (MDS) discriminated two statistically different groups of Asian cultivated rice (indica and japonica) (Fig. 3c,d). As described above, these findings indicated that mainly Asian rice indica and japonica may have far genetic distances and different genetic backgrounds.

Figure 3.

Figure 3

FST, Tajima’s D test, principal component analysis and multidimensional scaling of populations. (a) The FST value between Asian rice and wild rice; the circle size displayed the diversity of each group. The FST value between each group was marked by the length of each line. R: O. rufipogon, Tr: tropical japonica, I: indica, Te: temperate japonica, N: O. nivara. (b) Tajima’s D values in subgroups based on the rice mitochondrial genome. (c) Principle component analysis of indica and japonica. (d) Multidimensional scaling plots of indica and japonica.

Haplotype network, population structure, and phylogenetic tree

A total of 85 haplotypes were detected from 412 rice samples by DnaSP v6 based on high-quality variations27. Among these haplotypes, 38 haplotypes and 47 haplotypes were found in Asian rice and wild rice, respectively. In Asian rice, indica exhibited 31 haplotypes, whereas japonica only exhibited 4 haplotypes. If indica and japonica were domesticated once, they would have very similar haplotypes. However, we did not identify any shared haplotypes between these two subgroups at the mt level (Fig. 4a). Moreover, population structure from K = 2 to K = 7 were used to entirely distinguish the individual subgroups among the entire collection. To more accurately determine the structure, K = 5 was estimated by ChooseK.py in fastStructure (Fig. 4b). For K = 5, although japonica and wild rice are mixed together, we found a clear separation of indica and tropical japonica. We also found that the same composition exists in indica and japonica (purple color), but this composition was also found in wild rice and does not provide evidence for indica-specific or japonica-specific structure. This same structure could be obtained independently from the wild rice during a separate domestication28. To accurately assess the domestication relationship, we used all HQ SNPs to construct a phylogenetic tree using the Bayesian inference method. If Asian rice was only domesticated once, a tree with these two subpopulations as mixed or sister taxa should be most strongly supported29. However, in our results, japonica and indica types were clearly separated from wild rice (Fig. 4b). The archaeological evidence of Oryza sativa (>9,000 years) in India and China30,31 also exhibited independent domestication of Asian rice. As described above, these results demonstrate that indica and japonica may have a distinct genetic background, which supports the concept of multiple independent domestications of Asian rice.

Figure 4.

Figure 4

The haplotype network, population structure and phylogenetic tree of 412 rice accessions. (a) The haplotype network of 412 rice accessions. Here, different colors represent different populations, and circle size represents the number of samples. (b) Population structure and phylogenetic tree are displayed using a rectangular cladogram. A: indica; B: tropical japonica; C: temperate japonica; D: aromatic, aus and admixture type; E: wild rice.

Discussion

The domestication history of Oryza sativa is complex. Although numerous studies on the origin of Asian rice have been conducted, results regarding whether single or multiple domestications occurred are still controversial3235. Hybridization and gene flow in the natural state are two important factors affecting rice origin studies3638. Hybridization is the critical step that brings together the high-quality features of the parents, thus disrupting the unique components of different subspecies for evolutionary studies3941. Gene flow is the selection of genes from one species and the movement of such genes to the gene pool of another species13,42. Gene flow results in the genetic differentiation of local populations and plays an important role in genetic studies of specific loci in subgroups28. Civáň et al. (2018) argued that there are some potential alleles that moved to other populations by introgression events in rice, which have a critical impact on distinguishing and understanding the real history of Oryza sativa28. Wang et al. (2017) demonstrated that the different conclusions from rice genome analyses are due to extensive, continuous gene flow from cultivated rice to wild rice35. Fortunately, the mt genome is maternally inherited, and almost no genetic recombination occurs in the natural state, which provides pure and trustworthy materials for phylogenetic studies42,43. We did not detect any introgression signals between indica and japonica at the mt level based on statistical analysis of allele frequency7. Therefore, our mt genome architecture with high-quality variants is useful for solving contradictions in the domestication of Asian rice.

In the evolutionary history of rice, Huang et al. (2016) argued that japonica experienced a strong bottleneck event and that the cutoff of πwc should be accurate for collocated low-diversity genomic region (CLDGR) detection13. Based on this, we used the top 5 percentile of genetic diversity for a better selective sweep investigation. We detected a strong selection signal present in japonica rather than indica. In recent articles, the strong bottleneck effect was also revealed in japonica by genome-based position and the magnitude of selective sweeps13,44,45. Our selective sweep results of the rice mt genome were consistent with previous reports that performed chloroplast genome and nuclear analysis, which demonstrated that bottleneck event occurred in japonica during domestication. The comparison of the specific low diversity of a particular group may not necessarily mean independent domestication, as some selection pressures lead to areas of low diversity that may be adaptable to the local environment after the separation of Asian rice. However, if all cultivated rice came from a single domestication, the selective sweep during this event is expected to generate some of the same curves in subspecies. In our results, the selective sweep site in japonica was different from that of indica. Principal component analysis and population structure also confirmed this finding, showing distinct genetic information based on high-quality (HQ) SNPs. Furthermore, the phylogenetic analysis revealed 2 clusters of indica and japonica from wild rice. As described above, this means that japonica and indica may have been selected differently during domestication.

Methods

Samples and resequencing

A heuristic set containing 358 rice accessions with 3 types of accessions (landraces, weedy, and bred) previously generated from worldwide varieties collected from the National GeneBank of the Rural Development Administration (RDA-Genebank, Republic of Korea) using the program PowerCore46 was selected for whole-genome resequencing47. In addition, 54 wild rice accessions were obtained from the International Rice Research Institute (IRRI) in 2017.

For the 358 Asian rice and 54 wild rice accessions from our database, plants were planted in a soft field with enough water. After checking the heading date (approximately 13 days), young leaves were sampled from one plant and stored at –80 °C prior to genomic DNA extraction using the DNeasy Plant Mini Kit (Qiagen). Qualified DNA was used for whole-genome resequencing of the collected rice varieties with an average coverage of approximately 16X on the Illumina HiSeq. 2000 Sequencing Systems Platform.

Variant calling and data management

The assembly process included data preparation, filtering, mapping, sorting, and variant calling. First, the index was processed by Burrows-Wheeler Alignment v 0.7.15 (BWA)48, Samtools v1.3.149 and Picard v 2.14 (http://broadinstitute.github. io/picard/) before variant calling. Second, raw data were aligned to the Nipponbare mt genome sequence (https://www.ncbi.nlm.nih.gov/nuccore/NC_011033.1) using BWA. A sequence alignment map (SAM) file was created during mapping and converted to a binary SAM (BAM) file with sorting. Then, removal of duplicates and the addition of reading group IDs were performed using Picard Tools. Final realignment and identification of variants were performed using GATK v 3.7. Statistical analyses were applied to summarize the number and distribution of variants based on the Haplotype Map (HapMap) file generated from the VCF file. Default settings were used for most software and tools.

Statistical analysis and PCA

Statistical analyses of nucleotide diversity (π) and the fixation index (FST) were conducted using Vcftools v 0.1.1550 with a 1000-bp slide window and 500-bp steps for all collections and individuals. The FST value was used to determine the degree of population differentiation. The significance of diversity in the group was assessed using t-tests. For introgression event analysis, we followed Zhao’s method51. Generally, highly differentiated alleles of SNP loci were identified among indica, temperate japonica, and tropical japonica. SNP loci had an allele frequency greater than 0.95 in temperate indica and less than 0.05 in indica. At the SNP locus, allele information (indica-specific type, temperate-indica-specific type or tropical-japonica-specific type) of each accession was called across the mitochondrial genome. For each accession, the size of the introgression fragment in the genome was determined to estimate the proportion of potential introgression events. The selection effect of the geographic population was generated using Bottleneck v 1.2.0252,53 according to the allele frequency of each site. Regarding the reliability of the results for the detection of population bottleneck effects, minor allele frequencies <0.05 were removed from our data. To evaluate the relationship and population structure, PCA and MDS were conducted using TASSEL5 based on high quality SNPs to provide basic evidence of the population structure. Data were displayed with different groups and colors using the R package ggplot2 (https://cran.r-project.org/web/packages/ggplot2/index.html).

Haplotype network and dN/dS ratios

The TCS54 haplotype network was generated using PopART v 1.755. First, we used a python script to make FASTA data from the vcf file. Then, FASTA data alignment and transformation to nex format was performed using MEGA7. DnaSP v627 was employed for haplotype analysis (Supplementary Table S9). For dN/dS analyses, all orthologous mt genes from 23 species were aligned to the paml format using prank56. Gblocks v 0.91b57 was applied to eliminate the conservation area of the ML tree (MEGA7). The maximum likelihood method of codeml of PAML v 4.9h58 was used to estimate the ω ratio with F3X4 codon frequencies. The branch test of the null hypothesis (model = 0, NSsites = 0) was used for a single ω across branches, and the model alternative hypothesis (model = 2, NSsites = 0) was used for ω per branch site. The likelihood ratio test (LRT) was used to identify accelerated genes in the rice group. Here, indica and japonica were assigned as foreground branches, and other accessions were assigned as background branches. Genes with ω > 5 were removed because they were considered outliers59.

Population structure and evolution research

Briefly, fastStructure v 1.060 was used to investigate population clusters. InDels were removed from all high-quality (HQ) variants to obtain SNP only vcf file. Given increased K values ranging from 2 to 7, the subpopulation of an individual ancestry could be completely investigated. Bayesian inference methods were applied to construct a phylogenetic tree for the 412 accessions based on the HQ variants. After removing missing data and gaps from whole positions, the phylogenetic tree of evolutionary history was conducted by MrBayes v3.2.761 with the best nucleotide parameter (TVM + G) estimated by detection from 88 models with the software of JModelTest v 2.1.1062, with 1000 replicates and 6 categories.

Supplementary information

Supplementary Materials (325.3KB, pdf)
Dataset 1 (1.2MB, zip)

Acknowledgements

This work was carried out with the support of the “Cooperative Research Program for Agriculture Science and Technology Development (Project No. PJ013405)” Rural Development Administration, Republic of Korea. This work was supported by a National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2017R1A2B3011208).

Author Contributions

Y.P. conceived the project and oversaw writing the manuscript. Y.P., K.K. and L.C. developed the idea and edited and revised the manuscript. L.C. performed the data analysis and wrote the manuscript. All authors have read and approved the final manuscript.

Data Availability

The datasets supporting the conclusions of this article are included within the article and its additional files. In addition, the raw VCF file generated from current 412 rice accessions were also deposited in the European Variant Archive Database under Project ID: PRJEB31784.

Competing Interests

The authors declare no competing interests.

Footnotes

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Kyu-Won Kim, Email: kyuwonkim@kongju.ac.kr.

Yong-Jin Park, Email: yjpark@kongju.ac.kr.

Supplementary information

Supplementary information accompanies this paper at 10.1038/s41598-019-47318-x.

References

  • 1.Wang M, et al. The genome sequence of African rice (Oryza glaberrima) and evidence for independent domestication. Nature genetics. 2014;46:982. doi: 10.1038/ng.3044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Londo JP, Chiang Y-C, Hung K-H, Chiang T-Y, Schaal BA. Phylogeography of Asian wild rice, Oryza rufipogon, reveals multiple independent domestications of cultivated rice, Oryza sativa. Proceedings of the National Academy of Sciences. 2006;103:9578–9583. doi: 10.1073/pnas.0603152103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Molina J, et al. Molecular evidence for a single evolutionary origin of domesticated rice. Proceedings of the National Academy of Sciences. 2011;108:8351–8356. doi: 10.1073/pnas.1104686108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Khush GS. What it will take to feed 5.0 billion rice consumers in 2030. Plant molecular biology. 2005;59:1–6. doi: 10.1007/s11103-005-2159-5. [DOI] [PubMed] [Google Scholar]
  • 5.Shomura A, et al. Deletion in a gene associated with grain size increased yields during rice domestication. Nature genetics. 2008;40:1023. doi: 10.1038/ng.169. [DOI] [PubMed] [Google Scholar]
  • 6.Garris AJ, Tai TH, Coburn J, Kresovich S, McCouch S. Genetic structure and diversity in Oryza sativa L. Genetics. 2005;169:1631–1638. doi: 10.1534/genetics.104.035642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wang W, et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature. 2018;557:43. doi: 10.1038/s41586-018-0063-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Huang X, et al. A map of rice genome variation reveals the origin of cultivated rice. Nature. 2012;490:497. doi: 10.1038/nature11532. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Civáň P, Craig H, Cox CJ, Brown TA. Three geographically separate domestications of Asian rice. Nature plants. 2015;1:15164. doi: 10.1038/nplants.2015.164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Gao L-z, Innan H. Nonindependent domestication of the two rice subspecies, Oryza sativa ssp. indica and ssp. japonica, demonstrated by multilocus microsatellites. Genetics. 2008;179:965–976. doi: 10.1534/genetics.106.068072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Choi JY, Platts AE, Fuller DQ, Wing RA, Purugganan MD. The rice paradox: multiple origins but single domestication in Asian rice. Molecular biology and evolution. 2017;34:969–979. doi: 10.1093/molbev/msx049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Sang T, Ge S. Genetics and phylogenetics of rice domestication. Current opinion in genetics & development. 2007;17:533–538. doi: 10.1016/j.gde.2007.09.005. [DOI] [PubMed] [Google Scholar]
  • 13.Huang X, Han B. Rice domestication occurred through single origin and multiple introgressions. Nature plants. 2016;2:15207. doi: 10.1038/nplants.2015.207. [DOI] [PubMed] [Google Scholar]
  • 14.Choi JY, Purugganan MD. Multiple origin but single domestication led to Oryza sativa. G3: Genes, Genomes, Genetics. 2018;8:797–803. doi: 10.1534/g3.117.300334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Schuster W, Brennicke A. The plant mitochondrial genome: physical structure, information content, RNA editing, and gene migration to the nucleus. Annual review of plant biology. 1994;45:61–78. doi: 10.1146/annurev.pp.45.060194.000425. [DOI] [Google Scholar]
  • 16.Notsu Y, et al. The complete sequence of the rice (Oryza sativa L.) mitochondrial genome: frequent DNA sequence acquisition and loss during the evolution of flowering plants. Molecular Genetics and Genomics. 2002;268:434–445. doi: 10.1007/s00438-002-0767-1. [DOI] [PubMed] [Google Scholar]
  • 17.Ingman M, Kaessmann H, Pääbo S, Gyllensten U. Mitochondrial genome variation and the origin of modern humans. Nature. 2000;408:708. doi: 10.1038/35047064. [DOI] [PubMed] [Google Scholar]
  • 18.Lang BF, Gray MW, Burger G. Mitochondrial genome evolution and the origin of eukaryotes. Annual review of genetics. 1999;33:351–397. doi: 10.1146/annurev.genet.33.1.351. [DOI] [PubMed] [Google Scholar]
  • 19.Tian X, Zheng J, Hu S, Yu J. The rice mitochondrial genomes and their variations. Plant Physiology. 2006;140:401–410. doi: 10.1104/pp.105.070060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Sun C, Wang X, Yoshimura A, Doi K. Genetic differentiation for nuclear, mitochondrial and chloroplast genomes in common wild rice (Oryza rufipogon Griff.) and cultivated rice (Oryza sativa L.) Theoretical and Applied Genetics. 2002;104:1335–1345. doi: 10.1007/s00122-002-0878-4. [DOI] [PubMed] [Google Scholar]
  • 21.Mun J, Song Y, Heong K, Roderick G. Genetic variation among Asian populations of rice planthoppers, Nilaparvata lugens and Sogatella furcifera (Hemiptera: Delphacidae): mitochondrial DNA sequences. Bulletin of Entomological Research. 1999;89:245–253. doi: 10.1017/S000748539900036X. [DOI] [Google Scholar]
  • 22.Zeggini E, et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nature genetics. 2008;40:638. doi: 10.1038/ng.120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Troyanskaya O, et al. Missing value estimation methods for DNA microarrays. Bioinformatics. 2001;17:520–525. doi: 10.1093/bioinformatics/17.6.520. [DOI] [PubMed] [Google Scholar]
  • 24.Alachiotis N, Pavlidis P. RAiSD detects positive selection based on multiple signatures of a selective sweep and SNP vectors. Communications biology. 2018;1:79. doi: 10.1038/s42003-018-0085-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. evolution. 1984;38:1358–1370. doi: 10.1111/j.1558-5646.1984.tb05657.x. [DOI] [PubMed] [Google Scholar]
  • 26.Harushima Y, Nakagahra M, Yano M, Sasaki T, Kurata N. Diverse variation of reproductive barriers in three intraspecific rice crosses. Genetics. 2002;160:313–322. doi: 10.1093/genetics/160.1.313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Rozas J, et al. DnaSP 6: DNA sequence polymorphism analysis of large data sets. Molecular biology and evolution. 2017;34:3299–3302. doi: 10.1093/molbev/msx248. [DOI] [PubMed] [Google Scholar]
  • 28.Civáň P, Brown TA. Role of genetic introgression during the evolution of cultivated rice (Oryza sativa L.) BMC evolutionary biology. 2018;18:57. doi: 10.1186/s12862-018-1180-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Yang C-c, et al. Independent domestication of Asian rice followed by gene flow from japonica to indica. Molecular biology and evolution. 2011;29:1471–1479. doi: 10.1093/molbev/msr315. [DOI] [PubMed] [Google Scholar]
  • 30.Liu L, Lee G-A, Jiang L, Zhang J. Evidence for the early beginning (c. 9000 cal. BP) of rice domestication in China: a response. The Holocene. 2007;17:1059–1068. doi: 10.1177/0959683607085121. [DOI] [Google Scholar]
  • 31.Fuller DQ, Allaby RG, Stevens C. Domestication as innovation: the entanglement of techniques, technology and chance in the domestication of cereal crops. World archaeology. 2010;42:13–28. doi: 10.1080/00438240903429680. [DOI] [Google Scholar]
  • 32.Cheng C, et al. Polyphyletic origin of cultivated rice: based on the interspersion pattern of SINEs. Molecular Biology and Evolution. 2003;20:67–75. doi: 10.1093/molbev/msg004. [DOI] [PubMed] [Google Scholar]
  • 33.Lin Z, et al. Origin of seed shattering in rice (Oryza sativa L.) Planta. 2007;226:11–20. doi: 10.1007/s00425-006-0460-4. [DOI] [PubMed] [Google Scholar]
  • 34.Zhu Q, Ge S. Phylogenetic relationships among A‐genome species of the genus Oryza revealed by intron sequences of four nuclear genes. New Phytologist. 2005;167:249–265. doi: 10.1111/j.1469-8137.2005.01406.x. [DOI] [PubMed] [Google Scholar]
  • 35.Wang H, Vieira FG, Crawford JE, Chu C, Nielsen R. Asian wild rice is a hybrid swarm with extensive gene flow and feralization from domesticated rice. Genome research. 2017;27:1029–1038. doi: 10.1101/gr.204800.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Wilson Hugh D. Artificial Hybridization Among Species of Chenopodium sect. Chenopodium. Systematic Botany. 1980;5(3):253. doi: 10.2307/2418372. [DOI] [Google Scholar]
  • 37.Motley TJ, Carr GD. Artificial hybridization in the Hawaiian endemic genus Labordia (Loganiaceae) American Journal of Botany. 1998;85:654–660. doi: 10.2307/2446534. [DOI] [PubMed] [Google Scholar]
  • 38.Rousset F. Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics. 1997;145:1219–1228. doi: 10.1093/genetics/145.4.1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Wilson, H. D. J. S. B. Artificial hybridization among species of Chenopodium sect. Chenopodium. 253–263 (1980).
  • 40.Motley, T. J. & Carr, G. D. J. A. J. O. B. Artificial hybridization in the Hawaiian endemic genus Labordia (Loganiaceae). 85, 654–660 (1998). [PubMed]
  • 41.Harrison RG, Larson EL. Hybridization, introgression, and the nature of species boundaries. Journal of Heredity. 2014;105:795–809. doi: 10.1093/jhered/esu033. [DOI] [PubMed] [Google Scholar]
  • 42.Freeman, S. & Herron, J. C. Evolutionary analysis. (Pearson Prentice Hall Upper Saddle River, N. J., 2007).
  • 43.Lonsdale D, Brears T, Hodge T, Melville SE, Rottmann W. The plant mitochondrial genome: homologous recombination as a mechanism for generating heterogeneity. Philosophical Transactions of the Royal Society of London. B, Biological Sciences. 1988;319:149–163. doi: 10.1098/rstb.1988.0039. [DOI] [Google Scholar]
  • 44.Civán P, Craig H, Cox CJ, Brown TA. Multiple domestications of Asian rice. Nature plants. 2016;2:16037. doi: 10.1038/nplants.2016.37. [DOI] [PubMed] [Google Scholar]
  • 45.Vigueira CC, et al. Call of the wild rice: Oryza rufipogon shapes weedy rice evolution in Southeast Asia. Evolutionary applications. 2019;12:93–104. doi: 10.1111/eva.12581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Kim K-W, et al. PowerCore: a program applying the advanced M strategy with a heuristic search for establishing core sets. Bioinformatics. 2007;23:2155–2162. doi: 10.1093/bioinformatics/btm313. [DOI] [PubMed] [Google Scholar]
  • 47.Kim T-S, et al. Genome-wide resequencing of KRICE_CORE reveals their potential for future breeding, as well as functional and evolutionary studies in the post-genomic era. BMC genomics. 2016;17:408. doi: 10.1186/s12864-016-2734-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Li H, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Danecek P, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–2158. doi: 10.1093/bioinformatics/btr330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Zhao Q, et al. Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nature genetics. 2018;50:278. doi: 10.1038/s41588-018-0041-z. [DOI] [PubMed] [Google Scholar]
  • 52.Cornuet JM, Luikart G. Description and power analysis of two tests for detecting recent population bottlenecks from allele frequency data. Genetics. 1996;144:2001–2014. doi: 10.1093/genetics/144.4.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Piry S, Luikart G, Cornuet J. BOTTLENECK: a computer program for detecting recent reductions in the effective population size using allele frequency data. Journal of heredity. 1999;90:502–503. doi: 10.1093/jhered/90.4.502. [DOI] [Google Scholar]
  • 54.Clement M, Posada D, Crandall KA. TCS: a computer program to estimate gene genealogies. Molecular ecology. 2000;9:1657–1659. doi: 10.1046/j.1365-294x.2000.01020.x. [DOI] [PubMed] [Google Scholar]
  • 55.Leigh JW, Bryant D. popart: full-feature software for haplotype network construction. Methods in Ecology and Evolution. 2015;6:1110–1116. doi: 10.1111/2041-210X.12410. [DOI] [Google Scholar]
  • 56.Löytynoja, A. In Multiple sequence alignment methods 155–170 (Springer, 2014).
  • 57.Talavera G, Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Systematic biology. 2007;56:564–577. doi: 10.1080/10635150701472164. [DOI] [PubMed] [Google Scholar]
  • 58.Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Molecular biology and evolution. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  • 59.Castillo-Davis CI, Hartl DL, Achaz G. cis-Regulatory and protein evolution in orthologous and duplicate genes. Genome research. 2004;14:1530–1536. doi: 10.1101/gr.2662504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Raj A, Stephens M, Pritchard JK. fastSTRUCTURE: variational inference of population structure in large SNP data sets. Genetics. 2014;197:573–589. doi: 10.1534/genetics.114.164350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 2001;17:754–755. doi: 10.1093/bioinformatics/17.8.754. [DOI] [PubMed] [Google Scholar]
  • 62.Darriba D, Taboada GL, Doallo R, Posada D. jModelTest 2: more models, new heuristics and parallel computing. Nature methods. 2012;9:772–772. doi: 10.1038/nmeth.2109. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Materials (325.3KB, pdf)
Dataset 1 (1.2MB, zip)

Data Availability Statement

The datasets supporting the conclusions of this article are included within the article and its additional files. In addition, the raw VCF file generated from current 412 rice accessions were also deposited in the European Variant Archive Database under Project ID: PRJEB31784.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES