Skip to main content
PLOS Genetics logoLink to PLOS Genetics
. 2022 Aug 30;18(8):e1010141. doi: 10.1371/journal.pgen.1010141

Diversity and determinants of recombination landscapes in flowering plants

Thomas Brazier 1, Sylvain Glémin 1,2,¤,*
Editor: Ian R Henderson3
PMCID: PMC9467342  PMID: 36040927

Abstract

During meiosis, crossover rates are not randomly distributed along the chromosome and their location may have a strong impact on the functioning and evolution of the genome. To date, the broad diversity of recombination landscapes among plants has rarely been investigated and a formal comparative genomic approach is still needed to characterize and assess the determinants of recombination landscapes among species and chromosomes. We gathered genetic maps and genomes for 57 flowering plant species, corresponding to 665 chromosomes, for which we estimated large-scale recombination landscapes. We found that the number of crossover per chromosome spans a limited range (between one to five/six) whatever the genome size, and that there is no single relationship across species between genetic map length and chromosome size. Instead, we found a general relationship between the relative size of chromosomes and recombination rate, while the absolute length constrains the basal recombination rate for each species. At the chromosome level, we identified two main patterns (with a few exceptions) and we proposed a conceptual model explaining the broad-scale distribution of crossovers where both telomeres and centromeres play a role. These patterns correspond globally to the underlying gene distribution, which affects how efficiently genes are shuffled at meiosis. These results raised new questions not only on the evolution of recombination rates but also on their distribution along chromosomes.

Author summary

Meiotic recombination is a universal feature of sexually reproducing species. During meiosis, crossovers play a fundamental role for the proper segregation of chromosomes during meiosis and reshuffles alleles among chromosomes. How much variation in recombination is expected within a genome and among different species remains a central question for understanding the evolution of recombination. We characterized and compared recombination landscapes in a large set of plant species with a wide range of genome size. We found that the number of crossovers varied little among species, from one mandatory to no more than five or six crossovers per chromosomes, whatever the genome size. However, we identified two main patterns of variation along chromosomes (with a few exceptions) that can be explained by a new conceptual model where chromosome length, chromosome structure and gene density play a role. The strong association between gene density and recombination was already known, but raised new questions not only about the evolution of recombination rates but also on their distribution along chromosomes.

Introduction

Meiotic recombination is a universal feature of sexually reproducing species. New haplotypes are passed on to offspring by the reciprocal exchange of DNA between maternal and paternal chromosomes, known as crossovers (COs). However, recombination landscapes—the variation in recombination rates along the chromosome—are not homogeneous across the genome and vary among species [14]. Meiotic recombination involves chiasmata at pairing sites between homologous chromosomes to ensure the physical tension needed for the proper disjunction of homologs [1,3,5]. Recombination also plays an evolutionary role by breaking linkage disequilibrium between neighbouring sites and creating new genetic combinations transmitted to the next generation, making selection on individual genetic variant more efficient [68]. The number and location of crossovers along the chromosome are finely regulated through mechanisms of crossover assurance, interference and homeostasis [9,10]. In most species, crossover assurance is necessary to achieve proper segregation and to avoid deleterious consequences of nondisjunction, though it is not very clear whether at least one CO per chromosome or per arm is required. Additional COs are also usually regulated through interference, ensuring that they are not too numerous and not too close to each other [10,11]. In addition to regulation on a large scale [12,13], recombination is also finely tuned on a small scale. In plants studied so far, crossovers are concentrated in very short genomic regions (typically a few kb), i.e. recombination hotspots, which have been found in gene regulatory regions, and mostly in promoters [1416].

In addition to their function in meiosis, variations in recombination rates affect genome structure, functioning and evolution through direct effects–such as mutagenic effects, bias-conversion, and ectopic exchanges–and indirect effects by modulating the efficacy of selection [17], and it has become a challenge to integrate recombination rate variation in population genomics in the age of ‘genomic landscapes’ [18,19]. The characterization of recombination landscapes also has practical interests as variation in meiotic genes could be used to experimentally manipulate CO patterns for purposes, such as redirecting recombination towards regions of interest for crop breeding [20].

In plants, recombination rates are believed to be higher in species with smaller genomes because the linkage map length is independent of genome size and the number of chromosomes explain more variation than genome size [4]. Several broad-scale determinants have recently been identified, such as chromosome length [21], distance to the telomere or centromere [22] and genomic and epigenetic features [16,23,24], notably the density of transposable elements (TEs), which is usually negatively correlated with recombination rates [25]. Plant genomes also contain large regions with suppressed recombination in various proportions (from a few Mb to hundreds of Mb, 1 to 75% of the genome). However, the diversity of recombination landscapes in plants still remain to be properly quantified.

Recently, a meta-analysis explored large-scale recombination landscapes among eukaryotes and paved the way for identifying general patterns [2]. They found that larger chromosomes have low crossover rates in their centre and suggested a simple telomere-led model with a universal bias of COs towards the periphery of the chromosome, positively correlated with chromosome length. They also proposed that chromosome length played the main role in crossover patterning while position of the centromere had almost no effect (except locally). Alternatively, it has also been proposed that both telomeres and centromeres shape recombination landscapes [26] and a universal pattern among plants has been questioned [13]. As only a limited number of species has been studied and as plant genomes are highly diverse in many ways [27,28], diversity in recombination landscapes may have been overlooked [29]. In addition, previous studies were meta-analyses combining heterogeneous datasets (ex: mix of inferred data from graphics, final processed data and only a few raw datasets in [2]) without a standard way to infer recombination maps, which prevented detailed comparisons among species.

To overcome these limitations we gathered the largest recombination landscape dataset in flowering plants, to the best of our knowledge. We started from raw data by combining genetic mapping from pedigree data and chromosome scale genome assemblies, from which we estimated recombination maps–more precisely the sex-averaged rate of COs along chromosomes–using the same standardised method in all species, in order to ask the following questions. What is the range of COs per chromosome in plants? Is the distribution of COs shaped by genome structure (i.e. chromosome size, telomeres, centromeres) and if so is there a universal pattern? Since recombination is negatively associated with TEs and recombination hotspots have been found in gene regulatory, are recombination landscapes always associated with gene density? What are the consequences of recombination heterogeneity on the extent of genetic shuffling? Overall, we found that recombination landscapes in plants are more diverse and more complex than previously thought. We identified two main patterns that are correlated with, and which may emerge from, the gene density distribution. We show that the positive association between gene density and recombination rates globally improves the genetic shuffling of coding regions, which raises new questions about the evolution of recombination.

Results

Dataset and recombination maps

We retrieved publicly available data for sex-averaged linkage maps and genome assemblies to obtain genetic and physical distances. We selected linkage maps for which the markers had genomic positions on a chromosome-level genome assembly (except for Capsella rubella, which had a high-quality scaffold-level assembly of pseudo-chromosomes). We remapped markers on the reference genome for 14 species for which genomic positions were not known or were mapped to an older assembly. After making a selection based on the number of markers, marker density, and genome coverage, and after filtering out the outlying markers (see methods), we produced 665 chromosome-scale Marey maps (plots of genetic vs physical distances, expressed as cM vs Mb) for 57 species (2–26 chromosomes per species, S1 and S2 Tables and S1 and S2 Figs). The number of markers per chromosome ranged from 31 to 49,483, with a mean of 956 markers. Correcting the linkage map lengths (Hall & Willis’s method) did not change the total linkage map lengths (mean difference = 1.19 cM, max difference = 5.62 cM), giving confidence in the coverage of the linkage map [30]. We verified that neither the number of markers, marker density nor the number of progenies had a significant effect on the analyses (S3 Fig). We also retrieved gene annotations for 41 genomes. The angiosperm phylogeny was well represented in our sampling (S4 Fig), with a basal angiosperm species (Nelumbo nucifera), 15 monocot species and 41 eudicots. From the literature, we also obtained data on the centromeric index for 37 species, defined as the ratio of the short arm length divided by the total chromosome length (S3 Table).

From the Marey maps, we estimated local recombination rates along the chromosomes in non-overlapping 100 kb windows, and their 95% confidence intervals (1,000 bootstraps). Estimates at a scale of 1 Mb yielded very similar results (the Spearman rank correlation coefficient correlation between the two estimates was Rho = 0.99, p < 0.001, S4 Table) therefore only 100 kb landscapes were analysed in the subsequent analyses.

Smaller chromosomes have higher recombination rates than larger ones

Mean genome-wide recombination rates spanned two orders of magnitude, ranging from 0.2 cM/Mb in bread wheat (Triticum aestivum) to 16.9 cM/Mb in squash (Cucurbita maxima). In agreement with previous studies [2,4], we found a significant negative correlation between chromosome size (Mb) and the mean chromosomal recombination rate (Spearman rank correlation coefficient Rho = -0.84, p < 0.001; log-log Linear Model, adjusted R2 = 0.83, p < 0.001). Most species had, on average, between one and four COs per chromosome although the genome sizes span almost two orders of magnitude. Less than 2% of chromosomes had less than one CO on average (n = 11). 234 chromosomes had between one and two COs on average, suggesting that a single CO per chromosome is sufficient, though 419 chromosomes had more than two COs. However, as genetic maps are based on the average of several meiosis, they do not give access to the distribution of COs per meiosis. Thus, it is worth noting that chromosome genetic maps higher than 50 cM do not imply that all chromosome always exhibit at least one CO.

Using a Linear Mixed Model we found a significant species random effect that explained 82% of the variance (log10(recombination rate) ~ log10(chromosome size) + (1 | species), marginal R2 = 0.17, conditional R2 = 0.96, p < 0.001). Adding phylogenetic covariance did not improve the mixed model, so we did not retain a phylogenetic effect (S5 Table). Interestingly, the (log-log) relationship between the recombination rate and chromosome size was not the same within and between species, suggesting that absolute chromosome size does not have the same effect in all species (Fig 1B). Similarly, the relationship between linkage map length (cM) and chromosome size (Mb) was highly species-specific (linkage map length ~ log10(chromosome size) + (1 | species), marginal R2 = 0.49, conditional R2 = 0.99, p < 0.001) (Fig 2A), with species slopes decreasing with the mean chromosome size in a log-log relationship. It indicates that species slopes are roughly proportional to the inverse of the mean chromosome size (S5 Fig). Consequently, the excess of COs on a chromosome (i.e. the linkage map length minus 50 cM) was correlated with its relative size (i.e. chromosome size divided by the mean chromosome size of the species; Fig 2B) not the absolute size. Moreover, in contrast to the relationship between recombination rate and absolute size, we did not observe any difference between the linear model and the fixed regression of the mixed linear model, suggesting that this relationship is similar across species (Fig 2B). In other words, any two chromosomes with the same ratio of sizes will have the same ratio of excess of recombination rate, whatever the species and the genome size.

Fig 1. Mean recombination rates per chromosome (cM/Mb, log scale) are negatively correlated with chromosome physical size (Mb, log scale).

Fig 1

Each point represents a chromosome (n = 665). Species are presented in different colours (57 species). (A) The bold solid line represents the linear regression line fitted to the data. The thin lines correspond to the expectation of one, two, three or four COs per chromosome. (B) Correlations between recombination rates and chromosome size within each species with at least 5 chromosomes (coloured lines, 55 species) and the overall between-species correlation controlled for a species effect (black dashed line, n = 57 species). Solid bold line as in (A).

Fig 2. Linkage map length (cM) is positively correlated with genomic chromosome size (Mb).

Fig 2

(A) Correlation between chromosome genomic size (Mb) and linkage map length (cM). Each point represents a chromosome (n = 665). Species are presented in different colours (57 species). The black solid line represents the simple linear regression (linkage map length ~ log10(chromosome size), adjusted R2 = 0.036, p < 0.001) and the black dashed line the fixed effect of the mixed model (linkage map length ~ log10(chromosome size) + (1 | species), marginal R2 = 0.49, conditional R2 = 0.99, p < 0.001). Species random slopes are shown in colours. Isolines of recombination rates are plotted for different values (indicated cM/Mb) as dotted red lines to represent regions with equal recombination. (B) The excess of COs (linkage map length minus 50 cM for the obligate CO) is positively correlated with the relative chromosome size (size / average size of the species). The black solid line is the linear regression across species (excess of CO ~ relative chromosome size, adjusted R2 = 0.13, p < 0.001) and the black dashed line the fixed effect of the mixed model (excess of CO ~ relative chromosome size + (1 | species), marginal R2 = 0.14, conditional R2 = 0.86, p < 0.001). Coloured solid lines represent individual regression lines for species with at least 5 chromosomes (55 species).

Diversity of CO patterns among flowering plants

Recombination landscapes along chromosomes appeared to be qualitatively very similar within species but strongly varied between species (Figs 3 and S2). In the text below, to avoid confusion with the molecular composition and specific position defining telomeric and centromeric regions stricto sensu, we have used instead the terms distal regions for the extremities of the chromosomes and proximal regions for the central part of the chromosomes around the centromere. Note that in the species we surveyed none had acrocentric chromosome. Representing relative recombination rates in ten bins of equal physical length (see Materials and Methods for details), some landscapes appeared rather homogeneous along chromosomes whereas others were extremely structured with recombination concentrated in the short distal parts of the genome, and wide variations between these two extremes (Fig 4). The Gini index is a measure of heterogeneity bounded between 0 (perfect homogeneity) and 1 (maximal heterogeneity). The range of the Gini index estimated on recombination landscapes was between 0.1 and 0.9 (S1 Table). The bias towards the periphery was not ubiquitous across species (Fig 4), whereas Haenel et al. [2] suggested that the distal bias could be universal for chromosomes larger than 30 Mb. Only a subset of species, especially those with very large chromosomes (> 100 Mb), exhibited a clear bias (Fig 4). Despite large chromosome sizes (mean chromosome sizes = 98 Mb and 222 Mb, respectively), Nelumbo nucifera and Camellia sinensis are noticeable exceptions to this pattern, with the highest recombination rates found in the middle of the chromosomes (Nelumbo nucifera illustrated in Fig 3E, other species in S2 Fig). For small to medium-sized chromosomes, the pattern is less clear. Most species did not show any clear structure along the chromosome but a few of them (e.g. Capsella rubella, Dioscorea alata, Mangifera indica, Manihot esculenta) showed a drop in recombination rates in the distal regions and high recombination rates in the proximal regions (Capsella rubella illustrated in Fig 3A).

Fig 3. Diversity of recombination landscapes exemplified by six different species.

Fig 3

Recombination landscapes are similar within species (the dashed line is the average landscape for pooled chromosomes, all recombination landscapes of the species are contained within the colour ribbon). Genomic distances (Mb) were scaled between 0 and 1 to compare chromosomes with different sizes. Estimates of the recombination rates were obtained by 1,000 bootstraps over loci in windows of 100 kb with loess regression and automatic span calibration. One chromosome per species is represented in a solid line, with the genomic position of the centromere demarcated by a dot. The six species are ordered by ascending mean chromosome size (Mb).

Fig 4. Patterns of recombination within chromosomes (n = 665).

Fig 4

Relative recombination rates along the chromosome were estimated in ten bins of equal ratio of the observed genetic length divided by the expected genetic length (one tenth of total size) of the bin (log-transformed). Values below (above) zero are recombination rates that are lower (higher) than expected under a random distribution. The 57 species are ordered by ascending genome size. Each horizontal bar plot represents one chromosome. When available, the centromere position is mapped as a black and white diamond.

Following Haenel et al. [2], we calculated the periphery-bias ratio as the recombination rate in 10% at each extremity of each chromosome divided by the mean recombination rate. A ratio higher than 1 indicates a higher recombination rate in the tips than the whole chromosome. By pooling chromosomes within species, the periphery-bias ratio ranged from 0.31 to 5.76. Most species exhibited a ratio higher than one but the ratio was lower than one for nine species and just above one (<1.2) for six species (see S5 Table). We detected a significant positive effect of chromosome length on the periphery-bias ratio across species (Linear Model, adjusted R2 = 0.44, p < 0.001; Fig 5A) with some exceptions (see Capsella rubella and Nelumbo nucifera on Fig 3). Analysing chromosomes independently across all species the mean periphery-bias ratio is significantly higher than 1 (95% bootstrapped confidence interval of the mean = [2.06;2.32]) and skewed towards values higher than 1 but the correlation with chromosome length within each species was not clear (Fig 5B, 5C and S6 Table).

Fig 5. The periphery-bias ratio is positively correlated with chromosome genomic size.

Fig 5

(A) Linear regression between the species mean periphery-bias ratio and the mean chromosome size (log scale) across species (n = 57 species; adjusted R2 = 0.44, p < 0.001). Points are coloured according to the classification of the CO patterns described below (orange = distal, blue = sub-distal, black = unclassified). (B) Distribution of periphery-bias ratios (n = 665 chromosomes). The mean periphery-bias ratio and its 95% confidence interval (black solid and dashed lines) were estimated by 1,000 bootstrap replicates. The red vertical line corresponds to a ratio of one. (C) Distribution of Spearman’s correlation coefficients between the periphery-bias ratio and chromosome genomic size (Mb) within species (n = 57 species).

Joint effect of telomeres and centromeres on crossover distribution along chromosomes

Globally, recombination rates were negatively correlated with the distance to the nearest telomere (S6 Fig and S7 and S8 Tables). However, two qualitative patterns emerged (Figs 6 and S7, S8 Table). In 34 species, recombination decreased from the telomere and reached a plateau at approximately 20% of the whole chromosome length (the distal pattern, Fig 6A), in agreement with the model suggested by Haenel et al. [2]. Sixteen species exhibited a sharp decrease in the most distal regions and a peak of recombination in the sub-distal regions (relative genomic distance between 0.1–0.2) followed by a slow decrease towards the centre of the chromosome (the sub-distal pattern, Fig 6B). There were a few exceptions to these two patterns (six species), e.g. Capsella rubella consistently showed higher recombination rates in the middle of the chromosome (Fig 3A). Interestingly, species classified as having a distal pattern had significantly larger chromosomes than species classified as having a sub-distal pattern (Wilcox rank sum test, p < 0.001, Fig 6C). Furthermore, the correlation between recombination and the distance to the nearest telomere was significantly higher for species with larger chromosomes (Spearman rank correlation coefficient Rho = -0.51, p < 0.001; S6 Fig).

Fig 6. Distribution of crossover: main patterns.

Fig 6

(A and B) Standardized recombination rates for species (chromosomes pooled per species, n = 57 species) are expressed as a function of the relative genomic distance from the telomere in 20 bins representing the two main patterns (orange = distal, blue = sub-distal). The seven unclassified species are shown in supplementary (S7 Fig). Chromosomes were split in half and 0.5 corresponds to the centre of the chromosome. In each plot, the solid line represents the mean recombination rate estimated in a bin (20 bins) and each dot per bin represents the average of a species. Upper and lower boundaries of the ribbon represent the maximum and minimum values. (C) Distribution of chromosome genomic sizes (Mb) for each pattern (D: distal, SD: sub-distal, E: exceptions).

When the centromere position was known, we confirmed that the centromeres had an almost universal local suppressor effect (Figs 3 and 4). In small and medium-sized chromosomes, the recombination was often suppressed only in short restricted centromeric regions (several Mb, 1–5% of the map) displaying drastic drops in the recombination rates. In larger chromosomes, the suppression of recombination extends to large regions upstream and downstream of the physical centre of the chromosome (approximately 80–90% of the chromosome; Fig 4). Ninety percent of chromosomes (388 chromosomes) had significantly less recombination than the chromosome average at the centromeric index (n = 425, resampling test, 1,000 bootstraps, 95% confidence interval). In 81 chromosomes (19%) estimated recombination was null in the centromere (although it can be non-zero but lower than the detection threshold given the number of individuals used for genetic maps). However, the transposition of centromere position from cytological data to genomic data may be imprecise or wrongly oriented for some chromosomes. After orienting chromosomes to map the centromeric index, 16% of chromosomes (70 over 425) had a recombination rate slightly higher in the inferred centromere position than on the opposite side, thus a centromere potentially mapped on the wrong side.

To understand the patterns observed further, we compared three models (Fig 7). Under the strict distal model proposed by Haenel et al. [2] (M1), the centromere plays no role beyond its local suppressor effect, which predicts an equal distribution of crossovers on both sides of the centre of chromosomes, independently of centromere position: d(1/2)d(1)=0.5, where d(1/2) is the genetic distance (cM) to the physical middle of the chromosome and d(1) is the total genetic distance (cM). We also tested two alternative models with a centromere effect. We assumed that the position of the centromere, d(c), affects the distribution of crossovers along the chromosome. Models M2 ‘telomere + centromere + one CO per arm’ and M3 ‘telomere + centromere + one CO per chromosome’; both assume that the relative genetic distance of a chromosome arm is proportional to its relative genomic size. However, they differ in the number and distribution of mandatory COs. In M2, at least one CO in each chromosome arm (50 cM) is mandatory whereas only one CO is mandatory for the entire chromosome in M3, even if it has two arms (which always the case in our dataset). For species whose centromere position was known (37 species, 425 chromosomes) we regressed the observed values against the theoretical predictions of the three models and compared them using goodness-of-fit criteria (adjusted R2, AIC, BIC). M1 was not supported by any species and M2 was generally rejected since 22% of chromosomes had genetic maps lower than 50 cM in at least one arm, even though it was supported in a handful of species (Table 1). M3 was the best supported model (30 out of 37 species), with good predictive power (Spearman rank correlation between predicted and observed values: Rho = 0.72, p < 0.001; Tables 1, S9 and S10). Given that chromosome arm genetic maps shorter than 50 cM are incompatible with one mandatory CO per arm in model M2, we also compared the three models on a subset of chromosomes with at least 50 cM on each chromosome arm (n = 36 species, 333 chromosomes) which confirmed that model M3 was the best model. Similarly, we reran the model using only chromosomes whose centromere positions were known with certainty (n = 37 species, 355 chromosomes) and found the same results.

Fig 7. Possible models of crossover patterns.

Fig 7

Schematic representation of the three competing models for the two main patterns, with an example of a centromere position at 1/3 of the chromosome. Model 3 is the best model (box).

Table 1. Model selection for the telomere/centromere effect (n = 37 species with a centromere position, 425 chromosomes).

Three competing models were compared based on the adjusted R2, p-value and AIC-BIC criteria among chromosomes (the best supported model is in bold characters). The number of species supporting each model was calculated based on the adjusted R2 within species, for all species with at least five chromosomes. (1) ‘telomere’ model. (2) ‘telomere + centromere + one CO per arm’ model. (3) ‘telomere + centromere + one CO per chromosome’ model. d(c) is the genetic distance to the centromere. d(1) is the total genetic distance. A second model selection was done on a subset of chromosomes with at least 50 cM on each chromosome arm (n = 36 species, 333 chromosomes).

# Model Expected Adjusted R2 p AIC BIC Species
Full dataset (37 species, 425 chromosomes)
1 Telomere d(1/2) / d(1) = 0.5 0.22 < 0.001 -477.8 -465.7 0
2 Tel. + Cent. + CO per arm (d(c)– 50) / (d(1)– 100) = c - 0.72 3098.2 3110.4 7
3 Tel. + Cent. + CO per chr. d(c) / d(1) = c 0.51 < 0.001 -476.6 -464.5 30
Subset (36 species, 333 chromosomes)
1 Telomere d(1/2) / d(1) = 0.5 0.18 < 0.001 -407.5 -396.1 0
2 Tel. + Cent. + CO per arm (d(c)– 50) / (d(1)– 100) = c -0.001 0.42 1939.1 1950.5 10
3 Tel. + Cent. + CO per chr. d(c) / d(1) = c 0.50 < 0.001 -396 -384.6 26

Recombination rates are positively correlated with gene density

It has been shown in a few species that COs preferentially occur in gene promoters. The scale of 100 kb used here is too large to test whether this pattern is shared among angiosperms. Instead, like in Haenel et al. [2], we assessed whether recombination increased with gene density. This pattern is also predicted if there is a negative association between TEs and recombination. Forty-one genomes were annotated with gene positions. Across chromosomes, the distribution of chromosomal correlations between gene count and recombination rate was clearly skewed towards positive values, independently of the previously described CO patterns (mean Spearman’s rank correlation = 0.46 [0.43; 0.49]; Fig 8A). Ninety-one percent of 483 chromosomes (41 species) showed a significant correlation between the number of genes and recombination rate at a 100 kb scale. The strength of the relationship greatly varied across species and did not correlate with chromosome length or the genome-wide recombination rate (Fig 8B). Overall, standardized recombination rates (subtracting the mean and dividing by the standard deviation to allow comparison among species) consistently increased with the number of genes in most species (linear quadratic regression, adjusted R2 = 0.62, p < 0.001; Fig 8C).

Fig 8. Recombination rates are positively correlated with gene density (n = 483 chromosomes, 41 species).

Fig 8

(A) Distribution of chromosome Spearman’s rank correlations between the number of genes and the recombination rate in 100 kb windows. The black vertical line is the mean correlation with a 95% confidence interval (dashed lines) estimated by 1,000 bootstrap replicates. Colours correspond to CO patterns (orange = distal, blue = sub-distal, black = exception). (B) Slopes of the species linear regression between gene count and recombination rates are independent of the species averaged recombination rate (Linear Model, adjusted R2 = -0.02, p = 0.83). (C) Standardized recombination rates for each number of genes in a 100 kb window (centred-reduced, chromosomes pooled per species) estimated by 1,000 bootstraps and standardized within species. The gene count was estimated by counting the number of gene starting positions within each 100 kb window. The black line with a grey ribbon is the quadratic regression estimated by linear regression with a 95% parametric confidence interval (Linear Model, adjusted R2 = 0.62, p < 0.001).

As for recombination patterns, we classified patterns of gene density along chromosomes in three categories: distal, sub-distal and exceptions (S8 Fig). Most species (30 out of 41) were classified in the same gene density and recombination pattern (S11 Table). Moreover, we observed the same qualitative pattern for gene density and recombination for species with either major recombination pattern (Fig 9).

Fig 9. Gene counts patterns along the chromosome are correlated with CO patterns (n = 41 species).

Fig 9

Standardized gene count (centred-reduced) as a function of the relative distance from the tip to the middle of the chromosome (genomic distances distributed in 20 bins). We used the same groups as identified for the CO pattern in Fig 6; (A) distal pattern vs (B) sub-distal pattern. Same legend as Fig 6.

Quantification of genetic shuffling

We confirmed that crossovers are unevenly distributed in genomes, which should affect how genetic variation is recombined between parental homologous chromosomes. Recently, Veller et al. [31] proposed a measure to quantify the amount of genetic shuffling within and among chromosomes. To quantify how much it depends on the distribution of COs, we estimated its intrachromosomal component, r¯intra, as described in equation 10 in Veller et al. [31]. The r¯intra gives, for a chromosome, a measure of the probability for a random pair of loci to be recombined by a crossover. As expected, this was positively and significantly correlated with linkage map length (r¯intra ~ linkage map length + (1 | species), marginal R2 = 0.43, conditional R2 = 0.88, p < 0.001, S9 Fig). A pattern in which COs are physically clustered in distal chromosome regions is thought to generate less recombination than one with COs evenly distributed across the chromosome [31]. At a chromosomal level, consistently (across species) the periphery-bias ratio has a low but significant effect on genetic shuffling measure, consistent among species (r¯intra ~ periphery-bias ratio + (1 | species), marginal R2 = 0.05, conditional R2 = 0.68, p < 0.001, S10 Fig). The more COs are clustered in the tips of the chromosome, the lower the chromosomal genetic shuffling. These results verify the analytical predictions of Veller et al. [31], although the strength of the effect remains weak.

However, the distributions of COs and genes are both non-random and often correlated (Figs 8 and S11). Genomic distances measured in base pairs may not be the most appropriate measure of genetic shuffling among functional genomic components. Thus, we also measured genomic distances in gene distances (i.e. the cumulative number of genes along the chromosome) instead of base pairs. Marey maps most often appeared more homogeneous when scaled on gene distances instead of base pair distances, with 70% (316 over 450) of Marey maps showing a smaller departure from a random distribution (Figs 10 and S12, S11 Table). This is not an automatic effect of changing scale as we compared the two maps on relative scales. Globally, a subset of 30 species has more homogeneous Marey maps with gene distances whereas 11 others are quantitatively more heterogeneous (notably Capsella rubella and Arabidopsis thaliana), although this could be due to low quality annotations making it difficult to precisely estimate the gene distances for some of them (e.g. Sesamum indicum). In most cases, genetic shuffling measures were slightly higher when gene distances were used instead of base pairs (Fig 11; mean = 0.22 for base pairs; mean = 0.26 for gene distances; Wilcoxon rank sum test with continuity correction, p < 0.001), implying more recombination among coding regions than among regions randomly sampled in the genome. Interestingly, the increase in genetic shuffling calculated in gene distances compared to genomic distance was more pronounced for longer chromosomes—which are often the most heterogeneous ones, characterized by a distal pattern—whereas we saw little effect on smaller chromosomes characterized by a sub-distal pattern (difference in r¯intra ~ log10(chromosome size) + (1 | species), marginal R2 = 0.21, conditional R2 = 0.87, p < 0.001, Fig 11).

Fig 10. Marey maps of six chromosomes with the relative physical distance expressed in genomic distances (black dots, position in the genome in Mb) or in gene distances (grey dots, position measured as the cumulative number of genes along the chromosome).

Fig 10

Marey maps are ordered by ascending chromosome size (Mb). The diagonal dashed line represents a theoretical random distribution of COs along the chromosome.

Fig 11. Differences in genetic shuffling between estimates based on genomic distances (Mb) and gene distances (cumulative number of genes).

Fig 11

The difference is the genetic shuffling in gene distances minus the genetic shuffling in genomic distances. Colours correspond to CO patterns (orange = distal, blue = sub-distal, black = exception). (A) Distribution of the chromosome differences in the genetic shuffling (n = 444 chromosomes). (B) Distributions of the species difference in the genetic shuffling (n = 41 species, chromosomes pooled). (C) Species differences in the genetic shuffling are positively correlated with the averaged chromosome size (Linear Model, adjusted R2 = 0.20, p = 0.002, n = 41, 95% parametric confidence interval).

Discussion

Based on a large and curated dataset, we provided a broad survey of recombination landscapes among flowering plants. In addition to confirming that both the chromosome-wide recombination rate and the heterogeneity of recombination landscapes vary according to chromosome length, we identified two distinct CO patterns and we proposed a new model that extended the strict telomere model recently proposed by Haenel et al. [2]. Moreover, the consistent correlation between recombination and gene density may have implications for the evolution of recombination landscapes and whether the distribution of COs is optimal for the efficacy of genetic shuffling.

Chromosome size and recombination rate

We showed that, for most species, the smallest chromosome had roughly one or two COs, independently of chromosome size. This is in agreement with the idea that CO assurance is a ubiquitous regulation process among angiosperms [10]. Moreover, this constraint imposes a kind of basal recombination rate for each species, on the order of 50/Sc cM/Mb, where Sc is the size of the lowest chromosome in Mb. Regardless of the genome size (which ranges three orders of magnitude or more), the number of COs remains relatively stable amongst species, most probably under the joint influence of CO assurance, interference and homeostasis [4,9,11]. As a result, averaged recombination rates are negatively correlated with chromosome lengths, as already known in plants [2,21].

However, there is no universal relationship between the absolute size of a chromosome and its mean recombination rate. Although the average recombination rate of a species is well predicted by its average chromosome size, the recombination rates of each chromosome separately are not well predicted by their absolute chromosome size. Instead, variation within species is much better explained by the relative chromosome size, and surprisingly, this relationship seems to be roughly the same among species (see Figs 1 and 2). This suggests that CO interference is proportional to the relative size of the chromosome, as it has been empirically observed in some plants [32]. Although it is not clear yet which interference distance unit is the most relevant, genomic distances (in Mb) are excluded in most models of interference in favour of genetic distances (cM) [33] or, more likely, the length of the synaptonemal complex in micrometres [5,3436]. Both scales match our observation of a relative size effect. Within species, genetic maps increase with chromosome size, but among species they are uncorrelated and far less variable than genome sizes, which makes the relative chromosome size the main determinant of recombination rate variations among species. Similarly, physical sizes (in micrometres) at meiosis do not seem to scale with genome size, as chromosomal organization (nucleosomes, chromatin loops) strongly reduces the variation that could be expected given the genome size [9].

Recombination patterns along chromosomes

We observed a global trend towards higher recombination rates in sub-distal regions [2,29]. The distal bias increased with chromosome length, in agreement with the conclusions of Haenel et al. [2], although our methods differ in resolution. We analysed species and chromosomes separately whereas Haenel et al. [2] used averages over the different patterns, thereby masking chromosome- and species-specific particularities. For example, they did not detect the sub-distal pattern neither unclassified exceptions, whereas they seem common among species (16 and 7 species respectively). So far, little is known about the mechanisms that could explain the link between the distal bias and chromosome length. Even if models of CO interference yield similar patterns [37,38], the conceptual model of Haenel et al. [2] is still the only one to explicitly consider chromosome length. The telomere effect is thought to act at a broad chromosome scale over long genomic distance. The decision of double strand breaks (DSBs) to engage in the CO pathway is made early on during meiosis and the early chromosome pairing beginning in telomeres is thought to favour distal COs [3941]. In barley, when the relative timing of the first stages of the meiotic program was shortened, COs were redistributed towards proximal regions [40], as later observed in wheat [42].

Haenel et al. [2] proposed that distance to the telomere is driving CO positioning, and therefore it should produce a symmetrical U-shaped pattern along chromosomes. However, a formal test showed that this model was too simple and that centromeres also played a role in the distribution of COs between chromosome arms. The best model (M3: ‘telomere + centromere + one CO per chromosome’) that we have proposed suggests that centromeres do not only have a local effect but also influence the symmetry of recombination landscapes over long distance, though a large proportion of our sample is metacentric, which might limit the detection of an effect. The local suppression of COs in centromeric regions is well known and largely conserved among species and seems a strong constitutive feature restricted to a short centromeric region, basically the kinetochore [43,44]. But the extent of the pericentromeric region varies drastically, most probably under the influence of DNA methylation, chromatin accessibility or RNA interference [14,43,45,46]. However, how centromeres (especially non-metacentric ones) may affect CO distribution at larger scales still needs to be determined.

Diversity of patterns among species

In addition to the role of centromeres, we also observed that the distal model is not found in all plants. Instead, we observed at least two different crossover patterns among plant species (34 with the distal model and 16 with the sub-distal model), while seven species remain unclassified, which is at the limit of our visual classification. Globally, the distal pattern seems to occur more often in larger chromosomes, but our data lack species with giant genomes, which are not rare in plants [27]. Astonishingly, a low-density genetic map in Allium showed higher recombination rates in the proximal regions, which is opposite to the major trend we found [47]. Genera with giant genomes such as Lilium or Allium would have been valuable assets in our dataset, but the actual genomic and linkage data are relatively incomplete [48,49].

The occurrence of various recombination patterns is in agreement with what is known of the timing of meiosis and heterochiasmy (the fact that male and female meiosis have different CO patterns). Despite the strong conservation of the main meiotic mechanism in plants, differences in the balance between key components may produce distinct CO patterns [1,13,20,40]. For example, the ZYP1 and ASY1 proteins have antagonistic effects on the formation of the synaptonemal complex in plants [50]. In barley and wheat, linearization of the chromosome axis triggered by ZYP1 is gradual along the chromosome and initiated in distal regions, forming the telomere bouquet where early DSBs form [40,42]. In contrast, chromosome axes are formed at a similar time in Arabidopsis thaliana and chromosomes are gradually enriched in ASY1 from the telomeres to the centromeres; a gene-dosage component favours synapsis and ultimately COs towards the proximal regions [50]. It appears that the timing of the meiotic programme is important for the distal bias, as it involves changes in the relative contribution of each meiotic component that could explain the re-localization of COs [40,50]. Therefore, the different patterns we observed may be explained by the different balance and timing of the expression of shared key regulators of CO patterning such as ZYP1 and ASY1 [20]. It is interesting to note that this is also true for mechanistic models of interference. Zhang et al. [38] assessed that the ‘beam-film’ model is able to fit both CO patterns, regardless whether the tips of the chromosomes have an effect on interference or not, i.e. clamping. If clamping is assumed, the model predicts that mechanical stress culminates in the extremities of the chromosome leading to high CO rates at the periphery where it is released first. In contrast, when clamping is limited, mechanical stress is released in the tips of the chromosome and COs occur further from the tips, until a threshold of mechanical stress is reached. The observed sub-distal pattern fits these predictions.

The two patterns of recombination we described here can also be observed in opposite sexes within the same plant species [34,51,52]. Marked heterochiasmy variations between species, a feature shared among plants and animals, could influence the resulting sex-averaged recombination landscape [52]. The sex-averaged telomere effect can be thought of as the product of two independent sex-specific landscapes although it is not clear how sex-specific maps ultimately contribute to the sex-averaged one [53,54]. Recombination is usually biased towards the tips of the chromosome in male recombination maps, but is more evenly distributed in female maps in the few plant species with available data [52]. In Arabidopsis thaliana, male meiosis has higher CO rates within the tips of the chromosome, as it has been observed in other species with large chromosomes, whereas female meiosis is more homogeneously distributed, with the lowest rates found in the distal regions [34]. Shorter chromosome axes in A. thaliana female meiosis could induce fewer DSBs and class II non-interfering COs [36]. Conversely, in maize, the distal bias is similar in both sexes, despite higher CO rates for females [55]. Heterochiasmy is not universal in plants [56], and we suggest that the variation in recombination landscapes could also result from variation in heterochiasmy among species, as it has been suggested for broad-scale differences in recombination landscapes between A. thaliana and its relative A. arenosa [51]. This hypothesis should be tested further as more sex-specific genetic maps become available.

Recombination landscapes, gene density and genetic shuffling

We observed a strong convergence between CO patterns and gene density patterns. Interestingly, we found the same correlation in species with atypical chromosomes. For example, Camellia sinensis and Nelumbo nucifera have large genomes with homogenous recombination landscapes, and a recent annotation of the Nelumbo nucifera genome showed that genes are also evenly distributed along chromosomes at a broad scale [57], similar to Camellia sinensis [58]. In wheat and rye, the analysis of the effect of chromosome rearrangement on recombination also suggests that CO localization is more locus-specific than location-specific: after inversions of distal and interstitial segments, COs were relocated to the new position on the distal segment [59,60]. Overall, the parallel between gene density and recombination landscapes, confirmed by these two exceptions, is in agreement with the preferential occurrence of COs in gene regulatory sequences [1416], and suggests that this may be a general pattern shared among angiosperms. Thus, gene distribution along chromosomes could be a main driver of recombination landscapes simply by determining where COs may preferentially occur. It should be noted that since the gene number is usually positively correlated with chromosome size within a species but is roughly independent of genome size among species, this hypothesis also matches with the relative-size effect discussed above.

However, gene density and recombination rates are both correlated with many other genomic features, such as transposable elements [25,61]. The accumulation of TEs in low recombining regions would progressively decrease gene density in the region, and would eventually result in a positive correlation between gene density and recombination. However, the correlation of recombination rates with TEs is not always clear and different TE families have opposite correlations [25,62]. On the one hand, gene density could directly determine recombination landscapes, leading to the accumulation of TEs in gene-poor regions, which would be amplified by a positive feedback loop. On the other hand, recombination could be targeted to gene-rich regions to avoid the deleterious effects of ectopic recombination between TEs [25]. Recombination, gene density and TEs could thus co-evolve and causal mechanisms of these multiple interactions still need to be clarified [25]. The use of fine scale recombination maps (using very large mapping populations or LD maps) should help identifying the respective role of genic regions (especially the role of promoters) and transposable elements (or other genomic features).

Irrespective of the underlying mechanism, our finding implies that the CO distribution ultimately scales with the gene distribution. Therefore, in most species, COs have a more even distribution between genes than between random genomic locations (Fig 10), which may have important evolutionary implications such as homogenizing the probability of two random genes to recombine, especially for large genomes that exhibit the strongest difference in genetic shuffling between genes and between genomic locations (Fig 11). Therefore, CO patterning (and not only the global CO rate) could be under selection not only for its direct effect on the functioning of meiosis but also for its indirect effects on selection efficacy [9]. Recombination decreases linkage disequilibrium and negative interferences between adjacent loci (e.g. Hill-Robertson Interference), and thus locally increases the efficacy of selection. Functional sites are targets for selection [63] and we found higher recombination rates in functional regions, meaning that only a few genes are ultimately excluded from the benefits of recombination, even under the most pronounced distal bias.

Higher recombination rates in gene-rich regions could provide a satisfying explanation as to why the distal bias is maintained among species despite its theoretical lack of efficacy for genetic shuffling [31]. The association between CO hotspots and gene regulatory sequences is mechanistically driven by chromatin accessibility, but it does not exclude the evolution of the mechanism itself towards the benefits of recombining more in gene-rich regions [54]. However, slight variations in genetic shuffling caused by the non-random distribution of COs are less likely to be under strong selection compared to stabilizing selection on molecular constraints for chromosome pairing and segregation [64], although interference is sometimes likely to evolve towards relaxed physical constraints [9]. In addition, the intra-chromosomal component of the genetic shuffling is a small contributor to the genome-wide shuffling rate, as a major part is due to independent assortment among chromosomes [31]. Our estimates for the chromosomal genetic shuffling do not reach the theoretical optimal value of 0.5. The pattern is not absolute, and a fraction of genes remains in low recombining regions. In grass species, up to 30% of genes are found in recombination deserts and are not subject to efficient selection (e.g. [65]). Finally, it is still an open question as to whether this global distribution of COs in gene regulatory sequences is advantageous for the genetic diversity and adaptive potential of a species [66].

Conclusion

Our comparative study only demonstrates correlations, and not mechanisms, but helps to understand the diversity and determinants of recombination landscapes in flowering plants. Our results partly confirm previous studies based on fewer species [2,4,21] while bringing new insights that alter previous conclusions thanks to a detailed analysis at the species and chromosome levels. Two main and distinct CO patterns emerge across a large set of flowering plant species; it seems likely that chromosome structure (length, centromere) and gene densities are the major drivers of these patterns, and the interactions between them raise questions about the evolution of complex genomic patterns at the chromosome scale [29,67]. The new large and curated dataset we provide in the present work should be useful for addressing such questions and testing future evolutionary hypotheses regarding the role of recombination in genome architecture, and we hope that many new species will be added, especially thanks to the increasing number of fully-sequenced plant genomes. We also encourage experimentalists to quantify separately male and female recombination landscapes, as potential sex differences could bring important insight on the evolution of recombination patterns [52]. Comparing recombination maps among closely related species, or even among population of a same species, would also help understanding how recombination landscapes evolve.

Materials and methods

Data preparation

To build recombination maps, we combined genetic and genomic maps in angiosperms that had already been published in the literature. We conducted a literature search to collect sex-averaged genetic maps estimated on pedigree data–with markers positions in centiMorgans (cM). The keywords used were ‘genetic map’, ‘linkage map’, ‘genome assembly’, ‘plants’ and ‘angiosperms’, combined with ’high-density’ or ’saturated’ in order to target genetic maps with a large number of markers and progenies. Additionally, we carried out searches within public genomic databases to find publicly available genetic maps. Only species with a reference genome assembly at a chromosome level were included in our study (a complete list of genetic maps with the associated metadata is given in S1 and S2 Tables). As much as possible, genomic positions along the chromosome (Mb) were estimated by blasting marker sequences on the most recent genome assembly (otherwise genomic positions were those of the original publication). Genome assemblies with annotation files at a chromosome-scale were downloaded from NCBI (https://www.ncbi.nlm.nih.gov/) or public databases. Marker sequences were blasted with ‘blastn’ and a 90% identity cutoff. Markers were anchored to the genomic position of the best hit. When the sequence was a pair of primers, the mapped genomic position was the best hit between pairs of positions showing a short distance between the forward and reverse primer (< 200 bp). In a few exceptions (see S1 Table), genomic positions were mapped on a close congeneric species genome and the genomic map was kept if there was good collinearity between the genetic and genomic positions. Chromosomes were numbered as per the reference genome assembly. When marker sequences were not available, we kept the genomic positions published with the genetic map. The total genomic length was estimated by the length of the chromosome sequence in the genome assembly. The total genetic length was corrected using Hall and Willis’s method [30] which accounts for undetected events of recombination in distal regions by adding 2s to the length of each linkage group (where s is the average marker spacing in the group).

We selected genetic and genomic maps after stringent filtering and corrections, using custom scripts available in a public Github repository (https://github.com/ThomasBrazier/diversity-determinants-recombination-landscapes-flowering-plants.git). We assumed that markers must follow a monotone increasing function when plotting genetic distances as a function of genomic distances in a chromosome (i.e. the Marey map) and collinearity between the genetic map and the reference genome was required to keep a Marey map. If necessary, genetic maps were reoriented so that the Marey map function is increasing (i.e. genetic distances read in the opposite direction). In a first step, Marey maps with fewer than 50 markers per chromosome were removed, although a few exceptions were visually validated (maps with ~30 markers). Marey maps with more than 10% of the total genomic map length missing at one end of the chromosome were removed. Marey maps with obvious artefacts and assembly mismatches (e.g. lack of collinearity, large inversions, large gaps) were removed. Markers clearly outside the global trend of the Marey map (e.g. large genetic/genomic distance from the global cloud of markers or from the interpolated Marey function, no other marker in a close neighbourhood) were visually filtered out, and multiple iterations of filtering/interpolation helped to refine outlier removal. The Marey map approach is a graphical method, so figures were systematically produced at each step as a way to evaluate the results of the filtering and corrections. Finally, when multiple datasets were available for the same species, we selected the dataset with the highest marker density–in addition to visual validation–to maintain a balanced sampling and avoid pseudo-replicates of the same chromosome.

Estimates of local recombination rates

Local recombination rates along the chromosome were estimated with custom scripts following the Marey map approach, as described in the MareyMap R package [68]. The mathematical function of the Marey map was interpolated with a two-degree polynomial loess regression. Each span smoothing parameter was calibrated by 1,000 iterations of hold-out partitioning (random sampling of markers between two subsets; 2/3 for training and 1/3 for testing) with the Mean Squared Error of the loess regression as a goodness-of-fit criterion. The possible span ranged from 0.2 to 0.5 and was visually adjusted for certain maps. This validation procedure to automatically adjust the smoothing allowed avoiding overfitting and underfitting issues. The local recombination rate was the derivative of the interpolated smoothed function in fixed 100 kb and 1 Mb non-overlapping windows. Negative estimates were not possible as we assumed a monotonously increasing function and negative recombination rates were set to zero. The 95% confidence intervals of the recombination rates were estimated by 1,000 bootstrap replicates of the markers to evaluate the sensitivity of our estimates to outliers and noisy data. Recombination landscapes with large confidence interval were discarded. The quality of the estimates was checked using the correlation between the 100 kb and 1 Mb windows. In addition, to check for a bias in our estimates (e.g. inflating the chromosome recombination rate), we assessed the differences between the genome wide recombination rate (obtained by dividing the genetic map length by the genome length) and the average estimate per chromosome (the mean of recombination rates in windows of 100 kb). Both values are extremely correlated (Spearman’s Rho = 0.99, p < 0.001 and slope = 1).

The distribution of CO along chromosomes

The spatial structure of recombination landscapes across species and chromosomes is a major feature of recombination landscapes. We divided the Marey map in k segments of equal genomic size (Mb) and then calculated the relative genetic size (cM) of each segment. Under the null model (i.e. random recombination), one expects k segments of equal genetic size 1/k. The relative recombination rate in the segment i was estimated by the log-ratio of the observed genetic size (i.e. genetic size of segment i) divided by the expected genetic size (i.e. fixed to total genetic size / k by the model), as in the following equation.

relativerecombinationrate=log10geneticigenetictotal/k

Given the observation that most recombination landscapes are broken down into at least three segments [69], we arbitrarily chose a number of segments k = 10 to reach a good resolution (a larger k did not show any qualitative differences).

Crossover patterns and the periphery-bias ratio

We investigated the spatial bias towards distal regions of the chromosome in the distribution of recombination by estimating recombination rates as a function of relative distances to the telomere (i.e. distance to the nearest chromosome end). Chromosomes were split by their midpoint and only one side was randomly sampled for each chromosome to avoid pseudo-replicates and the averaging of two potentially contrasting patterns on opposite arms. The relative distance to the telomere was the distance to the telomere divided by total chromosome size, then divided into 20 bins of equal relative distances. A periphery-bias ratio metric similar to the one presented in Haenel et al. [2] was estimated to measure the strength of the distal bias. We divided the recombination rates in the tip of the chromosome (10% on each side of the chromosome, and one randomly sampled tip) by the mean recombination rate of the whole chromosome. We investigated the sensitivity of this periphery-bias ratio to the sampling scale by calculating the ratio for many distal region sizes (S13 Fig).

Testing centromere or telomere effects

We searched the literature for centromeric indices (ratio of the short arm length divided by the total chromosome length) established by cytological measures. When we had no information about the correct orientation of the chromosome (short arm/long arm), the centromeric index was oriented to match the region with the lowest recombination rate of the whole chromosome (i.e. putative centromere). To determine if telomeres and centromeres play a significant role in CO patterning, we fitted empirical CO distributions to three theoretical models of CO distribution. In the following equations, d(x) is the relative genetic distance at the relative genomic position x, and a is a coefficient corresponding to the excess of COs per genomic distance. Under the strict ‘telomere’ model (1), we assumed that only telomeres played a role in CO distribution, i.e. an equal distribution of COs on both sides of the chromosome (i.e. d(1/2) = d(1)−d(1/2), such that d(1/2)d(1)=0.5. The ‘telomere + centromere + one mandatory CO per arm’ model (2) assumed at least one CO per chromosome arm and a relative genetic distance of each chromosome arm proportional to its relative genomic size, corresponding to the role of centromere position, denoted d(c). We have d(c) = 50+a×c and d(1)−d(c) = 50+a×(1−c), such that d(c)50d(1)100=c. Lastly, the ‘telomere + centromere + one CO per chromosome’ model (3) assumed at least one CO per chromosome and a relative genetic distance within the chromosome proportional to its relative genomic distance. We have d(c) = c×50+a×c and d(1)−d(c) = (1−c)×50+a×(1−c), such that d(c)d(1)=c. The three competing models were compared with a linear regression between empirical and theoretical values, based on the adjusted R2 and AIC-BIC criteria among chromosomes. The number of species supporting each model was calculated based on the adjusted R2 within species, for all species with at least five chromosomes.

Gene density

We retrieved genome annotations (‘gff’ files) for genes, coding sequences and exon positions, preferentially from NCBI and otherwise from public databases (41 species). We estimated gene counts in 100 kb windows for recombination maps by counting the number of genes with a starting position falling inside the window. For each gene count, we estimated the species mean recombination rate and its confidence interval at 95% by 1,000 bootstrap replicates (chromosomes pooled per species). Most species had rarely more than 20 genes over a 100 kb span and variance dramatically increased in the upper range of the gene counts, and therefore we pruned gene counts over 20 for graphical representation and statistical analyses.

Genetic shuffling

To assess the efficiency of the recombination between chromosomes and species, we calculated the measure of intra-chromosomal genetic shuffling described by Veller et al. [31]. To have even sampling along the chromosome, genetic positions (cM) of 1,000 pseudo-markers evenly distributed along genomic distances (Mb) were interpolated using a loess regression on each Marey map, following the same smoothing and interpolation procedure as for the estimation of the recombination rates. The chromosomal genetic shuffling r¯intra were calculated as per the intra-chromosomal component of the equation 10 presented in Veller et al. [31]. For a single chromosome,

r¯intra=i<j(rij/(Λ2))

where Λ is the total number of loci, (Λ2)=Λ(Λ1)/2 and rij is the rate of shuffling for the locus pair (i, j). For the intra-chromosomal component r¯intra, the pairwise shuffling rate was only calculated for linked sites, i.e. loci on the same chromosome. This pairwise shuffling rate was estimated by the recombination fraction between loci i and j. Recombination fractions were directly calculated from Haldane or Kosambi genetic distances between loci by applying a reverse Haldane function (1) or reverse Kosambi function (2), depending on the mapping function originally used for the given genetic map.

rij=12(1e2dij/100) (1)
rij=12tanh(2dij/100) (2)

We also estimated marker positions in gene distances instead of genomic distances (Mb) to investigate the influence of the non-random distribution of genes on the recombination landscape. Gene distances were the cumulative number of genes along the chromosome at a given marker’s position. Splicing variants and overlapping genes were counted as a single gene. The genetic shuffling was re-estimated with gene distances instead of genomic distances to consider a genetic shuffling based on the gene distribution, as suggested by Veller et al. [31]. To compare the departure from a random distribution along the chromosome among both types of distances (i.e. genomic and genes), we calculated the Root Mean Square Error (RMSE) of each Marey map and for both distances. To assess if the distribution of genes influenced the heterogeneity of recombination landscapes, the type of distance with the lower RMSE was considered as the more homogeneous landscape. However, this measure for gene distances is sensitive to annotation errors and artefacts. False negatives are therefore expected (when Marey maps were assessed as more homogeneous in genomic distances while the inverse is true) and this classification remains conservative.

Statistical analyses

All statistical analyses were performed with R version 4.0.4 [70]. We assessed statistical relationships with the non-parametric Spearman’s rank correlation and regression models. Linear Models were used for regressions with species data since we did not detect a phylogenetic effect. The structure in the chromosome dataset was accounted for by Linear Mixed Models (LMER) implemented in the ‘lme4’ R package [71] and the phylogenetic structure was tested by fitting the Phylogenetic Generalized Linear Mixed Model (PGLMM) of the ‘phyr’ R package [72]. The phylogenetic time-calibrated supertree used for the covariance matrix was retrieved from the publicly available phylogeny constructed by Smith and Brown [73]. Marginal and conditional R2 values for LMER were estimated with the ‘MuMIn’ R package [74]. Significance of the model parameters was tested with the ‘lmerTest’ R package [75]. We selected the model based on AIC/BIC criteria and diagnostic plots. Reliability and stability of the various models were assessed by checking quantile-quantile plots for the normality of residuals and residuals plotted as a function of fitted values for homoscedasticity. Model quality was checked by the comparison of predicted and observed values. Given the skewed nature of some distributions, we used logarithm (base 10) transformations when appropriate. For comparison between species, statistics were standardized (i.e. by subtracting the mean and dividing by standard deviation). Mean statistics and 95% confidence intervals were estimated by 1,000 bootstrap replicates.

Supporting information

S1 Fig. Markers positions in genetic distance (cM) as a function of genomic distance (Mb), namely Mary maps, for each chromosome included in the dataset (n = 665 chromosomes).

The black vertical line is the centromere position estimated by cytological measures, when available in the literature.

(PDF)

S2 Fig. Recombination landscapes for each chromosome included in the dataset (n = 665 chromosomes).

Recombination rate (cM/Mb) estimated in windows of 100kb along genomic distances (Mb). Confidence interval at 95% (grey ribbon) estimated by 1,000 bootstraps of loci. The black vertical line is the centromere position estimated by cytological measures, when available in the literature.

(PDF)

S3 Fig. Dataset quality for the 57 species.

The averaged linkage map length (total linkage map length divided by the number of chromosomes, cM) is not correlated with (A) the number of markers (linkage map length ~ log10(number of markers), adjusted R2 = 0.04, p = 0.11), (B) marker density (linkage map length ~ marker density, adjusted R2 = -0.018, p = 0.90) and (C) the progeny size (linkage map length ~ progeny size, adjusted R2 = 0.022, p < 0.32). Regression lines with 95% parametric confidence interval estimated with ggplot2.

(TIF)

S4 Fig. Phylogenetic tree of species in our dataset (n = 57), annotated with mean recombination rate (cM/Mb) and mean chromosome size (Mb).

The supertree was retrieved from the publicly available phylogeny constructed by Smith and Brown (Smith & Brown, 2018).

(TIF)

S5 Fig. Slopes of the linear regression within species (linkage map length ~ chromosome size) as a function of the species mean genomic chromosome size (Mb).

(TIF)

S6 Fig. The negative correlation (Spearman’s Rho coefficient) between recombination rates (cM/Mb) and the distance to the nearest telomere is stronger for species with a larger chromosome size (n = 57).

The linear regression line and its parametric 95% confidence interval were estimated in ggplot2. The inset presents the distribution of Spearman’s Rho coefficients for chromosomes (n = 665 chromosomes). The mean correlation and its 95% confidence interval (black solid and dashed lines) were estimated by 1,000 bootstraps. The red vertical line is for a null correlation.

(TIF)

S7 Fig. Standardized recombination rate (cM/Mb) as a function of the relative distance (Mb) from the telomere along the chromosome (physical distances expressed in 20 bins).

Chromosomes were split in halves, a relative distance of 0.5 being the centre of the chromosome, and only one side was randomly sampled to avoid averaging patterns. Then, chromosomes were pooled per species. Each colour is a species. A loess regression was estimated for each species. Species presented in four plots for clarity.

(TIF)

S8 Fig. Standardized gene count as a function of the relative distance (Mb) from the telomere along the chromosome (physical distances expressed in 20 bins).

Chromosomes were split in halves, a relative distance of 0.5 being the centre of the chromosome, and only one side was randomly sampled to avoid averaging patterns. Then, chromosomes were pooled per species. Each colour is a species. A loess regression was estimated for each species. Species presented in four plots for clarity.

(TIF)

S9 Fig. The genetic shuffling r¯intra increases with the size of the genetic map (cM).

Linear mixed regression with a species random effect and its 95% confidence interval estimated by ggplot2 (black line and grey ribbon). Each colour is a species. A linear regression was estimated for each species.

(TIF)

S10 Fig. The genetic shuffling r¯intra decreases with the periphery-bias ratio. Linear mixed regression with a species random effect and its 95% confidence interval estimated by ggplot2 (black line and grey ribbon).

Each colour is a species. A linear regression was estimated for each species.

(TIF)

S11 Fig. Gene count in windows of 100kb along genomic distances (Mb) for each chromosome with gene annotations (n = 480 chromosomes).

Recombination rate (cM/Mb) estimated in windows of 100kb. Loess regression of gene count along the chromosome in blue line with parametric confidence interval at 95% in grey.

(PDF)

S12 Fig. Marey maps with genomic distances (black points) and gene distances (gray points). Markers positions in genetic distance (cM) as a function of the relative physical distance (either Mb of cumulative number of genes) for each chromosome with gene annotations (n = 480 chromosomes).

The black dashed line is a theoretical uniform distribution of markers. The black vertical line is the centromere position estimated by cytological measures, when available in the literature.

(PDF)

S13 Fig. Sensitivity of the periphery-bias ratio to the size of the sampled distal region (i.e. number of bins sampled at the tips).

The periphery-bias ratio was estimated for different numbers of bins sampled and always divided by the mean chromosomal recombination rate. Linear regression (black line) shows a decrease of the periphery-bias ratio as the number of bins increases, towards a ratio value of 1 (dashed line).

(TIF)

S1 Table. Metadata for 665 recombination landscapes, with name of the dataset collected and literal name of the chromosome used in our study, chromosome name in annotation (gff), size of the genetic map (cM, raw and corrected by methods of Chakravarti et al. (1991) or Hal & Willis (2005)), size of the genomic sequence in genome assembly (Mb), number of markers, density of markers in cM and bp, progeny size, mean interval between markers in cM and bp, Gini index, span parameter of the loess function, type of mapping function (Haldane, Kosambi or none), accession of the reference genome used for markers genomic positions, link to data repository and doi reference of the study in which the genetic map was published.

(XLSX)

S2 Table. Flowering plant species included in the study, with authors, year and doi reference of the genetic map publication, and accession of the reference genome.

(XLSX)

S3 Table. Centromeric indexes estimated in cytological studies, with unit of measurement, mean and standard error of long and short chromosome arms, centromeric index (ratio of short arm length divided by total chromosome length), and doi reference to the original study.

(XLSX)

S4 Table. Correlation between recombination landscapes estimated at two different genomic scales (1Mb and 100kb).

Spearman’s Rho coefficient was estimated for each chromosome between recombination rates estimated directly in windows of 1Mb and the mean recombination rate of 100kb windows pooled together in 1Mb windows. Mean of the Spearman’s Rho coefficient among chromosomes and proportion of significant p-values given for each species.

(XLSX)

S5 Table. Selection of the regression model between LM, LMER and PGLMM which explains best the relationship between the mean recombination rate (cM/Mb) and the chromosome size (Mb), based on AIC and BIC criteria.

(XLSX)

S6 Table. Species averaged correlation between the averaged chromosome size (Mb) and the averaged periphery-bias ratio.

Mean of the Spearman’s Rho coefficient among correlations at chromosome scale and proportion of significant p-values given for each species.

(XLSX)

S7 Table. Chromosome correlation between the recombination rate (cM/Mb) and the relative distance to the telomere, with Spearman’s Rho coefficient and p-value of the test per chromosome.

(XLSX)

S8 Table. Species averaged correlation between the recombination rate (cM/Mb) and the relative distance to the telomere.

Mean of the Spearman’s Rho coefficient among correlations at chromosome scale and proportion of significant p-values given for each species.

(XLSX)

S9 Table. Selection of the best model of crossover distribution for each species, based on Adjusted R-Squared between observed values and theoretical values predicted by the model.

The best model selected for each species is the one maximizing the Adjusted R-Squared.

(XLSX)

S10 Table. Selection of the best model of crossover distribution for each species in a subset of chromosomes with at least 50cM on each chromosome arm, based on Adjusted R-Squared between observed values and theoretical values predicted by model.

The best model selected for each species is the one maximizing the Adjusted R-Squared.

(XLSX)

S11 Table. Convergence between crossover patterns and gene patterns at a species scale.

For each species is given the type of crossover pattern, the type of gene count pattern, the difference RMSE(gene pattern)—RMSE(crossover pattern) which indicates how gene patterns are more/less homogeneous than crossover patterns, the homogenization effect of gene patterns (more/less), the difference genetic shuffling(gene pattern)—genetic shuffling(crossover pattern) and the averaged chromosome size (Mb).

(XLSX)

S1 Data. References for linkage map data included in this study.

(PDF)

Acknowledgments

We thank Eric Jenczewski, Laurent Duret, Anne-Marie Chèvre, Eric Petit, Armel Salmon and Bruno Raquillet for precious comments on the results and manuscript. We thank all the people that provided us genetic data.

Data Availability

All data available to reproduce the results presented in the paper is available in a public data repository (https://doi.org/10.17605/OSF.IO/NUXD7).

Funding Statement

SG received fundings from the Agence Nationale de la Recherche (ANR HotRec ANR-19-CE12-0019-04). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.de Massy B. Initiation of Meiotic Recombination: How and Where? Conservation and Specificities Among Eukaryotes. Annu Rev Genet. 2013;47: 563–599. doi: 10.1146/annurev-genet-110711-155423 [DOI] [PubMed] [Google Scholar]
  • 2.Haenel Q, Laurentino TG, Roesti M, Berner D. Meta-analysis of chromosome-scale crossover rate variation in eukaryotes and its significance to evolutionary genomics. Mol Ecol. 2018;27: 2477–2497. doi: 10.1111/mec.14699 [DOI] [PubMed] [Google Scholar]
  • 3.Mézard C, Tagliaro Jahns M, Grelon M. Where to cross? New insights into the location of meiotic crossovers. Trends Genet. 2015;31: 393–401. doi: 10.1016/j.tig.2015.03.008 [DOI] [PubMed] [Google Scholar]
  • 4.Stapley J, Feulner PGD, Johnston SE, Santure AW, Smadja CM. Variation in recombination frequency and distribution across eukaryotes: patterns and processes. Philos Trans R Soc B Biol Sci. 2017;372: 20160455. doi: 10.1098/rstb.2016.0455 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Zickler D, Kleckner N. Recombination, Pairing, and Synapsis of Homologs during Meiosis. Cold Spring Harb Perspect Biol. 2015;7: a016626. doi: 10.1101/cshperspect.a016626 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Barton NH. A general model for the evolution of recombination. Genet Res. 1995;65: 123–144. doi: 10.1017/s0016672300033140 [DOI] [PubMed] [Google Scholar]
  • 7.Charlesworth B, Jensen JD. Effects of Selection at Linked Sites on Patterns of Genetic Variability. Annu Rev Ecol Evol Syst. 2021;52: 177–197. doi: 10.1146/annurev-ecolsys-010621-044528 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Otto SP. The Evolutionary Enigma of Sex. Am Nat. 2009;174: S1–S14. doi: 10.1086/599084 [DOI] [PubMed] [Google Scholar]
  • 9.Otto SP, Payseur BA. Crossover Interference: Shedding Light on the Evolution of Recombination. Annu Rev Genet. 2019;53: 19–44. doi: 10.1146/annurev-genet-040119-093957 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Pazhayam NM, Turcotte CA, Sekelsky J. Meiotic Crossover Patterning. Front Cell Dev Biol. 2021;9: 681123. doi: 10.3389/fcell.2021.681123 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Wang S, Zickler D, Kleckner N, Zhang L. Meiotic crossover patterns: Obligatory crossover, interference and homeostasis in a single process. Cell Cycle. 2015;14: 305–314. doi: 10.4161/15384101.2014.991185 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Cooper TJ, Garcia V, Neale MJ. Meiotic DSB patterning: A multifaceted process. Cell Cycle. 2016;15: 13–21. doi: 10.1080/15384101.2015.1093709 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zelkowski M, Olson MA, Wang M, Pawlowski W. Diversity and Determinants of Meiotic Recombination Landscapes. Trends Genet. 2019;35: 359–370. doi: 10.1016/j.tig.2019.02.002 [DOI] [PubMed] [Google Scholar]
  • 14.Choi K, Zhao X, Tock AJ, Lambing C, Underwood CJ, Hardcastle TJ, et al. Nucleosomes and DNA methylation shape meiotic DSB frequency in Arabidopsis thaliana transposons and gene regulatory regions. Genome Res. 2018;28: 532–546. doi: 10.1101/gr.225599.117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.He Y, Wang M, Dukowic-Schulze S, Zhou A, Tiang C-L, Shilo S, et al. Genomic features shaping the landscape of meiotic double-strand-break hotspots in maize. Proc Natl Acad Sci. 2017;114: 12231–12236. doi: 10.1073/pnas.1713225114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Marand AP, Zhao H, Zhang W, Zeng Z, Fang C, Jiang J. Historical Meiotic Crossover Hotspots Fueled Patterns of Evolutionary Divergence in Rice. Plant Cell. 2019;31: 645–662. doi: 10.1105/tpc.18.00750 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Webster MT, Hurst LD. Direct and indirect consequences of meiotic recombination: implications for genome evolution. Trends Genet. 2012;28: 101–109. doi: 10.1016/j.tig.2011.11.002 [DOI] [PubMed] [Google Scholar]
  • 18.Booker TR, Ness RW, Keightley PD. The Recombination Landscape in Wild House Mice Inferred Using Population Genomic Data. Genetics. 2017;207: 297–309. doi: 10.1534/genetics.117.300063 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Comeron JM. Background selection as null hypothesis in population genomics: insights and challenges from Drosophila studies. Philos Trans R Soc B Biol Sci. 2017;372: 20160471. doi: 10.1098/rstb.2016.0471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kuo P, Da Ines O, Lambing C. Rewiring Meiosis for Crop Improvement. Front Plant Sci. 2021;12: 708948. doi: 10.3389/fpls.2021.708948 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Tiley GP, Burleigh JG. The relationship of recombination rate, genome structure, and patterns of molecular evolution across angiosperms. BMC Evol Biol. 2015;15: 194. doi: 10.1186/s12862-015-0473-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Blitzblau HG, Bell GW, Rodriguez J, Bell SP, Hochwagen A. Mapping of Meiotic Single-Stranded DNA Reveals Double-Strand-Break Hotspots near Centromeres and Telomeres. Curr Biol. 2007;17: 2003–2012. doi: 10.1016/j.cub.2007.10.066 [DOI] [PubMed] [Google Scholar]
  • 23.Apuli R-P, Bernhardsson C, Schiffthaler B, Robinson KM, Jansson S, Street NR, et al. Inferring the Genomic Landscape of Recombination Rate Variation in European Aspen (Populus tremula). G3 GenesGenomesGenetics. 2020;10: 299–309. doi: 10.1534/g3.119.400504 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Yelina NE, Choi K, Chelysheva L, Macaulay M, de Snoo B, Wijnker E, et al. Epigenetic Remodeling of Meiotic Crossover Frequency in Arabidopsis thaliana DNA Methyltransferase Mutants. Barsh GS, editor. PLoS Genet. 2012;8: e1002844. doi: 10.1371/journal.pgen.1002844 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kent TV, Uzunović J, Wright SI. Coevolution between transposable elements and recombination. Philos Trans R Soc B Biol Sci. 2017;372: 20160458. doi: 10.1098/rstb.2016.0458 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Wang Y, Copenhaver GP. Meiotic Recombination: Mixing It Up in Plants. Annu Rev Plant Biol. 2018;69: 577–609. doi: 10.1146/annurev-arplant-042817-040431 [DOI] [PubMed] [Google Scholar]
  • 27.Pellicer J, Hidalgo O, Dodsworth S, Leitch I. Genome Size Diversity and Its Impact on the Evolution of Land Plants. Genes. 2018;9: 88. doi: 10.3390/genes9020088 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Soltis PS, Marchant DB, Van de Peer Y, Soltis DE. Polyploidy and genome evolution in plants. Curr Opin Genet Dev. 2015;35: 119–125. doi: 10.1016/j.gde.2015.11.003 [DOI] [PubMed] [Google Scholar]
  • 29.Gaut BS, Wright SI, Rizzon C, Dvorak J, Anderson LK. Recombination: an underappreciated factor in the evolution of plant genomes. Nat Rev Genet. 2007;8: 77–84. doi: 10.1038/nrg1970 [DOI] [PubMed] [Google Scholar]
  • 30.Hall MC, Willis JH. Transmission Ratio Distortion in Intraspecific Hybrids of Mimulus guttatus: Implications for Genomic Divergence. Genetics. 2005;170: 375–386. doi: 10.1534/genetics.104.038653 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Veller C, Kleckner N, Nowak MA. A rigorous measure of genome-wide genetic shuffling that takes into account crossover positions and Mendel’s second law. Proc Natl Acad Sci. 2019;116: 1659–1668. doi: 10.1073/pnas.1817482116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ferreira MTM, Glombik M, Perničková K, Duchoslav M, Scholten O, Karafiátová M, et al. Direct evidence for crossover and chromatid interference in meiosis of two plant hybrids (Lolium multiflorum×Festuca pratensis and Allium cepa×A. roylei). Wilson Z, editor. J Exp Bot. 2021;72: 254–267. doi: 10.1093/jxb/eraa455 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Foss E, Lande R, Stahl F, Steinberg C. Chiasma interference as a function of genetic distance. Genetics. 1993;133: 681–691. doi: 10.1093/genetics/133.3.681 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Capilla-Pérez L, Durand S, Hurel A, Lian Q, Chambon A, Taochy C, et al. The synaptonemal complex imposes crossover interference and heterochiasmy in Arabidopsis. Proc Natl Acad Sci. 2021;118: e2023613118. doi: 10.1073/pnas.2023613118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Kleckner N, Zickler D, Jones GH, Dekker J, Padmore R, Henle J, et al. A mechanical basis for chromosome function. Proc Natl Acad Sci. 2004;101: 12592–12597. doi: 10.1073/pnas.0402724101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Lloyd A, Jenczewski E. Modelling Sex-Specific Crossover Patterning in Arabidopsis. Genetics. 2019;211: 847–859. doi: 10.1534/genetics.118.301838 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Falque M, Mercier R, Mézard C, de Vienne D, Martin OC. Patterns of Recombination and MLH1 Foci Density Along Mouse Chromosomes: Modeling Effects of Interference and Obligate Chiasma. Genetics. 2007;176: 1453–1467. doi: 10.1534/genetics.106.070235 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Zhang L, Liang Z, Hutchinson J, Kleckner N. Crossover Patterning by the Beam-Film Model: Analysis and Implications. Hawley RS, editor. PLoS Genet. 2014;10: e1004042. doi: 10.1371/journal.pgen.1004042 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Bishop DK, Zickler D. Early decision; meiotic crossover interference prior to stable strand exchange and synapsis. Cell. 2004;117: 9–15. doi: 10.1016/s0092-8674(04)00297-1 [DOI] [PubMed] [Google Scholar]
  • 40.Higgins JD, Perry RM, Barakate A, Ramsay L, Waugh R, Halpin C, et al. Spatiotemporal Asymmetry of the Meiotic Program Underlies the Predominantly Distal Distribution of Meiotic Crossovers in Barley. Plant Cell. 2012;24: 4096–4109. doi: 10.1105/tpc.112.102483 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hinch AG, Zhang G, Becker PW, Moralli D, Hinch R, Davies B, et al. Factors influencing meiotic recombination revealed by whole-genome sequencing of single sperm. Science. 2019;363: eaau8861. doi: 10.1126/science.aau8861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Osman K, Algopishi U, Higgins JD, Henderson IR, Edwards KJ, Franklin FCH, et al. Distal Bias of Meiotic Crossovers in Hexaploid Bread Wheat Reflects Spatio-Temporal Asymmetry of the Meiotic Program. Front Plant Sci. 2021;12: 631323. doi: 10.3389/fpls.2021.631323 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Ellermeier C, Higuchi EC, Phadnis N, Holm L, Geelhood JL, Thon G, et al. RNAi and heterochromatin repress centromeric meiotic recombination. Proc Natl Acad Sci. 2010;107: 8701–8705. doi: 10.1073/pnas.0914160107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Fernandes JB, Wlodzimierz P, Henderson IR. Meiotic recombination within plant centromeres. Curr Opin Plant Biol. 2019;48: 26–35. doi: 10.1016/j.pbi.2019.02.008 [DOI] [PubMed] [Google Scholar]
  • 45.Hartmann M, Umbanhowar J, Sekelsky J. Centromere-Proximal Meiotic Crossovers in Drosophila melanogaster Are Suppressed by Both Highly Repetitive Heterochromatin and Proximity to the Centromere. Genetics. 2019;213: 113–125. doi: 10.1534/genetics.119.302509 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Pan J, Sasaki M, Kniewel R, Murakami H, Blitzblau HG, Tischfield SE, et al. A Hierarchical Combination of Factors Shapes the Genome-wide Topography of Yeast Meiotic Recombination Initiation. Cell. 2011;144: 719–731. doi: 10.1016/j.cell.2011.02.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Khrustaleva LI, de Melo PE, van Heusden AW, Kik C. The Integration of Recombination and Physical Maps in a Large-Genome Monocot Using Haploid Genome Analysis in a Trihybrid Allium Population. Genetics. 2005;169: 1673–1685. doi: 10.1534/genetics.104.038687 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Jo J, Purushotham PM, Han K, Lee H-R, Nah G, Kang B-C. Development of a Genetic Map for Onion (Allium cepa L.) Using Reference-Free Genotyping-by-Sequencing and SNP Assays. Front Plant Sci. 2017;8: 1606. doi: 10.3389/fpls.2017.01606 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Shahin A, Arens P, Van Heusden AW, Van Der Linden G, Van Kaauwen M, Khan N, et al. Genetic mapping in Lilium: mapping of major genes and quantitative trait loci for several ornamental traits and disease resistances. Plant Breed. 2011;130: 372–382. doi: 10.1111/j.1439-0523.2010.01812.x [DOI] [Google Scholar]
  • 50.Lambing C, Kuo PC, Tock AJ, Topp SD, Henderson IR. ASY1 acts as a dosage-dependent antagonist of telomere-led recombination and mediates crossover interference in Arabidopsis. Proc Natl Acad Sci. 2020;117: 13647–13658. doi: 10.1073/pnas.1921055117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Dukić M, Bomblies K. Male and female recombination landscapes of diploid Arabidopsis arenosa. Birchler J, editor. Genetics. 2022;220: iyab236. doi: 10.1093/genetics/iyab236 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Sardell JM, Kirkpatrick M. Sex Differences in the Recombination Landscape. Am Nat. 2020;195: 361–379. doi: 10.1086/704943 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Johnston SE, Bérénos C, Slate J, Pemberton JM. Conserved Genetic Architecture Underlying Individual Recombination Rate Variation in a Wild Population of Soay Sheep (Ovis aries). Genetics. 2016;203: 583–598. doi: 10.1534/genetics.115.185553 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Lenormand T, Engelstädter J, Johnston SE, Wijnker E, Haag CR. Evolutionary mysteries in meiosis. Philos Trans R Soc B Biol Sci. 2016;371: 20160001. doi: 10.1098/rstb.2016.0001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Kianian PMA, Wang M, Simons K, Ghavami F, He Y, Dukowic-Schulze S, et al. High-resolution crossover mapping reveals similarities and differences of male and female recombination in maize. Nat Commun. 2018;9: 2370. doi: 10.1038/s41467-018-04562-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Melamed-Bessudo C, Shilo S, Levy AA. Meiotic recombination and genome evolution in plants. Curr Opin Plant Biol. 2016;30: 82–87. doi: 10.1016/j.pbi.2016.02.003 [DOI] [PubMed] [Google Scholar]
  • 57.Shi T, Rahmani RS, Gugger PF, Wang M, Li H, Zhang Y, et al. Distinct Expression and Methylation Patterns for Genes with Different Fates following a Single Whole-Genome Duplication in Flowering Plants. Wright S, editor. Mol Biol Evol. 2020;37: 2394–2413. doi: 10.1093/molbev/msaa105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Wei C, Yang H, Wang S, Zhao J, Liu C, Gao L, et al. Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality. Proc Natl Acad Sci. 2018;115: E4151–E4158. doi: 10.1073/pnas.1719622115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Lukaszewski AJ. Unexpected behavior of an inverted rye chromosome arm in wheat. Chromosoma. 2008;117: 569–578. doi: 10.1007/s00412-008-0174-4 [DOI] [PubMed] [Google Scholar]
  • 60.Lukaszewski AJ, Kopecky D, Linc G. Inversions of chromosome arms 4AL and 2BS in wheat invert the patterns of chiasma distribution. Chromosoma. 2012;121: 201–208. doi: 10.1007/s00412-011-0354-5 [DOI] [PubMed] [Google Scholar]
  • 61.Charlesworth B, Sniegowski P, Stephan W. The evolutionary dynamics of repetitive DNA in eukaryotes. Nature. 1994;371: 215–220. doi: 10.1038/371215a0 [DOI] [PubMed] [Google Scholar]
  • 62.Underwood CJ, Choi K. Heterogeneous transposable elements as silencers, enhancers and targets of meiotic recombination. Chromosoma. 2019;128: 279–296. doi: 10.1007/s00412-019-00718-4 [DOI] [PubMed] [Google Scholar]
  • 63.Nachman MW, Payseur BA. Recombination rate variation and speciation: theoretical predictions and empirical results from rabbits and mice. Philos Trans R Soc B Biol Sci. 2012;367: 409–421. doi: 10.1098/rstb.2011.0249 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Ritz KR, Noor MAF, Singh ND. Variation in Recombination Rate: Adaptive or Not? Trends Genet. 2017;33: 364–374. doi: 10.1016/j.tig.2017.03.003 [DOI] [PubMed] [Google Scholar]
  • 65.Mayer KFX, Martis M, Hedley PE, Šimková H, Liu H, Morris JA, et al. Unlocking the Barley Genome by Chromosomal and Comparative Genomics. Plant Cell. 2011;23: 1249–1263. doi: 10.1105/tpc.110.082537 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Pan Q, Li L, Yang X, Tong H, Xu S, Li Z, et al. Genome-wide recombination dynamics are associated with phenotypic variation in maize. New Phytol. 2016;210: 1083–1094. doi: 10.1111/nph.13810 [DOI] [PubMed] [Google Scholar]
  • 67.Nam K, Ellegren H. Recombination Drives Vertebrate Genome Contraction. Petrov DA, editor. PLoS Genet. 2012;8: e1002680. doi: 10.1371/journal.pgen.1002680 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Rezvoy C, Charif D, Gueguen L, Marais GAB. MareyMap: an R-based tool with graphical interface for estimating recombination rates. Bioinformatics. 2007;23: 2188–2189. doi: 10.1093/bioinformatics/btm315 [DOI] [PubMed] [Google Scholar]
  • 69.White IMS, Hill WG. Effect of heterogeneity in recombination rate on variation in realised relationship. Heredity. 2020;124: 28–36. doi: 10.1038/s41437-019-0241-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2019. Available from: https://www.R-project.org/. [Google Scholar]
  • 71.Bates D, Mächler M, Bolker B, Walker S. Fitting Linear Mixed-Effects Models Using lme4. J Stat Softw. 2015;67. doi: 10.18637/jss.v067.i01 [DOI] [Google Scholar]
  • 72.Ives A, Dinnage R, Nell LA, Helmus M, Li D. phyr: Model based phylogenetic analysis. 2019. Available from: https://CRAN.R-project.org/package=phyr. [Google Scholar]
  • 73.Smith SA, Brown JW. Constructing a broadly inclusive seed plant phylogeny. Am J Bot. 2018;105: 302–314. doi: 10.1002/ajb2.1019 [DOI] [PubMed] [Google Scholar]
  • 74.Bartoń K. MuMIn: Multi-model inference. 2020. Available: https://CRAN.R-project.org/package=MuMIn [Google Scholar]
  • 75.Kuznetsova A, Brockhoff PB, Christensen RHB. lmerTest package: Tests in linear mixed effects models. J Stat Softw. 2017;82: 1–26. doi: 10.18637/jss.v082.i13 [DOI] [Google Scholar]

Decision Letter 0

Kirsten Bomblies, Ian R Henderson

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

26 Apr 2022

Dear Dr Brazier,

Thank you very much for submitting your Research Article entitled 'Diversity and determinants of recombination landscapes in flowering plants' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by three independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the current manuscript. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review a much-revised version. We cannot, of course, promise publication at that time.

Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool.  PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions.

Yours sincerely,

Ian Henderson

Associate Editor

PLOS Genetics

Kirsten Bomblies

Section Editor: Evolution

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: It is now possible to analyse genetic and physical maps of organisms, as the literature now contains suitable data from multiple species. This manuscript analyses such data from larger numbers of plant species than previous studies (55 species, 5-26 chromosomes per species), and describes results for the broad-scale recombination landscapes. However, it does not actually use the analyses to ask interesting questions, which I had hope to see. Some questions might include the following

Do chromosome arms have an obligate crossover?

How often do chromosome arms have multiple crossovers, versus a single one (as I believe is the case in C. elegans)?

Do related species differ (this is an important question, as it relates to the question of whether genetic recombination is sometimes selectively favoured, leading to higher crossover numbers than required for correct segregation, and for repair mechanisms to occur). This question is discussed near the very end of the text, but is not mentioned as a question earlier, making it appear that the ms is entirely descriptive. The ms does not seem to mention that some of the species studied are close relatives, and that this can be helpful in studying such questions.

How large are pericentromeric regions with low recombination rates in plants, and how much do they differ between related species?

Do selfers have higher recombination rates per physical length of chromosome than closely related outcrossers?

Are recombination rates the same in male versus female meiosis? This is finally mentioned in line 574, but it is not made clear until then that the data analysed are sex-averaged rates.

Instead, the ms presents rather dull statistical analyses. The results have value, but they appear mainly to confirm findings that were already well established, and the ms does not make very clear what new findings now emerge, or show what we can now understand from the results that was not already known. More than once in the text “new insights” are claimed, but it is difficult to find them, partly because of the length of the text, which is also long-winded and repetitive in several places. These problems could be ameliorated by outlining in the Introduction what questions the authors set out to study. As written, this section gives the impression that their aims were purely descriptive, which is not an encouragement to read the text. The ms also tries to interest the reader by making claims to novelty, rather than describing some interesting questions. For example, I feel that it is too strong to say that “the broad diversity of recombination landscapes among plants has rarely been investigated… and the diversity of the resulting landscapes among species and chromosomes still need[s] to be assessed“, although a formal comparative genomic approach may be new and valuable. A further value from analysing more species is that exceptions to accepted generalisations may be detected, and this study did produce a few examples of such exceptions. Overall, I doubt that readers need a length Introduction to tell them that recombination patterns are interesting in relation to evolution, including evolution of patterns in genomes, such as regions with different repetitive sequence density, and consequently gene density, and with differences in GC content. A shorter Introduction could give a better idea of what is new from this study.

At least several of the conclusions are just confirmations of what was already known. The following examples illustrate this problem, and my comments also include some other issues for some of them (a recurring problem throughout the text is poor writing, including long-winded writing that makes the meaning hard to understand, and I provide some examples in my ‘Minor comments’ below, but these are still important comments that require revisions of the text, including a suggestion that some species may have too little information to be used. It would be helpful to show the numbers or markers mapped in Figure 4. In addition, if the numbers are small, presumably the total genetic map lengths are unreliable, and it is not explained prominently whether any attempt was made to check for this problem.

1. “We observed that the bias towards the periphery was not ubiquitous across species“ and “Only a subset of species, especially those with larger chromosomes, exhibited a clear bias”. These conclusions are quite similar to that of Haenel et al. (2018) that a distal bias is “universal for chromosomes larger than 30 Mb” (note the incorrect English “concluded to a distal bias”). The main advance seems to be that this study finds that Nelumbo nucifera and Camellia sinensis are exceptions to this pattern, with the highest recombination rates found in the middle of their chromosomes.

The result is described in a rather unhelpful manner, without taking chromosomes morphology into account. The text states that, for larger chromosomes, crossovers tend to occur (not “accumulate”) at the ends of chromosome, while the central regions have less. However, this would be correct only for metacentrics, and the centres of chromosome presumably means centromeric and pericentromeric regions, but this is not made clear. It is also not made clear that these are completely recombination-free regions.

The extent of a larger pericentromeric region (meaning, the extent of the wider region surrounding or adjacent to the centromere) is known to vary greatly between species, but it is not well described in the ms, and only examples are shown, with rather subjective criteria to define the different regions. It would, in principle, be possible to define them less subjectively, though this might not be easy. At least, it would be good to mention whether this was attempted. A further problem is that regions are shown in figures, rather than tables giving estimates of genome region sizes and recombination rates, and as relative sizes are often used, it is difficult to understand what sizes of pericentromeric regions (for example) are found in plants.

It is also not a new discovery that low recombination regions tend to have low gene density. The Discussion acknowledges this, but it is strange to first describe this as if it is a new result, only to later mention that it is not. If the Introduction had laid out some questions, this could be avoided. Problems like this also make the text longer than necessary.

2. Recombination is unevenly distributed in genomes. Therefore one should not write that “We showed that” this is the case. Once can write “We confirmed that” (or something similar). This text also uses vague terminology “how genetic variation is shuffled during meiosis”, but the word recombination already exists, so it would be better to be precise. If at some point the meaning is gene conversion, this should be used. However, I think that the text mentions conversion only in passing, and it is not considered seriously.

In line 538, I am not sure why thw word “prediction” is used (In addition to the role of centromeres, we also observed a departure from the prediction that recombination rates should decrease with the distance to the tip of the chromosome, showing that the distal model is not generally found among plants). Is this really a prediction, or are you trying to say that you did not confirm the view that this pattern is shared by all plants? If so, references are needed to assertions that all plants share this pattern.

The Discussion section need not repeat so much of the results. It might also mention that recombination rates vary between individuals of the same species, including from the effects of rearrangements, especially inversions, so it would be good to mention that the data are currently often from just a single maternal and paternal parental individual of each species (for selfers, perhaps just a single parental individual). Hotspots should also be mentioned, if only to make clear that this study did not attempt to detect them.

474 It is proposed that in angiosperms crossovers may be initiated in gene regulatory sequences, and it is suggested that this “sheds new light on the evolution of recombination landscapes”, but without saying what new light is shed, other than this suggestion. The suggestion is not evaluated further, and I did not understand if it is a speculation, based on the correlation between recombination and gene density mentioned in this paragraph (or on some other observations). However, based on later text (line 613), I suspect that the intended meaning is that the results are consistent with such a proposal that was already published by others.

However, as the correlation must be strongly affected by the lower gene densities in genome regions with low recombination rates, which lead to accumulation of transposable elements and other repetitive sequences, it would seem difficult to disentangle this from the suggested mechanism. Line 628 states that “The positive association of COs and gene regulatory sequences, including fine-scale correlations, appears more robust”, which is too vague. It seems unlikely that the effect is stronger than the very marked and consistent effect of low recombination rates on repetitive sequence density (although of course different elements are involved in different cases).

Regions with high recombination rates may, however, allow patterns in crossover localisation to be detectable, and I believe that this has been studied, for example in maize (e.g. papers by Dooner and colleagues) and also in Mimulus guttatus (see the paper by Hellsten et al. cited above). Line 621 finally mentions the problem of other correlated factors. I think that the authors should revise their text so that it does not first set up an untestable idea and then mention that it is untestable. Instead, it will be preferable to set up some interesting questions early in the text, tell readers what is currently known, and then describe analyses that help understand things better than before.

Dooner, H., & He, L. (2008). Maize genome structure variation: Interplay between retrotransposon polymorphisms and genic recombination. Plant Cell, 20(2), 249-258. doi:10.1105/tpc.107.057596

Fengler, K., Allen, S. M., Li, B., & Rafalski, A. (2007). Distribution of genes, recombination, and repetitive elements in the maize genome. Crop Science, 47(Supplement), S-83-S-95.

Yao, H., Zhou, Q., Li, J., Smith, H., Yandeau, M., Nikolau, B. J., & Schnable, P. S. (2002). Molecular characterization of meiotic recombination across the 140-kb multigenic a1-sh2 interval of maize. Proceedings of the National Academy of Sciences of the USA, 99, 6157-6162.

Tenaillon, M. I., Sawkins, M. C., Anderson, L. K., Stack, S. M., Doebley, J. F., & Gaut, B. S. (2002). Patterns of diversity and recombination along chromosome 1 of maize (Zea mays ssp. mays L.). Genetics, 162, 1401-1413.

Another comment that applies throughout the text is that recent papers are cited for concepta and understanding that are not new. In such cases, the text should make clear that the citation is to a review paper. For example, the text gives the impression that Marand et al. (2019) discovered that gene density and recombination rates are both correlated with transposable elements (meaning densities of transposable elements). This has been known for a long time, and was reviewed in 1994 by Charlesworth et al. (Nature, 371, 215-220. doi:10.1038/371215a0).

In first mentioning heterochiasmy, it seems strange not to mention whether the papers cited refer to plants or just to studies in animal species. It is explained later that Melamed-Bessudo et al. (2016) showed that it is not universal in plants, but the text does not explain what the term might mean in plants, and that hermaphrodites may have different crossover patterns in male and female meiosis, so readers may be puzzled.

3. Similarly, I was surprised to read that “We were intrigued to notice that [within species]

the chromosome-wide recombination rate is proportional to the relative size of the chromosome”. I was under the impression that this was already known.

It is illustrated in Figure 2D, which shows the new results, which are potentially interesting, as they relate to the question of how often arms have multiple crossovers. This figure analyses the excess of crossovers, defined as the linkage map length minus the 50 cM expected if one crossover per arm is obligate), and shows that it correlates positively with the chromosome’ physical sizes divided by the average chromosome size for the species, which they term the “relative chromosome size”. Such an effect is not a new result.

However, as I understand it, an obligate crossover is expected on each arm. If so, the number of excess crossovers, in addition to this one, should be analysed per arm. Even if my recollection about this is incorrect, the text should make clear what is known from previous studies, and why the present study uses chromosome, not arm, lengths. Line 136 mentions that the centromeric index was known for the chromosomes of 37 species, but then it remains unclear how these data were used, and also whether results can be used from the species where no such data were available. Line 285 mentions that recombination rates were negatively correlated with the distance to the nearest telomere, which seems to suggest that metacentrics may have been analysed as such, but I could not see this clearly explained.

Line 300 states that that (in my wording) the centromere regions almost universally showed low recombination rates, but this is not completely clear in Figure 4, where large low recombination rate regions in several species, for example Vigna unguiculata, appear not to overlap the centromeres. If this is a real biological observation, the statement seems incorrect.

Given these possible problems with the data, I was not convinced of the value of the formal modelling analysis of the effect of the centromere in suppressing recombination, and the comparison with less simple models that suggest that telomeres may also affect patterns. Such effects are plausible, but I feel that some of these plant data do not add valuable and solid support.

Another weakness is the lack of any mention of differences between male and female meiosis, and another is the lack of any mention of outcrossing rates.

I wondered why these papers were not cited, or other papers about Arabidopsis lyrate or helleri, which may have genetic map information.

Hellsten, U., Wright, K. M., Jenkins, J., Shu, S., Yuan, Y., Wessler, S. R., . . . Rokhsar, D. S. (2013). Fine-scale variation in meiotic recombination in Mimulus inferred from population shotgun sequencing. Proceedings of the National Academy of Sciences of the United States of America, 110(48), 19478–19482. doi:10.1073/pnas.1319032110

Kawabe, A., Hansson, B., Forrest, A., Hagenblad, J., & Charlesworth, D. (2006). Comparative gene mapping in Arabidopsis lyrata chromosomes 6 and 7 and A. thaliana chromosome IV: evolutionary history, rearrangements and local recombination rates. Genetical Research, 88, 45-56.

Hansson, B., Kawabe, A., Preuss, S., Kuittinen, H., & Charlesworth, D. (2006). Comparative gene mapping in Arabidopsis lyrata chromosomes 1 and 2 and the corresponding A. thaliana chromosome 1: recombination rates, rearrangements and centromere location. Genetical Research, 87(2), 75-85. doi:10.1017/S0016672306008287

Minor problems with the English, or vague wording or unclear statements

1. In English, it should be “correlated with” (not “to”).

2. The word ‘drive’ should be avoided, as it is very vague. For example, the meaning is not clear in the phrase “Chromosome length drives the basal recombination rate for each species”

3. In line 182, it should read “regression lines for species with at least 5 chromosomes mapped, 5-26 chromosomes per species, 55 species).

4. Line 232 Genomic distances (Mb) were scaled between 0 and 1 (divided by chromosome size) to compare chromosomes with different sizes.

5. It is difficult to make out the meaning of the text starting in line 247. I think it means the following: “Each chromosome was divided in (it should read “into”) ten bins, each one 10th of the chromosome’s total physical size.” The relative recombination rate is the log-transformed ratio of the expected relative genetic length (one tenth, presumably of the total genetic length) divided by the observed relative genetic length of the bin (presumably meaning the proportion of the total genetic length represented by the physical region in question. Values below zero correspond to recombination rates lower than expected under a random distribution of crossovers across the physical chromosome. Also difficult to understand “Chromosome sizes (Mb) on the left correspond to each broken stick chromosome” — maybe it means “each chromosome”. Also (in line 244) “Relative recombination rates along the chromosome were estimated in ten bins using the broken stick model.

6. In English, one needs to say “divided into” (not “in”). Also “pooled into” (although this reads awkwardly in English, and line 140 might be better as “the Spearman rank correlation coefficient correlation between the values for 1 Mb windows and those for the 100 kb windows within them was ….”.

7. The work “linkage” in genetics means that the variants are linked. It should be distinguished from “linkage disequilibrium” (LD), which refers to associations between two or more liked variants. Line 57 should be corrected, as the text refers to the latter, but uses the former (“Recombination…. breaking the linkage between neighbouring sites and creating new genetic combinations”). The sites remain linked, but not in LD. The sentence is also confusing by adding “upon which selection can act”, because selection acts on single variants, and the authors are trying to say that new genetic combinations might be more (or less) favoured by selection than the non-recombinant combinations (in other words, the different variants may interact in their effect on fitness).

8. It is a sweeping statement to say that “Plant genomes contain large regions with suppressed recombination”, depending strongly on how many plants have good data on physical and genetic maps, so line 92 ought to mention the number on which this is based, and give readers at least a rough idea of what is meant by “large”. There is no need to add the obvious remark that this impacts genomic averages ( in addition “impact” is the wrong word, as the meaning is that it affects the average — of course the average depends on the values in all genome regions that are included in the data, so it is not worth saying explicitly).

9. Phrases that are unnecessary (such as “it seems that” in line 93, should be pruned out, so that the text is easier to read. There are quite a few such instances, and I do not comment on all of them. The beginning of the Results section, for example, could be written more briefly and clearly.

We retrieved publicly available data for linkage maps and genome assemblies, to obtain genetic map distances and physical distances. We used linkage maps with marker positions in chromosome-level genome assemblies (except for Capsella rubella, which had a high-quality scaffold-level assembly of pseudo-chromosomes). After filtering based on the marker numbers, densities, and genome coverage, and after filtering out the outlying markers (maybe meaning outlier markers by a criterion that needs to be explained), we produced 665 Marey maps (reference needed) for 57 species (2-26 chromosomes per species); marker numbers per chromosome (or perhaps the authors mean per species, in which case perhaps some species have too little information to be used) ranged from 31 to 49,483.

.

Reviewer #2: In this paper, the authors seek to decipher genomic patterns of recombination across a large (57 species) dataset of sequenced plant genomes coupled with genetic maps. Their meta-analyses lead to several novel observations.

I thoroughly enjoyed reading this manuscript, and I congratulate the authors on a really fine paper. It will be, in my view, a very welcome addition to the literature. In the surest sign of flattery, I’m a jealous that I did not think of doing such a neat analysis.

Accordingly, I have only minor comments that the authors may wish to address in revision. Most of the comments are very minor, indeed. They are offered both as an attempt to clarify the few areas of the text that I found difficult to digest and probably out of an abundance of enthusiasm for this work. I leave it to the authors to decide if my suggestions offer improvements or are better ignored…

Minor Comments:

- Line 48 – Unlike most of the rest of the paper, I found this sentence hard to read and digest. Reword, rework or shorten? Btw I’d use “in” instead of “to” (“in the production”)

- Line 79 – This last sentence of the paragraph is really indirect and therefore pretty tough to read. I’m not really sure what manipulations are being considered here… Rewrite?

- Line 100 – as a reader, I found that a better link between the two sentences on this line could have been helpful. Maybe something as simple as “Haenel et al. considered chrosomome length, found blah blah blah and suggested a simpler telomere-led model. That model included a universal bias…”

- Line 118 – I’d use “about” instead of “on”

- Line 125 – If this is reasonable, I’d love to see the filter characteristics hinted at here, even though there is a good description in the methods. That is something like “… marker density (at least 50 per chromosome), genome coverage (blah blah)”

- Line 701 – I’m a bit confused by the what was done when marker sequences were not available and also how many species fell into this category. I’m not concerned at all – this is a careful study – but it’d be nice understand better.

- Figure 2 – It might be helpful to have X-axis say “Mean chromosome size” where appropriate (e.g., Figure 2C and B). The legend is very clear, though.

- I love Figure 3. It blows my mind how consistent the patterns are between the dashed lines (genome wide) and an individual chromosome. It is bizarre and neat and thought provoking. It might be nice to report mean chromosome size (in the legend or in the figure), given that the species are ordered in that matter. It just makes me curious…

- Figure 4, since patterns seem to correlated with mean chromosome size, would it be worth adding that value after each species name? As a reader, it would help me to see the pattern and better digest the text from ~lines 213 to 227 and figure 5a, etc).

- Figure 5A – this may make the graph too crowded, but it’d be nice to be able to compare dots in 5A to figure 4. So, it’d be nice to have the dots labelled. If that is too much, the authors might want to consider labelling a few species (e.g., the six in figure 3 or some of the species mentioned in lines 220 to 227). Personally, I’d love to know what the outliers are in this graph!

- Lines 284 and following. It’d be nice to cite Figure 6A and Figure 6B separately after the word descriptions of the patterns.

- Line 296. I could not figure out what the “species correlation” referred to. Sorry if I missed this, but it’s worth another look to be sure it is clear.

- I’m kind of shocked that M3 is favored over M2, as isn’t one CO per arm necessary for mechanism? Hence, I’d a priori predict M2 > M3. I don’t think this contrast is explicitly discussed in the Discussion (e.g., lines 523 to 536), but I think it should be.

- Figure 8: It’d be nice if the legend clearly stated which graph is which. I think Figure 8b is the distal recombination pattern, but I’m not 100% sure. It’d be great to have sample sizes on the graph too (n = 34 or 16 species, I think).

- It’s pretty clear in the M&M, but on line 419, it might be nice to mention that rintra is a single value per chromosome. On my first reading, I was thinking it was some sort of transformation of cM between genes...

- The analysis of gene distances is very thought provoking!

- Line 488 – What the heck is going on with fungi and animals! It’s certainly not necessary, but can the authors provide a quick description or explanation. They have piqued my curiousity.

Again, I do not consider any of my comments to be critical for publication, and I want to again congratulate the authors on a thorough and interesting study.

Reviewer #3: This manuscript by Brazier and Glémin uses a comparative approach to investigate variation in recombination landscapes in flowering plants. Their study used genetic map data from 665 chromosomes in 57 species of angiosperms. At the whole chromosomal level, they found a negative correlation between chromosome size and recombination rate (cM/Mb) with a strong species-specific effect. They also found that CO excess on chromosomes was more correlated with their relative size to other chromosomes in the genome rather than their absolute size, and that this effect was consistent across species. When investigating crossover landscapes, they found that landscapes were similar within species but strongly varied between species. CO rates were not uniform across chromosomes and were often more likely to occur at the distal ends of the chromosomes, with larger chromosomes tending to have a higher “periphery bias” of COs. However (as with most things in nature), this general pattern did have a number of exceptions. The authors then investigated the joint effect of telomeres and centromeres on CO distribution, finding the strongest support for a model that incorporated the effects of the telomere, centromere and one CO per chromosome. The authors found that recombination rate increased with gene density. Finally, the authors showed that genetic shuffling was positively correlated with linkage map length, and that there was a small negative effect of the periphery-bias ratio. These effects were slightly higher when modelling genetic shuffling in terms of gene distances. Whilst the investigations here are largely correlative rather than revealing mechanisms, this study provides a useful foundation for further investigation of broad drivers of recombination rate and landscape variation across a wide range of taxa.

This paper is the most comprehensive and well analysed that I have read on this topic, and generally it is well-written and well structured, particularly the introduction and discussion. I’m impressed by the sheer breadth of analyses. Nevertheless, there are parts of the methods & results that lack clarity, which in turn leads to issues with reproducibility. In particular, a lot of the statistical models are not well described – model structures should be made explicit in the methods and/or results, rather than providing a general text for statistical analyses at the end of the methods. I would emphasise that providing code and data (where possible) would improve these issues.

I had many comments and suggestions - those marked ** should be addressed by the authors in a revised version.

ABSTRACT/INTRODUCTION

Lines 32-33: The authors should be clearer what they mean by “relative size” here (i.e. relative to the rest of the genome) and why this result is interesting.

Lines 48-52: In the first sentence, I would add the term “crossing-over” or “crossover” here to set up the rest of the introduction. In the second sentence, I would briefly define landscape (i.e. variation in recombination rate along the chromosomes)

Lines 61 – 63: Indicate that you are defining assurance in this sentence.

Line 73: Can the authors briefly mention how recombination landscapes shape the distribution of TEs?

**Lines 98 – 100: I found this statement confusing, as I can’t understand how independence between linkage map length and genome size means that recombination rates will be higher in smaller genomes. I also can’t make the link between this statement and the Stapley paper – I think they found that linkage map lengths were smaller in smaller genomes, but also that chromosome number explained more variation (i.e. increased chromosome number lead to longer maps due to a higher minimum bound of recombination due to crossover assurance). Perhaps I am wrong, but regardless, it might be worth double-checking this statement and explaining it more clearly.

Line 100: on recombination rate, or landscape? Or both?

**Line 102: Based on your argument here, it is not clear how chromosome length links to biases of CO towards the peripheries – please clarify.

Line 117: Briefly define genetic shuffling and why it’s interesting – could even be mentioned earlier e.g. around lines 56 – 58.

RESULTS

**Lines 130 – 132: I don’t think this is described in the methods. Is there information on the number of progeny? I was curious about this but couldn’t find the information in the supplementary tables.

Line 143: This header could be interpreted that smaller chromosomes have more crossover events rather than more crossovers per unit length. Perhaps “Smaller chromosomes have higher recombination rates than larger ones”?

**Lines 153 – 169: Where are the methods for this LMER and what is the model structure? Is this what is being described in lines 831 – 850 of the methods? Throughout the paper, it needs to be clearer what models were run and what their fixed & random effect structures were in order to better interpret them.

Line 155: Does this mean that there is no/low phylogenetic signal of recombination rate?

Figure 1: This figure is busy. A suggestion for panel A: perhaps the dashed lines could be fit from axis to axis, to visually demarcate the 1 – 4CO expectations a bit better? For panel B, since this is the same data plotted twice, perhaps only the regression lines need to be visualised here rather than all of the points.

**Figure 2: I found this figure confusing. Some suggested edits:

Panel A could be wider to allow discerning of the slopes. I also struggled to understand what the isolines on the graph are showing even after reading several times. When using isolines, perhaps there is a need to define their values (as in Figure 1) – or perhaps they can be removed if making things too busy.

Panel B: I cannot interpret what this is showing – are there really lines in panel A that have intercepts of less than zero?

Panels B & C: I am very curious to see the error on these estimates.

Panels B & C: maybe these panels might be better suited in the supplementary?

Panel D: Again, very busy. Perhaps use of transparency of points or lines could make things clearer.

**Figure 4: Accessibility issue for colour-blindness - the red dots may not be visible on the green background. The visual scale for the chromosome size is a unclear, particularly as appears to be log – could there be line traces instead of colours here? Also – perhaps I have misunderstood – but if the chromosomes were split into ten bins, then why does the resolution of recombination rate estimation look to be much higher than 1/10th of the chromosome on the horizontal lines?

**Line 282 – 298: It seems that chromosome size was a strong correlate of recombination pattern, but I was curious if the authors tested other factors to rule out potential artefacts (e.g. differences in marker density) or to identify other biological correlates, such as ploidy? Was there a phylogenetic signal of this distal vs subdistal pattern?

Figure 6: There is a lot of text to wade through in the legend - it would help the reader to put annotations, sample sizes, key on the figures to allow for faster interpretation. For example, putting A: Distal pattern, N = XX, B: Subdistal pattern, N = XX on the panels make it easier to interpret. The dashed lines for the unclassified patterns are very distracting – why not include this as another panel, or put it in the supplementary material? Panel C is tiny and needs a key, or at least x-axis labels. I like the schematics of the crossover distributions but it’s so tiny – perhaps include this as its own figure as it explains the model really well.

Figure 7: A & B. There needs to be a higher contrast between the colours as it’s difficult to see the differences between blue and black. C. What do the colours represent here? Adjusting the point transparency and slight x jitter may improve the visualisation here.

Figure 8: Same as Figure 6 – annotating the panels would be helpful.

**Line 414 – I think it’s important for the authors to briefly define what genetic shuffling is and why it’s interesting to look at from an evolutionary perspective.

Line 419: On the same chromosome

Line 422: less efficient = resulted in less genomic shuffling?

Figure 9: see comments on Figure 7.

DISCUSSION

Lines 477 – 488: I’m a little puzzled by some of the statements here, so perhaps clarification is needed. I think it could be mentioned that crossover assurance will give a basal rate per chromosome of 50cM regardless of size, and then the authors can expand how the findings outlined here add to this established fact. Furthermore, I believe that in animals, larger chromosomes do have lower recombination rates within species… if I have misinterpreted this, perhaps the authors need to clarify their point better.

Line 519: clarify what “association” means here… does chromosome pairing begin at the telomeres?

Line 570: put “beam-film” in inverted commas and indicate that you are about to describe it.

Line 582 – 583: It depends on the number of gametes measured and how many were male and female, which is easily done in dioecious species. I think authors should specify here “in angiosperms” and iterate here why heterochiasmy is difficult to investigate for the less plant-literate reader.

METHODS

**Lines 700 – 704: Indicate that this was from cytogenetic data. How is this information orientated correctly to the linkage map/genome sequence?

**Lines 709 – 723: I think this paragraph requires a few improvements in reproducibility. Was this all done in the MareyMap package in the next paragraph? What was a ballpark criteria/example for anything that was outside the global trend?

**Lines 724 – 737: Related to the previous comment, when looking at the plotted Marey maps (Figure S1), are the methods/results affected in any way by the “jitter” of Mb vs cM distances? I imagine that if the markers were not in the correct order in the linkage map (if the local linkage order is ABC, but the real genomic order is ACB), then the cM length of the chromosome may be overestimated, meaning that recombination rates would be consistently inflated. For example, in Camellia sinensis, the maps seems to be messier and therefore may accumulate local overestimations in recombination rate that will lead to a longer cM map than the true one, compared to Arabidopsis where the orders appear to be highly conserved between the genome and the linkage map. The potential impact of this should be discussed.

**Lines 738 – 749: Please clarify here how the relative recombination rate is calculated – is this done for each segment? i.e. if the chromosome is 100cM and 80Mb, and the first segment is e.g. 10cM & 20Mb, then how would the value be calculated? The verbal argument is unclear.

**Lines 831 – 850: The way this is written, it isn’t connected to any specific models. It is important to describe what was modelled here and to be explicit about the model structures to ensure reproducibility.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Decision Letter 1

Kirsten Bomblies, Ian R Henderson

20 Jul 2022

Dear Dr Brazier,

Thank you very much for submitting your Research Article entitled 'Diversity and determinants of recombination landscapes in flowering plants' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the current manuscript. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review a much-revised version. We cannot, of course, promise publication at that time.

As you can see, two of your reviewers found your revised version acceptable for publication. However, reviewer 1 raises a number of concerns about the work and does not yet support acceptance, and they suggest multiple points for continued improvement of the manuscript.

Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by reviewer 1. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool.  PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions.

Yours sincerely,

Ian R. Henderson

Associate Editor

PLOS Genetics

Kirsten Bomblies

Section Editor: Evolution

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The revised manuscript still requires extensive revision of the English, which is frequently hard to understand (I attach an annotated pdf file as well as comments below). The Discussion also seems unduly long (7 pages) and could be shorter and clearer if unnecessary repetition of results were removed.

Another general comment is that the text frequently ignores the state of the art in the field. When a concept is already well-established, I feel that, if one cites a recent paper, the citation should make clear that this is a review of the topic. As written, without doing this, or citing the early literature, the manuscript misleadingly gives the impression that these things are new discoveries. My comments on Lines 56 and 127 below are in a category that I consider “not so minor”.

The chief question remains whether the present study represents an advance sufficient for acceptance by PLoS Genetics, as opposed to being better suited for a journal like G3 or Heredity. In other words, what new discovery does it provide?

It is correct that the size of the dataset is an improvement on previous analyses, and I consider it improvements in quality to describe recombination patterns, not just total map lengths, and to describe centromere locations (the value of such information was stressed in this recent paper, which should be cited: Yoshida, K., and J. Kitano, 2021 Tempo and mode in karyotype evolution revealed by a probabilistic model incorporating both chromosome number and morphology. PLoS Genetics 17: e1009502. doi: 10.1371/journal.pgen.1009502). Clearly, these points should be emphasised. My criticisms did not dispute the value of such improvements, or of re-analysing the data to make results comparable between species.

The authors’ responses make clear that several of the most interesting questions could not be studied, and it would be helpful to readers to mention these explicitly, perhaps in a brief “Future directions” section at the end. This would help readers understand how difficult it is to reach conclusions about these questions, and that the right kind of data are now obtainable, albeit with considerable effort.

The issue of family sizes appears not to be mentioned, but surely the ability to detect recombinants will depend on the family size, and, in small families, none may be detected in regions with infrequent crossing over even if crossovers can occasionally occur. At present, the genetic maps are taken as facts, not estimates. This problem extends beyond the issue of family sizes, and I believe that it also affects the analysis of the models. As crossovers happen in the 4-strand stage, the genetic map distances don’t directly tell us the number of crossovers per bivalent, and so it is difficult to infer anything clear about the number that is required for correct segregation. It might be nice to know whether the evidence supports a requirement for a crossover per chromosome, or whether this requirement is enforced for each chromosome arm. However, it is well-established that, despite some such requirement, crossover numbers show a distribution that includes zero events, yet correct segregation is possible.

However, my major criticism of the manuscript is still that the main new conclusion (line 434 onwards) does not seem justified. In my wording, this is that recombination between coding regions occurs more (“was more efficient”) than among regions randomly sampled from the genome, especially for longer chromosomes, which tend to have distal crossover localisation, and consequently the most heterogeneous recombination rates. The reasoning used to reach the conclusion that this is an important evolutionary observation seems to overlook an “elephant in the room” — the accumulation of repetitive sequences in low recombination regions of genomes. A direct outcome of this tendency is that regions with low recombination rates tend also have low gene densities. I don’t think that this has (in my edited wording) “implications for the evolution of crossover landscapes and whether the distribution of COs is optimal for the efficacy of genetic recombination”.

Following Haenel et al. (2018), the authors show that recombination rates correlate positively with gene density. This is unsurprising, as gene density goers down as repetitive content goes up, and the latter increases in genome regions with low crossover rates. This has been established in many species, including plants. Line 373 onwards says “the strength of the relationship greatly varied across species and did not correlate with usual predictors such as the chromosome length or the genome-wide recombination rate”. But surely the usual predictors of gene density are not chromosome length or the genome-wide recombination rate, but local recombination rates, with clear effects of proximity to centromeres. This is because these effects are pronounced only in genome regions with pretty low recombination rates, and cannot be detected from correlations using minor or local recombination rate differences.

By writing that recombination rates “consistently increased with the number of genes” in 100 kb windows, the text suggests in the readers mind that the causative factor is the number of genes (even though clearly a correlation does not imply such a causation, and of course I don’t think that the authors intended to create this misleading impression).

Marey maps most often appeared more homogeneous when the x axis measures distances in terms of cumulative number of genes along the chromosome instead of base pair distances. When genes are sparse, a small value of this “gene distance” corresponds to a large physical distance, so the inhomogeneity of the maps is diminished, as a direct arithmetical consequence (provided that the annotation of the genes is accurate). This has no special biological significance.

If I have misunderstood the authors’ reasoning, I am happy to be corrected. Certainly, the writing needs to make the meaning clearer if what I have understood them to be saying is not their meaning.

Not so minor

Lines 56 and 127: The text should not give the impression that an association between gene density and recombination is a new observation. Line 127 says “are recombination landscapes generally associated with gene density?” (meaning “are recombination RATEs generally associated with gene density?”), but such a relationship is already extremely well established, because it is known that repetitive sequences accumulate in low recombination regions of genomes. If this is to be mentioned in the Introduction, it needs to be changed to make clear that this well-known pattern is confirmed by these new analyses. The fact that recombination hotspots

have been found in gene regulatory sequences is a minor contributor to such a pattern.

Line 66: It is strange to cite de Massy, 2013 and later papers for the discovery that recombination rates are not homogeneous across the genome and vary among species, as this has been known for a very long time. Centromeric and pericentromeric regions with low rates were known for plants at least since work on maize in the 1930s, and work on tomatoes shortly afterwards, showing the heterochromatic regions physically surrounding centromeres are recombinationally inert.

Similarly, in line 294, it reads strangely that “we observed that the centromeres had an almost universal local suppressor effect”, as this is such a well-known phenomenon (since Beadle’s work in Drosophila in 1932), and is known to act in addition to the effect of heterochromatin. It would be better to write that thewell-known phenomenon of centromeric low crossover rates is confirmed. In this paragraph, a caveat should be expressed to the statement about “completely recombination-free in the centromere” (and also about the short genetic maps of some arms in line 331), as surely the ability to detect recombinants will depend on the family size, and none may be detected in small families, even if crossovers occasionally occur.

Line 192: Figure 1 shows that most of the 57 plant species analysed have estimated cM/Mb rates >1. This is a valuable result, but the text doesn’t mention it. I would have liked to see the value of 1 indicated by a horizontal line, as it is not shown on the y axis. From the figure, it appears that 4 species consistently have values < 0.5. Their names should be mentioned, as such low values are unusual, and these results should be examined to find out whether they are trustworthy, of if terminal markers may be lacking. Figure 2A seems to show that these same species may have very long chromosomes. This too seems worthy of explicit remark, as many readers will be interested to learn that plant chromosomes can be larger than 500 Mb (again, the species should be named, as perhaps these are species already known to have highly repetitive, and physically large, genomes). As presented, the reader does not even know whether these are related species (looking at Figure 4, I wonder if these are the 4 grass species at the bottom of that Figure?).

This comment exemplifies my previous concern that only relationships between quantities are studied, when in fact some of the values are of interest in themselves. Omitting to even mention them makes the findings hard to understand, as it is easy to “miss the wood for the trees”, or understand clearly what the complicated relationships might mean.

Line 222: the text is still not clear (“used the terms proximal and distal regions, respectively, to avoid confusion with the molecular composition and specific position defining telomeric and centromeric regions stricto sensu”). Do “proximal and distal regions” mean centromere-proximal versus distal positions (on both arms for metacentrics), respectively? (and I think that the correct phrase is “sensu strictu”). And does “middle” or “centre” refer to the middle of an arm in a metacentric, or to the centromeric region and its neighbourhood (whereas, for an acrocentric, it would refer to a region distant from the centromere). Clarification is still needed.

I also did not understand which landscapes were homogeneous along chromosomes — it would be helpful to name explicitly the species in Figure 3 that fall into the different categories, and mention the test for homogeneity that was used. The differences are not obvious to me.

Line 265 states that the confidence interval for the periphery bias is 2.06 - 2.32., indicating that recombination rates are highest in the tips of chromosomes., though the differences are not usually extreme. Confusingly, the text earlier in this section strongly emphasised that a bias towards the periphery was not ubiquitous across species. These things are not necessarily contradictory, but they give the impression of being so. In fact, if I have understood correctly, it appears from Figure 5 that Haenel et al. were correct in thinking that this is a common pattern. If so, this should be clearly stated in the text, as it is an important conclusion, and readers should not be left with the impression that they were wrong (even if there are a few exceptions to this pattern — in fact, the exceptions could be of interest).

Minor comments

A minor, but general, issue is the failure to distinguish between physical positions in the chromosome assemblies and genetic map positions. Although the text makes use of both these measures, it is sometimes unclear which one is meant , so that the meaning is not clear (e.g. lines 313 to 315).

It is simplistic to write that recombination “increases genetic diversity and the adaptive potential of a species”. Most readers will know that genetic recombination is interesting and worthy of study.

Line 51: what genomic characteristics? It would be better to say “genome size and ….” specifying the other characteristics.

Line 83: It is too sweeping to say that recombination hotspots are general. Some organisms don’t have them, so please write more cautiously. In addition, please make clear what definitions were used for the plant hotspots cited, as many authors use this term extremely loosely, to mean any region where they detect higher recombination than other regions, even if the regions are large, unlike the original definition of hotspots.

Line 87: it is too vague to just say that recombination affects genome structure, functioning and evolution. Please be explicit. For example, recombination events can affect genome structure by causing ectopic recombination between similar sequences, including repeats, in different locations. As written, readers will not understand what you have in mind.

Line 107: the meaning is unclear of “positively driven by chromosome length”

Line 119: the phrase “to our knowledge” is in the wrong place in the sentence.

Line 128: the question “What are the consequences of recombination heterogeneity on the extent of genetic shuffling?” is hard to understand, as “genetic shuffling” is usually just “baby language” for recombination (in my opinion, it should not be used in a scientific paper). Is this an attempt to mention that recombination can involve both crossing over and gene conversion? If so, it is certainly not clear enough. I would advise a different term for the sub-heading in line 405.

Line 169: Are the crossover numbers means? It is difficult to understand what is meant by “234 chromosomes had between one and two COs”, as a chromosome has a definite number. This, and the preceding sentence should be revised to make the meaning clear.

Line 173: The meaning of a “species random effect” is not clear enough.

Line 263: what does “ex on Fig 3A and 3E” mean?

Line 289: the meaning is unclear. The text reads “chromosomes from species classified as having a distal pattern were significantly larger than chromosomes with a sub-distal pattern”. Does this mean “species classified as having the distal pattern had significantly larger chromosomes than those of species assigned to the set with the sub-distal pattern”.

Line 313: The meaning is unclear for “equal distribution of crossovers on both sides of the

chromosome”,

Line 321: This reads “At least one CO in each chromosome arm (50 cM) is mandatory in M2 whereas only one CO is mandatory for the entire chromosome in M3”, which is difficult to understand. Does it mean “In M2, at least one CO in each chromosome arm (50 cM) is mandatory, whereas in M3 only a single CO is mandatory for the entire chromosome, even if it has two arms”?

Reviewer #2: I want to congratulate the authors again on an excellent paper. All my (minor) suggestions were met very aptly, and overall the paper is improved with the revisions.

Reviewer #3: Thank you for addressing our suggested revisions. In my opinion, this paper is much improved from the previous draft. It is so much easier to follow the different analyses and findings this time around. It made me really appreciate the discussion as an excellent synthesis of all the findings of this paper in a broader context. The figures look much better and do justice to all the work that is being presented.

A few small corrections:

• Throughout the Abstract/Summary/Intro, I would use “crossover” rather than “crossing-over” when referring to the process as a noun e.g. number of crossovers vs. during crossing-over (verb). (I realise I may not have been clear enough in my original review)

• Line 24: characterise

• Line 35: correspond globally

• Lines 63 – 64: might make more sense to say something like “New haplotypes are passed on to offspring by the reciprocal exchange of DNA between maternal and paternal chromosomes, known as crossovers (COs)” and then remove “(COs)” on line 74

• Line 68: specific pairing sites implies that recombination always takes place in the same spot

• Lines 84-85: naïve question for you to check – is it in regulatory regions rather that sequence? Promotor regions?

• Lines 172 – 174: can you get at how much of the variance species explains?

• Figure 6: Define pattern “E” in the legend

• For my comment about the local linkage order potentially inflating map length, I thought your response was good – could it be included in the methods in case another reader had the same concern?

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Attachment

Submitted filename: PGENETICS-D-22-00286_R1_reviewer.pdf

Decision Letter 2

Kirsten Bomblies, Ian R Henderson

5 Aug 2022

Dear Dr Brazier,

We are pleased to inform you that your manuscript entitled "Diversity and determinants of recombination landscapes in flowering plants" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Ian R. Henderson

Academic Editor

PLOS Genetics

Kirsten Bomblies

Section Editor

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-22-00286R2

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Acceptance letter

Kirsten Bomblies, Ian R Henderson

24 Aug 2022

PGENETICS-D-22-00286R2

Diversity and determinants of recombination landscapes in flowering plants

Dear Dr Glemin,

We are pleased to inform you that your manuscript entitled "Diversity and determinants of recombination landscapes in flowering plants" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Zsofi Zombor

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Markers positions in genetic distance (cM) as a function of genomic distance (Mb), namely Mary maps, for each chromosome included in the dataset (n = 665 chromosomes).

    The black vertical line is the centromere position estimated by cytological measures, when available in the literature.

    (PDF)

    S2 Fig. Recombination landscapes for each chromosome included in the dataset (n = 665 chromosomes).

    Recombination rate (cM/Mb) estimated in windows of 100kb along genomic distances (Mb). Confidence interval at 95% (grey ribbon) estimated by 1,000 bootstraps of loci. The black vertical line is the centromere position estimated by cytological measures, when available in the literature.

    (PDF)

    S3 Fig. Dataset quality for the 57 species.

    The averaged linkage map length (total linkage map length divided by the number of chromosomes, cM) is not correlated with (A) the number of markers (linkage map length ~ log10(number of markers), adjusted R2 = 0.04, p = 0.11), (B) marker density (linkage map length ~ marker density, adjusted R2 = -0.018, p = 0.90) and (C) the progeny size (linkage map length ~ progeny size, adjusted R2 = 0.022, p < 0.32). Regression lines with 95% parametric confidence interval estimated with ggplot2.

    (TIF)

    S4 Fig. Phylogenetic tree of species in our dataset (n = 57), annotated with mean recombination rate (cM/Mb) and mean chromosome size (Mb).

    The supertree was retrieved from the publicly available phylogeny constructed by Smith and Brown (Smith & Brown, 2018).

    (TIF)

    S5 Fig. Slopes of the linear regression within species (linkage map length ~ chromosome size) as a function of the species mean genomic chromosome size (Mb).

    (TIF)

    S6 Fig. The negative correlation (Spearman’s Rho coefficient) between recombination rates (cM/Mb) and the distance to the nearest telomere is stronger for species with a larger chromosome size (n = 57).

    The linear regression line and its parametric 95% confidence interval were estimated in ggplot2. The inset presents the distribution of Spearman’s Rho coefficients for chromosomes (n = 665 chromosomes). The mean correlation and its 95% confidence interval (black solid and dashed lines) were estimated by 1,000 bootstraps. The red vertical line is for a null correlation.

    (TIF)

    S7 Fig. Standardized recombination rate (cM/Mb) as a function of the relative distance (Mb) from the telomere along the chromosome (physical distances expressed in 20 bins).

    Chromosomes were split in halves, a relative distance of 0.5 being the centre of the chromosome, and only one side was randomly sampled to avoid averaging patterns. Then, chromosomes were pooled per species. Each colour is a species. A loess regression was estimated for each species. Species presented in four plots for clarity.

    (TIF)

    S8 Fig. Standardized gene count as a function of the relative distance (Mb) from the telomere along the chromosome (physical distances expressed in 20 bins).

    Chromosomes were split in halves, a relative distance of 0.5 being the centre of the chromosome, and only one side was randomly sampled to avoid averaging patterns. Then, chromosomes were pooled per species. Each colour is a species. A loess regression was estimated for each species. Species presented in four plots for clarity.

    (TIF)

    S9 Fig. The genetic shuffling r¯intra increases with the size of the genetic map (cM).

    Linear mixed regression with a species random effect and its 95% confidence interval estimated by ggplot2 (black line and grey ribbon). Each colour is a species. A linear regression was estimated for each species.

    (TIF)

    S10 Fig. The genetic shuffling r¯intra decreases with the periphery-bias ratio. Linear mixed regression with a species random effect and its 95% confidence interval estimated by ggplot2 (black line and grey ribbon).

    Each colour is a species. A linear regression was estimated for each species.

    (TIF)

    S11 Fig. Gene count in windows of 100kb along genomic distances (Mb) for each chromosome with gene annotations (n = 480 chromosomes).

    Recombination rate (cM/Mb) estimated in windows of 100kb. Loess regression of gene count along the chromosome in blue line with parametric confidence interval at 95% in grey.

    (PDF)

    S12 Fig. Marey maps with genomic distances (black points) and gene distances (gray points). Markers positions in genetic distance (cM) as a function of the relative physical distance (either Mb of cumulative number of genes) for each chromosome with gene annotations (n = 480 chromosomes).

    The black dashed line is a theoretical uniform distribution of markers. The black vertical line is the centromere position estimated by cytological measures, when available in the literature.

    (PDF)

    S13 Fig. Sensitivity of the periphery-bias ratio to the size of the sampled distal region (i.e. number of bins sampled at the tips).

    The periphery-bias ratio was estimated for different numbers of bins sampled and always divided by the mean chromosomal recombination rate. Linear regression (black line) shows a decrease of the periphery-bias ratio as the number of bins increases, towards a ratio value of 1 (dashed line).

    (TIF)

    S1 Table. Metadata for 665 recombination landscapes, with name of the dataset collected and literal name of the chromosome used in our study, chromosome name in annotation (gff), size of the genetic map (cM, raw and corrected by methods of Chakravarti et al. (1991) or Hal & Willis (2005)), size of the genomic sequence in genome assembly (Mb), number of markers, density of markers in cM and bp, progeny size, mean interval between markers in cM and bp, Gini index, span parameter of the loess function, type of mapping function (Haldane, Kosambi or none), accession of the reference genome used for markers genomic positions, link to data repository and doi reference of the study in which the genetic map was published.

    (XLSX)

    S2 Table. Flowering plant species included in the study, with authors, year and doi reference of the genetic map publication, and accession of the reference genome.

    (XLSX)

    S3 Table. Centromeric indexes estimated in cytological studies, with unit of measurement, mean and standard error of long and short chromosome arms, centromeric index (ratio of short arm length divided by total chromosome length), and doi reference to the original study.

    (XLSX)

    S4 Table. Correlation between recombination landscapes estimated at two different genomic scales (1Mb and 100kb).

    Spearman’s Rho coefficient was estimated for each chromosome between recombination rates estimated directly in windows of 1Mb and the mean recombination rate of 100kb windows pooled together in 1Mb windows. Mean of the Spearman’s Rho coefficient among chromosomes and proportion of significant p-values given for each species.

    (XLSX)

    S5 Table. Selection of the regression model between LM, LMER and PGLMM which explains best the relationship between the mean recombination rate (cM/Mb) and the chromosome size (Mb), based on AIC and BIC criteria.

    (XLSX)

    S6 Table. Species averaged correlation between the averaged chromosome size (Mb) and the averaged periphery-bias ratio.

    Mean of the Spearman’s Rho coefficient among correlations at chromosome scale and proportion of significant p-values given for each species.

    (XLSX)

    S7 Table. Chromosome correlation between the recombination rate (cM/Mb) and the relative distance to the telomere, with Spearman’s Rho coefficient and p-value of the test per chromosome.

    (XLSX)

    S8 Table. Species averaged correlation between the recombination rate (cM/Mb) and the relative distance to the telomere.

    Mean of the Spearman’s Rho coefficient among correlations at chromosome scale and proportion of significant p-values given for each species.

    (XLSX)

    S9 Table. Selection of the best model of crossover distribution for each species, based on Adjusted R-Squared between observed values and theoretical values predicted by the model.

    The best model selected for each species is the one maximizing the Adjusted R-Squared.

    (XLSX)

    S10 Table. Selection of the best model of crossover distribution for each species in a subset of chromosomes with at least 50cM on each chromosome arm, based on Adjusted R-Squared between observed values and theoretical values predicted by model.

    The best model selected for each species is the one maximizing the Adjusted R-Squared.

    (XLSX)

    S11 Table. Convergence between crossover patterns and gene patterns at a species scale.

    For each species is given the type of crossover pattern, the type of gene count pattern, the difference RMSE(gene pattern)—RMSE(crossover pattern) which indicates how gene patterns are more/less homogeneous than crossover patterns, the homogenization effect of gene patterns (more/less), the difference genetic shuffling(gene pattern)—genetic shuffling(crossover pattern) and the averaged chromosome size (Mb).

    (XLSX)

    S1 Data. References for linkage map data included in this study.

    (PDF)

    Attachment

    Submitted filename: Response to reviewers.pdf

    Attachment

    Submitted filename: PGENETICS-D-22-00286_R1_reviewer.pdf

    Attachment

    Submitted filename: AnswerToReviewers_V2.docx

    Data Availability Statement

    All data available to reproduce the results presented in the paper is available in a public data repository (https://doi.org/10.17605/OSF.IO/NUXD7).


    Articles from PLoS Genetics are provided here courtesy of PLOS

    RESOURCES