The analysis of the distribution of ? along chromosomes at the 100-kb scale reveals a more uniform distribution than that of CO (c) rates, with no reduction near telomeres or centromeres (Figure 5). More than 80% of 100-kb windows show ? within a 2-fold range, a percentage that contrasts with the distribution of CO where only 26.3% of 100-kb windows along chromosomes show c within a 2-fold range of the chromosome average. To test specifically whether the distribution of CO events is more variable across the genome that either GC or the combination of GC and CO events (i.e., Tattoo dating online number of DSBs), we estimated the coefficient of variation (CV) along chromosomes for each of the three parameters for different window sizes and chromosome arms. In all cases (window size and chromosome arm), the CV for CO is much greater (more than 2-fold) than that for either GC or DSBs (CO+GC), while the CV for DSBs is only marginally greater than that for GC: for 100-kb windows, the average CV per chromosome arm for CO, GC and DSBs is 0.90, 0.37 and 0.38, respectively. Nevertheless, we can also rule out the possibility that the distribution of GC events or DSBs are completely random, with significant heterogeneity along each chromosome (P<0.0001 at all physical scales analyzed, from 100 kb to 10 Mb; see Materials and Methods for details). Not surprisingly due to the excess of GC over CO events, GC is a much better predictor of the total number of DSBs or total recombination events across the genome than CO rates, with semi-partial correlations of 0.96 for GC and 0.38 for CO to explain the overall variance in DSBs (not taking into account the fourth chromosome).
DSB quality involves the creation away from heteroduplex sequences (for both CO otherwise GC situations; Figure S1). Such heteroduplex sequences is consist of A(T):C(G) mismatches that will be repaired randomly or favoring particular nucleotides. In Drosophila, there is no lead fresh proof support G+C biased gene sales fix and you will evolutionary analyses provides considering inconsistent overall performance when using CO prices once the a good proxy having heteroduplex formation (– however, find , ). Notice however you to definitely GC events are more regular than CO events inside the Drosophila plus in most other organisms , , , and that GC (?) cost are a great deal more related than CO (c) rates whenever investigating the fresh new it is possible to outcomes out-of heteroduplex resolve.
In a number of types, gene sales mismatch fix has been suggested as biased, favoring Grams and you may C nucleotides – and you will predicting an optimistic matchmaking ranging from recombination pricing (sensu frequency from heteroduplex development) therefore the G+C articles of noncoding DNA ,
The study show zero organization regarding ? having G+C nucleotide composition during the intergenic sequences (R = +0.036, P>0.20) or introns (R = ?0.041, P>0.16). The same lack of association is observed when G+C nucleotide constitution was compared to c (P>0.twenty five both for intergenic sequences and introns). We discover thus zero proof of gene conversion prejudice favoring Grams and you may C nucleotides during the D. melanogaster based on nucleotide constitution. The reasons for most of the prior results one to inferred gene conversion bias to your Grams and you will C nucleotides during the Drosophila can be multiple and can include the employment of simple CO maps also because the incomplete genome annotation. As the gene occurrence for the D. melanogaster is high into the nations which have non-faster CO , , many has just annotated transcribed places and Grams+C rich exons , , might have been in the past examined as the natural sequences, especially in these genomic places that have low-smaller CO.
This new motifs out-of recombination within the Drosophila
To discover DNA motifs associated with recombination events (CO or GC), we focused on 1,909 CO and 3,701 GC events delimited by five hundred bp or less (CO500 and GC500, respectively). Our D. melanogaster data reveal many motifs significantly enriched in sequences surrounding recombination events (18 and 10 motifs for CO and GC, respectively) (Figure 6 and Figure 7). Individually, the motifs surrounding CO events (MCO) are present in 6.8 to 43.2% of CO500 sequences, while motifs surrounding GC events (MGC) are present in 7.8 to 27.6% of GC500 sequences. Note that 97.7% of all CO500 sequences contain at least one MCO motif and 85.0% of GC500 sequences contain one or more MGC motif (Figure S4).