. 2021 Sep 20;10:e67991. doi: 10.7554/eLife.67991

Author response table 1. Analysis of circRNA databases.

Evolutionary analyses are sensitive and prone to the amplification of noise. We thus used a comprehensive dataset including samples from wild-type tissues and the same sex. In addition, as pointed out by Reviewer 2 “without a mock control, [it] is impossible […] to determine the actual resistance to the degradation by the exonuclease and hence determine which potential backsplicing junctions are really coming from circular molecules.” As most of the samples reported in the public database are not RNase R-treated, none of them could provide sufficiently trustworthy circRNA annotations for our study.

	Species	Sample types, circRNA calling	Expression
circAtlas Wu et al. Genome Biol 2020 PMID: 32345360	Human, mouse, rhesus, rat, pig, chicken	1,070 RNA-seq samples collected from 19 normal tissues Identification with CIRI2, DCC, find_circ, CIRCexplorer2 No RNase R-treated samples, but circRNAs need to be detected (1) by at least two of the tools and (2) junctions be covered with a minimum of two independent reads (-> identification on a per sample basis) CircRNA numbers shown in the database for each tissue are pooled, meaning there are for example 225’000 circRNAs in 39 human brain samples -> no separation by region, or minimum overlap (e.g in half of the samples analysed)	No comprehensive expression data provided (i.e. only numerical values for the top 30 circRNAs). For cases where biological samples are pooled (same tissue), the maximal expression value across all samples is used to define high expressors. However, due to strong heterogeneity in expression levels, mean values are frequently magnitudes lower.
Comment: The circRNAs reported are not consistently expressed in the tissue. For the brain, it is for example sufficient for a circRNA to be detected in one out of the 39 brain samples, leading to the high numbers reported. This is reflected in the expression variation observed within the same tissue samples. The authors report the maximal observed value, but the tissue mean seems to be at least a magnitude lower. In addition, no RNase R-treated samples were used for validation. We considered this database inappropriate for our study due to the lack of RNase R-treated expression data, which we consider a more important indicator for a circRNA than the detection by two independent identification methods. Note also statement from publication: “Using this method, we found that the vast majority of circRNAs (an average of 61.7%) could be detected only in one species, with only 797 circRNAs shared by all species, in agreement with previous reports on the highly species-specific expression of circRNAs”
circBase Glažar et al. RNA 2014 PMID: 25234927	Human, mouse, worm, fly	Mostly cell lines. Mouse cerebellum is the only tissue comparable to our study. The authors report 2,407 circRNAs for this tissue (but: only one cerebellum sample was used in study).	Ranking of circRNAs according to “Scores” that are difficult to transform to expression levels.
Comment: Due to the high number of cell lines samples, we did not consider this database. In addition, only human and mouse samples are present, yet we were looking for an evolutionary dataset that included multiple mammalian species.
CIRCpedia Zhang et al. Genome Res 2016 PMID: 27365365	Human, mouse, rat, zebra fish, fly, worm	180 RNA-seq datasets (at first sight, mostly cell lines). One detection tool used. Only 20 out of the 180 samples are RNase Rtreated samples. 4 out of the 20 RNase Rtreated samples are from a mammalian species.
Comment: This database was not used because it mainly contains cell line samples and only a very low number of RNase R-treated samples.