Extended Data Fig. 4 |. Performance of RiboNN on mouse cell types.

a, These panels mirror those shown in Supplementary Fig. 4, except they show the performance of our multitask RiboNN model on mouse cell types using r2 measured on ten held-out CV folds (n = 10,242 total genes among the ten folds). The center of the boxes corresponds to the median (the 50th percentile). The lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles). The upper whisker extends from the hinge to the largest value no further than 1.5× IQR (interquartile range, or distance between the first and third quartiles) from the hinge. The lower whisker extends from the hinge to the smallest value at most 1.5× IQR of the hinge. Data beyond the end of the whiskers are plotted individually. b,c, Scatter plots showing the relationships between our mouse RiboNN predictions to the observed mean TEs for human mRNAs (b) as well as the relationships between our human RiboNN predictions to the observed mean TEs for mouse mRNAs (c). Pearson (r) and Spearman (ρ) correlation coefficients are also shown. d,e, Scatter plots showing the relationships between sequence homology, considering the interspecies pair of mRNAs with the maximum homology, and the residual prediction error between the TE from one species and TE predicted from the alternative species. This was shown for human (d) and mouse (e) mean TE data. ‘Max homology %’ was computed as follows: (1) all human–mouse mRNA pairs were locally aligned using the ‘pairwiseAlignment’ function from the Biostrings (version 2.70.2) R package129 (‘match: 1, mismatch: −3, gap open: −2 and gap extend: −1’) and (2) for each mRNA, the final value was computed using the highest scoring alignment from the other species, calculating the maximum homology score divided by mRNA length.