Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2019 Mar 4;47(7):3344–3352. doi: 10.1093/nar/gkz151

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Figure 3. — Model performance evaluation. (A) Accuracy scores of the random-forest classifier on real labels and random labels (see legend). = 121, = 92. (B) Prediction accuracies on the red alga Cyanidioschyzon merolae when discarded from the initial dataset, and on the cyanobacteria Synechocystis sp. PCC6803 labels derived from (12). The bars represent coding-sequence gene pairs (‘CDS group’), a mixture of tRNA, rRNA or coding sequence gene pairs (‘mixed group’) and the weighted mean of the two groups (‘overall’). Synechocystis sp. PCC6803 encompasses only CDS gene pairs. The bars show mean ± STD. (C) Type I (false-positive mistake) error test. (D) type II (false-negative mistake) error test. An overall of 19 errors were introduced into the labels and for each error rate the prediction pipeline was repeated. The same analysis was carried out on random labels. All accuracy scores were calculated based on the average accuracies of ten bootstrap trained samples.

Inline graphic — Model performance evaluation. (A) Accuracy scores of the random-forest classifier on real labels and random labels (see legend). = 121, = 92. (B) Prediction accuracies on the red alga Cyanidioschyzon merolae when discarded from the initial dataset, and on the cyanobacteria Synechocystis sp. PCC6803 labels derived from (12). The bars represent coding-sequence gene pairs (‘CDS group’), a mixture of tRNA, rRNA or coding sequence gene pairs (‘mixed group’) and the weighted mean of the two groups (‘overall’). Synechocystis sp. PCC6803 encompasses only CDS gene pairs. The bars show mean ± STD. (C) Type I (false-positive mistake) error test. (D) type II (false-negative mistake) error test. An overall of 19 errors were introduced into the labels and for each error rate the prediction pipeline was repeated. The same analysis was carried out on random labels. All accuracy scores were calculated based on the average accuracies of ten bootstrap trained samples.