Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2002 Oct;12(10):1466–1482. doi: 10.1101/gr.331902

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright © 2002, Cold Spring Harbor Laboratory Press

PMC Copyright notice

(A) Flow chart of the procedure for searching for RP pseudogenes in the human genome. RP and ΨG denote “ribosomal protein” and “pseudogene”, respectively. S-W., “Smith-Waterman”. The steps are as follows: (1) Six-frame BLAST run searching for RP homologies in the human genome. (2) Merging and extension. BLAST hits were merged and extended on both sides to match the length of RP peptide sequence. (3) Smith-Waterman realignment. Extended homologies were realigned with RP sequence. (4) Comparison with Ensembl annotation. Five RPL41 pseudogenes from Ensembl were added to the set. A total of 2536 PR genes or pseudogenes were identified. (5) Checking for long gaps. Homology sequences that contained gaps shorter than 60 bp were labeled “intact processed pseudogenes” if they were longer than 70% of the full-length RP sequence; otherwise they were labeled “pseudogenic fragments”. (6) Comparison with GenBank and cytogenic mappings. For those RP homologies that contained long gaps (>60 bp), their sequences were compared with the RP exon structure from GenBank and their chromosomal locations were checked with cytogenic mapping. The homology sequences were assigned as functional RP genes, duplicated RP genes, and “disrupted processed pseudogenes.” The latter were processed pseudogenes whose sequences were interrupted by retrotransposons. (B) Schematic graph describing the considerations in merging two adjacent RP matches, M1 and M2. (c₁₁, c₁₂) and (c₂₁, c₂₂) are chromosomal coordinates for M1 and M2. (q₁₁, q₁₂) and (q₂₁, q₂₂) are corresponding regions on the query RP protein that they match.

(A) Flow chart of the procedure for searching for RP pseudogenes in the human genome. RP and ΨG denote “ribosomal protein” and “pseudogene”, respectively. S-W., “Smith-Waterman”. The steps are as follows: (1) Six-frame BLAST run searching for RP homologies in the human genome. (2) Merging and extension. BLAST hits were merged and extended on both sides to match the length of RP peptide sequence. (3) Smith-Waterman realignment. Extended homologies were realigned with RP sequence. (4) Comparison with Ensembl annotation. Five RPL41 pseudogenes from Ensembl were added to the set. A total of 2536 PR genes or pseudogenes were identified. (5) Checking for long gaps. Homology sequences that contained gaps shorter than 60 bp were labeled “intact processed pseudogenes” if they were longer than 70% of the full-length RP sequence; otherwise they were labeled “pseudogenic fragments”. (6) Comparison with GenBank and cytogenic mappings. For those RP homologies that contained long gaps (>60 bp), their sequences were compared with the RP exon structure from GenBank and their chromosomal locations were checked with cytogenic mapping. The homology sequences were assigned as functional RP genes, duplicated RP genes, and “disrupted processed pseudogenes.” The latter were processed pseudogenes whose sequences were interrupted by retrotransposons. (B) Schematic graph describing the considerations in merging two adjacent RP matches, M1 and M2. (c₁₁, c₁₂) and (c₂₁, c₂₂) are chromosomal coordinates for M1 and M2. (q₁₁, q₁₂) and (q₂₁, q₂₂) are corresponding regions on the query RP protein that they match.