Skip to main content
Bioinformation logoLink to Bioinformation
. 2012 Dec 19;8(25):1265–1270. doi: 10.6026/97320630081265

Phylogenetic analysis of uroporphyrinogen III synthase (UROS) gene

Abjal Pasha Shaik 1,$,*, Abbas H Alsaeed 1,$, Asma Sultana 2,$
PMCID: PMC3532012  PMID: 23275732

Abstract

The uroporphyrinogen III synthase (UROS) enzyme (also known as hydroxymethylbilane hydrolyase) catalyzes the cyclization of hydroxymethylbilane to uroporphyrinogen III during heme biosynthesis. A deficiency of this enzyme is associated with the very rare Gunther's disease or congenital erythropoietic porphyria, an autosomal recessive inborn error of metabolism. The current study investigated the possible role of UROS (Homo sapiens [EC: 4.2.1.75; 265 aa; 1371 bp mRNA; Entrez Pubmed ref NP_000366.1, NM_000375.2]) in evolution by studying the phylogenetic relationship and divergence of this gene using computational methods. The UROS protein sequences from various taxa were retrieved from GenBank database and were compared using Clustal-W (multiple sequence alignment) with defaults and a first-pass phylogenetic tree was built using neighbor-joining method as in DELTA BLAST 2.2.27+ version. A total of 163 BLAST hits were found for the uroporphyrinogen III synthase query sequence and these hits showed putative conserved domain, HemD superfamily (as on 14th Nov 2012). We then narrowed down the search by manually deleting the proteins which were not UROS sequences and sequences belonging to phyla other than Chordata were deleted. A repeat phylogenetic analysis of 39 taxa was performed using PhyML and TreeDyn software to confirm that UROS is a highly conserved protein with approximately 85% conserved sequences in almost all chordate taxons emphasizing its importance in heme synthesis.

Keywords: Uroporphyrinogen III synthase, Protein sequences, Phylogeny, Sequence alignment

Background

Congenital erythropoietic porphyria (CEP) or Gunther's disease is a rare inherited autosomal recessive trait and is caused due to the deficiency of uroporphyrinogen III synthase (UROS), the fourth enzyme in heme biosynthesis. Only about 150 cases of CEP have been reported to date [14]. CEP symptoms are heterogeneous, ranging from severe hemolytic anemia in utero to milder, later onset forms, with skin lesions due to cutaneous photosensitivity in adult life [4]. The deficiency of functional UROS causes buildup of porphyrins to toxic levels in red blood cells. The excess porphyrins can then accumulate in the skin causing oversensitivity to sunlight [4, 5]. UROS is the central point for the synthesis of heme, a complex organic molecule, is a cyclic tetrapyrrole that contains a centrally chelated Fe and functions in the transport oxygen. The human UROS gene located on chromosome 10 (10q26.2; Cytogenetic Location: 10q25.2-q26.3; EC 4.2.1.75; 1371 bp mRNA; Entrez Pubmed ref NP_000366.1, NM_000375.2) and expressed as a 265 amino acid UROS enzyme catalyzes the cyclization of the linear tetrapyrrole, hydroxymethylbilane, to the macrocyclic uroporphyrinogen III which is eventually converted to heme. Most organisms have the ability to synthesize their own heme molecules by this largely conserved metabolic pathway [4, 6, 7]. UROS enzyme is localized in the cytosol and plays a critical part in production of heme which is an essential prosthetic group in many cellular reactions in prokaryotes and eukaryotes.

The presence of this pathway across Bacteria, Archaea, and Eukarya suggests that heme performs a central function in the evolution of life. The acquisition by eukaryotes of the heme is one of the most important events in cellular evolution and any interference to its synthesis can thus have dire consequences to the survival of these organisms. Because of the inherent role of UROS in the synthesis of heme, and considering its role in many other inherent metabolic pathways in the cell, we aimed to elucidate if variance in this gene exists in various species during evolution by using a phylogenetic analysis of published protein sequences of these genes.

Methodology

Data Set, Sequence Alignment and Construction of Phylogenetic Tree:

The GenBank database [8] was queried to retrieve all available protein sequences of the UROS protein. These sequences were retrieved and saved in FASTA sequence format. These sequences were then aligned using Clustal W [9] algorithm using default parameters. The initial first-pass phylogenetic tree was constructed using Neighbour Joining [10] method (maximum sequence difference of 0.85) using Domain Enhanced Lookup Time Accelerated Basic Local Alignment Search Tool [DELTA BLAST] pairwise alignments between a query and the database sequences searched [11]. The evolutionary distance between two retrieved sequences modeled as expected fraction of amino acid substitutions per site given the fraction of mismatched amino acids in the aligned region was taken by the software using Grishin computation [12]. Using the results from DELTA BLAST, we created a firstpass phylogenetic tree after which we used a purpose-built computational phylogenetic method using Phylogeny.fr software. Sequences were aligned with MUSCLE (v3.7) [13, 14] configured for highest accuracy (MUSCLE with default settings). After alignment, ambiguous regions (i.e. containing gaps and/or poorly aligned) were removed with Gblocks (v0.91b) using the following parameters: minimum length of a block after gap cleaning: 10; no gap positions were allowed in the final alignment; all segments with contiguous nonconserved positions bigger than 4 were rejected and 85% minimum number of sequences for a flank position [1517]. The phylogenetic tree was reconstructed using the maximum likelihood method implemented in the PhyML program (v3.0 aLRT) [13, 18]. The Jones-Taylor-Thornton (JTT) substitution model was selected assuming an estimated proportion of invariant sites (of 0.197) and 4 gamma-distributed rate categories to account for rate heterogeneity across sites. The gamma shape parameter was estimated directly from the data (gamma=1.199). Reliability for internal branch was assessed using the aLRT test (SH-Like). The graphical representation and edition of the phylogenetic tree were performed with TreeDyn (v198.3) [19].

Results

From the NCBI GenBank database, 163 sequences of UROS covering the putative conserved HemD domain were used to construct a first-pass phylogenetic tree. The tree however had some of the repetitive and unrelated sequences which were deleted. The short-listed sequences belonged to Porifera, Arthropoda, Echinodermata, Tunicates, Cephalochordates, Bacteria, Cnidaria, Fishes, Amphibia, Reptiles, Aves and Mammalia. The accession information for these sequences is available in Table 1 (see supplementary material). Analysis of the sequences revealed that there is a high degree of sequence similarity of UROS enzyme in many of the selected sequences used for the phylogeny reconstruction. Putative conserved domains were observed in many taxa at the HemD region (Figure 1) [20]. The actual alignment was detected with superfamily member pfam02602 (E-value: 3.02e-41). BLAST produced 163 hits (Figure 2); these sequences were screened manually and only those related to the sequence in question (HPSE-1) from different taxa were retained for further analyses. This produced a total of 39 sequences from 31 taxa. Multiple sequence alignment results of these short-listed sequences is presented in (Figure 3). Using the PhyML program a tree was constructed for these sequences, the results of which are presented in (Figure 4).

Figure 1.

Figure 1

Putative sequence of UROS in super families; UROS (HemD) catalyzes the production of uroporphyrinogen-III, the fourth step in the biosynthesis of heme. This ubiquitous enzyme is present in eukaryotes, bacteria and archaea. Cd Length: 239 Bit Score: 171.72 E-value: 5.15e-52; http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=B6X0K5J901R&mode=all

Figure 2.

Figure 2

First pass phylogenetic tree constructed by multiple alignment using BLAST pair wise alignments: Results presented using Taxonomic name [163 hits]

Figure 3.

Figure 3

MUSCLE (3.7) multiple sequence alignment [CLUSTAL format] of short-listed sequences [39 taxons] Similar residues are colored as the most conserved one (according to BLOSUM62).

Figure 4.

Figure 4

Phylogenetic tree using short-listed sequences [cladogram] built using the PhyML software [39 taxons].

Discussion

The evolutionary relationship of UROS enzyme in various species, taxa and phyla was evaluated using computational phylogenetics to identify similar genes in the short-listed organisms. A mix of algorithms and programs were used to construct a phylogenetic tree [21, 22]; the neighbor-joining method was used to calculate genetic distance and ClustalW to create trees based on multiple sequences. The JTT matrix method was used to generate mutation data matrices from protein sequences and the set sequences were clustered at the 85% identity level. In addition, the sequences were aligned, and the observed exchanges amino acids were tallied [23]. Following this, the final phylogenetic tree was constructed [18]. We have used MUSCLE (multiple sequence comparison by logexpectation) method to achieve the highest scores. Close to four benchmarks showed that MUSCLE achieved the highest ranking of any method available at the time of publication. Using Gblocks, poorly aligned positions and divergent regions of sequences could be eliminated – these positions may not be homologous or may have been saturated by multiple substitutions [24, 25]. In addition, this program helped to reduce the necessity of manually editing multiple alignments with very fast processing. The PhyML software was used since it was shown to be at least as accurate and slightly faster as other existing phylogeny programs using simulated data. The DELTA BLAST algorithm which uses a heuristic method to identify homologous sequences helped to produce high scoring sequence alignment between the query and database sequences. We have used BLAST as a first pass sequence alignment tool to narrow down the target and most relevant sequences.

The phylogenetic tree indicated that UROS protein is conserved and plays an important role in organismal evolution (Figure 3 & Figure 4). It is interesting to note that the conserved regions as shown in Homo sapiens are similar to those found in some other organisms that have this conserved gene. Presence of UROS in the major organisms indicates that it is crucial for the development of the physiology of cells. Its high conservation at certain domains indicates that its function is preserved. In conclusion, the evolutionary relationship of UROS gene was established based on the sequence alignment, conserved sequences and phylogenetic trees. The results of the published data on protein sequences of the above genes showed that the sequences are highly conserved especially at certain domains. Human sequences consistently clustered with their mammal orthologs within these genes clearly indicate the importance of these genes in evolution [26]. The phylogenetic reconstruction of the metabolic pathways of many organisms is one of the major goals of genomics. Reconstructions made through comparative genomics, and results from experiments on model systems help in understanding the biochemical diversity of life. Thus, analysis of phyletic patterns re-emphasizes the importance of certain metabolic enzymes in evolution.

Supplementary material

Data 1
97320630081265S1.pdf (14.4KB, pdf)

Acknowledgments

Authors would like to extend their appreciation to the Research Centre, College of Applied Medical Sciences and Deanship of Scientific Research at King Saud University for funding this research.

Footnotes

Citation:Shaik et al, Bioinformation 8(25): 1265-1270 (2012)

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data 1
97320630081265S1.pdf (14.4KB, pdf)

Articles from Bioinformation are provided here courtesy of Biomedical Informatics Publishing Group

RESOURCES