Abstract
The four members of the recently identified suppressor of cytokines signaling family (SOCS-1, SOCS-2, SOCS-3, and CIS, where CIS is cytokine-inducible SH2-containing protein) appear, by various means, to negatively regulate cytokine signal transduction. Structurally, the SOCS proteins are composed of an N-terminal region of variable length and amino acid composition, a central SH2 domain, and a previously unrecognized C-terminal motif that we have called the SOCS box. By using the SOCS box amino acid sequence consensus, we have searched DNA databases and have identified a further 16 proteins that contain this motif. These proteins fall into five classes based on the protein motifs found N-terminal of the SOCS box. In addition to four new SOCS proteins (SOCS-4 to SOCS-7) containing an SH2 domain and a SOCS box, we describe three new families of proteins that contain either WD-40 repeats (WSB-1 and -2), SPRY domains (SSB-1 to -3) or ankyrin repeats (ASB-1 to -3) N-terminal of the SOCS box. In addition, we show that a class of small GTPases also contains a SOCS box. The expression of representative members of each class of proteins differs markedly, as does the regulation of expression by cytokines. The function of the WSB, SSB, and ASB protein families remains to be determined.
Cytokines act by binding to and inducing dimerization of members of the hemopoietin receptor family expressed on the surface of responsive cells (1). Although the cytoplasmic proteins that then transduce the signal are relatively well-defined and include the Janus kinase (JAK) family of kinases and signal transducers and activators of transcription (STAT) transcription factors (2, 3), the proteins involved in limiting signal transduction are not well characterized.
The four known members of the suppressor of cytokine signaling (SOCS) family (CIS, SOCS-1/SSI-1/JAB-1, SOCS-2, and SOCS-3, where CIS is cytokine-inducible SH2-containing protein) represent a family of negative regulators of cytokine signal transduction (4–9). The SOCS proteins appear to form part of a classical negative feedback loop that regulates cytokine signal transduction. Transcription of each of the SOCS genes occurs rapidly in vitro and in vivo in response to cytokines, and once produced, the various members of the SOCS family appear to inhibit signaling in different ways. For SOCS-1, inhibition of signal transduction appears to occur by binding to and inhibiting the catalytic activity of members of the JAK family of cytoplasmic kinases (4–6), while CIS appears to act by competing with signaling molecules such as the STATs for binding to phosphorylated receptor cytoplasmic domains (7, 9).
The SOCS proteins share structural similarities. Each has an N-terminal region of variable length and highly variable amino acid sequence, a central SH2 domain, and a striking region of C-terminal homology that we designated the SOCS box (4). Given the sequence similarity evident in the SOCS box of the four SOCS proteins and its conserved position at the C terminus of each protein, it seems likely that this domain has a conserved and important function. To date, however, the role of each part of the protein in inhibiting signal transduction is far from clear, although regions in addition to the SH2 domain appear to be required (5).
We sought to determine whether any other proteins contain a C-terminal SOCS box. To identify such proteins, we have searched various DNA databases with a SOCS box consensus sequence and have found expressed sequence tags encoding 20 proteins that contain this motif, including the four previously described SOCS protein. Although one class, now composed of 8 proteins (CIS and SOCS-1 to SOCS-7), has a central SH2 domain, others contain WD-40 repeats, a SPRY domain, ankyrin repeats, or a GTPase domain. We have named these new protein families WD-40-repeat-containing proteins with a SOCS box (WSB-1 and WSB-2), SPRY domain-containing proteins with a SOCS box (SSB-1 to SSB-3), or ankyrin-repeat-containing proteins with a SOCS box (ASB-1 to ASB-3).
MATERIALS AND METHODS
Database Searches.
The National Center for Biotechnology Information genetic sequence database (GenBank), which encompasses the major database of expressed sequence tags (ESTs) and The Institute of Genetic Research database of human ESTs were searched for sequences with similarity to a consensus SOCS box sequence by using the tfasta and motif/pattern algorithms (10, 11). By using the software package SRS (12), ESTs that exhibited similarity to the SOCS box (and their partners derived from sequencing the other end of cDNAs) were retrieved and assembled into contigs by using autoassembler (Applied Biosystems). Consensus nucleotide sequences derived from overlapping ESTs were then used to search the various databases with blastn (13). Again, positive ESTs were retrieved and added to the contig. This process was repeated until no additional ESTs could be recovered. Final consensus nucleotide sequences were then translated by using sequence navigator (Applied Biosystems). A summary of the ESTs encoding SOCS-4 to SOCS-7, WSB-1 and WSB-2, SSB-1 to SSB-3, and ASB-1 to ASB-3 can be found at http://www.wehi.edu.au./SOCS.
cDNA Cloning.
On the basis of the consensus sequences derived from overlapping ESTs, oligonucleotides were designed that were specific for various members of the SOCS family. As described, oligonucleotides were labeled and used to screen commercially available genomic and cDNA libraries cloned into λ bacteriophage (4). Genomic and/or cDNA clones covering the entire coding region of mouse SOCS-5 (GenBank accession no. AF033187), mouse WSB-1 (GenBank accession no. AF033186), and mouse WSB-2 (GenBank accession no. AF033188) have been isolated. The entire gene for SSB-2 is present on the human 12p13 bacterial artificial chromosome (ref 14; GenBank accession no. HSU47924) and the mouse chromosome 6 bacterial artificial chromosome (GenBank accession no. AC002393). Partial cDNAs for mouse SOCS-4, SOCS-6, SOCS-7, SSB-1, ASB-1, and ASB-2 have also been isolated.
Northern Blots.
Mice were injected with interleukin 6 (IL-6) and their livers were removed as described (4). Northern blots of were performed as described (4). The hybridization probes used were (i) the entire coding region of the mouse SOCS-1 cDNA, (ii) a 1,058-bp PCR product derived from the coding region of SOCS-4 upstream of the SH2 domain, (iii) the entire coding region of the mouse WSB-2 cDNA, (iv) a 790-bp PCR product derived from the coding region of a partial ASB-1 cDNA, and (v) a 1,200-bp PstI fragment of the chicken glyceraldehyde 3-phosphate dehydrogenase (GAPDH) cDNA.
RESULTS
Four members of the SOCS protein family have been previously identified. These are SOCS-1, SOCS-2, SOCS-3, and CIS (4–9). Each contains a poorly conserved N-terminal region, a central SH2 domain, and a previously unrecognized protein motif at the C terminus that we have named the SOCS box (4). To isolate further members of this protein family, we searched various DNA databases with the amino acid sequence corresponding to conserved residues of the SOCS box. This search proved to be particularly fruitful, because it revealed the presence of human and mouse ESTs encoding 16 additional proteins, each with a SOCS box at the C terminus (Fig. 1 A and H). The SOCS box found in these proteins occurred close to the C-terminal end of each protein and was composed of two blocks of well-conserved residues separated by between 2 and 10 nonconserved residues.
By using the sequence information derived from ESTs, cDNAs encoding all or part of these new proteins were isolated (Fig. 1 A–H). Further analysis of contigs derived from ESTs and cDNAs revealed that these proteins could be placed into five structural groups. The five groups are the seven SOCS proteins and CIS that have an SH2 domain (15) N-terminal of the SOCS box, three new classes of proteins—those with WD-40 repeats (16) N-terminal of the SOCS box (WSB proteins), those with SPRY domains (17) N-terminal of the SOCS box (SSB proteins), and those with ankyrin repeats (18) N-terminal of the SOCS box (ASB proteins; Fig. 1 A–H) in addition to two small GTPases, rar and ras-like GTPase, which also contain a SOCS box (19) and two ESTs of unknown structural class.
In total, eight SOCS proteins have now been identified. These are the previously described CIS, SOCS-1 (SOCS-1/SSI-1/JAB-1), SOCS-2 (SOCS-2/SSI-2), and SOCS-3 (SOCS-3/SSI-3), in addition to four novel members of this family, SOCS-4 to SOCS-7 (Fig. 1 A–C, G, and H and refs. 4–9). To date we have isolated full-length cDNAs of mouse SOCS-6 and have partial clones encoding mouse SOCS-4, SOCS-5, and SOCS-7. The human homologue of SOCS-6 was also recently identified by Yoshimura and colleagues and was given the name CIS4 (GenBank accession no. AB006968). Interestingly, a partial cDNA clone of human SOCS-7 was also recently described on the basis of its ability to interact with the adaptor molecules Nck and Ash and with phospholipase C-γ. (ref. 20 and GenBank accession no. AB005216).
Analysis of primary amino acid sequence and genomic structure suggested that pairs of these proteins (CIS and SOCS-2, SOCS-1 and SOCS-3, SOCS-4 and SOCS-5, and SOCS-6 and SOCS-7) are more closely related to each other than to other SOCS proteins (Fig. 1 B, C, F, and G). Indeed, the SH2 domains of SOCS-4 and SOCS-7 are almost identical (Fig. 1C), and compared with the first four SOCS proteins described, SOCS-4, SOCS-5, SOCS-6, and SOCS-7 have an extensive, though less well conserved, N-terminal region preceding their SH2 domains (Fig. 1B). Analysis of the sequence of N-terminal regions of these proteins did not reveal any recognizable motifs.
Two proteins with WD-40 repeats N-terminal of a SOCS box were identified and termed WSB-1 and WSB-2. As with the SOCS proteins, pairs of the WSB proteins appear to be closely related. Full-length cDNAs of mouse WSB-1 and WSB-2 were isolated and shown to encode proteins containing eight WD-40 repeats (Fig. 1 A, B, D, G, and H). WSB-1 and WSB-2 share 65% amino acid similarity.
Three proteins have been found with a SPRY domain N-terminal of a SOCS box and termed SSB-1 to SSB-3. Although only partial sequence of SSB-1 and SSB-3 could be obtained from ESTs, the entire SSB-2 gene was recognized as an ORF during sequencing of bacterial artificial chromosomes from human chromosome 12p13 and the syntenic region of mouse chromosome 6 (9). SSB-2 is encoded by a gene that lies within a few hundred base pairs of the 3′ end of the triose phosphate isomerase (TPI) gene but that is encoded on the opposite strand to TPI (9). SSB-2 appears most closely related to SSB-1, a family member for which we have isolated only partial cDNAs (Fig. 1 A, B, D, F, and H).
Three mammalian proteins with multiple ankyrin repeats N-terminal of a SOCS box were identified and named ASB-1 to ASB-3 (Fig. 1 A and E–H). Although full-length cDNAs encoding these proteins have yet to be isolated, analysis of the sequence of gene M60-7 of the Caenorhabditis elegans cosmid M60 (GenBank accession no. CELM60) revealed that it encodes a full-length ASB gene (see also ref. 19).
To determine the expression pattern of SOCS, WSB, and ASB proteins, we examined the expression of mRNA for representative members of the SOCS, WSB, and ASB protein families. As shown (ref. 4 and Fig. 2), Northern blot analysis revealed a single abundant 1.4-kb species of SOCS-1 mRNA in the thymus, with lower levels in several other adult tissues. In contrast, SOCS-5 expression was more widespread and evident as two mRNA species of 3.5 and 4.5 kb, the relative abundances of which varied between tissues (Fig. 2). Comparatively high levels of a 2.0-kb WSB-2 transcript were observed in all tissues examined, but ASB-1 mRNA was less abundant, with a 5-kb species restricted primarily to the brain, spleen, bone shaft, and bone marrow (Fig. 2). During embryonic development, expression of SOCS-1, SOCS-5, and WSB-2 was clearly evident in Northern blots of RNA extracted from mid- to late gestation embryos and yolk sac (Fig. 2). At similar autoradiographic exposure times, ASB-1 expression was not detected in embryonic tissues (Fig. 2), although extended exposures did suggest some low-level expression of ASB-1 mRNA during this embryonic period.
Because transcription of the SOCS-1 gene is induced by IL-6 (4), we sought to determine whether levels of mRNA encoding other SOCS box-containing proteins also increased upon cytokine stimulation. In the livers of mice injected with IL-6, SOCS-1 mRNA was detectable after 15 min and decreased to background levels within 4 hr (ref. 4 and Fig. 3). A similar induction has been observed with SOCS-2, SOCS-3, and CIS mRNA (4). SOCS-5 mRNA expression also appeared to be induced in the liver after IL-6 injection; however, the kinetics of appearance differed from that observed for mRNA for the other SOCS proteins, only being detectable 8–12 hr after IL-6 injection (Fig. 3). Consistent with Fig. 2, WSB-2 mRNA appeared to be present constitutively in the liver, with little evidence of regulation by IL-6 (Fig. 3). ASB-1 mRNA was not detected in the liver either before or after injection of IL-6 (Fig. 3).
DISCUSSION
Negative feedback is likely to be important in regulating signal transduction elicited by a wide variety of extracellular signals. The negative regulator of signal transduction SOCS-1 was cloned independently by three groups in three very different ways: functionally by its ability to inhibit IL-6 signal transduction (4), in a yeast two-hybrid screen to find proteins that interacted with JAK2 (5), and on the basis of antigenic similarity to STAT3 (6). SOCS-2, SOCS-3, and CIS, close relatives of SOCS-1, also appear to inhibit signal transduction (refs. 7 and 9 and R.S. and D.M., unpublished observations). To identify other members of this protein family, we searched the DNA databases for cDNAs encoding proteins containing a C-terminal SOCS box. With this strategy, we discovered four additional members of the SOCS protein family, SOCS-4 to SOCS-7. Furthermore, we describe four other classes of proteins that also contain a SOCS box. Unlike the eight SOCSs proteins that contain an SH2 domain, these classes of proteins contain either WD-40 motifs (WSB proteins), a SPRY domain (SSB proteins), ankyrin repeats (ASB proteins), or a GTPase domain N-terminal of the SOCS box (Fig. 1). In every case, the SOCS box is located close to or at the C-terminal end of the protein, with no proteins in the database found with an N-terminal or central SOCS box.
WD-40 repeats were originally recognized in the β subunit of G proteins and have since been described in a wide variety of cytoplasmic proteins, many of which are involved in signal transduction (16). Although the function of WD-40 repeats is not clear, they appear to form a β-propeller-like structure and are speculated to be involved in protein–protein interactions. The WD-40 repeats of the β subunit of certain G proteins have, for example, been shown to interact with pleckstrin homology domains of kinases, such as β-adrenergic receptor kinase (19). Ankyrin repeats and SPRY domains have also been found in many proteins and, like WD-40 repeats, have been implicated in protein–protein interactions (17, 18).
SOCS-1 seems to play an important role in regulating signal transduction. Upon binding, cytokines stimulate dimerization of their cognate cell surface receptors. Receptor dimerization brings JAKs, which are bound to the cytoplasmic tails of cytokine receptors, into close proximity, leading to their activation through cross-phosphorylation. In turn, JAKs phosphorylate STATs, leading to their dimerization and migration from the cytoplasm to the nucleus, where they act to increase transcription of target genes. Use of dominant negative STAT3 mutants suggest that this transcription factor is, at least in part, responsible for the rapid increase in mRNA levels for SOCS-1 after cytokine stimulation (6). Once produced, SOCS-1 is thought to bind, via its SH2 domain, to activated JAK molecules and to inhibit the catalytic activity of the kinase (4–6). This represents a classical negative feedback loop. Inactivation of kinases does not appear to represent the only mechanism by which SOCS family members can inhibit signal transduction. CIS, for example, is thought to block STAT5 activation by competing for binding to phosphorylated tyrosine residues within receptor cytoplasmic domains (7, 9).
On the basis of structural considerations, it would not be unexpected if SOCS-4 to SOCS-7, like SOCS-1, SOCS-2, SOCS-3, and CIS, acted to negatively regulate signal transduction. The functions of the WSB, SSB, and ASB proteins are more difficult to define, primarily because the function of the SOCS box is not known. The conservation of the SOCS box at the amino acid sequence level and its C-terminal position within SOCS proteins, WSB, SSB, and ASB proteins suggests that it will perform a broadly similar function in each protein. Several possibilities for the function of the SOCS box suggest themselves. First, the SOCS box might play a role in suppressing signal transduction, perhaps by inhibiting kinase activity. In this model, the regions N-terminal to the SOCS box (SH2 domains in SOCS proteins, WD-40 motifs in the WSB proteins, SPRY domain in the SSB proteins, and ankyrin repeats in the ASB proteins) might dictate with which kinase each protein interacts, with the SOCS box, in each case, inhibiting kinase activity. Equally, it is possible the SOCS box may act as an adaptor module or, alternatively, may regulate aspects of protein behavior such as intracellular location or stability. If any of the latter possibilities prove correct, the five classes of proteins that contain a SOCS box, (SOCSs, WSBs, SSBs, ASBs, and GTPases) might be serving very different functions. Studies to determine the relationship between the structure of the SOCS proteins and their function, may shed light on the role of the SOCS box in each class of proteins.
Acknowledgments
We gratefully acknowledge Richard Simpson and Robert Moritz for recombinant mouse IL-6. Tony Kyne, Keith Satterly, and Janice Coventry are thanked for their help with database searching and sequence alignments. This work was supported by the Anti-Cancer Council of Victoria, Melbourne, Australia; Australian Medical Research and Development Corp., Melbourne, Australia; The National Health and Medical Research Council, Canberra, Australia; The J.D. and L. Harris Trust; The National Institutes of Health, Bethesda, Maryland (Grant CA-22556), and the Australian Federal Government Cooperative Research Centres Program.
ABBREVIATIONS
- ASB
ankyrin-repeat-containing protein with a SOCS box
- CIS
cytokine-inducible SH2-containing protein
- EST
expressed sequence tag
- JAK
Janus kinase
- STAT
signal transducer and activator of transcription
- SOCS
suppressor of cytokine signaling
- WSB
WD-40-repeat-containing protein with a SOCS box
- SSB
SPRY domain-containing protein with a SOCS box
Footnotes
Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. AF033186–8).
References
- 1.Nicola N A, editor. Guidebook to Cytokines and Their Receptors. Oxford: Oxford Univ. Press; 1994. [Google Scholar]
- 2.Darnell J E, Jr, Kerr I M, Stark G R. Science. 1994;264:1415–1421. doi: 10.1126/science.8197455. [DOI] [PubMed] [Google Scholar]
- 3.Ihle J N, Kerr I M. Trends Genet. 1995;11:69–74. doi: 10.1016/s0168-9525(00)89000-9. [DOI] [PubMed] [Google Scholar]
- 4.Starr R, Willson T A, Viney E M, Murray L J, Rayner J R, Jenkins B J, Gonda T J, Alexander W S, Metcalf D, Nicola N A, Hilton D J. Nature (London) 1997;387:917–921. doi: 10.1038/43206. [DOI] [PubMed] [Google Scholar]
- 5.Endo T A, Masuhara M, Yokouchi M, Suzuki R, Mitsui K, Sakamoto H, Ohtsubo M, Misawa H, Kanekura Y, Yoshimura A. Nature (London) 1997;387:921–924. doi: 10.1038/43213. [DOI] [PubMed] [Google Scholar]
- 6.Naka T, Narazaki M, Hirata M, Matsumoto T, Minamoto S, Aono A, Nishimoto N, Kajita T, Taga T, Yoshizaki K, Akira S, Kishimoto T. Nature (London) 1997;387:924–929. doi: 10.1038/43219. [DOI] [PubMed] [Google Scholar]
- 7.Yoshimura A, Ohkubo T, Kiguchi T, Jenkins N A, Gilbert D J, Copeland N G, Hara T, Miyajima A. EMBO J. 1995;14:2816–2826. doi: 10.1002/j.1460-2075.1995.tb07281.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Minamoto S, Ikegame K, Ueno K, Narazaki M, Naka T, Yamamoto H, Matsumoto T, Saito M, Hosoe S, Kishimoto T. Biochem Biophys Res Commun. 1997;237:79–83. doi: 10.1006/bbrc.1997.7080. [DOI] [PubMed] [Google Scholar]
- 9.Matsumoto A, Masuhara M, Mitsui K, Yokouchi M, Ohtsubo M, Misawa H, Miyajima A, Yoshimura A. Blood. 1997;89:3148–3154. [PubMed] [Google Scholar]
- 10.Pearson W R. Methods Enzymol. 1990;183:63–98. doi: 10.1016/0076-6879(90)83007-v. [DOI] [PubMed] [Google Scholar]
- 11.Cockwell K Y, Giles I G. Comp Appl Biosci. 1989;5:227–232. doi: 10.1093/bioinformatics/5.3.227. [DOI] [PubMed] [Google Scholar]
- 12.Etzold T, Ulyanov A, Argos P. Methods Enzymol. 1996;266:114–128. doi: 10.1016/s0076-6879(96)66010-8. [DOI] [PubMed] [Google Scholar]
- 13.Altschul S F, Gish W, Miller W, Myers E W, Lipman D J. J Mol Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 14.Ansari-Lari M A, Shen Y, Munzy D M, Lee W, Gibbs R A. Genome Res. 1997;7:268–280. doi: 10.1101/gr.7.3.268. [DOI] [PubMed] [Google Scholar]
- 15.Pawson T. Adv Cancer Res. 1994;64:87–110. doi: 10.1016/s0065-230x(08)60835-0. [DOI] [PubMed] [Google Scholar]
- 16.Neer E J, Schmidt C J, Nambudripad R, Smith T F. Nature (London) 1994;371:297–300. doi: 10.1038/371297a0. [DOI] [PubMed] [Google Scholar]
- 17.Bork P. Proteins: Struct Funct Genet. 1993;17:363–374. doi: 10.1002/prot.340170405. [DOI] [PubMed] [Google Scholar]
- 18.Ponting C, Schultz J, Bork P. Trends Biochem Sci. 1997;22:193–194. doi: 10.1016/s0968-0004(97)01049-9. [DOI] [PubMed] [Google Scholar]
- 19.Masuhara M, Sakamoto H, Matsumoto A, Suzuki R, Yasukawa H, Mitsui K, Wakioka T, Tanimura S, Sasaki A, Misawa H, Yokouchi M, Ohtsuba M, Yoshimura A. Biochem Biophys Res Commun. 1997;239:439–446. doi: 10.1006/bbrc.1997.7484. [DOI] [PubMed] [Google Scholar]
- 20.Matuoka K, Miki H, Takahashi K, Takenawa T. Biochem Biophys Res Commun. 1997;239:488–492. doi: 10.1006/bbrc.1997.7492. [DOI] [PubMed] [Google Scholar]
- 21.Wang D S, Shaw R, Winkelmann J C, Shaw G. Biochem Biophys Res Commun. 1994;203:29–35. doi: 10.1006/bbrc.1994.2144. [DOI] [PubMed] [Google Scholar]