Benarafa and Remold-O’Donnell. 10.1073/pnas.0502934102.

Supporting Information

Files in this Data Supplement:

Supporting Table 1
Supporting Table 2
Supporting Figure 9
Supporting Figure 10




Supporting Figure 9

Fig. 9.

Amino acid alignment of human and chicken clade B serpins. Residues are colored according to type: polar uncharged (green), acidic (red), basic (blue), and nonpolar (yellow). Secondary structure elements are indicated at the top and are based on the crystal structure of ovalbumin (1OVA) (3). The 60 conserved positions derived from previous studies (7, 8) that were analyzed in all ov-serpins are indicated at the bottom by circles. By convention, numbering refers to that of a1-antitrypsin (7). The interacting residues identified in the atypical human SERPINB5 crystal structure are K196, T226, K283, E349, Q357, and the 5'-stem H358, K359, D360, and E361 (32). Blue arrowheads indicate the position of the invariable exon–intron boundaries common for all seven- and eight-exon genes (see Fig. 3C). The orange arrowhead represents the position of the additional boundary found only in eight-exon ov-serpins [shown is the exact position for ovalbumin but varying by a few residues within the CD loop in other eight-exon genes (data not shown)].





Supporting Figure 10

Fig. 10.

Amino acid alignment of human and zebrafish clade B serpins. Residues are colored according to type: polar uncharged (green), acidic (red), basic (blue), and nonpolar (yellow). Secondary structure elements are indicated at the top and are based on the crystal structure of ovalbumin (1OVA) (3). The 60 conserved positions derived from previous studies (7, 8) that were analyzed in all ov-serpins are indicated at the bottom by circles. By convention, numbering refers to that of a1-antitrypsin (7). The interacting residues identified in the atypical human SERPINB5 crystal structure are K196, T226, K283, E349, Q357, and the 5'-stem H358, K359, D360, and E361 (32).





Table 1. Accession numbers of chicken clade B serpins

Build 1.1 locus name

Common name

Proposed official serpin nomenclature

Accession nos.

Current annotation

LOC420894

 

serpinb1

XP_418980

Similar to serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 1

LOC420895

 

serpinb6

XP_418981

Similar to MGC64421 protein

LOC428483

 

serpinb10

XP_426040*

Similar to heterochromatin-associated protein MENT

MENT-1

MENT

serpinb10b

NP_990228

Heterochromatin-associated protein MENT

LOC420896

 

serpinb2

XP_418982

Similar to plasminogen activator inhibitor

LOC396058

Ovalbumin

serpinb14

NP_990483

Ovalbumin

LOC420897

Gene Y

serpinb14b

XP_418983

Similar to ovalbumin-related Y protein, chicken

LOC420898

Gene X

serpinb14c

XP_418984

Similar to ovalbumin-related Y protein, chicken

LOC420899

 

serpinb12

XP_418985

Similar to serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 12

LOC420900

 

serpinb5

XP_418986

Similar to maspin

The proposed nomenclature for the chicken clade B serpins follows the rules of the Serpin Gene Nomenclature Committee (2).

*XP_426040 is a predicted sequence not found in EST databases that lacks one exon. By using the corresponding sequence of MENT, the missing exon was found in intronic sequence. The exon was also found by using genscan (score of 0.08) and begins at position 1362 and ends at 1448 in the genomic sequence and encodes for 29 amino acids with compatible phasing with adjacent exons.

The first 108 amino acids of predicted sequence XP_418984 were excluded from our study. The position of Met109 is identical to that of Met1 of ovalbumin and gene Y and is at the conserved position for translation start site in exon 2.

The first 283 amino acids of predicted XP_418986 were excluded from our study. The position of Met284 is identical to Met1 of mammalian SERPINB5 (maspin) and the ATG encoding it is in the conserved position in exon 2.





Table 2. Ov-serpin genes in zebrafish (Danio rerio)

Zebrafish

Human

Chicken

Zebrafish

Gene structure

P1–P1'

Accession nos.

B1

B5

B6

B2

B10

B12

b1

b5

b6

b2

b10

b12

ZGC:91981

CAI20745

ZGC:64178

AAQ97848

ZGC:77645

ZGC:76926

Protein

Nucleotide

ZGC:91981

57

36

48

43

43

40

56

35

53

45

43

43

x

75

47

51

46

36

Seven-exon

Cys–Met

ZGC:91981

BC076524

CAI20745

51

35

45

41

44

39

52

36

52

46

42

41

75

x

46

51

45

34

Seven-exon

Ser–Ala

CAI20745

BX248515

ZGC:64178

49

36

51

39

40

41

50

37

52

42

42

40

47

46

x

55

48

34

Seven-exon

Arg–Cys

ZGC:64178

BC053300

AAQ97848

58

39

50

44

47

46

56

39

55

47

49

44

51

51

55

x

75

38

Seven-exon

Leu–Cys

AAQ97848

AY398415

ZGC:77645*

47

34

46

44

45

43

46

35

48

47

46

44

46

45

48

75

x

33

Eight-exon

Arg–Thr

ZGC:77645

BC064292

ZGC:76926

34

28

34

32

30

29

33

30

35

32

32

31

36

34

34

38

33

x

n/a

Leu–Cys

ZGC:76926

BC066740

The names for the zebrafish ov-serpins starting with ZGC were characterized by full-length cDNAs isolated as part of the zebrafish gene collection (ZGC). The remaining ov-serpins are designated by their protein accession nos. CAI20745 is a predicted sequence from genomic scaffold BX248515. This locus also includes ZGC:91981 (under protein accession no. CAI20749) and a serpinb1-like pseudogene CAI20747, not shown in the table (when the premature stop codon is omitted, the predicted protein sequence of CAI20747 is 76% and 80% identical to ZGC:91981 and CAI20745, respectively). Inter- and intraspecies percentage of protein sequence identity is shown (x indicates intersection of the same zebrafish protein, i.e. 100% identity). The gene-structure information was not available (n/a) for ZGC:76926.

*The predicted coding sequence of ZGC:77645, currently in the database, is based on the mRNA transcript. A more likely start codon was identified by genscan 120 bp upstream, adding 40 amino acids at the N terminus, and that start codon was used in this study.