cDNA sequence and deduced amino acid sequence of the longest cDNA clone (clone 18; 2.8 kb) encoding the human A33 antigen. The longest open reading frame, encompassing nucleotides 345-1301, contains the known N-terminal sequence of the native A33 antigen and predicts a protein of 319 amino acids. The stop codon at 1302–1304 is boxed, and the amino acid sequences of the internal peptides identified by digestion of the native molecule are shown (shaded areas). A putative signal sequence (bold underline), three potential N-linked glycosylation sites (overline), and a transmembrane domain (second bold underline) are indicated. Adjacent to the transmembrane domain, four consecutive cysteine residues are observed. The spans of the two putative Ig-like domains are enclosed by square brackets with the specific residues conserved in Ig superfamily members shown in circles. Other features of the DNA sequence include a tandem repeat of 25 bp in the 5′ untranslated region (bold overline) and a polyadenylylation signal (AATAAA) 11 bp upstream from the poly(A) tail. The asterisk above the C at position 294 denotes the fact that a C was found in this position in 2 out of 5 independent clones sequenced (including clone 18) and an A in the 3 other clones.