Table 2.
Widespread L1 ASP-driven transcription of human genes revealed from ESTs/mRNAs.
| EST1 | Source2 | Similarity to L1 5′UTR opposite strand3 | Similarity to known mRNA4 | Location in the genome5 | Orientation6 |
| Type I splicing (1 EST) | |||||
| BU943355++ (4 ex) | Pool of 40 cell line polyA+ | 4−59 ≡ 592−647 (96%) 60−289 ≡ 762−990 (96%) L1PA3 AC007780 | Arylsulfatase G, NM_014960 331−649 ≡ 1342−1660 (99%) | NT_010641 (chr 17) 10/11 | Sense |
| Type II splicing (2 ESTs) | |||||
| CD642260 (4 ex) | Embryonic stem cell line WA01/H1 | 12−117 ≡ 542−647 (97%) 118−230 ≡ 878−990 (96%) L1PA2 AC022762 | Olfactory receptor, family 56, subfamily B, member 4, NM_001005181 373−728 ≡ 802−443 (98%) | NT_009237 (chr 11) 3′/1 | Antisense |
| NM_017794 (46 ex) | RA-induced NT2 neuronal precursor cells | 4−150 ≡ 501−647 (93%) 151−262 ≡ 878−990 (93%) L1P AL354879 | Hypothetical protein KIAA1797, AL711955* 331−834 ≡ 60−563 (99%) | NT_008413 (chr 9) 5′/45 | Sense |
| Type III splicing (22 ESTs) | |||||
| BM910612 (6 ex) | Brain, astrocytoma grade IV cell line | 1−134 ≡ 514−647 (98%) L1Ta (Hs) AC011597 | Fibronectin type III domain containing 6 (cytokine receptor) NM_144717 268−915 ≡ 336−982 (98%) | NT_086641 (chr 3) 1/7 | Sense |
| BF676152 (3 ex) | Prostate | 4−126 ≡ 520−647 (91%) L1PA2 AC097061 | Hypothetical protein BC014608, NM_138796 127−713 ≡ 422−1005 (91%) | NT_021877 (chr 1) 5/11 | Sense |
| AU123136++ (7 ex) | Uninduced NT2 cell line | 1−125 ≡ 523−647 (96%) L1PA2 AC079005 | Breast carcinoma amplified sequence 3, NM_017679 126−623 ≡ 710−1208 (99%) | NT_010783 (chr 17) 9/24 | Sense |
| AA226814+ (3 ex) | Ntera-2 neuroepithelial cells | 1−111 ≡ 538−649 (93%) L1PA2 AC018470 | Secernin 3 (dipeptidase), NM_024583 112−347 ≡ 843−1075 (96%) | NT_005403 (chr 2) 5/8 | Sense |
| BU959632 (5 ex) | Pool of 40 cell line polyA+ | 4−45 606−647 L1Ta (Hs) AC008496 | Cardiomyopathy associated 5, NM_153610 46−559 ≡ 3235−3748 (97%) | NT_006713 (chr 5) 8/12 | Sense |
| BF208095+ (6 ex) | Bladder carcinoma cell line | 2−57 ≡ 592−647 (94%) L1PA2 AC002080 | Hepatocyte growth factor receptor (MET proto-oncogene), NM_000245 132−456 ≡ 1387−1714 (99%) 462−663 ≡ 1805−2013 (92%) | NT_007927 (chr 7) 2/21 | Sense |
| AA220950+ (3 ex) | Ntera-2 neuroepithelial cells | 1−39 ≡ 609−647 (89%) L1PA3 AC022261 | Dynein, cytoplasmic, intermediate polypeptide 1, NM_004411 40−247 ≡ 613−818 (95%) | NT_007910 (chr 7) 5/17 | Sense |
| BM557937 (7 ex) | Brain astrocytoma grade IV cell line | 1−110 ≡ 538−647 (93%) L1PA3 AC022748 | Cholinergic receptor, nicotinic, beta poly-peptide 4, NM_000750 410−713 ≡ 168−471 (99%) | NT_024654 (chr 15) 5′/6 | Sense |
| BG335812 (> 6 ex) | Placenta choriocarcinoma cell line | 2−105 ≡ 544−647 (93%) L1PA2 AC009949 | Nuclear antigen Sp100, NM_003113 106−522 ≡ 139−556 (90%) | NT_005403 (chr 2) 2/25 | Sense |
| BE865812+(4 ex) | Bladder carcinoma cell line | 1−43 ≡ 605−647 (97%) L1Ta (Hs) AL049838 | Chromosome 14 open reading frame 37, NM_001001872 44−343 ≡ 933−1228 (96%) | NT_025892 (chr 14) 5/7 | Sense |
| BE866323+(4 ex) | Bladder carcinoma cell line | 1−92 556−647 (96%) L1PA2 AC073058 L1PA2 AC020550 | Bol, boule-like (Drosophila), NM_033030 145−204 ≡ 790−731 (98%) | NT_005246 (chr 2) 3′/11 | Antisense |
| BP352155 (5 ex) | Well-differentiated squamous cell carcinoma cell line TE13 | 1−113 ≡ 535−647 (96%) L1PA2 AC004519 | Hypothetical protein FLJ31340, BX346336* 114−490 ≡ 500−876 (98%) | NT_086723 (chr 7) 1/ >5 | Sense |
| BP351387 (5 ex) | Well-differentiated squamous cell carcinoma cell line TE13 | 1−67 = 581−647 L1Ta (Hs) AL663118 | Chloride channel 5, NM_000084 213−583 = 243−613 | NT_086939 (chr X) 5′/12 | Sense |
| BP351082 (>4 ex) | Well-differentiated squamous cell carcinoma cell line TE13 | 1−71 ≡ 576−647 (95%) L1PA3 AC114734 | Hypothetical protein MGC16169 (protein kinase) NM_033115 72−593 ≡ 1913−2433 (99%) | NT_086651 (chr 4) 17/24 | Sense |
| BP369881 (6 ex) | Testis | 1−65 ≡ 581−647 (92%) L1PA3 AL136525 | WD repeat and FYVE domain containing 2, NM_052950 66 −570 ≡ 460−963 (99%) | NT_086801 (chr 13) 3/12 | Sense |
| AA226765 (3 ex) | Brain Ntera-2 neuroepithelial cells | 1−67 ≡ 581 −647 (92%) L1PA3 AC025170 | Hypothetical protein FLJ35779, NM_152408 68−356 ≡ 480−767 (97%) | NT_086677 (chr 5) 4/11 | Sense |
| CF593264 (> 5 ex) | Placenta | 29−95 ≡ 581−647 (95%) L1PA3 AL050323 | Phospholipase C, beta 1, NM_182734 174−769 ≡ 103−692 (98%) | NT_011387 (chr 20) 5′/33 | Sense |
| BP873102 (5 ex) | Embryonal kidney cell line=“293” | 1−67 ≡ 581−647 (95%) L1PA2 AL022400 | RAB GTPase activating protein 1-like, NM_014857 68−583 ≡ 731−1244 (95%) | NT_086598 (chr 1) 4/21 | Sense |
| CD110319 (2 ex) | Placenta “preeclamptic placenta” | 25−92 ≡ 580−647 (97%) L1PA2 AC004452 | FLJ16237 protein, NM_001004320 93−568 ≡ 428−900 (97%) | NT_086703 (chr 7) 2 /13 | Sense |
| BX476029 (5 ex) | Pooled from different tissues | 2−77 ≡ 572−647 (93%) L1PA3 AL121946 | Polycystic kidney and hepatic disease 1, NM_138694 78−567 ≡ 7273−7762 (99%) | NT_007592 (chr 6) 43/67 | Sense |
| CB960713 (4 ex) | Placenta | 30−107 ≡ 570−647 (96%) L1PA3 AC005922 | ATP-binding cassette, subfamily A, NM_172386 108−208 = 3283−3183 | NT_010641 (chr 17) 25/38 | Antisense |
| CD644604 (3 ex) | Embryonic stem cells, cell line=“WA01” | 14−115 ≡ 547−647 (94%) L1PA3 AC022029 | Catenin (cadherin-associated protein), alpha 3, NM_013266 116−736 ≡ 755−1375 (98%) | NT_086771 (chr 10) 5/19 | Sense |
| Type IV splicing (1 EST) | |||||
| CF594290 (9 ex) | Placenta | 29−230 ≡ 531−732 (94%) 231−340 ≡ 878−988 (95%) L1PA2 AC022306 | Hypothetical protein FLJ32800, NM_152647 354−451 = 1305−1402 452 −780 ≡ 1642−1964 (97%) | NT_010194 (chr 15) 5/16 | Sense |
| Type V splicing (19 ESTs) | |||||
| BE787024++ (3 ex) | Lung large cell carcinoma cell line | 17−215 ≡ 533−732 (98%) L1Ta (Hs) AC079750 | Activin A receptor, type IC, NM_145259 216−752 ≡ 548−1086 (95%) | NT_005403 (chr 2) 2/9 | Sense |
| BE568884+ (4 ex) | Bladder carcinoma cell line | 1−178 ≡ 554−732 (97%) | CD96 antigen, NM_005816 179−627 ≡ 659−1113 (97%) | NT_086640 (chr 3) 2/15 | Sense |
| BE617461++ (6 ex) | Colon adenocarcinoma cell line | 8−185 ≡ 553−732 (98%) L1PA2 AC092916 | RAB3A interacting protein, NM_175625 186−738 ≡ 998−1556 (98%) | NT_086796 (chr 12) 3/10 | Sense |
| BE568818+ (3 ex) | Bladder carcinoma cell line | 1−163 ≡ 570−732 (93%) L1PA2 AC010585 | Secretory carrier membrane protein 1, NM_052822 164−516 ≡ 717−1063 (97%) | NT_006713 (chr 5) 6/8 | Sense |
| BU858570 (2 ex) | Pool of 40 cell line polyA+ RNAs | 4−166 ≡ 571−732 (93%) L1PA2 AL691464 | Guanylate binding protein 1, NM_002053 167−402 ≡ 259−494 (95%) | NT_004686 (chr 1) 2/11 | Sense |
| BF028725 (3 ex) | Bladder carcinoma cell line | 2−123 ≡ 612−732 (91%) L1PA2 AC004800 | Hypothetical protein FLJ36166, NM_182634 124−264 ≡ 3282−3424 (95%) | NT_086704 (chr 7) 2/21 | Sense |
| AA224229+ (4 ex) | 6 week, differentiated, post-mitotic hNT, neurons | 1−94 ≡ 640−732 (98%) L1Ta (Hs) AL365308 | Chromosome 6 open reading frame 170, NM_152730 95−430 ≡ 2622−2957 (99%) | NT_086697 (chr 6) 22/30 | Sense |
| BG542212++ (> 3 ex) | Lung | 2−187 ≡ 547−732 (97%) L1Ta (Hs) AC096569 | Zinc finger protein 638, NM_014497 188−638 ≡ 3576−4013 (92%) | NT_022184 (chr 2) 18/28 | Sense |
| AV693621 (2 ex) | Hepatocellular carcinoma | 1−172 ≡ 559−732 (93%) L1PA2 AL627203 | Collagen, type XI, alpha 1, variant A, NM_001854 187−279 = 3433−3341 | NT_004623 (chr 1) 46/67 | Antisense |
| BE735854+ (6 ex) | Pancreas adenocarcinoma cell line | 1−95 ≡ 638−732 (93%) L1PA2 AC092903 | Similar to beta-1, 4-mannosyltransferase, CD708577* 95−387 ≡ 174−466 (99%) | NT_005588 (chr 3) 1/ >5 | Sense |
| R64632 (4 ex) | Soares placenta Nb2HP | 1−52 = 681−732 L1PA2 AL713859 | Hypothetical protein FLJ10986, NM_018291 53−406 ≡ 1319−1671 98% | NT_029223 (chr 1) 11/14 | Sense |
| BP352672 (4 ex) | Well-differentiated squamous cell carcinoma cell line TE13 | 1−126 ≡ 608−732(94%) L1PA2 AL354711 | Chromosome 9 open reading frame 39, NM_017738 127−603 = 152−631 | NT_008413 (chr 9) 2/23 | Sense |
| BP358215 (7 ex) | Mammary gland tumor cell line T47D | 1−147 ≡ 586−732 (92%) L1PA2 AL391749 | Regulator of G-protein signalling 6, NM_004296 148−581 ≡ 188−621 (99%) | NT_026437 (chr 14) 5′/17 | Sense |
| H72033 (4 ex) | Soares breast 2NbHBst | 1−107 ≡ 626−732 (97%) L1PA2 AC079005 | Breast carcinoma amplified sequence 3, NM_017679 108 −370 ≡ 710−967 (95%) | NT_010783 (chr 17) 9/24 | Sense |
| CA488981 (3 ex) | Cell_line=ZR-75-1, MCF7, SK-BR-3, MDA-MB-231, hTERT-HME1, LNCaP | 1−159 ≡ 574−732 (91%) L1PA2 AC034215 | Monogenic, audiogenic seizure susceptibility 1 homolog, NM_032119 160−736 ≡ 17956−18532 (99%) | NT_086677 (chr 5) 83/98 | Sense |
| BX955947 (3 ex) | Pooled from different tissues | 1−116 ≡ 617−732 (89%) L1PA2 AC006559 | Solute carrier organic anion transporter family, member 1A2, NM_021094 240−342 = 186−288 | NT_009714 (chr 12) 5′/14 | Sense |
| BX477512++ (3 ex) | Pooled from different tissues | 2−129 ≡ 605−732 (93%) L1PA2 AC024061 | Hypothetical protein FLJ38736, NM_182758 130−551 = 3191−3216 | NT_086827 (chr 15) 18/20 | Sense |
| CN412489++ (2 ex) | Embryonic stem cells, embryoid bodies from H1, 7 and H9 cell lines | 1−151 ≡ 582−732 (98%) L1PA2 AL133299 | FLJ46156 protein, NM_198499 152− 348 = 1087−1283 | NT_086806 (chr 14) 8/37 | Sense |
| CN408255 (4 ex) | Embryonic stem cells, DMSO-treated H9 cell line | 1−180 ≡ 553−732 (95%) L1PA2 AP00942 | Baculoviral IAP repeat-containing 2, NM_001166 181−514 = 2766−3099 | NT_033899 (chr 11) 6/9 | Sense |
| Type VI splicing (4 ESTs) | |||||
| CD643062 (8 ex) | Embryonic stem cell line WA01/H1 | 10−220 ≡ 780−990 (97%) L1PA2 AC018741 | Hypothetical LOC388927, XM_371478 237−744 ≡ 1−509 (99%) | NT_015926 (chr 2) ND | Sense |
| BU176833 (6 ex) | Eye retinoblastoma cell line | 1−227 ≡ 763−989 (96%) L1PA3 AC105054 | Rho GTPase activating protein 25, NM_014882 536−878 ≡ 419−757 (97%) | NT_022184 (chr 2) 5′/10 | Sense |
| BE568192 (3 ex) | Bladder carcinoma cell line | 1−60 ≡ 931−990 (98%) L1PA2 AP005264 | Similar to hypothetical protein LOC375127, XM_496265 95−367 ≡ 213−490 (95%) | NT_010859 (chr 18) 3/5 | Sense |
| BP245205 (3 ex) | Embryonal kidney cell line 293 | 6−135 ≡ 861−990 (95%) L1PA2 AC099512 | Monogenic, audiogenic seizure susceptibility 1 homolog, NM_032119 138−574 ≡ 17953−18384 (98%) | NT_086677 (chr 5) 91/98 | Sense |
1 EST/mRNA GenBank accession number and number of exons (ex) determined by SPIDEY [1]. ESTs are grouped according to 6 different splicing schemes [2]. Sixteen identical or similar ESTs described earlier by Nigumann et al [2] and Wheelan et al [44] are shown by + and ++, respectively.
2 Source of the EST as annotated in EST division of GenBank.
3 EST similarity (≡) or identity (=) to a representative L1 genomic clone #11A [3]. Subfamily of L1 [4] and GenBank accession number were determined by genome browser [5]. For some ESTs the 5′ nucleotides (< 28 nt) were derived either from vector/adaptor or represented as low quality sequence.
4 Similarity/identity to known mRNA as determined by BLASTN [6] and BLAST2 sequences [7] programs. mRNA description is based on the RefSeq database [8]. If the mRNA has not been described, an EST (marked by an asterisk) is shown. This EST contains a putative first exon transcribed from the non-L1 (native) promoter.
5 Genomic contig (accession no), chromosome (chr), and position of the L1 ASP in the intron, upstream (5′) or downstream (3′)/total number of exons, as determined with MegaBLAST and SPIDEY programs. ND stands for not determined.
6 Orientation with respect to the gene's transcription.