Discovering protein similarity using natural language processing

Indra N Sarkar; Thomas C Rindflesch

. 2002:677–681.

Discovering protein similarity using natural language processing.

Indra N Sarkar ¹, Thomas C Rindflesch ¹

PMCID: PMC2244512 PMID: 12463910

Abstract

Extracting protein interaction relationships from textual repositories, such as MEDLINE, may prove useful in generating novel biological hypotheses. Using abstracts relevant to two known functionally related proteins, we modified an existing natural language processing tool to extract protein interaction terms. We were able to obtain functional information about two proteins, Amyloid Precursor Protein and Prion Protein, that have been implicated in the etiology of Alzheimer's Disease and Creutzfeldt-Jakob Disease, respectively.

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

Andrade M. A., Bork P. Automated extraction of information in molecular biology. FEBS Lett. 2000 Jun 30;476(1-2):12–17. doi: 10.1016/s0014-5793(00)01661-6. [DOI] [PubMed] [Google Scholar]
Aronson A. R. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp. 2001:17–21. [PMC free article] [PubMed] [Google Scholar]
Baker D., Sali A. Protein structure prediction and structural genomics. Science. 2001 Oct 5;294(5540):93–96. doi: 10.1126/science.1065659. [DOI] [PubMed] [Google Scholar]
Chang J. T., Raychaudhuri S., Altman R. B. Including biological literature improves homology search. Pac Symp Biocomput. 2001:374–383. doi: 10.1142/9789814447362_0037. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fukuda K., Tamura A., Tsunoda T., Takagi T. Toward information extraction: identifying protein names from biological papers. Pac Symp Biocomput. 1998:707–718. [PubMed] [Google Scholar]
Hahn Udo, Romacker Martin, Schulz Stefan. Creating knowledge repositories from biomedical reports: the MEDSYNDIKATE text mining system. Pac Symp Biocomput. 2002:338–349. [PubMed] [Google Scholar]
Hristovski D., Stare J., Peterlin B., Dzeroski S. Supporting discovery in medicine by association rule mining in Medline and UMLS. Stud Health Technol Inform. 2001;84(Pt 2):1344–1348. [PubMed] [Google Scholar]
Leroy G., Chen H. Filling preposition-based templates to capture information from medical abstracts. Pac Symp Biocomput. 2002:350–361. [PubMed] [Google Scholar]
Park J. C., Kim H. S., Kim J. J. Bidirectional incremental parsing for automatic pathway identification with combinatory categorial grammar. Pac Symp Biocomput. 2001:396–407. [PubMed] [Google Scholar]
Pustejovsky J., Castaño J., Zhang J., Kotecki M., Cochran B. Robust relational parsing over biomedical literature: extracting inhibit relations. Pac Symp Biocomput. 2002:362–373. doi: 10.1142/9789812799623_0034. [DOI] [PubMed] [Google Scholar]
Rindflesch T. C., Hunter L., Aronson A. R. Mining molecular binding terminology from biomedical text. Proc AMIA Symp. 1999:127–131. [PMC free article] [PubMed] [Google Scholar]
Stephens M., Palakal M., Mukhopadhyay S., Raje R., Mostafa J. Detecting gene relations from Medline abstracts. Pac Symp Biocomput. 2001:483–495. doi: 10.1142/9789814447362_0047. [DOI] [PubMed] [Google Scholar]
Weeber M., Klein H., Aronson A. R., Mork J. G., de Jong-van den Berg L. T., Vos R. Text-based discovery in biomedicine: the architecture of the DAD-system. Proc AMIA Symp. 2000:903–907. [PMC free article] [PubMed] [Google Scholar]
Yakushiji A., Tateisi Y., Miyao Y., Tsujii J. Event extraction from biomedical papers using a full parser. Pac Symp Biocomput. 2001:408–419. doi: 10.1142/9789814447362_0040. [DOI] [PubMed] [Google Scholar]

[OCR_00543] Andrade M. A., Bork P. Automated extraction of information in molecular biology. FEBS Lett. 2000 Jun 30;476(1-2):12–17. doi: 10.1016/s0014-5793(00)01661-6. [DOI] [PubMed] [Google Scholar]

[OCR_00594] Aronson A. R. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp. 2001:17–21. [PMC free article] [PubMed] [Google Scholar]

[OCR_00493] Baker D., Sali A. Protein structure prediction and structural genomics. Science. 2001 Oct 5;294(5540):93–96. doi: 10.1126/science.1065659. [DOI] [PubMed] [Google Scholar]

[OCR_00537] Chang J. T., Raychaudhuri S., Altman R. B. Including biological literature improves homology search. Pac Symp Biocomput. 2001:374–383. doi: 10.1142/9789814447362_0037. [DOI] [PMC free article] [PubMed] [Google Scholar]

[OCR_00607] Fukuda K., Tamura A., Tsunoda T., Takagi T. Toward information extraction: identifying protein names from biological papers. Pac Symp Biocomput. 1998:707–718. [PubMed] [Google Scholar]

[OCR_00507] Hahn Udo, Romacker Martin, Schulz Stefan. Creating knowledge repositories from biomedical reports: the MEDSYNDIKATE text mining system. Pac Symp Biocomput. 2002:338–349. [PubMed] [Google Scholar]

[OCR_00632] Hristovski D., Stare J., Peterlin B., Dzeroski S. Supporting discovery in medicine by association rule mining in Medline and UMLS. Stud Health Technol Inform. 2001;84(Pt 2):1344–1348. [PubMed] [Google Scholar]

[OCR_00556] Leroy G., Chen H. Filling preposition-based templates to capture information from medical abstracts. Pac Symp Biocomput. 2002:350–361. [PubMed] [Google Scholar]

[OCR_00562] Park J. C., Kim H. S., Kim J. J. Bidirectional incremental parsing for automatic pathway identification with combinatory categorial grammar. Pac Symp Biocomput. 2001:396–407. [PubMed] [Google Scholar]

[OCR_00522] Pustejovsky J., Castaño J., Zhang J., Kotecki M., Cochran B. Robust relational parsing over biomedical literature: extracting inhibit relations. Pac Symp Biocomput. 2002:362–373. doi: 10.1142/9789812799623_0034. [DOI] [PubMed] [Google Scholar]

[OCR_00588] Rindflesch T. C., Hunter L., Aronson A. R. Mining molecular binding terminology from biomedical text. Proc AMIA Symp. 1999:127–131. [PMC free article] [PubMed] [Google Scholar]

[OCR_00646] Stephens M., Palakal M., Mukhopadhyay S., Raje R., Mostafa J. Detecting gene relations from Medline abstracts. Pac Symp Biocomput. 2001:483–495. doi: 10.1142/9789814447362_0047. [DOI] [PubMed] [Google Scholar]

[OCR_00626] Weeber M., Klein H., Aronson A. R., Mork J. G., de Jong-van den Berg L. T., Vos R. Text-based discovery in biomedicine: the architecture of the DAD-system. Proc AMIA Symp. 2000:903–907. [PMC free article] [PubMed] [Google Scholar]

[OCR_00576] Yakushiji A., Tateisi Y., Miyao Y., Tsujii J. Event extraction from biomedical papers using a full parser. Pac Symp Biocomput. 2001:408–419. doi: 10.1142/9789814447362_0040. [DOI] [PubMed] [Google Scholar]

PERMALINK

Discovering protein similarity using natural language processing.

Indra N Sarkar

Thomas C Rindflesch

Abstract

Full text

Selected References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Discovering protein similarity using natural language processing.

Indra N Sarkar

Thomas C Rindflesch

Abstract

Full text

Selected References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases