Abstract
Computerized indexing and retrieval of medical records is increasingly important; but the use of natural language versus coded languages (SNOP, SNOMED) for this purpose remains controversial. In an effort to develop search strategies for natural language text, the authors examined the anatomic diagnosis reports by computer for 7000 consecutive autopsy subjects spanning a 13-year period at The Johns Hopkins Hospital. There were 923,657 words, 11,642 of them distinct. The authors observed an average of 1052 keystrokes, 28 lines, and 131 words per autopsy report, with an average 4.6 words per line and 7.0 letters per word. The entire text file represented 921 hours of secretarial effort. Words ranged in frequency from 33,959 occurrences of "and" to one occurrence for each of 3398 different words. Searches for rare diseases with unique names or for representative examples of common diseases were most readily performed with the use of computer-printed key word in context (KWIC) books. For uncommon diseases designated by commonly used terms (such as "cystic fibrosis"), needs were best served by a computerized search for logical combinations of key words. In an unbalanced word distribution, each conjunction (logical and) search should be performed in ascending order of word frequency; but each alternation (logical inclusive or) search should be performed in descending order of word frequency. Natural language text searches will assume a larger role in medical records analysis as the labor-intensive procedure of translation into a coded language becomes more costly, compared with the computer-intensive procedure of text searching.
Full text
PDF





Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Aller R. D. Computers in anatomic pathology. Clin Lab Med. 1983 Mar;3(1):133–147. [PubMed] [Google Scholar]
- Aller R. D., Robboy S. J., Poitras J. W., Altshuler B. S., Cameron M., Prior M. C., Miao S., Barnett G. O. Computer-assisted pathology encoding and reporting system (CAPER). Am J Clin Pathol. 1977 Dec;68(6):715–720. doi: 10.1093/ajcp/68.6.715. [DOI] [PubMed] [Google Scholar]
- Bouckaert A. Computer learning of the differential diagnosis of goiters. Int J Biomed Comput. 1975 Jul;6(3):213–220. doi: 10.1016/0020-7101(75)90005-7. [DOI] [PubMed] [Google Scholar]
- Bowie J., Barnett G. O. MUMPS--an economical and efficient time-sharing system for information management. Comput Programs Biomed. 1976 Apr;6(1):11–22. doi: 10.1016/0010-468x(76)90048-9. [DOI] [PubMed] [Google Scholar]
- Britton M. Diagnostic errors discovered at autopsy. Acta Med Scand. 1974 Sep;196(3):203–210. doi: 10.1111/j.0954-6820.1974.tb00996.x. [DOI] [PubMed] [Google Scholar]
- Bruce R. A., Yarnall S. R. Computer-aided diagnosis of cardiovascular disorders. J Chronic Dis. 1966 Apr;19(4):473–484. doi: 10.1016/0021-9681(66)90121-4. [DOI] [PubMed] [Google Scholar]
- Coles E. C., Slavin G. An evaluation of automatic coding of surgical pathology reports. J Clin Pathol. 1976 Jul;29(7):621–625. doi: 10.1136/jcp.29.7.621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Côté R. A., Robboy S. Progress in medical information management. Systematized nomenclature of medicine (SNOMED). JAMA. 1980 Feb 22;243(8):756–762. doi: 10.1001/jama.1980.03300340032015. [DOI] [PubMed] [Google Scholar]
- De Dombal F. T. Medical diagnosis from a clinician's point of view. Methods Inf Med. 1978 Jan;17(1):28–35. [PubMed] [Google Scholar]
- Foulis P. R., Norbut A. M., Mendelow H., Kessler G. F. Pathology accessioning and retrieval system with encoding by computer (PARSEC). A microcomputer-based system for anatomic pathology featuring automated SNOP coding and multiple administrative functions. Am J Clin Pathol. 1980 Jun;73(6):748–753. doi: 10.1093/ajcp/73.6.748. [DOI] [PubMed] [Google Scholar]
- Fraser P. M., Franklin D. A. Mathematical models for the diagnosis of liver disease. Problems arising in the use of conditional probability theory. Q J Med. 1974 Jan;43(169):73–88. [PubMed] [Google Scholar]
- Frieder G., Slocum P. J. The lexical attributes of medical name files. Comput Biol Med. 1978 Jan;8(1):81–87. doi: 10.1016/0010-4825(78)90016-1. [DOI] [PubMed] [Google Scholar]
- Hieb B. R. A word processor-based pathology reporting system. Am J Clin Pathol. 1981 Mar;75(3):357–363. doi: 10.1093/ajcp/75.3.357. [DOI] [PubMed] [Google Scholar]
- Hirschman L., Story G., Marsh E., Lyman M., Sager N. An experiment in automated health care evaluation from narrative medical records. Comput Biomed Res. 1981 Oct;14(5):447–463. doi: 10.1016/0010-4809(81)90021-5. [DOI] [PubMed] [Google Scholar]
- Horowitz G. L., Bleich H. L. PaperChase: a computer program to search the medical literature. N Engl J Med. 1981 Oct 15;305(16):924–930. doi: 10.1056/NEJM198110153051605. [DOI] [PubMed] [Google Scholar]
- LEDLEY R. S., LUSTED L. B. Reasoning foundations of medical diagnosis; symbolic logic, probability, and value theory aid our understanding of how physicians reason. Science. 1959 Jul 3;130(3366):9–21. doi: 10.1126/science.130.3366.9. [DOI] [PubMed] [Google Scholar]
- Miller R. A., Pople H. E., Jr, Myers J. D. Internist-1, an experimental computer-based diagnostic consultant for general internal medicine. N Engl J Med. 1982 Aug 19;307(8):468–476. doi: 10.1056/NEJM198208193070803. [DOI] [PubMed] [Google Scholar]
- Moore G. W., Hutchins G. M., Bulkley B. H. Certainty levels in the nullity method of symbolic logic: application to the pathogenesis of congenital heart malformations. J Theor Biol. 1979 Jan 7;76(1):53–81. doi: 10.1016/0022-5193(79)90375-8. [DOI] [PubMed] [Google Scholar]
- Moore G. W., Hutchins G. M. Symbolic logic analysis of cause of death in humans: application to 108 patients after coronary artery bypass graft surgery. J Theor Biol. 1981 Oct 7;92(3):267–291. doi: 10.1016/0022-5193(81)90292-7. [DOI] [PubMed] [Google Scholar]
- Okubo R. S., Russell W. S., Dimsdale B., Lamson B. G. Natural language storage and retrieval of medical diagnostic information. Experience at the UCLA hospital and clinics over a 10-year period. Comput Programs Biomed. 1975 Dec;5(2):105–130. doi: 10.1016/0010-468x(75)90003-3. [DOI] [PubMed] [Google Scholar]
- Paplanus S. H., Shepard R. H., Zvargulis J. E. A computer-based system for autopsy diagnosis storage and retrieval without numerical coding. Lab Invest. 1969 Feb;20(2):139–146. [PubMed] [Google Scholar]
- Pauker S. G., Gorry G. A., Kassirer J. P., Schwartz W. B. Towards the simulation of clinical cognition. Taking a present illness by computer. Am J Med. 1976 Jun;60(7):981–996. doi: 10.1016/0002-9343(76)90570-2. [DOI] [PubMed] [Google Scholar]
- Platt R. C., Wong R. L., Lantner K. W., Gaynon P. S. POLARS: a Pathology On-line Logging And Reporting System. Comput Biomed Res. 1974 Feb;7(1):83–99. doi: 10.1016/0010-4809(74)90045-7. [DOI] [PubMed] [Google Scholar]
- Robboy S. J., Altshuler B. S., Chen H. Y. Retrieval in a Computer-assisted Pathology Encoding and Reporting System (CAPER). Am J Clin Pathol. 1981 May;75(5):654–661. doi: 10.1093/ajcp/75.5.654. [DOI] [PubMed] [Google Scholar]
- Roberts W. C. The autopsy: its decline and a suggestion for its revival. N Engl J Med. 1978 Aug 17;299(7):332–338. doi: 10.1056/NEJM197808172990704. [DOI] [PubMed] [Google Scholar]
- Schwartz W. B., Wolfe H. J., Pauker S. G. Pathology and probabilities: a new approach to interpreting and reporting biopsies. N Engl J Med. 1981 Oct 15;305(16):917–923. doi: 10.1056/NEJM198110153051604. [DOI] [PubMed] [Google Scholar]
- Sharpe T. C., Clark D. E. General purpose information handling techniques for pathological data. Comput Biol Med. 1975 Sep;5(3):221–233. doi: 10.1016/0010-4825(75)90006-2. [DOI] [PubMed] [Google Scholar]
- Stein M. A., Winter J. Theory development in medical decision-making. Int J Biomed Comput. 1974 Apr;5(2):147–159. doi: 10.1016/0020-7101(74)90016-6. [DOI] [PubMed] [Google Scholar]
- Story G., Hirschman L. Data base design for natural language medical data. J Med Syst. 1982 Feb;6(1):77–88. doi: 10.1007/BF00994122. [DOI] [PubMed] [Google Scholar]
- WARNER H. R., TORONTO A. F., VEASEY L. G., STEPHENSON R. A mathematical approach to medical diagnosis. Application to congenital heart disease. JAMA. 1961 Jul 22;177:177–183. doi: 10.1001/jama.1961.03040290005002. [DOI] [PubMed] [Google Scholar]
- Wong R. L., Reno J. D., Hain T. C., Platt R. C., Gaynon P. S., Joseph D. M. Profile of a dictionary compiled from scanning over one million words of surgical pathology narrative text. Comput Biomed Res. 1980 Aug;13(4):382–398. doi: 10.1016/0010-4809(80)90029-4. [DOI] [PubMed] [Google Scholar]
- Yu V. L., Buchanan B. G., Shortliffe E. H., Wraith S. M., Davis R., Scott A. C., Cohen S. N. Evaluating the performance of a computer-based consultant. Comput Programs Biomed. 1979 Jan;9(1):95–102. doi: 10.1016/0010-468x(79)90022-9. [DOI] [PubMed] [Google Scholar]