Abstract
In clinical practice guidelines (CPGs) the medical information is stored in a narrative way. A large part of this information occurs in a negated form. The detection of negation in CPGs is an important task since it helps medical personnel to identify not occurring symptoms and diseases as well as treatment actions that should not be accomplished. We developed algorithms capable of Negation Detection in this kind of medical documents. According to our results, we are convinced that the involvement of syntactical methods can improve Negation Detection, not only in medical writings but also in arbitrary narrative texts.
Keywords: Negation detection, clinical practice guidelines, natural language processing
Introduction
Negation is an important part of inter-human communication. It can be used to invert concepts and to show refusal of opinions. The concept of negation is a universal concept in all languages and very important in the medical field. Detecting negations in natural language is a difficult task, but in the medical scope it is easier: Medical language is much more restricted than narrative speech [1]; a physician will not use stylistic elements such as double negation extensively to write reports or patients histories.
In the medical scope Negation Detection is currently only applied to very simple texts (e.g., radiology reports). In our work, we primarily focus on the more complex text type of clinical practice guidelines (CPGs). These are “systematically developed statements to assist practitioner and patient decisions about appropriate health care for specific clinical circumstances” [2]. In CPGs negation is crucial not only for facts that do not apply (e.g., patient has no pain), but also for actions that should not be accomplished (e.g., do not take this drug). In contrast to simpler texts, we need algorithms dealing with the syntax (e.g., tenses in active and passive voice, parts-of-speech) of the English language.
In the following section we give an overview over existing methods of Negation Detection. In the main part of the work we describe and evaluate an approach of Negation Detection using syntactical methods tailored to the special characteristics of CPGs.
1. Related Work
Besides general work on negation in natural language (e.g., [3]) we will discuss relevant work addressing Negation Detection in the scope of medical language.
NegEx [4] is a simple algorithm for detecting negated findings and diseases in radiology reports. Negation triggers are classified in triggers with preceding negated concepts and those with succeeding concepts. After a replacement of the concepts by UMLS terms, the negated ones are detected.
In the work of Mutalik et al. [5] UMLS concepts are also identified in a first step. Then, a lexical scanner using regular expressions is applied for trigger detection and classification in preceding and succeeding triggers. With this information a parser provides the original concepts with the negation information leading to the output of the NegFinder algorithm.
Elkin et al. [6] use an algorithm with a rule base to decide which medical concepts are negated in clinical documents. Here, stop words (e.g., “other than”) determine the scope of a negation trigger.
Patrick et al. [7] use SNOMED CT to identify negated concepts. Thereby, pre-coordinated phrases (e.g., “no headache”, SNOMED CT concept id 162298006) and concepts explicitly asserted as negative by negation phrases are identified. To identify the latter rule-based algorithms similar to [4] and [6] were implemented.
Huang and Lowe [8] recently developed a hybrid approach combining regular expression matching with grammatical parsing to detect negations in clinical radiology reports.
Aronow and Feng [9] have developed a method for Negation Detection to be applied for document classification. Thereby, they determine the scope of negation triggers by conjunctive phrases. All phrases connected by such conjunctions are regarded as negated phrases.
CPGs differ from medical reports or discharge summaries, which are used by the algorithms presented. In CPGs the language is not as restricted as it is in these other documents. They are more like prosaic writings, which complicate the development of simple algorithms. Still, they are not as complicated as free text since sophisticated stylistic elements such as the double negation are not used. In the following section we explain our approach, called NegHunter.
2. NegHunter – A Method to Detect Negated Concepts
Our strategy when developing our method, called NegHunter, was to classify negations in CPGs according to identified negation types. The reason for this is that negations within CPGs are strongly varying from each other. It is not easily possible to keep the number of negation triggers within manageable limits, thus a syntactical approach was used. This means that grammatical elements of the English language are used to decide whether a phrase is negated or not. For this purpose, the tenses in both active and passive voice as well as parts-of-speech are used.
The starting point of the NegHunter algorithm is the detection of negation triggers. Whereas other algorithms use a relatively high number of different negation triggers, NegHunter gets along with a rather small number of triggers. The reason for this is the way NegHunter handles the different negation types. We have selected a number of universal triggers and classified their behaviour in narrative texts.
2.1. Negation Classes
We have come up with five negation classes according to our study of CPGs in the literature: (1) adverbial negation, (2) intra-phrase triggered negation, (3) prepositional negation, (4) adjective negation, and (5) verb negation. In the following, we will discuss these five classes in detail.
- The Adverbial Negation
- This is the most frequent type of negation. Triggered by “not” and “never”, negated concepts appear in combination with a verb. Via the tense of the verb we decide, whether the sentence is written in active or passive voice. With this information, we interpret the three preceding or succeeding noun phrases as negated. We use the number of three noun phrases as we receive the best results with it. The following two sentences show examples of sentences in active and passive voice, both with triggers and their negated phrases:2“Guideline developers do not recommend .” (active voice)“ is not recommended.” (passive voice)
- Intra-Phrase Triggered Negation
- These are negations in which the trigger is included in the noun phrase. “No” and “without” act as triggers. The following sentence shows an example:“Evidence obtained from at least one well-designed study without .”
- Prepositional Negation
- In this negation type triggers are followed by prepositional phrases, often introduced by the prepositions “of” or “from”. The phrases following the preposition are considered as negated. Here, we have the three noun triggers “lack”, “absence”, and “freedom” as well as the adjective trigger “free”. The result of the detection process could be as follows:“Patients with good performance status, … and the absence .”
- Adjective Negation
- This type of negation uses adjectives as negation triggers. We have identified, for example, the term “ineffective” and can interpret the first noun phrase before the trigger as the relevant phrase. For example:“Recommendation indicates at least fair evidence that is ineffective or that harm outweighs benefit.”
- Verb Negation
- Some verbs are also negation triggers themselves. We identified the verbs “deny”, “decline”, and “lack”. The following sentence shows the effect of such a trigger:“ on final patient outcomes was also lacking.”
2.2. Assigning prepositional phrases
In some cases, not the entire negated information gets tagged with the algorithms described above. For instance, prepositional phrases (which are by themselves not negated) appearing after a negated phrase need to be handled apart. We proceed with this problem by tagging all prepositional phrases that follow a negated phrase. This ensures that no information concerning the negation is lost. The following sentence shows an output result with two prepositional phrases following an intra-phrase triggered negation:
“Requires availability of well conducted clinical studies but no .” 3
3. Evaluation
For evaluation purposes we used a Java-implementation of our algorithms. To receive the syntactical information from the guideline documents necessary for our algorithms we used the MetaMap Transfer (MMTx) program. MetaMap is “a program […] to map biomedical text to the [UMLS] Metathesaurus or, equivalently, to discover concepts referred to in text” [10]. MMTx makes this program available for researchers in an adaptive way. Besides the concept assignment it also provides us with the syntactical information such as part-of-speech. We implemented our algorithm as a Java library that can also be used by and incorporated in other programs and applications. In the following we describe the evaluation process.
3.1. Training and Test Sets
We used a set of 18 CPGs from the medical speciality oncology for our development. Out of these 18 practice guidelines, we used four guidelines as training set for the analysis of occurring negations. By means of these documents we classified the negations and developed our algorithms. We used the remaining 14 CPGs for the evaluation.
3.2. Generation of Gold Standard Documents
We manually rated the sentences of all oncological CGPs to establish a “gold standard” against which the computerized algorithms could be compared. We processed 558 sentences containing 615 negated concepts and tagged both negation triggers and negated concepts. At the first glance, it may be irritating to have such a little more number of negated phrases than sentences containing negations. This is because there are many sentences containing a trigger, which does not aim a phrase in the same sentence, (e.g., “None available.”). We do not provide detection across sentence borders because the result is unpredictable and are not tackled in our methods conceptually.
3.3. Evaluation Techniques and Measures
For our evaluation, we processed the 14 guidelines with NegHunter. Afterwards, a hand reading was carried out to detect errors. We classified in true positives (TP), false positives (FP), false negatives (FN), and partially correct (PC) taggings, whereas the latter scored only 50 %.
To qualify our measurement we used the statistical parameters of recall and precision. The recall measures the number of the correctly found phrases against all relevant phrases according to the gold standard. The value of precision measures the ratio of the number of correctly detected phrases to the number of all found phrases of the system. Table 1 shows a detailed listing of the performance of our implementation.
Table 1.
Negation Type | Triggers (%) |
TP (%) |
FP (%) |
FN (%) |
PC (%) | Recall (%) | Precision (%) |
---|---|---|---|---|---|---|---|
Adverbial Negation | 66.12 | 65.25 | 94.26 | 91 | 71.43 | 76.41 | 58.97 |
Intra-Phrase Tr. Negation | 23.45 | 26.85 | 0 | 7 | 16.33 | 95.1 | 100 |
Prepositional Negation | 7.66 | 3.36 | 0.82 | 0 | 4.08 | 100 | 89.47 |
Adjective Negation | 0.65 | 1.18 | 0 | 0 | 0 | 100 | 100 |
Verb Negation | 2.12 | 3.36 | 4.92 | 2 | 8.16 | 89.47 | 58.62 |
Overall | 100 | 100 | 100 | 100 | 100 | 83.51 | 67.49 |
3.4. Analysis of Evaluation Results
NegHunter shows its strength in the handling of the intra-phrase triggered negation, the prepositional negation and the adjective negation. This is caused by the simple structure of these negations. In the case of the intra-phrase triggered and the prepositional negation, the negated phrase usually follows immediately after the trigger so it is nearly impossible to fail it.
The behaviour of the adverbial negation as well as the verb negation is much more complex. Here, it is possible that a phrase related with a trigger occurs at the diametrically opposite end of the sentence. In such a case, it is very difficult to identify this phrase, as NegHunter uses the range of three preceding or succeeding possible phrases for detecting the negated concepts.
Another problem is generated by MMTx itself. In some cases the part-of-speech is incorrectly assigned and this consecutively causes errors. For example, in the sentence
“… and interpreting studies that were not otherwise covered in existing syntheses or guidelines.”
MMTx recognises the noun phrase “studies” as a verb phrase, whereas “interpreting”, a verb phrase, is recognised as noun. This circumstance leads to a false tagging and the creation of both a FP and a FN.
4. Conclusion
With our presented algorithms, negated information occurring in CPGs can be detected on syntactical level using grammatical information of the English language such as tenses and parts-of-speech. This forms a basis for subsequent processing also on a semantic level. Further processing on a semantic level will be absolutely necessary, as, for instance, a negation trigger and a concept representing a symptom or disease may not imply the absence of this symptom or disease. Compare also the example of [4]:
“We did not treat the infection.”
“We did not detect an infection.”
where the first sentence does not indicate the absence of an infection, but the absence of treating it. Anyhow, using NegHunter can support an automated structuring of the information in order to, for instance, decide which therapies or drug regimens are best applied in patients with certain diseases and which are not recommended. This helps to sort out the treatment options and supports the medical personnel as well as patients in their decision-making.
Additionally, NegHunter's negation classification allows users to augment the trigger set by themselves. Therefore, new triggers need to be assigned a negation class. NegHunter applies its rule base to these new triggers. This makes NegHunter portable to be applied on other document types as well as extensible and maintainable.
Acknowledgement
This work is supported by “Fonds zur Förderung der wissenschaftlichen Forschung FWF” (Austrian Science Fund), grant L290-N04.
Footnotes
Negation triggers are underlined; signalize negated phrases.
signalize prepositional information.
References
- 1.Naomi Sager, Carol Friedman, Lyman Margaret S. Medical Language Processing: Computer Management of Narrative Data. Addison-Wesley Longman Publishing. 1987 [Google Scholar]
- 2.Field Marilyn J., Lohr Kathleen N. Clinical Practice Guidelines: Directions for a New Program. National Academies Press, Institute of Medicine; Washington DC: 1990. [PubMed] [Google Scholar]
- 3.Horn Laurence R. A Natural History of Negation. University of Chicago Press; Chicago, Illinois: Jun 15, 1989. [Google Scholar]
- 4.Chapman Wendy W., Bridewell Will, Hanbury Paul, Cooper Gregory F., Buchanan Bruce G. A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries. Journal of Biomedical Informatics. 2001;34:301–310. doi: 10.1006/jbin.2001.1029. [DOI] [PubMed] [Google Scholar]
- 5.Mutalik Pradeep G., Deshpande Aniruddha, Nadkarni Prakash M. Use of General-purpose Negation Detection to Augment Concept Indexing of Medical Documents: A Quantitative Study using the UMLS. Journal of the American Medical Informatics Association (JAMIA) 2001;8(6):598–609. doi: 10.1136/jamia.2001.0080598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Elkin Peter L., Brown Steven H., Bauer Brent A., Husser Casey S., Carruth William, Bergstrom Larry R., Wahner-Roedler Dietlind L. A controlled trial of automated classification of negation from clinical notes. BMC Medical Informatics and Decision Making. 2005;5(13) doi: 10.1186/1472-6947-5-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Patrick Jon, Wang Yefeng, Budd Peter. Automatic Mapping Clinical Notes to Medical Terminologies; Proceedings of the 2006 Australasian Language Technology Workshop (ALTW 2006); 2006. pp. 75–82. [Google Scholar]
- 8.Huang Yang, Lowe Henry J. A novel hybrid approach to automated negation detection in clinical radiology reports. J Am Med Inform Assoc. 2007;14:304–311. doi: 10.1197/jamia.M2284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Aronow David B., Feng Fangfang. Ad-Hoc Classification of Electronic Clinical Documents. D-Lib Magazine. 1997 Jan; [Google Scholar]
- 10.Aronson Alan R. Effective Mapping of Biomedical Text to the UMLS Metathesaurus: The MetaMap Program. Proc. of the AMIA Symposium. 2001:17–21. [PMC free article] [PubMed] [Google Scholar]