Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2021 Jan 25;2020:452–461.

SpiNet - A FrameNet-like Schema for Automatic Information Extraction about Spine from Scientific Papers

Vanessa C Ferreira 1, Vlάdia Pinheiro 2
PMCID: PMC8075448  PMID: 33936418

Abstract

New medical research concerning the spine and its diseases are incrementally made available through biomedical literature repositories. Several Natural Language Processing (NLP) tasks, like Semantic Role Labelling (SRL) and Information Extraction (IE), can offer support for, automatically, extracting relevant information about spine, from scientific papers. This paper presents a domain-specific FrameNet, called SpiNet, for automatic information extraction about spine concepts and their semantic types. For this, we use the frame semantic and the MeSH ontology in order to extract the relevant information about a disease, a treatment, a medication, a sign or symptom, related to spine medical domain. The differential of this work is the enrichment of SpiNet’s base with the MeSH ontology, whose terms, concepts, descriptors and semantic types enable automatic semantic annotation. We use the SpiNet framework in order to annotate one hundred of scientific papers and the F1-score metric, calculated between the classification of relevant sentences performed by the system and the human physiotherapists, achieved the result of 0.83.

Introduction

New medical research concerning the spine and its diseases are frequently being published and made available through biomedical literature repositories like PEDro1 and Cochrane2. Elkins et al. observed that the number of records on the PEDro database had doubled every 3.5 years3, which means its growing fast. The information available on this literature is important for clinical decisions, training of students and experts, improvement and innovation of treatments, and other important applications. However, the analysis of this content is performed manually by experts and is incredibly time-consuming and expensive.

Several Natural Language Processing (NLP) tasks, like Semantic Role Labelling (SRL) and Information Extraction (IE), can offer support for, automatically, retrieving and extracting relevant information about spine, its diseases, treatments and symptoms, from scientific papers. Roberts et al.4 affirms that the challenge is to develop general-purpose extraction algorithms for clinical text, especially when it has not been defined, a priori, which information should be extracted. In this sense, they developed a pilot project named Cancer FrameNet, which consist of a resource for cancer-related information extraction in clinical notes with three semantic frames (targeting the high-level tasks of cancer diagnoses, cancer therapeutic procedures, and tumor descriptions), created and annotated on a clinical text corpus. This FrameNet-like resource is an important instrument for IE applications in oncology.

The Berkeley FrameNet5 is the best-known frame semantic base where a word or phrase evokes a frame of semantic knowledge that describes the characteristic attributes (or semantic roles) associated with a concept or event. For example, for the concept Radiculopathy, the frame would contain elements that describe the symptoms, which body part has been affected by those symptoms and which group of patients have been suffering from that. The set of frame elements can either be defined prior to the annotation by an expert or added, interactively, based on an annotated corpus.

This paper describes a domain-specific FrameNet, called SpiNet – for automatic information extraction about spine concepts and their semantic types, from scientific papers. For this, we use the framework based on frame semantics5 and the MeSH ontology6 in order to extract the relevant information about a disease, a treatment, a medication, a sign or symptom, or something else related to spine medical domain. The differential of the SpiNET is the enrichment of its base with the MeSH ontology, whose terms, concepts, descriptors and semantic types enable automatic semantic annotation of scientific texts.

The SpiNet framework was used to automatically annotate 139 scientific papers, from which 8 were evaluated, and the results of the system in classifying the relevant sentences were 0.72, in terms of precision, and 0.99, in terms of recall (or true positive rate). Specifically, for frame annotation, the average precision rate was 0.62, and for frame element annotation, the average precision rate was 0.81.

Related Works

Kokkinakis7 presents a general-purpose extractor for the biomedical domain in Swedish. One important factor for his work is that unlike the original FrameNet, the Swedish FrameNet (SweFN++) also contains information about the domain, specifically the medical one. With the use of pattern recognition associated with manual annotation, he got good results, but did not used experts on the area. Roberts4 developed another FrameNet-like schema for the specific case of extracting cancer information from clinical narratives, with manual annotation. He was also based on the original FrameNet, but focused on generating specific frames as Cancer Diagnosis, Cancer Therapeutic Procedure and Tumor Description, by observing the existing literature, and using specific terms such as adenocarcinoma e mastectomy as lexical units. The evaluation method used was to calculate the agreement between both annotators. He concludes that one of the reasons he got a high agreement was because of the specificity of the words used, as the use of the a word such as adenocarcinoma is almost certain to identify a diagnosis. Both the projects used the brat annotation tool.

Background Knowledge

Two main resources are the foundations of this work: the Berkeley FrameNet project5, which consists of a framework for defining semantic frames (frame elements, lexical units, and annotated sentences); and the Medical Subject Headings (MeSH) thesaurus, a controlled and hierarchically-organized ontology produced by the U.S.National Library of Medicine. It is used for indexing, cataloging, and searching of biomedical and health-related information. MeSH includes the subject headings appearing in MEDLINE/PubMed, the NLM Catalog, and other NLM databases6. Below, we detail each of these resources.

FrameNet

Frame semantic is a concept introduced in linguistics, but widely used in NLP tasks like Semantic Role Labeling (SRL) and Information Extraction (IE). The most well-known frame database is the Berkeley FrameNet, which is a general-purpose frame database and includes some information related to events (or concepts). Its main structure consists of a Frame, that is a structured representation of an event (or concept), and its Frame Elements (FE) and Lexical Units (LU), which are, respectively, a part of a sentence that identifies a semantic role in that event, and a part of speech that can identify or evoke the event.

In Figure 1, we can see an example of a frame with its components. For example, the sentence “she wrinkled her nose in disapproval” was annotated with the FRAME Body movement, the FEs Agent, Body part and Internal cause, and was evoked by the LU wrinkled.

Figure 1:

Figure 1:

Example of frame from FrameNet (Bauer et al.)8

In Table 2, there is information on the current status of the project.

Table 2:

Quantity of entities on FrameNet.

Quantity
Descriptor 29,351
Concept 57,246
Term 238,535
Semantic Type 127

Medical Subject Headings (MeSH) Ontology

The MeSH database was developed to support a biomedical catalog, it defines terms that can be used to retrieve relevant information from medical texts. On the database, there is a hierarchy, which consists of an arrangement of three entities: (1) Descriptor - a broader representation of a concept; (2) Concepts - an exact representation of a concept; and (3) Terms - different ways a concept can be identified by.

Figure 2 shows the hierarchy for the descriptor Spine, which contains the concepts Spine and Vertebra. Below each concept there are terms for them. These terms are synonymous with each other, but only if they are under the same concept and descriptor.

Figure 2:

Figure 2:

Example of a MeSH hierarchy for the descriptor Spine.

Another feature presented on MeSH is the Semantic Type which refers to a definition of a descriptor related to its meaning, specifying if it belongs to a disease, a treatment, a medication, a sign or symptom, or something else. There are 127 semantic types and a descriptor can be classified into a maximum of 4 different semantic types. In Figure 3, there are 3 descriptors. The first two of them are the descriptors Pelvic Pain and Low Back Pain, which can be defined by the semantic type Sign or Symptom, while the third one Muscle Weakness can be define both by Sign or Symptom as by Pathologic Function, since it is essentially both of them.

Figure 3:

Figure 3:

Example of connections between semantic types and descriptors

MeSH is a well developed ontology, which constantly updates its data. In Table 2, we specify the numbers of each entity on the 2019 distribution of MeSH that was used.

Preparing an Initial Corpus of Scientific texts

The first challenge on this work was to prepare an initial set of scientific papers and to identify which semantic frames out of those from Berkeley FrameNet base 1 would be related to the Spine domain. A set of 8 scientific papers 2 were freely evaluated by four physiotherapists, specialists in diseases of the spine. The evaluation they did aimed to identify relevant chunks of text and also to annotate the topic of each chunk of text.

Figure 4 presents one of the papers annotated by those physiotherapists, the chunks of text highlighted as relevant, and the explication about the annotation. On the first text highlighted is written “Lumbar radiculopathy refers to pain in the back or buttocks that radiates down the leg in a dermatomal distribution.” and the annotation made by the expert identifies the relevance in that with the words “diagnosis and symptom”.

Figure 4:

Figure 4:

Example of a scientific paper and its relevant chunks of text, annotated by experts of spine domain. There were 151 chunks of text that were considered relevant, containing 284 sentences from a total of 1,280 sentences.

The main reasons for considering the annotation as relevant were Symptom, Treatment, Observation, Diagnosis, Analysis or Evidence. Since 46% of those attributions were Symptom and Treatment, these were the one used to define the semantic frames. Patient was an element added to the this group since the context is medical.

We developed an algorithm for retrieving the frames from FrameNet that had Symptom, Treatment or Patient as a Frame Element (FE). This procedure returned, initially, 36 frames and, of these, only 4 were evoked by words (LUs) present in the relevant sentences. The evoked frames were the basis for the construction of SpiNet - the base of semantic frames for the spine domain.

  1. Condition Symptom Relation - a patient has a medical condition that can be understood by its Symptoms.

  2. Medical Intervention - procedural or medicine based interventions are used on a patient to attempt to alleviate a medical condition.

  3. Medical Conditions - words in this frame name medical conditions or diseases that a patient suffers from, is being treated for, may be cured of, or die of.

  4. Cure - this frame deals with a healer treating and curing an affliction (the injuries, disease, or pain) of the patient.

The SpiNET Schema

The main intention with the creation of SpiNet was to develop a FrameNet inspired schema that is enhanced by the MeSH ontology. The intuition behind that idea was that we had a large amount of terms on MeSH, which means we could identify a lot of words on sentences with it. In addition to that, we also had a hierarchy on MeSH, in a way that every Term leads to a Concept, which leads to a Descriptor which, at the end, leads to a Semantic Type. For that reason, if we associate a frame element as Symptom to a semantic type as Sign or Symptom, we get a list of 3,257 terms on MeSH that can be identified as Symptom from a frame.

Since we had 127 semantic types in MeSH ontology, a readable amount, we manually evaluated which of them could be linked to the Frame Elements on four frames. We also added Frame Elements that were not on the original FrameNet, but that were considered relevant by the annotators (e.g. Object of Study, Concept). In Table 3 there is information on the relations established.

Table 3:

The relation between frame elements and MeSH Descriptors

Frame Element MeSH Semantic Type
Patient Patient or Disabled Group, Age Group and Population Group
Symptom Sign or Symptom and Injury or Poisoning
Disease Disease or Syndrome
Treatment Therapeutic or Preventive Procedure and Antibiotic
Location Body Location or Region and Body Part Organ or Organ Component
Concept Quantitative Concept
Object of Study Mammal, Bird, Fish
Organism Function Organism Function

After defining the relation of FEs to MeSH Semantic Types, we needed to define the LUs. Most of the LUs were the same from the original FrameNet, but we added other 8 verbs which were frequently used on the papers, on those relevant parts, which were: conclude.v, correlate.v, demonstrate.v, favor.v, highlight.v, improve.v, predict.v and resolve.v. We also removed original LU since they were not found on the evaluated set of papers (e.g. asthma.n as a LU from the Medical intervention frame on FrameNet). In Table 4 there is a definition of each frame with its FEs and LUs, including the ones added.

Table 4:

SpiNet Schema definition

Frame Frame Elements Lexical Units
Condition symptom
relation
Patient, Symptom,
Disease, Location, Treatment, Concept, Object of Study, Organism Function
manifest.v, cause.v, produce.v, link.v, relate.v, induce.v, provoke.v, in-
dicate.v, suggest.v, occur.v, observe.v, demonstrate.v, highlight.v, resolve.v, favor.v, improve.v, conclude.v, correlate.v, predict.v
Medical intervention Patient, Disease,
Treatment
treat.v, develop.v, indicate.v, consist.v, attempt.v, prevent.v
Cure Patient, Disease, Lo-
cation, Treatment
rehabilitate.v, treat.v
Medical conditions Patient, Disease,
Treatment
develop.v

Annotation

In order to create the SpiNET corpus of annotated sentences, three types of papers were selected and retrieved from PEDro e Cochrane repositories: Systematic Review, Practice Guideline, and Clinical Study.

Two different approaches were used, based on the information from the specialists. Since systematic reviews and practice guidelines are supposed to give a broad perspective of an issue and not be focused on specific group cases, as a clinical study would be, we extracted all sentences from systematic reviews and practice guidelines, while, from clinical studies, we focused only the discussion part of the paper. The papers retrieved were processed by GROBID9, which annotate the text in XML, so we were able to extract the paragraphs (and its headers) and only sections of a paper.

For text segmentation and text cleaning, NLTK toolkit10 was used. Since the document of choice is a scientific paper, it had to be taken in consideration that, as is common on this type of document, many terms would be abbreviated after being written the first time. The first time Low Back Pain appears on the text, it will be written as this, the other times it probably will be written as LBP. To handle that, we used the Schwartz-Hearst algorithm11, which identifies these abbreviations. There was a cleaning method applied to most papers, which intended to include only what is after abstract and before references, since those information in the abstract would most likely be repeated on the text, and the references would not be relevant, in any way.

In order to automatically annotate the text, the following procedure has been followed:

  1. Identify relevant sentences:
    • to find LU in the sentence
    • to associate the sentence with the Frame that contains the LU
  2. Identify semantic types on relevant sentences:
    • to find semantic types on text, based on the terms it has. So, if the term Spinal Column or Vertebral Column is on the sentence, it will be identified as the concept and descriptors Spine, which is then identified as being of the semantic type Body Part Organ or Organ Component

In Figure 5, we show an annotation made from the experiment, in which the LU relate is found on a sentence as related, identifying the frame Condition symptom relation. After the frame being identified, we annotate the terms found: Asymptomatic Individuals, Disk Degeneration and Young Adults, which are then associated with its semantic type: Patient or Disabled Group, Disease or Symptom and Age Group respectively.

Figure 5:

Figure 5:

Example of automatically annotated sentence.

Since the annotation is automatically performed, we calculated the precision and recall metrics between the machine annotation and the specialist validation (human physiotherapists). For this process of validation, we developed a website (see Figure 6) in which those specialists could evaluate sentences with embedded BRAT visualizations from BRAT annotation tool12. The papers were evaluated by sentence and not specifically by FE, so a sentence validated consider all of its FEs validated.

Figure 6:

Figure 6:

Website developed for annotation.

Annotation Statistics

Table 5 presents the statistics of the SpiNet corpus and the results of the human evaluation, in which 139 scientific papers were processed until the moment, according to the order in which they were retrieved, with 8 of them being evaluated by four human experts. We presented every sentence for validation, but most of the sentences evaluated by the specialists were the ones SpiNet considered relevant. From those evaluated papers, there were a total of 1,213 sentences, in which 218 of them were considered relevant by the SpiNet framework and 154 of them were evaluated by the human experts (as relevant or not). Only 7 of the sentences considered not relevant by SpiNet were evaluated by the human experts, making a total of 161 sentences evaluated by the experts. Table 6 presents the confusion matrix which shows these numbers of sentences predicted by the system and evaluated by the experts, as relevant or not.

Table 5:

SpiNet Corpus and Evaluation Statistics.

Frame Element Frequency
Paper processed 139
Paper evaluated 8
Sentences from evaluated papers 1,213
Sentences considered relevant by SpiNet 218
Sentences evaluated by human experts 161
Relevant sentences (by SpiNet) evaluated by experts 154
Not Relevant sentences (by SpiNet) evaluated by experts 7

Table 6:

Confusion matrix for validation.

True\
Pred.
Relevant Not Relevant Total
Relevant 111 1 112
Not Relevant 43 6 49
Total 154 7 161

Descriptive statistics of the annotated corpus are provided in Table 7. A total of 1,213 sentences are annotated with 28 frame-evoking LUs, and a total of 287 frames occurrences were identified in 218 relevant sentences, which means there were sentences that identified LUs from more than one frame. Since not every sentence from those papers were evaluated (only 154 from 218), we also show on Table 7 the occurrences of evaluated frames.

Table 7:

Annotated and Evaluated Frame frequency.

Frame Frequency Evaluated
Condition Symptom Relation 205 151
occur 30 19
cause 27 20
suggest 21 12
indicate 18 10
demonstrate 18 15
produce 16 13
relate 15 12
observe 8 6
improve 8 5
induce 8 8
predict 6 6
highlight 5 5
conclude 5 3
manifest 5 4
resolve 5 4
correlate 4 4
favor 2 1
provoke 2 2
link 2 2
Medical Intervention 54 29
indicate 18 10
treat 17 10
develop 10 3
consist 4 1
attempt 4 4
prevent 1 1
Cure 18 11
treat 17 10
rehabilitate 1 1
Medical Conditions 10 3
develop 10 3
Total 287 194

Regarding frame elements, the top-4 most common were Symptom, Location, Disease, and Treatment, which would also be the ones that could be identified through most frames. On the other hand, Patient which is linked to all frames is not the one with the most occurrences. In Table 8, we can see the frequency of each frame element, again with the information of how many of them were evaluated.

Table 8:

Annotated and Evaluated Frame Element frequency.

Frame Element Frequency Evaluated
Symptom 151 111
Location 95 64
Disease 70 30
Treatment 66 55
Concept 32 28
Organism Function 19 16
Patient 15 13
Object of Study 5 5
Total 453 322

Annotation Evaluation

In order to evaluate our classification of the sentences as relevant, using SpiNet framework, we calculated precision and recall. Given the confusion matrix on Table 6, we have 111 sentences evaluated as relevant by experts from 154 predicted as relevant by SpiNet, resulting in a precision rate of 0.72. In terms of Recall (true positive rate), we have 111 correctly classified as relevant by SpiNet from an universe of 112 true positive, according the experts, resulting in a recall rate of 0.99. So, the F1-score was 0.83.

Table 9 shows the precision rate for each Frame, according to the number of true positive (Support). Although the precision rates varied between 0.45 and 0.77, the Condition Symptom Relation frame was the most annotated frame and the one that showed better precision.

Table 9:

Frame evaluation metrics.

Frame Precision Support
Condition Symptom Relation 0.77 117
Medical Intervention 0.59 17
Cure 0.45 5
Medical Conditions 0.67 2
Total 0.62 141

Regarding frame elements (FEs), the precision rate varied from 0.60 to 1.0, as seen on Table 10. It can be observed that FEs that appear less (e.g. Object of Study and Patient) are the ones that vary more in performance. We attribute that to the specificity of these FEs. On the other hand, the top-4 frames elements with higher frequency are also very close in terms of the precision rate (0.77 to 0.89).

Table 10:

Frame Element evaluation metrics.

Frame Element Precision Support
Symptom 0.79 88
Location 0.83 53
Disease 0.77 23
Treatment 0.89 49
Concept 0.64 18
Organism Function 1.00 16
Patient 1.00 13
Object Of Study 0.60 3
Total 0.81 263

Conclusion

In this paper, we presented SpiNet, a project that aims to automatically extract relevant sentences and parts of text from scientific papers. We provide a FrameNet-like schema consisting of four frames (Condition Symptom Relation, Cure, Medical Conditions and Medical Intervention) with the definitions of Frame Elements (FE) and Lexical Units (LU). In addition, we propose an enhancement on the annotation by associating FEs from SpiNet to the Semantic Types from MeSH, making it possible to use the Terms from MeSH to annotate FEs. On the future, with more validated sentences, we intend to use learning algorithms to improve our annotated corpus.

One of the limitations found was that we were not validating at the Frame Element level, we were validating at the Sentence level, which means we had to deduce that if a sentence was validated, the frame and FEs recognized on that sentence were also validated. Another limitation was the PDF extractor module, that for most cases achieved an acceptable outcome, but when it were not able to identify references or tables correctly, it did interfere with sentence segmentation. We also did not include any strategies to deal with negation and uncertainty at the moment, as we are still studying and evaluating how to do it. A potential disadvantage of SpiNet is also that most Frames on it have a lot of Frame Elements in common, despite having really distinct purposes and definitions. However, we could observe, in the annotated sentences, that in fact the FEs were repeated even if the frames were different.

Despite our limitations, the experimental evaluation, described in this work, indicated that the SpiNet framework can be used to automate the information extraction about diseases, treatments and symptoms in the medical domain, specifically in the area of physiotherapy.

Acknowledgements

This work was assisted with annotation and validation of the SpiNet dataset by physiotherapists from ITC Vertebral, a Brazilian Institute of Treatment for the Spine.

Footnotes

Figures & Table

Table 1:

Quantity of entities on FrameNet.

Quantity
Frames 1,224
Frame Elements 10,535
Lexical Units 13,675

References

  • 1.Physiotherapy evidence database https://www.pedro.org.au/ . (Accessed on 01/10/2020)
  • 2.Cochrane library https://www.cochranelibrary.com/ . (Accessed on 01/10/2020)
  • 3.Elkins Mark R, Moseley Anne M, Sherrington Catherine, Herbert Robert D, Maher Christopher G. Growth in the physiotherapy evidence database (pedro) and use of the pedro scale. 2013. [DOI] [PubMed]
  • 4.Roberts Kirk, Si Yuqi, Gandhi Anshul, Bernstam Elmer. A framenet for cancer information in clinical narra- tives: schema and annotation. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) 2018.
  • 5.Baker Collin F., Fillmore Charles J., Lowe John B. The Berkeley FrameNet project; 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1; Montreal, Quebec, Canada, August 1998. Association for Computational Linguistics; pp. 86–90. [Google Scholar]
  • 6.Medical subject headings https://www.nlm.nih.gov/mesh/meshhome.html . (Accessed on 01/10/2020)
  • 7.Kokkinakis Dimitrios. Initial experiments of medication event extraction using frame semantics. 2012.
  • 8.Bauer Daniel, Fu¨rstenau Hagen, Rambow Owen C. The dependency-parsed framenet corpus. 2012.
  • 9.Grobid https://github.com/kermitt2/grobid , 2008 — 2020.
  • 10.Loper Edward, Bird Steven. Nltk: the natural language toolkit. arXiv preprint cs/0205028. 2002.
  • 11.Schwartz Ariel, Hearst Marti. A simple algorithm for identifying abbreviation definitions in biomedical text. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 02 2003;4:451–62. [PubMed] [Google Scholar]
  • 12.Stenetorp Pontus, Topic´ Goran, Pyysalo Sampo, Ohta Tomoko, Kim Jin-Dong, Tsujii Jun’ichi. Proceedings of BioNLP Shared Task 2011 Workshop. Port- land, Oregon, USA, June 2011: Association for Computational Linguistics; Bionlp shared task 2011: Supporting resources; pp. 112–120. [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES