SpiNet - A FrameNet-like Schema for Automatic Information Extraction about Spine from Scientific Papers

Vanessa C Ferreira; Vlάdia Pinheiro

. 2021 Jan 25;2020:452–461.

SpiNet - A FrameNet-like Schema for Automatic Information Extraction about Spine from Scientific Papers

Vanessa C Ferreira ¹, Vlάdia Pinheiro ²

PMCID: PMC8075448 PMID: 33936418

Abstract

New medical research concerning the spine and its diseases are incrementally made available through biomedical literature repositories. Several Natural Language Processing (NLP) tasks, like Semantic Role Labelling (SRL) and Information Extraction (IE), can offer support for, automatically, extracting relevant information about spine, from scientific papers. This paper presents a domain-specific FrameNet, called SpiNet, for automatic information extraction about spine concepts and their semantic types. For this, we use the frame semantic and the MeSH ontology in order to extract the relevant information about a disease, a treatment, a medication, a sign or symptom, related to spine medical domain. The differential of this work is the enrichment of SpiNet’s base with the MeSH ontology, whose terms, concepts, descriptors and semantic types enable automatic semantic annotation. We use the SpiNet framework in order to annotate one hundred of scientific papers and the F1-score metric, calculated between the classification of relevant sentences performed by the system and the human physiotherapists, achieved the result of 0.83.

Introduction

New medical research concerning the spine and its diseases are frequently being published and made available through biomedical literature repositories like PEDro¹ and Cochrane². Elkins et al. observed that the number of records on the PEDro database had doubled every 3.5 years³, which means its growing fast. The information available on this literature is important for clinical decisions, training of students and experts, improvement and innovation of treatments, and other important applications. However, the analysis of this content is performed manually by experts and is incredibly time-consuming and expensive.

Several Natural Language Processing (NLP) tasks, like Semantic Role Labelling (SRL) and Information Extraction (IE), can offer support for, automatically, retrieving and extracting relevant information about spine, its diseases, treatments and symptoms, from scientific papers. Roberts et al.⁴ affirms that the challenge is to develop general-purpose extraction algorithms for clinical text, especially when it has not been defined, a priori, which information should be extracted. In this sense, they developed a pilot project named Cancer FrameNet, which consist of a resource for cancer-related information extraction in clinical notes with three semantic frames (targeting the high-level tasks of cancer diagnoses, cancer therapeutic procedures, and tumor descriptions), created and annotated on a clinical text corpus. This FrameNet-like resource is an important instrument for IE applications in oncology.

The Berkeley FrameNet⁵ is the best-known frame semantic base where a word or phrase evokes a frame of semantic knowledge that describes the characteristic attributes (or semantic roles) associated with a concept or event. For example, for the concept Radiculopathy, the frame would contain elements that describe the symptoms, which body part has been affected by those symptoms and which group of patients have been suffering from that. The set of frame elements can either be defined prior to the annotation by an expert or added, interactively, based on an annotated corpus.

This paper describes a domain-specific FrameNet, called SpiNet – for automatic information extraction about spine concepts and their semantic types, from scientific papers. For this, we use the framework based on frame semantics⁵ and the MeSH ontology⁶ in order to extract the relevant information about a disease, a treatment, a medication, a sign or symptom, or something else related to spine medical domain. The differential of the SpiNET is the enrichment of its base with the MeSH ontology, whose terms, concepts, descriptors and semantic types enable automatic semantic annotation of scientific texts.

The SpiNet framework was used to automatically annotate 139 scientific papers, from which 8 were evaluated, and the results of the system in classifying the relevant sentences were 0.72, in terms of precision, and 0.99, in terms of recall (or true positive rate). Specifically, for frame annotation, the average precision rate was 0.62, and for frame element annotation, the average precision rate was 0.81.

Related Works

Kokkinakis⁷ presents a general-purpose extractor for the biomedical domain in Swedish. One important factor for his work is that unlike the original FrameNet, the Swedish FrameNet (SweFN++) also contains information about the domain, specifically the medical one. With the use of pattern recognition associated with manual annotation, he got good results, but did not used experts on the area. Roberts⁴ developed another FrameNet-like schema for the specific case of extracting cancer information from clinical narratives, with manual annotation. He was also based on the original FrameNet, but focused on generating specific frames as Cancer Diagnosis, Cancer Therapeutic Procedure and Tumor Description, by observing the existing literature, and using specific terms such as adenocarcinoma e mastectomy as lexical units. The evaluation method used was to calculate the agreement between both annotators. He concludes that one of the reasons he got a high agreement was because of the specificity of the words used, as the use of the a word such as adenocarcinoma is almost certain to identify a diagnosis. Both the projects used the brat annotation tool.

Background Knowledge

Two main resources are the foundations of this work: the Berkeley FrameNet project⁵, which consists of a framework for defining semantic frames (frame elements, lexical units, and annotated sentences); and the Medical Subject Headings (MeSH) thesaurus, a controlled and hierarchically-organized ontology produced by the U.S.National Library of Medicine. It is used for indexing, cataloging, and searching of biomedical and health-related information. MeSH includes the subject headings appearing in MEDLINE/PubMed, the NLM Catalog, and other NLM databases⁶. Below, we detail each of these resources.

FrameNet

Frame semantic is a concept introduced in linguistics, but widely used in NLP tasks like Semantic Role Labeling (SRL) and Information Extraction (IE). The most well-known frame database is the Berkeley FrameNet, which is a general-purpose frame database and includes some information related to events (or concepts). Its main structure consists of a Frame, that is a structured representation of an event (or concept), and its Frame Elements (FE) and Lexical Units (LU), which are, respectively, a part of a sentence that identifies a semantic role in that event, and a part of speech that can identify or evoke the event.

In Figure 1, we can see an example of a frame with its components. For example, the sentence “she wrinkled her nose in disapproval” was annotated with the FRAME Body movement, the FEs Agent, Body part and Internal cause, and was evoked by the LU wrinkled.

In Table 2, there is information on the current status of the project.

Table 2:

Quantity of entities on FrameNet.

	Quantity
Descriptor	29,351
Concept	57,246
Term	238,535
Semantic Type	127

Open in a new tab

Medical Subject Headings (MeSH) Ontology

The MeSH database was developed to support a biomedical catalog, it defines terms that can be used to retrieve relevant information from medical texts. On the database, there is a hierarchy, which consists of an arrangement of three entities: (1) Descriptor - a broader representation of a concept; (2) Concepts - an exact representation of a concept; and (3) Terms - different ways a concept can be identified by.

Figure 2 shows the hierarchy for the descriptor Spine, which contains the concepts Spine and Vertebra. Below each concept there are terms for them. These terms are synonymous with each other, but only if they are under the same concept and descriptor.

Another feature presented on MeSH is the Semantic Type which refers to a definition of a descriptor related to its meaning, specifying if it belongs to a disease, a treatment, a medication, a sign or symptom, or something else. There are 127 semantic types and a descriptor can be classified into a maximum of 4 different semantic types. In Figure 3, there are 3 descriptors. The first two of them are the descriptors Pelvic Pain and Low Back Pain, which can be defined by the semantic type Sign or Symptom, while the third one Muscle Weakness can be define both by Sign or Symptom as by Pathologic Function, since it is essentially both of them.

Figure 3: — Example of connections between semantic types and descriptors

MeSH is a well developed ontology, which constantly updates its data. In Table 2, we specify the numbers of each entity on the 2019 distribution of MeSH that was used.

Preparing an Initial Corpus of Scientific texts

The first challenge on this work was to prepare an initial set of scientific papers and to identify which semantic frames out of those from Berkeley FrameNet base ¹ would be related to the Spine domain. A set of 8 scientific papers ² were freely evaluated by four physiotherapists, specialists in diseases of the spine. The evaluation they did aimed to identify relevant chunks of text and also to annotate the topic of each chunk of text.

Figure 4 presents one of the papers annotated by those physiotherapists, the chunks of text highlighted as relevant, and the explication about the annotation. On the first text highlighted is written “Lumbar radiculopathy refers to pain in the back or buttocks that radiates down the leg in a dermatomal distribution.” and the annotation made by the expert identifies the relevance in that with the words “diagnosis and symptom”.

The main reasons for considering the annotation as relevant were Symptom, Treatment, Observation, Diagnosis, Analysis or Evidence. Since 46% of those attributions were Symptom and Treatment, these were the one used to define the semantic frames. Patient was an element added to the this group since the context is medical.

We developed an algorithm for retrieving the frames from FrameNet that had Symptom, Treatment or Patient as a Frame Element (FE). This procedure returned, initially, 36 frames and, of these, only 4 were evoked by words (LUs) present in the relevant sentences. The evoked frames were the basis for the construction of SpiNet - the base of semantic frames for the spine domain.

Condition Symptom Relation - a patient has a medical condition that can be understood by its Symptoms.
Medical Intervention - procedural or medicine based interventions are used on a patient to attempt to alleviate a medical condition.
Medical Conditions - words in this frame name medical conditions or diseases that a patient suffers from, is being treated for, may be cured of, or die of.
Cure - this frame deals with a healer treating and curing an affliction (the injuries, disease, or pain) of the patient.

The SpiNET Schema

The main intention with the creation of SpiNet was to develop a FrameNet inspired schema that is enhanced by the MeSH ontology. The intuition behind that idea was that we had a large amount of terms on MeSH, which means we could identify a lot of words on sentences with it. In addition to that, we also had a hierarchy on MeSH, in a way that every Term leads to a Concept, which leads to a Descriptor which, at the end, leads to a Semantic Type. For that reason, if we associate a frame element as Symptom to a semantic type as Sign or Symptom, we get a list of 3,257 terms on MeSH that can be identified as Symptom from a frame.

Since we had 127 semantic types in MeSH ontology, a readable amount, we manually evaluated which of them could be linked to the Frame Elements on four frames. We also added Frame Elements that were not on the original FrameNet, but that were considered relevant by the annotators (e.g. Object of Study, Concept). In Table 3 there is information on the relations established.

Table 3:

The relation between frame elements and MeSH Descriptors

Frame Element	MeSH Semantic Type
Patient	Patient or Disabled Group, Age Group and Population Group
Symptom	Sign or Symptom and Injury or Poisoning
Disease	Disease or Syndrome
Treatment	Therapeutic or Preventive Procedure and Antibiotic
Location	Body Location or Region and Body Part Organ or Organ Component
Concept	Quantitative Concept
Object of Study	Mammal, Bird, Fish
Organism Function	Organism Function

Open in a new tab

After defining the relation of FEs to MeSH Semantic Types, we needed to define the LUs. Most of the LUs were the same from the original FrameNet, but we added other 8 verbs which were frequently used on the papers, on those relevant parts, which were: conclude.v, correlate.v, demonstrate.v, favor.v, highlight.v, improve.v, predict.v and resolve.v. We also removed original LU since they were not found on the evaluated set of papers (e.g. asthma.n as a LU from the Medical intervention frame on FrameNet). In Table 4 there is a definition of each frame with its FEs and LUs, including the ones added.

Table 4:

SpiNet Schema definition

Frame	Frame Elements	Lexical Units
Condition symptom relation	Patient, Symptom, Disease, Location, Treatment, Concept, Object of Study, Organism Function	manifest.v, cause.v, produce.v, link.v, relate.v, induce.v, provoke.v, in- dicate.v, suggest.v, occur.v, observe.v, demonstrate.v, highlight.v, resolve.v, favor.v, improve.v, conclude.v, correlate.v, predict.v
Medical intervention	Patient, Disease, Treatment	treat.v, develop.v, indicate.v, consist.v, attempt.v, prevent.v
Cure	Patient, Disease, Lo- cation, Treatment	rehabilitate.v, treat.v
Medical conditions	Patient, Disease, Treatment	develop.v

Open in a new tab

Annotation

In order to create the SpiNET corpus of annotated sentences, three types of papers were selected and retrieved from PEDro e Cochrane repositories: Systematic Review, Practice Guideline, and Clinical Study.

Two different approaches were used, based on the information from the specialists. Since systematic reviews and practice guidelines are supposed to give a broad perspective of an issue and not be focused on specific group cases, as a clinical study would be, we extracted all sentences from systematic reviews and practice guidelines, while, from clinical studies, we focused only the discussion part of the paper. The papers retrieved were processed by GROBID⁹, which annotate the text in XML, so we were able to extract the paragraphs (and its headers) and only sections of a paper.

For text segmentation and text cleaning, NLTK toolkit¹⁰ was used. Since the document of choice is a scientific paper, it had to be taken in consideration that, as is common on this type of document, many terms would be abbreviated after being written the first time. The first time Low Back Pain appears on the text, it will be written as this, the other times it probably will be written as LBP. To handle that, we used the Schwartz-Hearst algorithm¹¹, which identifies these abbreviations. There was a cleaning method applied to most papers, which intended to include only what is after abstract and before references, since those information in the abstract would most likely be repeated on the text, and the references would not be relevant, in any way.

In order to automatically annotate the text, the following procedure has been followed:

Identify relevant sentences:
- to find LU in the sentence
- to associate the sentence with the Frame that contains the LU
Identify semantic types on relevant sentences:
- to find semantic types on text, based on the terms it has. So, if the term Spinal Column or Vertebral Column is on the sentence, it will be identified as the concept and descriptors Spine, which is then identified as being of the semantic type Body Part Organ or Organ Component

In Figure 5, we show an annotation made from the experiment, in which the LU relate is found on a sentence as related, identifying the frame Condition symptom relation. After the frame being identified, we annotate the terms found: Asymptomatic Individuals, Disk Degeneration and Young Adults, which are then associated with its semantic type: Patient or Disabled Group, Disease or Symptom and Age Group respectively.

Since the annotation is automatically performed, we calculated the precision and recall metrics between the machine annotation and the specialist validation (human physiotherapists). For this process of validation, we developed a website (see Figure 6) in which those specialists could evaluate sentences with embedded BRAT visualizations from BRAT annotation tool¹². The papers were evaluated by sentence and not specifically by FE, so a sentence validated consider all of its FEs validated.

Figure 6: — Website developed for annotation.

Annotation Statistics

Table 5 presents the statistics of the SpiNet corpus and the results of the human evaluation, in which 139 scientific papers were processed until the moment, according to the order in which they were retrieved, with 8 of them being evaluated by four human experts. We presented every sentence for validation, but most of the sentences evaluated by the specialists were the ones SpiNet considered relevant. From those evaluated papers, there were a total of 1,213 sentences, in which 218 of them were considered relevant by the SpiNet framework and 154 of them were evaluated by the human experts (as relevant or not). Only 7 of the sentences considered not relevant by SpiNet were evaluated by the human experts, making a total of 161 sentences evaluated by the experts. Table 6 presents the confusion matrix which shows these numbers of sentences predicted by the system and evaluated by the experts, as relevant or not.

Table 5:

SpiNet Corpus and Evaluation Statistics.

Frame Element	Frequency
Paper processed	139
Paper evaluated	8
Sentences from evaluated papers	1,213
Sentences considered relevant by SpiNet	218
Sentences evaluated by human experts	161
Relevant sentences (by SpiNet) evaluated by experts	154
Not Relevant sentences (by SpiNet) evaluated by experts	7

Open in a new tab

Table 6:

Confusion matrix for validation.

True\ Pred.	Relevant	Not Relevant	Total
Relevant	111	1	112
Not Relevant	43	6	49
Total	154	7	161

Open in a new tab

Descriptive statistics of the annotated corpus are provided in Table 7. A total of 1,213 sentences are annotated with 28 frame-evoking LUs, and a total of 287 frames occurrences were identified in 218 relevant sentences, which means there were sentences that identified LUs from more than one frame. Since not every sentence from those papers were evaluated (only 154 from 218), we also show on Table 7 the occurrences of evaluated frames.

Table 7:

Annotated and Evaluated Frame frequency.

Frame	Frequency	Evaluated
Condition Symptom Relation	205	151
occur	30	19
cause	27	20
suggest	21	12
indicate	18	10
demonstrate	18	15
produce	16	13
relate	15	12
observe	8	6
improve	8	5
induce	8	8
predict	6	6
highlight	5	5
conclude	5	3
manifest	5	4
resolve	5	4
correlate	4	4
favor	2	1
provoke	2	2
link	2	2
Medical Intervention	54	29
indicate	18	10
treat	17	10
develop	10	3
consist	4	1
attempt	4	4
prevent	1	1
Cure	18	11
treat	17	10
rehabilitate	1	1
Medical Conditions	10	3
develop	10	3
Total	287	194

Open in a new tab

Regarding frame elements, the top-4 most common were Symptom, Location, Disease, and Treatment, which would also be the ones that could be identified through most frames. On the other hand, Patient which is linked to all frames is not the one with the most occurrences. In Table 8, we can see the frequency of each frame element, again with the information of how many of them were evaluated.

Table 8:

Annotated and Evaluated Frame Element frequency.

Frame Element	Frequency	Evaluated
Symptom	151	111
Location	95	64
Disease	70	30
Treatment	66	55
Concept	32	28
Organism Function	19	16
Patient	15	13
Object of Study	5	5
Total	453	322

Open in a new tab

Annotation Evaluation

In order to evaluate our classification of the sentences as relevant, using SpiNet framework, we calculated precision and recall. Given the confusion matrix on Table 6, we have 111 sentences evaluated as relevant by experts from 154 predicted as relevant by SpiNet, resulting in a precision rate of 0.72. In terms of Recall (true positive rate), we have 111 correctly classified as relevant by SpiNet from an universe of 112 true positive, according the experts, resulting in a recall rate of 0.99. So, the F1-score was 0.83.

Table 9 shows the precision rate for each Frame, according to the number of true positive (Support). Although the precision rates varied between 0.45 and 0.77, the Condition Symptom Relation frame was the most annotated frame and the one that showed better precision.

Table 9:

Frame evaluation metrics.

Frame	Precision	Support
Condition Symptom Relation	0.77	117
Medical Intervention	0.59	17
Cure	0.45	5
Medical Conditions	0.67	2
Total	0.62	141

Open in a new tab

Regarding frame elements (FEs), the precision rate varied from 0.60 to 1.0, as seen on Table 10. It can be observed that FEs that appear less (e.g. Object of Study and Patient) are the ones that vary more in performance. We attribute that to the specificity of these FEs. On the other hand, the top-4 frames elements with higher frequency are also very close in terms of the precision rate (0.77 to 0.89).

Table 10:

Frame Element evaluation metrics.

Frame Element	Precision	Support
Symptom	0.79	88
Location	0.83	53
Disease	0.77	23
Treatment	0.89	49
Concept	0.64	18
Organism Function	1.00	16
Patient	1.00	13
Object Of Study	0.60	3
Total	0.81	263

Open in a new tab

Conclusion

In this paper, we presented SpiNet, a project that aims to automatically extract relevant sentences and parts of text from scientific papers. We provide a FrameNet-like schema consisting of four frames (Condition Symptom Relation, Cure, Medical Conditions and Medical Intervention) with the definitions of Frame Elements (FE) and Lexical Units (LU). In addition, we propose an enhancement on the annotation by associating FEs from SpiNet to the Semantic Types from MeSH, making it possible to use the Terms from MeSH to annotate FEs. On the future, with more validated sentences, we intend to use learning algorithms to improve our annotated corpus.

One of the limitations found was that we were not validating at the Frame Element level, we were validating at the Sentence level, which means we had to deduce that if a sentence was validated, the frame and FEs recognized on that sentence were also validated. Another limitation was the PDF extractor module, that for most cases achieved an acceptable outcome, but when it were not able to identify references or tables correctly, it did interfere with sentence segmentation. We also did not include any strategies to deal with negation and uncertainty at the moment, as we are still studying and evaluating how to do it. A potential disadvantage of SpiNet is also that most Frames on it have a lot of Frame Elements in common, despite having really distinct purposes and definitions. However, we could observe, in the annotated sentences, that in fact the FEs were repeated even if the frames were different.

Despite our limitations, the experimental evaluation, described in this work, indicated that the SpiNet framework can be used to automate the information extraction about diseases, treatments and symptoms in the medical domain, specifically in the area of physiotherapy.

Acknowledgements

This work was assisted with annotation and validation of the SpiNet dataset by physiotherapists from ITC Vertebral, a Brazilian Institute of Treatment for the Spine.

Footnotes

https://framenet.icsi.berkeley.edu/

https://github.com/vcarnr/SpiNet

Figures & Table

Table 1:

Quantity of entities on FrameNet.

	Quantity
Frames	1,224
Frame Elements	10,535
Lexical Units	13,675

Open in a new tab

References

1.Physiotherapy evidence database https://www.pedro.org.au/ . (Accessed on 01/10/2020)
2.Cochrane library https://www.cochranelibrary.com/ . (Accessed on 01/10/2020)
3.Elkins Mark R, Moseley Anne M, Sherrington Catherine, Herbert Robert D, Maher Christopher G. Growth in the physiotherapy evidence database (pedro) and use of the pedro scale. 2013. [DOI] [PubMed]
4.Roberts Kirk, Si Yuqi, Gandhi Anshul, Bernstam Elmer. A framenet for cancer information in clinical narra- tives: schema and annotation. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) 2018.
5.Baker Collin F., Fillmore Charles J., Lowe John B. The Berkeley FrameNet project; 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1; Montreal, Quebec, Canada, August 1998. Association for Computational Linguistics; pp. 86–90. [Google Scholar]
6.Medical subject headings https://www.nlm.nih.gov/mesh/meshhome.html . (Accessed on 01/10/2020)
7.Kokkinakis Dimitrios. Initial experiments of medication event extraction using frame semantics. 2012.
8.Bauer Daniel, Fu¨rstenau Hagen, Rambow Owen C. The dependency-parsed framenet corpus. 2012.
9.Grobid https://github.com/kermitt2/grobid , 2008 — 2020.
10.Loper Edward, Bird Steven. Nltk: the natural language toolkit. arXiv preprint cs/0205028. 2002.
11.Schwartz Ariel, Hearst Marti. A simple algorithm for identifying abbreviation definitions in biomedical text. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 02 2003;4:451–62. [PubMed] [Google Scholar]
12.Stenetorp Pontus, Topic´ Goran, Pyysalo Sampo, Ohta Tomoko, Kim Jin-Dong, Tsujii Jun’ichi. Proceedings of BioNLP Shared Task 2011 Workshop. Port- land, Oregon, USA, June 2011: Association for Computational Linguistics; Bionlp shared task 2011: Supporting resources; pp. 112–120. [Google Scholar]

[r1-078_3417089] 1.Physiotherapy evidence database https://www.pedro.org.au/ . (Accessed on 01/10/2020)

[r2-078_3417089] 2.Cochrane library https://www.cochranelibrary.com/ . (Accessed on 01/10/2020)

[r3-078_3417089] 3.Elkins Mark R, Moseley Anne M, Sherrington Catherine, Herbert Robert D, Maher Christopher G. Growth in the physiotherapy evidence database (pedro) and use of the pedro scale. 2013. [DOI] [PubMed]

[r4-078_3417089] 4.Roberts Kirk, Si Yuqi, Gandhi Anshul, Bernstam Elmer. A framenet for cancer information in clinical narra- tives: schema and annotation. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) 2018.

[r5-078_3417089] 5.Baker Collin F., Fillmore Charles J., Lowe John B. The Berkeley FrameNet project; 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1; Montreal, Quebec, Canada, August 1998. Association for Computational Linguistics; pp. 86–90. [Google Scholar]

[r6-078_3417089] 6.Medical subject headings https://www.nlm.nih.gov/mesh/meshhome.html . (Accessed on 01/10/2020)

[r7-078_3417089] 7.Kokkinakis Dimitrios. Initial experiments of medication event extraction using frame semantics. 2012.

[r8-078_3417089] 8.Bauer Daniel, Fu¨rstenau Hagen, Rambow Owen C. The dependency-parsed framenet corpus. 2012.

[r9-078_3417089] 9.Grobid https://github.com/kermitt2/grobid , 2008 — 2020.

[r10-078_3417089] 10.Loper Edward, Bird Steven. Nltk: the natural language toolkit. arXiv preprint cs/0205028. 2002.

[r11-078_3417089] 11.Schwartz Ariel, Hearst Marti. A simple algorithm for identifying abbreviation definitions in biomedical text. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 02 2003;4:451–62. [PubMed] [Google Scholar]

[r12-078_3417089] 12.Stenetorp Pontus, Topic´ Goran, Pyysalo Sampo, Ohta Tomoko, Kim Jin-Dong, Tsujii Jun’ichi. Proceedings of BioNLP Shared Task 2011 Workshop. Port- land, Oregon, USA, June 2011: Association for Computational Linguistics; Bionlp shared task 2011: Supporting resources; pp. 112–120. [Google Scholar]

PERMALINK

SpiNet - A FrameNet-like Schema for Automatic Information Extraction about Spine from Scientific Papers

Vanessa C Ferreira, Bachelor in Computer Science

Vlάdia Pinheiro, D.Sc

Abstract

Introduction

Related Works

Background Knowledge

FrameNet

Figure 1:

Table 2:

Medical Subject Headings (MeSH) Ontology

Figure 2:

Figure 3:

Preparing an Initial Corpus of Scientific texts

Figure 4:

The SpiNET Schema

Table 3:

Table 4:

Annotation

Figure 5:

Figure 6:

Annotation Statistics

Table 5:

Table 6:

Table 7:

Table 8:

Annotation Evaluation

Table 9:

Table 10:

Conclusion

Acknowledgements

Footnotes

Figures & Table

Table 1:

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases