Skip to main content
PLOS One logoLink to PLOS One
. 2021 Jan 6;16(1):e0244604. doi: 10.1371/journal.pone.0244604

Use of a modular ontology and a semantic annotation tool to describe the care pathway of patients with amyotrophic lateral sclerosis in a coordination network

Sonia Cardoso 1, Pierre Meneton 1, Xavier Aimé 1,2, Vincent Meininger 3, David Grabli 4, Gilles Guezennec 1, Jean Charlet 1,5,*
Editor: Robert Hoehndorf6
PMCID: PMC7787442  PMID: 33406098

Abstract

The objective of this study was to describe the care pathway of patients with amyotrophic lateral sclerosis (ALS) based on real-life textual data from a regional coordination network, the Ile-de-France ALS network. This coordination network provides care for 92% of patients diagnosed with ALS living in Ile-de-France. We developed a modular ontology (OntoPaRON) for the automatic processing of these unstructured textual data. OntoPaRON has different modules: the core, medical, socio-environmental, coordination, and consolidation modules. Our approach was unique in its creation of fully defined concepts at different levels of the modular ontology to address specific topics relating to healthcare trajectories. We also created a semantic annotation tool specific to the French language and the specificities of our corpus, the Ontology-Based Semantic Annotation Module (OnBaSAM), using the OntoPaRON ontology as a reference. We used these tools to annotate the records of 928 patients automatically. The semantic (qualitative) annotations of the concepts were transformed into quantitative data. By using these pipelines we were able to transform unstructured textual data into structured quantitative data. Based on data processing, semantic annotations, sociodemographic data for the patient and clinical variables, we found that the need and demand for human and technical assistance depend on the initial form of the disease, the motor state, and the patient age. The presence of exhaustion in care management, is related to the patient’s motor and cognitive state.

Introduction

Neurodegenerative diseases affect many people in France, and throughout the world. For example, Parkinson’s disease affects about 160,000 people in France [1] and nearly one million in the United States https://www.parkinson.org/Understanding-Parkinsons/Statistics. Alzheimer’s disease and related conditions affect more than one million people in France, and it was estimated that 5 million Americans were living with this disease in 2014 https://www.cdc.gov/aging/aginginfo/alzheimers.htm. In Europe, the incidence of amyotrophic lateral sclerosis (ALS) has been estimated at 2.2 per 100,000 person-years (py) for the general population [2]. In 2015, 16,583 people were identified as having ALS in the USA [3]. In France, the annual incidence of ALS has been estimated at 1,500 new cases [4]. These diseases have many features in common: they cause a number of disabilities and handicaps that cannot be cured, although the symptoms can be treated with pharmacological and non-pharmacological approaches. Neurodegenerative diseases cause polymorphic damage (motor impairment, respiratory impairment, cognitive impairment, etc.), requiring the intervention of multiple structures and professionals. In France, the professionals and structures involved in the care pathway come from three specific sectors (medical, social, and medico-social). The care pathway includes support provided at home, but also passages through hospitals or care structures.

In France, care pathway optimization and the maintenance of care continuity have become key health policy issues in recent years. Care pathway optimization should improve patient management and have a positive economic impact on the healthcare system by limiting hospitalizations (particularly those that are avoidable) and unnecessary medical procedures. However, our knowledge of care pathways and their components, including their interruptions and the difficulties encountered by patients and their families, remains partial, because information is split between various actors and is often untraceable. The identification of interruptions of patient care requires a prior knowledge of the elements of the care pathway. We addressed these issues, using a real-life database for patients with ALS managed by the regional coordination network of Ile-de-France.

ALS is a neurodegenerative disease that affects the motor neurons, causing progressive weakening of the voluntary muscles. This disease has two forms: a familial form, accounting for 10% of cases, and a sporadic form accounting for the other 90%. Median survival after diagnosis is generally three years [5]. Progressive paralysis of the muscles generates functional limitations, causing disability (loss of the ability to walk, difficulty speaking, loss of dexterity), and ultimately resulting in death. The disease has an impact not only on the patient, but also on the patient’s family and careers [6]. Over and above management of the medical aspects of the disease, the patients and their families need compensation and assistance. This assistance may take the form of (a) human help with activities of daily living (eating, washing, getting dressed, etc.) or (b) technical assistance for mobility or communication, such as a powered wheelchair, or a speech synthesizer [7, 8]. This human and technical assistance is costly, and patients and their families require social assistance to complete the necessary administrative procedures to attain such funding [9].

In France, ALS patients are managed partly by expert centers. A specific group sector of such centers has been established: Rare Diseases, Amyotrophic Lateral Sclerosis, and Motor Neuron Diseases (https://portail-sla.fr). In Paris, a regional coordination network was created in 2005: the Ile-de-France ALS network (ALS-IDF network). The objective of this network is to coordinate actions and to help patients, families, and professionals through holistic management encompassing medical, social and medico-social aspects during the various stages of disease progression. A studied published in 2015 [10] reported that 92% of patients with ALS in Ile-de-France were managed by the ALS-IDF network. This study revealed an impact of this coordination on the number of hospital admissions and an improvement in survival. The ALS-IDF network established a database to make it easier to track the requests, needs and coordination actions implemented to support patients. This database contains two types of patient data: real-life and sociodemographic data.

We hypothesized that an analysis of these textual data would make it possible to identify the difficulties and needs of patients and their families at home, to understand the coordination actions implemented and to identify situations or types of patients confronted with multiple difficulties. The identification of such situations should improve patient support and the early detection of risk situations. We processed these textual data by semantic annotation using an ontology corresponding to this domain as a lexical resource. An ontology is defined as the formalization of a shared conceptualization [11]. Ontologies, as conceptual models, provide the necessary framework for semantic representation of textual information. Several ontologies have been developed in neurology, for Alzheimer’s disease [12], the Parkinson’s disease [13], and neurological diseases in general [14]. In this context, the work done on Alzheimer’s ontology [12] is inspiring for us since the authors have developed an ontology with French and English terms in order to annotate an information portal. However, none of the existing ontologies simultaneously models all of the knowledge relating to ALS, care pathway coordination and the specific features of existing social and medical structures in France.

We have therefore developed our own ontology, OntoPaRON, including these different dimensions in a modular structure. A modular ontology corresponds to a set of modules, where each module is a stand-alone component that maintains relationships with other ontology modules [15]. A modular ontology seemed be the most appropriate model for taking all these aspects into account. As a means of focusing our research on specific themes and identifying the difficulties encountered by patients during their care pathways, we decided to create fully defined concepts [16], each encompassing several classes relating to the same theme but from different modules.

We developed a semantic annotation tool based on the General Architecture for Text Engineering (GATE https://gate.ac.uk) open framework, which provides basic building blocks for the annotation of textual data. GATE’s resources have been adapted to our needs and to the French language. Our annotator, Ontology-Based Semantic Annotation Module (OnBaSAM) uses the OntoPaRON ontology. We used the annotations performed to create a specific module with annotation frequency as output. One of the advantages of this work is that it makes it possible to convert unstructured textual data into structured quantitative data. These quantitative data can be used for population-based statistical approaches in which the annotation data obtained are used to describe the elements of patient care trajectories. The work includes information extraction using ontology and specifically fully defined concepts, and data mining tasks by looking for relationships between the information extracted by semantic annotation and the socio-demographic data from the ALS-IDF network database [17].

Materials and methods

Computer tools are essential for exploitation of the textual data of the ALS-IDF network, which include too great a volume of information for manually processing. We decided to use knowledge engineering and automatic natural language processing (NLP) tools. We present here: (a) the use of an ontology to model domain knowledge; (b) the creation of a fully defined concept concerning specific themes relating to the care pathway; (c) the use of a semantic annotation tool that we developed, the OnBaSAM module. All these analyses were performed with JMP 14 Pro Statistical Discovery software (SAS, Cary NC).

Materials

The ALS-IDF network database has two parts: a structured static part containing sociodemographic data for the patients (sex, date of diagnosis, date of inclusion in the network, living conditions, date of birth, place of residence, etc.), and a dynamic part consisting of real-life data in a textual format. Textual data for “events” (unstructured part of the database) are entered as free text by the coordinators and may be of different types (e.g. transmission, hospitalization reports, minutes of care team meetings, medical transcriptions, or the transcription of oral exchanges with patients or their families). The Table 1 presents two examples of events present in the database. The processing of textual event data required initial spelling correction and pseudo-anonymization to comply with the General Data Protection Regulation (GRPD) rules. On August 26, 2019, the database of the ALS-IDF network contained 2,684 patient files, including more than 80,000 textual entries for events.

Table 1. Examples of events recorded in the SLA-IDF network database.

In FRENCH In ENGLISH
Information du ssiad de VILLE, PROFESSIONEL: a débuté la PEC ce jour mais le logement n’est pas du tout adapté. Nous ne pourrons pas lui faire la douche car la SDB n’est pas du tout adaptée. Son épouse prend beaucoup de risques dans les transferts le patient pèse 103 kg il y a un risque de chute non négligeable. Information from the care at home service of the patient’s town of residence, message from a PROFESSIONAL: management initiated today but the accommodation is entirely unsuitable. We will not be able to help him to shower because the bathroom is completely unsuitable. The patient’s wife takes a lot of risks when transferring the patient: the patient weighs 103 kg and there is a non-negligible risk of falls.
Appel de Patient qui a sollicité COORDINATEUR SLA, pour obtenir un certificat médical du NEUROLOGUE pour son dossier Mdph. Demande si COORDINATEUR SLA peut le rappeler pour son pb de FRE. Call from a patient who has requested an ALS COORDINATOR, to obtain a medical certificate from the NEUROLOGIST for his local disability services file. Asked if ALS COORDINATOR can call him back for his electric wheelchair problem.

These examples highlight the frequent use of abbreviations by coordinators (for example, PEC for patient management, SDB for bathroom, FRE for electric wheelchair). Pseudo-anonymization led to the replacement of names with functions, facilitating the annotation. The transformation of a nominal data Dr Brain, not defined in the ontology, into data identifying a concept the agent ‘Neurologist’ defined in the ontology allows from a conceptual point of view to identify the interactions between agents and actions.

Construction of the OntoPaRON ontology

We use ‘single quotes’ to denote OntoPaRON classes and italic font to denote the relationships in our ontology. The specific features of our project relate to (a) the decision to model knowledge through a modular ontology and (b) the use of fully defined concepts specifically created as themes of interest in statistical analyses, to improve clinical understanding of the care trajectory. Our ontology was developed with a methodology combining a top-down approach involving the use of a top-level ontology with a bottom-up approach involving searches for candidate terms in the corpus of the text [18].

The first step was the extraction of corpus terms with NLP tools in BIOTEX software [19]. The corpus used consisted of 60,130 events extracted from the database of the ALS-IDF network, covering a ten-year period of network activity (2005 to 2015). We selected the candidate terms for this analysis in collaboration with experts in the field. This method provides access to terms representing the concepts used. The OntoPaRON ontology was constructed with Protégé (https://protege.stanford.edu) version 5.2 [20]. Each concept in our ontology is denoted by a preferred term in English and in French, together with alternative terms (synonyms, acronyms, and abbreviations taking into account different spellings linked to the coordination context). Indeed, there is no consensus between the coordinators of the ALS-IDF network (who come from different paramedical professions: nurse, occupational therapist, psychologist) on the common use of abbreviations. This diversity of usage required the collection of all the terms and abbreviations used for the same concept. For example, the concept ‘general practitioner’ may be denoted by eight alternative terms in french: ‘med tt’, ‘mt’, ‘family doctor’, ‘med ttt’, ‘méd t’, ‘méd tt’, ‘general practitioner’, ‘mdt’.

The second step was the alignment of concepts and enrichment of the ontology. For this step, we used the Health Terminology Ontology Portal (HeTOP (https://www.hetop.eu/hetop/)) [21] tool to align the concepts of our ontological modules with other reference terminologies, using Unified Medical Language System (UMLS) codes.

The analysis of candidate terms led to the definition of four principal dimensions:

  • a) a generic dimension, corresponding to the set of concepts common to all themes present in the care pathway;

  • b) a medical dimension associated with the disease and medical management processes;

  • c) a socio-environmental dimension linked to the social situation and environment of the patient;

  • d) a coordination dimension linked to the actions of ALS-IDF network coordinators to help patients with administrative formalities, such as finding appropriate healthcare professionals and evaluating needs.

These four dimensions oriented us towards the creation of four ontological modules for each of these domains.

Modularity of OntoPaRON

We chose to construct a modular ontology, consisting of four domain modules and one a consolidation module. The modules are autonomous but have a defined association with other ontology modules, including the original ontology [22]. Modularity has several advantages, including module reuse and the facilitation of management, by module [23, 24]. We chose to create a modular ontology for several reasons: the possible secondary use of some of the modules for care trajectory analysis for other neurodegenerative diseases (e.g. Alzheimer’s disease, Parkinson’s disease), facilitation of the updating and handling of knowledge by taking the evolution of systems into account (social assistance system, medical advances), and the promotion of exchanges with experts in the field (doctors, coordinators). The OntoPaRON ontology is composed of five modules (one core module, three domain modules, one consolidation module):

  1. The core module, which contains all the high-level concepts common to the three ontologies, such as ‘ideal objects’,‘agents’, ‘processes’, and ‘modes’. Some of the core ontology and high-level concepts are shown in Fig 1. This ontology was inspired by the Menelas top-level ontology [25] available from: https://bioportal.bioontology.org/ontologies/TOP-MENELAS. This module also contains all object properties used, and the fully defined high-level concepts.

  2. The medical module is the largest in terms of the number of classes. It is specific to the medical domain and includes: ‘medical agents’ (‘doctor’, ‘neurologist’, ‘physiotherapist’, etc.), ‘medical processes’ (‘consultation’, ‘hospitalization’, etc.), and ‘medical objects’ (‘drugs’, ‘prescriptions’, etc.). This module contains concepts relating to anatomical structures, signs, and symptoms. These concepts are directly related to the disease and its medical management.

  3. The socio-environmental module contains concepts relating to the life of the patients, in their family and social environments [26]. This module includes ‘agents’ (‘family’, ‘social workers’, etc.),‘social actions’ (‘requests for benefits’, ‘requests for human assistance’, etc.), ‘physical objects’ from the social domain (‘wheelchair’, ‘medical insurance card’, ‘accommodation’, etc.), concepts linked to the various benefits the patient may receive and concepts relating to legal protection (‘guardianship’, ‘ward of court’). The socio-environmental module has the second largest number of classes.

  4. The coordination module consists mostly of specific coordination missions (‘coordination actions’). There are several types of coordination action: ‘communication actions’, ‘assessment of needs actions’, and ‘resource search actions’. We based our model partly on a previous study [27] describing the creation of the Nursing Care Coordination Ontology (https://bioportal.bioontology.org/ontologies/NCCO). This is the module with the smallest number of concepts.

  5. The last module is a consolidation module. This module does not model any concepts, but all the concepts of the ontology are imported into it. Following importation of the various modules, the HermiT reasoner (1.3.8.413) [28] from Protégé is used to infer classes under the fully defined concept. A reasoned version of the ontology, with the concepts inferred under the fully defined concept, is exported and used by the two tools we developed: the semantic annotator Ontology-Based Semantic Annotation Module (OnBaSAM), and the annotation evaluation tool Pronto.

Fig 1. Core module of the OntoPaRON ontology.

Fig 1

High-level concepts, such as ‘abstract objects’ and ‘ideal objects’, are presented in this module, together with concepts defined as high-level.

The importation links between the various modules of OntoPaRON are illustrated in Fig 2, and the metrics for each module of the ontology are shown in Table 2. OntoPaRON is available from https://bioportal.bioontology.org/ontologies/ONTOPARON. The ontology respects a certain number of metadata according to the prescriptions of [29]. To see the classification after reasoning, download the ontology from Bioportal, load it in Protégé and start reasoning with Hermit. The final reclassification is visible in the “class hierarchy (inferred)” view.

Fig 2. Overview of the OntoPaRON concept inheritance diagram.

Fig 2

Overview of the inheritance diagram of the OntoPaRON concept and the corresponding URIs. The arrows point in the direction of the modules that perform the import. Thus, the ontologies of the domain (ontoparonmed: medical ontology, ontoparonsoc: socio-environmental ontology; ontoparoncoord: coordination ontology) import the core ontology. In the same way OntoPaRON ontology imports each of the ontologies of the domain and by inference the core ontology.

Table 2. Metrics of the OntoPaRON ontology modules.

Core Medical Socio-environmental Coordination Consolidation OntoPaRON
Number of classes 378 1041 740 303 0 2,462
Number of relationships 32 0 0 0 0 32
Fully defined concepts 7 17 10 9 0 43

The number of fully defined classes, relationships and concepts present in each module of the OntoPaRON ontology. As shown in the table, the consolidation module does not model any concepts; its main function is to aggregate the four modules.

Construction of fully defined concepts

In this case of use, we wished to define the elements of the care trajectory of the patient and to determine whether these elements were expressed in a similar manner for all patients. We hypothesized that semantic annotation of the event database of the ALS-IDF network with our annotation tool, OnBaSAM, using the OntoPaRON ontology as a reference, would help us to understand the care trajectories of patients. We wanted to know whether certain themes, such as ‘exhaustion’, ‘obtaining human assistance’, or ‘seeking help to find a healthcare professional’, were frequent in patient care trajectories or whether they concerned all patients. Exhaustion during patient management may be expressed in several different ways: the spouse may explicitly report being exhausted, the use of respite care (requested by the patients or their families, or proposed by a healthcare professional), or the exhaustion of teams caring for the patient at home. We created fully defined concepts to bring together all the concepts relating to the same theme.

The organization of knowledge into an ontology made it possible to construct fully defined concepts with sufficient and necessary conditions. Fully defined concepts make it possible to group together all concepts linked to the same theme by the same object property, in the same class. These concepts may be found at different levels in the ontology or in different ontology modules. The HermiT reasoner infers the membership of all concepts sharing a relationship to a fully defined concept defined with this same relationship. The fully defined high-level concepts found in the core module are illustrated in Fig 3. The ‘domain of exhaustion’ thus includes the exhaustion of a carer from the family or of a professional carer during patient management. The concepts relating to this theme are found in two ontology modules: the concepts ‘request for respite care’ and ‘respite care proposal’ are found in the coordination module, whereas the concept ‘need for respite care’ and ‘patient carer exhaustion’ are present in the socio-environmental module. These three concepts have the object property aPourThématique ThématiqueEpuisement and are therefore inferred under the fully defined concept of ‘Exhaustion Domain’.

Fig 3. Fully defined concepts in OntoPaRON.

Fig 3

Partial screenshot of the high-level concepts of OntoPaRON present in the core module. All the concepts present in each of the modules having the relation thematic concept of exhaustion are inferred under the fully defined concept ‘field of exhaustion’.

Fully defined concepts are constructed and created as variables of interest for the clinical analysis of patient care trajectory, and their frequency is an indicator of the problems encountered by patients. After annotation of the corpus by the OnBaSAM system, all the related quantitative data linked to the occurrence of exhaustion in caring for a patient can be extracted. Searches for exhaustion of a carer or healthcare professional are generally performed according to a methodology based on individual interviews or rating scales [3032]. Our approach makes it possible to determine the frequency of this phenomenon in the management of ALS patients from textual data entered by third parties.

Fig 3 illustrates the fully defined concepts presented in the core module, but others are present in each of the modules. The fully defined concept ‘Coordination Action’, which groups together all the actions performed by the coordinators, is present in the coordination module. It encompasses all the ‘coordination actions’, including ‘communication actions’ and ‘finding coordination resources’. As another example, we created the fully defined concept ‘Social Process’, which brings together all requests and actions in the social domain, including ‘request for social benefits’ and ‘bathroom adaptation’ for the social module. In the medical module, we created the fully defined concept ‘Cognitive State’, which brings together all concepts referring to clinical signs, symptoms, and diagnoses linked to changes in the cognitive state of the patient, and ‘Motor state’, which reports clinical signs relating to a deterioration of the patient’s motor skills, such as falls or losses of motricity. In total, the ontology includes 43 fully defined concepts, part of defined concepts and formal definitions are presented in S1 Table.

Level of alignment with other terminological and ontological resources

Our studies of existing terminological/ontological resources (TORs) identified none including the dimensions of the patient care trajectory. This finding justifies our ontological approach. Once the complete ontology is obtained, it can be interesting to revalidate the approach a posteriori and to check whether any known terminological/oncological resources cover the domain modeled. We therefore assessed the level of coverage of each module of the OntoPaRON ontology by the ontologies present in HeTop (https://www.hetop.eu/hetop/), using the terms in French. The HeTop terminology server provides access to 85 TORs. Automatic unsupervised alignment identified found 9,906 alignments for 51 TORs. The results are summarized in Table 3.

Table 3. OntoPaRON alignments with terminological/ontological resources presents in HeTop.

Terminology Number of Concepts Alignments found Socio-environmental alignments Medical alignments Coordination alignments
Medical subject headings (MeSH) 277,575 789 181 (27.01%) 501 (51.6%) 14 (3.47%)
Systematized Nomenclature Of Medicine Clinical Terms (SNOMED-CT) 350,976 763 163 (24.32%) 432 (44.5%) 14 (3.47%)
National Cancer Institute Thesaurus (NCIt) 79,870 698 127 (18.95%) 370 (38.10%) 15 (3.72%)
International Classification of Diseases (ICD-11) 55,267 310 39 (5.82%) 243 (25.02%) 3 (0.74%)

Results of alignments by ontological module of OntoPaRON with the TORs presented in HeTop. The table shows for each module of the ontology OntoPaRON the number of concepts aligned with the reference terminology, and the percentage of terms aligned in each module. For the four TORs considered, the alignments for the medical module were of better quality than those for the coordination module (p < 0.0001).

For the three modules, the best alignment rates were those obtained with three terminologies: Mesh, SNOMED-CT, and NCIt. These results are readily explained by the specific modeling choices made during the construction of the ontology. Indeed, the medical module contains all the medical professionals, drugs, signs, and symptoms present as generic elements in many resources, without and not specific to ALS. Thus, certain symptoms, such as headache, or the presence of motor disturbances (e.g. falls or muscle amyotrophy), may also be found in other diseases. The results for the socio-environmental module are readily explained by the identification of generic classes, such as family and carer. However, it is difficult to identify benefits or structures specific to the French domain in the terminologies, for example: a) the prestation sociale (social benefits) class was found by only two of the terminologies (Mesh et SNOMED-CT), but, concepts at a finer level of granularity, such as Allocation Adulte Handicapé (disabled adult allowance), were not present in any of the terminologies and; b) no alignments were found within the framework of medicosocial structures specific to France. Thus, specific structures, such as the Maison départementale des personnes handicapées ou bien encore le SAVS (Service d’Accompagnement à la Vie Sociale; social assistance) were not present in any of the terminologies. The coordination module was the module with the lowest percentage alignment. This is partially explained by the level of granularity defined for the coordination module. Thus, the Establishment of Coordination Resources class was aligned with the MeSH, SNOMED-CT, and NCIt terminologies. However, no alignments were found for subclasses, such as the establishment of a respite care stay or the establishment of disability benefits in any of the TORs. In this case of use, it was important to know the specific arrangements made by the coordinators, which could: a) respond to a request clearly expressed by patients or their families or b) result from the analysis of a need not identified by the patient but highlighted by the coordinator. This analysis clearly shows that no other TORs comes close to covering the whole of our domain.

Annotation and assessment tools: The Ontology-Based Semantic Annotation Module and Pronto annotator

Within the Medical Informatics and eHealth Knowledge Engineering Laboratory (LIMICS), we decided to develop a semantic annotator using OntoPaRON ontology as a semantic reference. We created this annotator OnBaSAM, with resources available from the GATE, a text analysis platform providing open source resources. These resources are used in various biomedical research projects [33]. Using the available resources, we constructed annotation chains for textual documents. Based on GATE which allows to build a semantic annotation string using an ontology. We have built a specific processing chain that we have adapted to our use case. Indeed, GATE is used and developed in English. Based on GATE, we have built a specific processing chain for our problem, which allows us to take into account the French language, the negation, as well as the export of annotations. At the end of an annotation chain, the textual documents of the corpus are enriched by metadata annotations represented by XML tags included in the annotated document. We created various pipelines (chains of processing resources), allowing several levels of corpus processing:

  1. Pre-processing by normalization and tokenization, splitting into sentences, application of lemmatization (TreeTagger), and Part Of Speech (POS) tagging, to make use of grammatical categories. This pipeline can be used to correct spelling in the content, thereby favoring the identification of concepts during annotation.

  2. The second pipeline can be used to annotate entities named according to the identification of the ontological concepts from the preflLabel and altLabel of the concept.

  3. We chose the option of exporting the created annotations to a spreadsheet. For each patient, the number of occurrences identified by OnBaSAM is determined for each ontology class. The export process can be global, taking into account all ontological concepts (n = 2, 462), or specific, restricted to defined concepts only (n = 43).

We created Pronto, a tool for the evaluation of annotations, for assessments of the quality of the automatic annotations made a by the OnBaSAM system. This tool can be used to group the annotated text and the corresponding concepts together on the same interface, as shown in Fig 4. This tool is designed for use by evaluators who are experts in the field, directly involved in the coordination of patient care trajectories. Internal and external coordinators from the ALS-IDF network assisted with the assessment of annotation quality. During the evaluation process we asked the coordinators both to evaluate the concepts annotated by the system, and to indicate which concepts might be missing, by creating a manual annotation. The detection of missing concepts by the experts allowed us to enrich the ontology. We used three widely used standard measurements [34], to evaluate annotation performance and the relevance of the concepts: precision, recall, and the F-measure. Five experts evaluated 410 events. The resulting scores yielded a precision of 0.91, a recall of 0.9, and an F-Measure of 0.91. The results presented may be moderate, given certain biases in the evaluation [34]. Indeed, the presence of annotations on the corpus influences the annotation of the evaluators who will focus on the present annotations and not on the non-annotated data. Based on an analysis of these results, we modified OntoPaRON and used the new version to annotate a corpus of 928 patient files. We used JMP software for statistical data processing. Data were processed by merging the sociodemographic data for the patients and the semantic annotation data into a single table.

Fig 4. Pronto assessment tool.

Fig 4

Partial screenshot of the assessment tool used by experts in the field to evaluate each of the annotations made by OnBaSAM.

Results

We annotated 928 patient files from the ALS-IDF network between January 2, 2013 and December 31, 2017. These years were chosen on the basis of the start and end dates of this project (2014—May 2019). The annotation of these 928 files represented 31,260 events, or more than 1,000 000 words. The sociodemographic data for the patients are summarized in Table 4. The population consisted of 52% men and 48% women (sex ratio of 1.09). The patients had a spinal form of the disease in 65% of cases and a bulbar form in 34%. The bulbar form was more frequent in women and the spinal form was more frequent in men [35]. Mean age at diagnosis was 65.2 years, and mean age at inclusion in the ALS-IDF network was 66 years. The mean duration of support from the ALS-IDF network was 550 days. Most of the patients lived in families (78%) and were married (64%) and all lived in the Ile-de-France region. Support from the ALS-IDF network resulted in a mean of 33.6 reported events per patient, but there was considerable heterogeneity, with a minimum of one event for four patients, more than 200 events for others, and a maximum of 322 events entered for one patient.

Table 4. Socio-demographic data of the study population.

Women (n = 442) Men (n = 486) Total (n = 928)
Age at diagnosis mean 67.15 63.4 65.2
SD 11.77 11.88 11.97
Min-Max 22-90 20-89 20-90
Inclusion age mean 67.76 64.16 65.9
SD 11.76 11.96 12
Min-Max 22-91 20-91 20-91
Form of the pathology Spinal form 247 (60%) 319 (71%) 566 (65%)
Bulbar form 164 (40%) 130 (29%) 294 (34%)
Follow-up in days mean 548 552.6 550.3
SD 446.8 427 436.6
Min-Max 6-2302 6-2207 6-2302
Number of events mean 34.6 32.6 33.6
SD 31.4 32.7 32.13
Min-Max 1-276 1-322 1-322
Social Status Single 63 (16%) 63 (13%) 126 (14%)
Married 215 (54%) 337 (73%) 552 (64%)
Divorced 62 (16%) 46 (10%) 108 (13%)
Widower 58 (15%) 15 (3%) 73 (8%)
Lifestyle Lives alone 107 (27%) 59 (13%) 166 (20%)
Lives with family 282 (70%) 384 (86%) 666 (78%)
Lives in an institution 12 (3%) 5 (1%) 17 (2%)

All the socio-demographic characteristics of the study population.

Contribution of fully defined concepts to the identification of patient needs

We illustrated the contribution of fully defined concepts to the analysis of care trajectories by analyzing some of these concepts. The percentage of patients affected by these themes during their care pathway is shown in Table 5. Some themes were expressed differently. For example, the defined concept ‘field of technical aids’ related of technical aids concerns almost the entire population (93%), but the fully defined concept ‘field of exhaustion’ was identified as present in more than half (55%) of the patients included in the SLA-IDF network.

Table 5. Semantic annotation of fully defined concepts.

Fully defined concepts Mean / SD Presence n(%) vs Absence n(%) p
Fully defined concept: ‘field of technical aids’ 29 / 42.23 859 (93%) vs 69 (7%) <0.0001
Fully defined concept: ‘field of exhaustion’ 3.67 / 6.56 507 (55%) vs 421 (45%)
Fully defined concept: ‘field of human help’ 3.57 / 4.27 707 (76%) vs 221 (24%) <0.0001
Fully defined concept: ‘area of need and request for social help’ 3.73 / 4.53 722 (78%) vs 206 (22%) <0.0001
Fully defined ‘concept of motor state’ 4.39 / 4.40 782 (84%) vs 146 (16%) <0.0001
Fully defined ‘concept of cognitive state’ 0.49 / 1.41 174 (19%) vs 754 (81%) <0.0001

Quantitative and qualitative data for the semantic annotation of certain fully defined concepts.

We sought to know how these fully defined concept were expressed according to the specific characteristics of the patients with regard to sociodemographic elements (age, social status, lifestyle, etc.) and clinical variables (form of the pathology, motor state, cognitive state). We used different statistical tests such as the linear regression model, the Student’s t test in comparison and the search for correlation using the Pearson test.

The fully defined concept of ‘field of technical aids’, which concerns 93% of the population, mainly concerns the youngest patients (p < 0.0001) with a spinal form (p = 0.020) compared to people with a bulbar form (Student’s t test showed a mean least squares difference of 31,07 for spinal form and 25,00 for bulbar form). The demands for the implementation of technical aids, increase with the time of presence in the SLA-IDF network, as well as with the presence of an altered motor state (p < 0.0001) but not in the case of cognitive impairment (p = 0.67). The Table 6 shows the importance of the ranking of each clinical variable in the expression of the fully defined concepts studied.

Table 6. Association between fully defined concepts and clinical variables asses by linear regression model.

Fully defined concept Clinical variables Log Worth p-value
Fully defined concept ‘field of technical aids’ Motor state 48,830 <0.0001
Age 8,718 <0.0001
Form of the pathology (bulbar) 1.680 = 0.020
Cognitive state 0.171 = 0.67
Fully defined concept ‘field of exhaustion’ Motor state 17.621 <0.0001
Cognitive state 6.310 <0.0001
Age 0.488 = 0.32
Fully defined concept ‘field of human help’ Motor state 33.81 <0.0001
Age 7.56 <0.0001
Form of the pathology 2.529 = 0.002
Cognitive state 0.996 = 0.101
Fully defined concept ‘area of need and request for social help’ Motor state 43.259 <0.0001
Age 12.760 <0.001
Cognitive state 2.823 = 0.001
Form of pathology 1.11 = 0.07

Relative effects of fully defined concept as estimated by linear regression model. False discovery rate p-value is given for each effect using the Benjamini-Hochberg technique that adjusts for multiple tests. False discovery rate LogWorth, which is the best statistic for plotting and assessing significance, is defined as -log10 (FDR p-value).

The occurrence of exhaustion during care was linked to the presence of signs indicating motor degradation (declining motor state) (p < 0.0001) or cognitive impairment (p < 0.0001). Marital and social status appeared to be involved in exhaustion, which was more frequently among divorcees than among married patients (Student’s t test showed a mean least squares difference of 4.86 for divorcees and 3.01 for married patients). The living conditions of the patient seemed to affect the likelihood of exhaustion, which was more frequent for patients living in families (mean of 4.45) than among those living alone (mean of 2.75). The occurrence of exhaustion during patients care was correlated with the recording of events by the coordinators of the SLA-IDF network (r = 0.24; p < 0.0001) and necessitated a coordination action (r = 0.5; p < 0.0001), in particular the search for an appropriate structure (r = 0.17; p < 0.0001) and the provision of a human helper (r = 0.23; p < 0.0001).

The fully defined concept of ‘field of human help’ concerned 76% of patients during their care trajectories. The use of human help is related to the motor state, (p < 0.0001) person’s age (p < 0.0001) and the initial form of the disease (p = 0.0021), with people with the spinal form having a greater need for human assistance than those with the bulbar form. Living conditions also influenced the need for human help (p = 0.0011). Student’s t test showed that people living alone had a greater need for human aid than those living in a family (mean value of 4.48 for people living alone and 3.4 for those living in families).

The fully defined concept of ‘area of need and request for social help’ were linked to the motor state (p < 0.0001) of the patient and age of the person at inclusion (p < 0.0001) in the ALS-IDF network. Requests and needs were more numerous for younger patients and decreased with age. They therefore increased with disease progression and time spent in the ALS network (p < 0.0059), but were not linked to the initial form of the disease (p = 0.07).

Care pathway coordination

Within the framework of coordination, we focused particularly on two fully defined concepts, investigating the difference between ‘Coordination requests received’ and ‘Coordination actions to match resources to needs’. The allocation of necessary resources based on need can arise in two situations: a) a clearly stated request for resources or b) detection of the need during evaluation by the coordinator. The values for the annotation of these two fully defined concepts are presented in Table 7. A comparison of the two defined concepts ‘Coordination requests received’ and ‘Coordination actions to match resources to needs’ revealed a significant difference (p < 0.0001). This suggests that some requests are expressed explicitly, but that coordinators carry out assessments and propose solutions without the request being clearly expressed. We validated this hypothesis by exploring this issue at a finer level of granularity of the OntoPaRON ontology through the extraction of annotations for the concept of ‘Request to find a healthcare professional’. This concept is defined as the explicit formulation of a request for coordinators to find a healthcare professional (e.g. physiotherapist, speech therapist, GP). We investigated whether the occurrence of this request was related to the resulting coordinating action, ‘Search for a healthcare professional’ (r = 0.28; p < 0.001).

Table 7. Semantic annotation of fully defined concepts in the coordination ontology.

Fully defined concepts Mean / SD
Coordination action 74.17 / 77.70
Match resources to needs 10.41 / 11.02
Coordination requests received 5.42 / 6.55

Semantic annotation of fully defined concepts for specific coordination actions. In coordination actions, the number of actions matching resources to patient needs (Match resources to needs) exceeded the number of requests received by coordinators (coordination requests received) (p < 0.0001).

We transformed the quantitative data into dichotomous discrete data (0 = no demand or not seeking of a healthcare professional; 1 = request made or looking for a professional) for this analysis, to determine the total number of patients concerned by these two concepts. In 64 cases (7%) a ‘Request to find a healthcare professional’ occurred, whereas ‘Search for a healthcare professional’ occurred for 209 patients (22% of the patients). Thus, in 145 cases, the coordinators searched for a healthcare professional in the absence of a direct request from the patient.

OntoPaRON can be specifically identify the type of healthcare professional sought. The healthcare professionals most sought by coordinators are shown in Table 8. For certain patients, the search for a healthcare professional related to more than one type of professional (e.g. physiotherapist and nurse; physiotherapist and doctor). Physiotherapists were the most sought category of paramedical professionals for the management of patients with ALS.

Table 8. Requests and searches for healthcare professionals.

Request to find a healthcare professional 64
Search for a healthcare professional 209
Searches for a physiotherapist 162
Searches for a doctor 16
Searches for a speech therapist 18
Searches for a nurse 14
Searches for psychologist 9

The requests to search for a healthcare profession correspond to all the requests received by the coordinators to find a healthcare professional. The number of such requests was smaller than the number of searches for healthcare professionals actually carried out (p < 0.001).

Discussion

Improvements in the efficiency of patient care pathway will require the prior description of these trajectories, for their analysis and the identification of ways of improving them. A top-down approach can be envisaged, based on the Information Systems Medicalization Program (PMSI) for the medical dimension [36, 37], but the social, medico-social, and coordination dimensions are not included in this program. A bottom-up approach could also be envisaged, starting from real-life patient data. We chose to follow this second approach, using textual data from a coordination network for people living with ALS in the Ile-de-France region.

We created a modular ontology for the processing of such data, taking all aspects of the patient care pathway into account: the medical, socio-environmental, and coordination dimensions. The choice of a modular system and the creation of defined concepts made it possible to group together concepts dealing with the same theme from different ontology modules under a defined concept. The themes for the defined concepts were chosen on the basis of published data for ALS. Like Grau [23], we observed the positive aspects of modularity. However, modularity requires constant attention to the positioning of defined concepts and the management and attribution of relationships between concepts.

The annotation results revealed that the expression of needs and requests, particularly for human and social aid, were expressed differentially by the patients of the ALS-IDF network. Patients’ needs vary according to clinical variables such as motor status (i.e. the motor progression of the disease), the initial form of the disease, the presence of cognitive impairment, the age and living conditions. Our quantitative approach revealed large differences between patients in the number of events recorded. It is important to take this first criterion into account in analyses of patient care trajectory, as it indicates that some patients make more requests and have a greater need for the coordination of their care (care management) than others [38]. The evolution of motor impairments is the most important factor in the care pathway. It intervenes at the same time in the setting up of technical assistance, in the appearance of caregiver exhaustion, in the setting up of human assistance as well as in social demands. The use of technical and human aid was more frequent for the spinal than for the bulbar form, consistent with the natural course of disease for this form. These results are comprehensible, given the characteristics of spinal involvement, which mostly affects the limbs, limiting the patient’s ability to perform activities of daily living and necessitating technical or human aid to compensate for the disability. These needs change over time and concern all areas of daily life [39].

The age of the patient also affects the type of aid required (human and technical), needs, and social requests. In France, age is an important criterion in social and medico-social policy. It directs and defines the type of social benefits that can be claimed, according to the person’s situation and disability. Such aid requires funding, with out-of-pocket expenses greater for those over 60 years of age, which may lead some to renounce such assistance [40]. Our results also linked the occurrence of exhaustion with cognitive alterations, consistent with published findings [41, 42].

The structure of the ontology made it possible to quantify the requests made to coordinators and the actions implemented. The number of requests made to coordinators was smaller than the number of actions implemented. Coordinators probably took preventive action, by analyzing and assessing certain needs that were not identified by patients or their families. The level of granularity of OntoPaRON made possible the specific identification of requests from patients to find healthcare professionals who would come to their homes, and to quantify the searches for such professionals made by coordinators. We wanted to know whether such requests to search for healthcare professionals were related to healthcare availability (density of healthcare professionals) in the local area (department, about the size of a county). A comparison of these results with data from the Regional Health Agency of Ile-de-France (https://santegraphie.fr/accueil/accueil) for the density of healthcare professionals showed that searches for physiotherapists were not always linked to areas in which such professionals were in short supply. Such requests are probably made when patients begin to encounter increasing difficulty leaving their home to visit the physiotherapist’s office. The continuation of care therefore requires the healthcare professional to come to the patient’s home. We hypothesize that such professionals may be present in a given area but that they do not provide, or refuse to provide, home care (probably due to the sum they are reimbursed for such visits). An absence or interruption of care can have a major impact on the care pathway of the patient, who may require hospitalization due to a deterioration of their medical condition or the exhaustion of their carers.

We found that alignment was better in the medical domain than, for example, in the socio-environmental domain. The granularity of our ontology, related to our original objective of annotating specific textual data and the specificities of the French social and medical sector, explain why these classes were not found in many ontologies, and fully justifies the creation of the various modules of the OntoPaRON ontology.

Our study had several limitations. First, only a limited number of patients. The ALS-IDF network provides support for a large proportion of patients with ALS in Ile-de-France, but these data may not be generalizable to all patients with ALS in France, as the monitoring and care services available differ between the regions of France. Second, we did not account for the timing or chronology of the appearance of various difficulties. The time between requests for care or assistance and their implementation can have an impact, worsening the difficulties encountered. The identification of interruptions of care, requires an identification of the criteria for avoidable and necessary hospitalizations (for respiratory decompensation, for example). It was not possible to perform such an analysis within the time constraints of our project, and further studies are required to determine whether there are predictive factors or indicators. The semantic annotation generated numerous quantitative results, but not all the data were processed, due to the short duration of this project. The exploitation and analysis of all the results should provide us with a more precise vision of the elements of care trajectories. Our results cover all the patients included, revealing specific features for some. A more detailed cluster analysis would make it possible to observe the trajectories and determine whether certain sequences (of events or difficulties) are identifiable. We have also used OntoPaRON to annotate coordination corpora for Parkinson’s disease. The initial results showed that two of the modules (coordination and socio-environmental) were suitable for report annotation, but the corpus volume was too small validate the results. In addition, we have started tests to use our approach in psychiatry. The first results show that the semantic annotator works in another domain but this remains to be further investigated.

As described in section “Modularity of OntoPaRON”, we organize our ontologies with a core ontology built for a long time in our laboratory, TopMenelas. It allows us (i) to have a foundational ontology to subsumerate, and (ii) to ensure consistency between the different ontologies developed. But we are aware that there are, among others, two federative initiatives, DOLCE with the core BTL2 ontology (http://biotopontology.github.io) and BFO with the core OGMS ontology (https://bioportal.bioontology.org/ontologies/OGMS). We have begun a work of alignment of TopMenelas with these two ontologies, in order to take advantage of the eco-system of these two initiatives.

In conclusion, we present one approach, based on the reasoning capacity and possible inferences in ontologies, for creating defined concepts with not only a semantic aspect but also a real dimension and clinical expression. The modularization of ontologies and their association with the automatic NLP tools, as developed here, will make it possible to annotate french corpora and to extract knowledge. Further studies of annotations will be required to identify the causes of interruptions in care. This initial work highlights the variability of needs and demands in the care pathway of individuals not only on the basis of medical criteria, such as the initial form of the disease (spinal or bulbar), but also on the basis of intrinsic criteria, such as patient age or living conditions. We are currently using and continuing to develop the OnBaSAM annotation tool in the framework of the university hospital health research project PSY-CARE.

Supporting information

S1 Table. List of some of the fully defined concepts in OntoPaRON ontology.

List of some of the defined concepts and their formal definitions, present in each module of the OntoPaRON modular ontology. We recall that the HermiT reasoner infers the membership of all concepts sharing a relationship to a fully defined concept defined with this same relationship.

(PDF)

Data Availability

All relevant data are within the manuscript and its Supporting information files.

Funding Statement

Xavier Aimé works in the company Cogsonomy that he created: This company’s sole role is to finance training and other project management assistance for Xavier Aimé. Cogsonomy did not play a role in the study design, data collection and analysis, decision to publish, preparation of the manuscript and only provided financial support in the form of authors’ salaries and/or research materials. The funders (universities, hospitals) provided support in the form of salaries for authors [SC, PM, VM, DG, GG, JC], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.

References

  • 1. Moisan F, Kab S, Moutengou E, Boussac-Zerebska M, Carcaillon-Bentata L, Elbaz A. Fréquence de la maladie de Parkinson en France. Données nationales et régionales 2010-2015.; p. 69. [Google Scholar]
  • 2. Logroscino G, Piccininni M. Amyotrophic Lateral Sclerosis Descriptive Epidemiology: The Origin of Geographic Difference. Neuroepidemiology. 2019; p. 93–103. 10.1159/000493386 [DOI] [PubMed] [Google Scholar]
  • 3. Mehta P, Kaye W, Raymond J, Punjani R, Larson T, Cohen J, et al. Prevalence of Amyotrophic Lateral Sclerosis—United States, 2015. MMWR Morb Mortal Wkly Rep. 2018;67(46):1285–1289. 10.15585/mmwr.mm6746a1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Couratier P. Sclérose latérale amyotrophique. LA REVUE DU PRATICIEN. 2016;66:555–571. [PubMed] [Google Scholar]
  • 5. Riancho J, Gil-Bea F, Santurtun A, López de Munaín A. Amyotrophic lateral sclerosis: a complex syndrome that needs an integrated research approach. Neural Regeneration Research. 2019;14(2):193 10.4103/1673-5374.244783 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Oh J, Kim JA. Supportive care needs of patients with amyotrophic lateral sclerosis/motor neuron disease and their caregivers: AÂ scoping review. Journal of Clinical Nursing. 2017;26(23-24):4129–4152. 10.1111/jocn.13945 [DOI] [PubMed] [Google Scholar]
  • 7. Ward AL, Sanjak M, Duffy K, Bravver E, Williams N, Nichols M, et al. Power Wheelchair Prescription, Utilization, Satisfaction, and Cost for Patients With Amyotrophic Lateral Sclerosis: Preliminary Data for Evidence-Based Guidelines. Archives of Physical Medicine and Rehabilitation. 2010;91(2):268–272. 10.1016/j.apmr.2009.10.023 [DOI] [PubMed] [Google Scholar]
  • 8. Elliott MA, Malvar H, Maassel LL, Campbell J, Kulkarni H, Spiridonova I, et al. Eye controlled, power wheelchair performs well for ALS patients. Muscle & Nerve. 2019;60(5):513–519. 10.1002/mus.26655 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Soriani MH, Desnuelle C. Care management in amyotrophic lateral sclerosis. Revue Neurologique. 2017;173(5):288–299. 10.1016/j.neurol.2017.03.031 [DOI] [PubMed] [Google Scholar]
  • 10. Cordesse V, Sidorok F, Schimmel P, Holstein J, Meininger V. Coordinated care affects hospitalization and prognosis in amyotrophic lateral sclerosis: a cohort study. BMC Health Services Research. 2015;15(1). 10.1186/s12913-015-0810-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Gruber TR. Toward principles for the design of ontologies used for knowledge sharing? International Journal of Human-Computer Studies. 1995;43(5):907–928. 10.1006/ijhc.1995.1081 [DOI] [Google Scholar]
  • 12. Dramé K, Diallo G, Delva F, Dartigues JF, Mouillet E, Salamon R, et al. Reuse of termino-ontological resources and text corpora for building a multilingual domain ontology: An application to Alzheimer’s disease. Journal of Biomedical Informatics. 2014;48:171–182. 10.1016/j.jbi.2013.12.013 [DOI] [PubMed] [Google Scholar]
  • 13. Younesi E, Malhotra A, Gündel M, Scordis P, Kodamullil AT, Page M, et al. PDON: Parkinson’s disease ontology for representation and modeling of the Parkinson’s disease knowledge domain. Theoretical biology & medical modelling. 2015;12:20–20. 10.1186/s12976-015-0017-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Jensen M, Cox AP, Chaudhry N, Ng M, Sule D, Duncan W, et al. The neurological disease ontology. J Biomed Semantics. 2013;4:42 10.1186/2041-1480-4-42 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ben Abbès S, Scheuermann A, Meilender T, D’Aquin M. Characterizing Modular Ontologies. In: International Conference on Formal Ontologies in Information Systems (FOIS). Graz, Austria; 2012. p. 13–25.
  • 16. Jiang G, Chute CG. Auditing the Semantic Completeness of SNOMED CT Using Formal Concept Analysis. Journal of the American Medical Informatics Association. 2009;16(1):89–102. 10.1197/jamia.M2541 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Spasic I, Ananiadou S, McNaught J, Kumar A. Text mining and ontologies in biomedicine: making sense of raw text. Brief Bioinformatics. 2005;6(3):239–251. 10.1093/bib/6.3.239 [DOI] [PubMed] [Google Scholar]
  • 18. Charlet J, Bachimont B, Jaulent MC. Building medical ontologies by terminology extraction from texts: An experiment for the intensive care units. Computers in Biology and Medicine. 2006;36(7-8):857–870. 10.1016/j.compbiomed.2005.04.012 [DOI] [PubMed] [Google Scholar]
  • 19. Lossio-Ventura JA, Jonquet C, Roche M, Teisseire M. Biomedical term extraction: overview and a new methodology. Information Retrieval Journal. 2016;19(1-2):59–99. 10.1007/s10791-015-9262-2 [DOI] [Google Scholar]
  • 20. W3C. OWL 2 Web Ontology Language Document Overview (Second Edition). 2012; p. 7. [Google Scholar]
  • 21. Grosjean J, Soualmia LF, Bouarech K, Jonquet C, Darmoni SJ. An approach to compare bio-ontologies portals. Stud Health Technol Inform. 2014;205:1008–1012. [PubMed] [Google Scholar]
  • 22. Pathak J, Johnson TM, Chute CG. Survey of modular ontology techniques and their applications in the biomedical domain. Integrated Computer-Aided Engineering. 2009;16(3):225–242. 10.3233/ICA-2009-0315 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Grau BC, Parsia B, Sirin E, Kalyanpur A. Modularity and Web Ontologies. In: Proceedings of the Tenth International Conference on Principles of Knowledge Representation and Reasoning KR-2006. AAAI Press. Lake District of the United Kingdom; 2006. p. 198–209.
  • 24. Bao J, Honavar V. Adapt OWL as a modular ontology language (A Position Paper). CEUR Workshop Proceedings. 2006;216. [Google Scholar]
  • 25. Charlet J, Bachimont B, Mazuel L, Dhombres F, Jaulent MC, OntoMénélas Bouaud J. Motivations et retours d’expérience sur l’élaboration d’une ontologie noyau de la médecine. Techniques et sciences informatiques. 2012;31(1):125–147. 10.3166/tsi.31.125-147 [DOI] [Google Scholar]
  • 26. Gharebaghi A, Mostafavi MA, Edwards G, Fougeyrollas P, Gamache S, Grenier Y. Integration of the social environment in a mobility ontology for people with motor disabilities. Disability and Rehabilitation: Assistive Technology. 2018;13(6):540–551. [DOI] [PubMed] [Google Scholar]
  • 27. Popejoy LL, Khalilia MA, Popescu M, Galambos C, Lyons V, Rantz M, et al. Quantifying care coordination using natural language processing and domain-specific ontology. J Am Med Inform Assoc. 2015;22(e1):e93–e103. 10.1136/amiajnl-2014-002702 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Shearer R, Motik B, Horrocks I. HermiT: A Highly-Eï¬?cient OWL Reasoner. In: In 5th International Workshop on OWL: Experiences and Directions (OWLED 2008). Karlsruhe, Germany; 2008. p. 10.
  • 29. Matentzoglu N, Malone J, Mungall C, Stevens R. MIRO: guidelines for minimum information for the reporting of an ontology. Journal of Biomedical Semantics. 2018;9(1). 10.1186/s13326-017-0172-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Goldstein LH, Atkins L, Landau S, Brown R, Leigh PN. Predictors of psychological distress in carers of people with amyotrophic lateral sclerosis: a longitudinal study. Psychological Medicine. 2006;36(6):865–875. 10.1017/S0033291706007124 [DOI] [PubMed] [Google Scholar]
  • 31. Gauthier A, Vignola A, Calvo A, Cavallo E, Moglia C, Sellitti L, et al. A longitudinal study on quality of life and depression in ALS patient–caregiver couples. Neurology. 2007;68(12):923–926. 10.1212/01.wnl.0000257093.53430.a8 [DOI] [PubMed] [Google Scholar]
  • 32. Burke T, Hardiman O, Pinto-Grau M, Lonergan K, Heverin M, Tobin K, et al. Longitudinal predictors of caregiver burden in amyotrophic lateral sclerosis: a population-based cohort of patient–caregiver dyads. Journal of Neurology. 2018;265(4):793–808. 10.1007/s00415-018-8770-6 [DOI] [PubMed] [Google Scholar]
  • 33. Cunningham H, Tablan V, Roberts A, Bontcheva K. Getting More Out of Biomedical Documents with GATE’s Full Lifecycle Open Source Text Analytics. PLoS Computational Biology. 2013;9(2):e1002854 10.1371/journal.pcbi.1002854 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Fort K, Ehrmann M, Nazarenko A. Vers une méthodologie d’annotation des entités nommées en corpus? In: Traitement Automatique des Langues Naturelles 2009. Actes de la 16ème Conférence sur le Traitement Automatique des Langues Naturelles. Senlis, France; 2009.
  • 35. Pradat PF, Corcia P, Meininger V. Sclérose latérale amyotrophique. http://www.em-premiumcom/data/traites/ne/17-45800/. 2016;13(2):15. [Google Scholar]
  • 36. Blein C, Chamoux C, Reynaud D, Lepage V. Diversité des prises en charge des patients atteints de sclérose en plaques entre régions françaises. Revue d’Épidémiologie et de Santé Publique. 2018;66(6):385–394. 10.1016/j.respe.2018.08.006 [DOI] [PubMed] [Google Scholar]
  • 37. Charles-Nelson A, Lazzati A, Katsahian S. Analysis of Trajectories of Care After Bariatric Surgery Using Data Mining Method and Health Administrative Information Systems. Obesity Surgery. 2020;30(6):2206–2216. 10.1007/s11695-020-04430-6 [DOI] [PubMed] [Google Scholar]
  • 38. Bakker M, Creemers H, Schipper K, Beelen A, Grupstra H, Nollet F, et al. Need and value of case management in multidisciplinary ALS care: A qualitative study on the perspectives of patients, spousal caregivers and professionals. Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration. 2015;16(3-4):180–186. 10.3109/21678421.2014.971811 [DOI] [PubMed] [Google Scholar]
  • 39. Hobson EV, McDermott CJ. Supportive and symptomatic management of amyotrophic lateral sclerosis. Nat Rev Neurol. 2016;12(9):526–538. 10.1038/nrneurol.2016.111 [DOI] [PubMed] [Google Scholar]
  • 40. Penneau A, Pichetti S, Espagnacq M. Dépenses et restes à charge sanitaires des personnes en situation de handicap avant et après 60 ans. Paris: IDRES; 2019. 571. [Google Scholar]
  • 41. Pagnini. Clinical psychology and amyotrophic lateral sclerosis. Frontiers in Psychology. 2010. 10.3389/fpsyg.2010.00033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Sandstedt P, Littorin S, Widsell GC, Johansson S, Gottberg K, Ytterberg C, et al. Caregiver experience, health-related quality of life and life satisfaction among informal caregivers to patients with amyotrophic lateral sclerosis: A cross-sectional study. Journal of Clinical Nursing. 2018;27(23-24):4321–4330. 10.1111/jocn.14593 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Robert Hoehndorf

17 Sep 2020

PONE-D-20-20582

Use of a modular ontology and a semantic annotation tool to describe the care pathway of patients with amyotrophic lateral sclerosis in a coordination network.

PLOS ONE

Dear Dr. CARDOSO,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Nov 01 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Robert Hoehndorf, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following in the Competing Interests section:

"The authors have declared that no competing interests exist"

We note that one or more of the authors are employed by a commercial company: Cogsonomy.

2.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

2.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.  

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and  there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

3. Please ensure that you refer to all your Figures in your text as, if accepted, production will need this reference to link the reader to the figures.

4. We note you have included tables to which you do not refer in the text of your manuscript. Please ensure that you refer to Tables in your text; if accepted, production will need this reference to link the reader to the Table.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this manuscript, titled ‘Use of a modular ontology and a semantic annotation tool to describe the care pathway of patients with amyotrophic lateral sclerosis in a coordination network’, the authors developed the tools, OntoPaRON and OnBaSAM, to describe the care pathway of patients with ALS based on real-life textual data from the Ile-de-France ALS network. Authors hope to identify the difficulties and needs of patients and their families at home, to understand the coordination actions implemented and to identify situations or types of patients confronted with multiple difficulties by analyzing of the textual data.

Since, the references, figures and tables were not cited correctly, not easy to well understand the manuscript and go through the tools which represented in the references and used in this study. Even though, the model and algorithms used in this study are simple and not novel but authors applied them in a good way.

Then, I’d rather recommend this manuscript to be published neither as current version nor minor-correction version. I believe that they need to re-organize the study to make a serious effort on improving the writing. In the following sections, the specific requirements for future publication of this study are explained in detail.

1) In the ‘Modularity of OntoPaRON’ section, it is mentioned that OntoPaRON has four modules and Table 2 showed four modules, while five modules were defined. Also, there is no reference or explanation about how and why these modules were chosen.

2) The concepts in each module should be listed. In addition, Fig. 2 shows the OntoPaRON inheritance diagram but there is no explanation about how the connecting arrows was drawn in this figure.

3) It is mentioned that the ontology includes 43 fully defined concepts. It is recommended that authors include the list of all fully defined concepts with their concepts as a supplementary data.

4) Authors used a linear regression model to investigate whether the identified themes specifically concerned patients with particular characteristics. First of all, the common term for assigning independent variables in regression is ‘X’ and ‘Y’ for dependent variable. So, it is better to change the terms to prevent the misunderstanding. It would be also interesting to see the feature importance based extracted from regression model to interpret about the importance of each independent variable explaining the fully defined concept.

Minor points

• All the references, figures and tables should be cited well in the entire manuscript.

• I didn’t care much of typos, punctuations and grammar mistakes but there are several mistakes which authors should ask for English proof reading to improve the writing. Also, the authors need to systematically organize the usage of acronyms. Some of them not using anymore through the manuscript after the first occurrence, knowledge engineering (KE), and some mentioned for two times, natural language processing (NLP).

• What is the parenthesis means in Table 3?

• It is recommended that the numbers with more than 3 digits separate by ‘,’ not space and using ‘.’ for decimals.

Reviewer #2: This paper describes the creation of an ontology and associated tool for characterisation and management of patients with ALS, using French textual data. Overall the study is interesting, and has produced several potentially interesting outcomes that consist in an ontology, surrounding analysis tools, and disease insights that could contribute to improved patient management.

My overall comment is that certain aspects of the methodology and results are somewhat unclear, and should be improved before publication. I am, therefore, making the suggestion of major changes, not because the content of the paper is bad (this is not the case), but because some of the argumentation and explanation needs to be reformulated and extended, and there are significant formatting problems that inhibit understanding of the paper.

--

> Our approach was unique inits creation of fully defined concepts at different levels of the modular ontology to address specific topics relating to healthcare trajectories.

I am not entirely sure what this means, there are many ontologies and associated tools that approach characterisation of different facets of disease management (for a recent example, see the COVID-19 ontology http://www.aber-owl.net/ontology/COVID-19/ ). I would suggest that the unique approach here consists more in:

- The application of ontology technology to French language clinical text

- The creation of a new ontology-based semantic annotation / analysis tool

- The use of the ontology to gain insight into the disease (this is mentioned in the latter part of the abstract)

The later methods and results discuss the idea of 'fully defined concepts' in more detail, but it's unclear to me how they differ from the ontology modules, or what exactly makes them fully defined, as opposed to other concepts or modules in the ontology.

--

The introduction discusses ALS deeply, and provides some other examples of disease-specific ontologies. However, little time is spent on applications of ontology to the area of text mining. I would suggest this review as a start: http://cobweb.cs.uga.edu/~kochut/teaching/8350/Papers/Ontologies/TextMining-RawText.pdf

--

> A modular ontology seemed be the most appropriate model for taking all these aspects into account.

It is unclear what a 'modular ontology' is in this context. Also the choice to use a modular design should be better explained. In fact, I note that this term is defined later in the paper, and the choice explained better (although the definition could perhaps be cited). Perhaps this implementation detail can be omitted entirely from the introduction, and left to the later, better explanation?

--

There is a helpful guideline on minimum reporting for ontologies. I do not suggest that any of these , but the authors could add a note https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-017-0172-7

--

I think the explanation of the OnBaSAM tool, starting on page 8, should be further developed. What is unclear to me, is how these tools implement or replace functionality provided by the GATE framework itself. The GATE framework natively supports the use of ontologies as a resource for text mining, as evidenced by the documentation: https://gate.ac.uk/sale/tao/splitch14.html , as well as the pre-processing steps mentioned.

The discussion could also include some mention of how generalisable the tools for text mining and analysis are. Could they be used with other domain ontologies? Are they available on the web?

--

The methods of the validation should be more explicit: what exactly were the expert evaluators asked to do? It's unclear whether they only verified the machine-derived labels, or also created an annotation themselves. In the former case, recall is not an informative measure (or at least, it is misleading, as it does not describe the proportion of the actually existent concepts found).

Optionally, the authors could consider measuring also inter-annotator agreement for the validation stage of their annotations. If not, they should mention potential limitations arising from treating non-perfect/human operators as a gold standard for evaluation. I can't further comment on the evaluation with more information about the methods.

--

The ontology itself is interesting, using its own defined upper-level stratification of terms, with entity, abstract object, and ideal object. I don't think this is necessarily a bad thing, however many biomedical ontologies use the Basic Formal Ontology (BFO), to the extent they make upper level metaphysical distinctions at all. This can, in some cases, help with integration of concepts between different ontologies. It may be worth adding a small discussion of why the authors chose this different method.

--

Citation number are missing in the document text, although they are listed in the references. I would suggest that 'pdflatex' should be run once more, to fill in the citation numbers in the document :-)

In addition, the links given as citation for people living with ALS in France and America in the introduction lead to 404 errors. Several other links included as citations are also broken, which is possibly another issue caused by the above problem.

--

The 'Construction of the OntoPaRON ontology' section mentions a standard format of single quotes for OntoPaRON and italic for relationships in the ontology. It would be helpful if the mention of italic font was italicised here to provide an example. Furthermore, it would aid understanding to highlight such terms in table 1

--

Some previous work that could be interesting for French language ontology is WHOFRE ( http://www.aber-owl.net/ontology/WHOFRE ). Could this work potentially be integrated?

--

In table 1, the language names English and French should be capitalised.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Luke T Slater

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jan 6;16(1):e0244604. doi: 10.1371/journal.pone.0244604.r002

Author response to Decision Letter 0


19 Nov 2020

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

RESPONSE: We have checked and attest that all formatting and style requirements have been met.

2. Thank you for stating the following in the Competing Interests section:

"The authors have declared that no competing interests exist"

We note that one or more of the authors are employed by a commercial company: Cogsonomy.

2.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

RESPONSE Updated Funding statement :

Xavier Aimé works in the company Cogsonomy that he created: This company's sole role is to finance training and other project management assistance for Xavier Aimé. Cogsonomy did not play a role in the study design, data collection and analysis, decision to publish, preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials.

The funders (universities, hospitals) provided support in the form of salaries for authors [SC, PM, VM, DG, GG, JC], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

The specific roles of these authors are articulated in the ‘author contributions’ section.

2.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If this adherence statement is not accurate and there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

RESPONSE updated Funding competing interests:

Xavier Aimé works in the company Cogsonomy that he created. This commercial affiliation (Cogsonomy) does not alter our (all the authors) adherence to PLOS ONE policies on sharing data and materials.

3. Please ensure that you refer to all your Figures in your text as, if accepted, production will need this reference to link the reader to the figures.

RESPONSE: We have reformatted the manuscript according to the above style guidelines.

4. We note you have included tables to which you do not refer in the text of your manuscript. Please ensure that you refer to Tables in your text; if accepted, production will need this reference to link the reader to the Table.

RESPONSE: We have reformatted the manuscript according to the above style guidelines.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I Don't Know

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this manuscript, titled ‘Use of a modular ontology and a semantic annotation tool to describe the care pathway of patients with amyotrophic lateral sclerosis in a coordination network’, the authors developed the tools, OntoPaRON and OnBaSAM, to describe the care pathway of patients with ALS based on real-life textual data from the Ile-de-France ALS network. Authors hope to identify the difficulties and needs of patients and their families at home, to understand the coordination actions implemented and to identify situations or types of patients confronted with multiple difficulties by analyzing of the textual data.

Since, the references, figures and tables were not cited correctly, not easy to well understand the manuscript and go through the tools which represented in the references and used in this study. Even though, the model and algorithms used in this study are simple and not novel but authors applied them in a good way.

Then, I’d rather recommend this manuscript to be published neither as current version nor minor-correction version. I believe that they need to re-organize the study to make a serious effort on improving the writing. In the following sections, the specific requirements for future publication of this study are explained in detail.

RESPONSE: We are grateful for the comments and assessment provided by the reviewer concerning the previous version of our manuscript. Please find below a detailed explanation on how we have attempted to address his/her comments.

1) In the ‘Modularity of OntoPaRON’ section, it is mentioned that OntoPaRON has four modules and Table 2 showed four modules, while five modules were defined. Also, there is no reference or explanation about how and why these modules were chosen.

RESPONSE: Thank you for your comment. We have added an extra column in table 2, to integrate the consolidation module, adding the specificity of this module as a legend. The number of fully defined classes, relationships and concepts present in each module of the OntoPaRON ontology. As shown in the table, the consolidation module does not model any concepts; its main function is to aggregate the four modules.

We chose to use modularity for our ontology, for its reusability, easier management of complexity, customization and extensibility features, defined by different authors. The choice to create these modules results from the analyses made from the initial corpus. The analysis of the term candidates highlighted the fields: medical, socio-environmental and coordination. From this information we decided to create a modular ontology with the integration of each of these domains. For the final ontology to be the most optimal and to be secondarily reusable a core module was needed, as well as a consolidation module to aggregate all the modules. The final ontology of OntoPaRON results from the aggregation of the different modules and the use of the Protege's reasoning. We have modified part of the paragraph Construction of the OntoPaRON ontology, to make it easier to understand our choice of modularization.

These four dimensions oriented us towards the creation of four ontological modules for each of these domains.

2) The concepts in each module should be listed. In addition, Fig. 2 shows the OntoPaRON inheritance diagram but there is no explanation about how the connecting arrows was drawn in this figure.

RESPONSE: Thank you for your comment. It is not possible to list all of the concepts present in the ontology modules, as they represent a total of 2,462 concepts. Table 2, shows the number of classes for each ontology module. All the concepts of the different modules are available on bioportal https://bioportal.bioontology.org/ontologies/ONTOPARON.

To improve the understanding of how to import the modules in Figure 2, we have explained the meaning of import in the legend of the figure. “The arrows point in the direction of the modules that perform the import. Thus, the ontologies of the domain (ontoparonmed: medical ontology, ontoparonsoc: socio-environmental ontology; ontoparoncoord: coordination ontology) import the core ontology. In the same way OntoPaRON ontology imports each of the ontologies of the domain and by inference the core ontology.”

3) It is mentioned that the ontology includes 43 fully defined concepts. It is recommended that authors include the list of all fully defined concepts with their concepts as a supplementary data.

RESPONSE: As you proposed, we have placed in the supporting information section, a table S1 containing a list of most of the fully defined concepts of our ontology and their formal definitions.

S1 Table. List of some of the fully defined concepts in OntoPaRON Ontology. List of some of the defined concepts and their formal definitions, present in each module of the OntoPaRON modular ontology. We recall that the HermiT reasoner infers the membership of all concepts sharing a relationship to a fully defined concept defined with this same relationship.

4) Authors used a linear regression model to investigate whether the identified themes specifically concerned patients with particular characteristics. First of all, the common term for assigning independent variables in regression is ‘X’ and ‘Y’ for dependent variable. So, it is better to change the terms to prevent the misunderstanding. It would be also interesting to see the feature importance based extracted from regression model to interpret about the importance of each independent variable explaining the fully defined concept.

RESPONSE: We thank the reviewer for pointing this out. We have revised the manuscript by adding Table 6 ‘Association between fully defined concepts and clinical variables assessed by linear regression model’ which shows the strength and ranking of the associations. These results underline the importance of the patient's motor state and age in the formulation of requests and needs. To bring more clarity to the reader we have reworded parts of the results paragraph to emphasize the role of these clinical variables.

Minor points

• All the references, figures and tables should be cited well in the entire manuscript.

RESPONSE: We apologize for our mistake. This revised version contains all the references.

• I didn’t care much of typos, punctuations and grammar mistakes but there are several mistakes which authors should ask for English proof reading to improve the writing. Also, the authors need to systematically organize the usage of acronyms. Some of them not using anymore through the manuscript after the first occurrence, knowledge engineering (KE), and some mentioned for two times, natural language processing (NLP).

RESPONSE: We are grateful for the comments, we’ve corrected the typo, and acronyms. We had the article proofread to improve English.

• What is the parenthesis means in Table 3?

RESPONSE: Thank you for your comment. The parenthesis in Table 3 correspond to the percentage of common classes between ontology ontoparon and the reference terminology. In order to promote understanding, we have added a sentence in the table legend. ‘The table shows for each module of the ontology OntoPaRON the number of concepts aligned with the reference terminology, and the percentage of terms aligned in each module’.

• It is recommended that the numbers with more than 3 digits separate by ‘,’ not space and using ‘.’ for decimals.

RESPONSE: We thank the reviewer for pointing this out. We have revised. We have changed all the numeric data in the revised manuscript.

Reviewer #2: This paper describes the creation of an ontology and associated tool for characterisation and management of patients with ALS, using French textual data. Overall the study is interesting, and has produced several potentially interesting outcomes that consist in an ontology, surrounding analysis tools, and disease insights that could contribute to improved patient management.

My overall comment is that certain aspects of the methodology and results are somewhat unclear, and should be improved before publication. I am, therefore, making the suggestion of major changes, not because the content of the paper is bad (this is not the case), but because some of the argumentation and explanation needs to be reformulated and extended, and there are significant formatting problems that inhibit understanding of the paper.

RESPONSE: We are grateful for the comments and assessment provided by the reviewer concerning the previous version of our manuscript. Please find below a detailed explanation on how we have attempted to address his/her comments.

> Our approach was unique inits creation of fully defined concepts at different levels of the modular ontology to address specific topics relating to healthcare trajectories.

I am not entirely sure what this means, there are many ontologies and associated tools that approach characterisation of different facets of disease management (for a recent example, see the COVID-19 ontology http://www.aber-owl.net/ontology/COVID-19/ ). I would suggest that the unique approach here consists more in:

- The application of ontology technology to French language clinical text

- The creation of a new ontology-based semantic annotation / analysis tool

- The use of the ontology to gain insight into the disease (this is mentioned in the latter part of the abstract)

The later methods and results discuss the idea of 'fully defined concepts' in more detail, but it's unclear to me how they differ from the ontology modules, or what exactly makes them fully defined, as opposed to other concepts or modules in the ontology.

RESPONSE: We note in the introduction and conclusion that this is a specific treatment of French. The fact that the annotation tool is based on an ontology has been clarified. The part about fully defined concepts is developed with in particular supporting information and new descriptions.

The introduction discusses ALS deeply, and provides some other examples of disease-specific ontologies. However, little time is spent on applications of ontology to the area of text mining. I would suggest this review as a start: http://cobweb.cs.uga.edu/~kochut/teaching/8350/Papers/Ontologies/TextMining-RawText.pdf

RESPONSE: We appreciate your suggestion. We have included the tasks performed in our work namely information and data mining as reported in the proposed reference.

> A modular ontology seemed be the most appropriate model for taking all these aspects into account.

It is unclear what a 'modular ontology' is in this context. Also the choice to use a modular design should be better explained. In fact, I note that this term is defined later in the paper, and the choice explained better (although the definition could perhaps be cited). Perhaps this implementation detail can be omitted entirely from the introduction, and left to the later, better explanation?

RESPONSE: This observation is correct. We have changed the introduction, including a definition of modular ontology. We have therefore developed our own ontology, OntoPaRON, including these different dimensions in a modular structure. A modular ontology corresponds to a set of modules, where each module is a stand-alone component that maintains relationships with other ontology modules [16].

There is a helpful guideline on minimum reporting for ontologies. I do not suggest that any of these , but the authors could add a note https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-017-0172-7

RESPONSE: Good idea. The metadata has been enriched following the [Miro] principles of ontology to give more accurate information to users.

--

I think the explanation of the OnBaSAM tool, starting on page 8, should be further developed. What is unclear to me, is how these tools implement or replace functionality provided by the GATE framework itself. The GATE framework natively supports the use of ontologies as a resource for text mining, as evidenced by the documentation: https://gate.ac.uk/sale/tao/splitch14.html , as well as the pre-processing steps mentioned.

RESPONSE: We thank the reviewer for pointing this out. The reviewer is correct, GATE allows the use of ontologies as a resource for text mining. However, to be able to use it in our work we had to adapt the Gate modules. In fact GATE is adapted for English text data. The treatment of the French language requires adaptations of GATE, especially in the treatment of negation, but also in the spelling treatment. The revised text reads as follows on: Based on GATE which allows to build a semantic annotation string using an ontology. We have built a specific processing chain that we have adapted to our use case. Indeed, GATE is used and developed in English. Based on GATE, we have built a specific processing chain for our problem, which allows us to take into account the French language, the negation, as well as the export of annotations.

The discussion could also include some mention of how generalisable the tools for text mining and analysis are. Could they be used with other domain ontologies? Are they available on the web?

RESPONSE: Thank you for your comment. We agree and we have developed this in the antepenultimate paragraph of the discussion.

We created a modular ontology for the processing of such data, taking all aspects of the patient care pathway into account: the medical, socio-environmental, and coordination dimensions. The choice of a modular system and the creation of defined concepts made it possible to group together concepts dealing with the same theme from different ontology modules under a defined concept. The themes for the defined concepts were chosen on the basis of published data for ALS. Like Grau [23], we observed the positive aspects of modularity. However, modularity requires constant attention to the positioning of defined concepts and the management and attribution of relationships between concepts.

--

The methods of the validation should be more explicit: what exactly were the expert evaluators asked to do? It's unclear whether they only verified the machine-derived labels, or also created an annotation themselves. In the former case, recall is not an informative measure (or at least, it is misleading, as it does not describe the proportion of the actually existent concepts found).

Optionally, the authors could consider measuring also inter-annotator agreement for the validation stage of their annotations. If not, they should mention potential limitations arising from treating non-perfect/human operators as a gold standard for evaluation. I can't further comment on the evaluation with more information about the methods.

RESPONSE: As suggested by the reviewer, we have provided elements of understanding of the evaluation phase of the annotations made by experts in the field. The experts had the task of evaluating the annotations made by the annotator but also to create manual annotations if some concepts were not annotated. The revised text reads as follows on: During the evaluation process we asked the coordinators both to evaluate the concepts annotated by the system, and to indicate which concepts might be missing, by creating a manual annotation. The detection of missing concepts by the experts allowed us to enrich the ontology.

--

The ontology itself is interesting, using its own defined upper-level stratification of terms, with entity, abstract object, and ideal object. I don't think this is necessarily a bad thing, however many biomedical ontologies use the Basic Formal Ontology (BFO), to the extent they make upper level metaphysical distinctions at all. This can, in some cases, help with integration of concepts between different ontologies. It may be worth adding a small discussion of why the authors chose this different method.

RESPONSE: We agree and we have developed this in the penultimate paragraph of the discussion.

--

Citation number are missing in the document text, although they are listed in the references. I would suggest that 'pdflatex' should be run once more, to fill in the citation numbers in the document :-)

RESPONSE: Thank you for pointing this out. We apologize for our mistake. This revised version contains all the references.

In addition, the links given as citation for people living with ALS in France and America in the introduction lead to 404 errors. Several other links included as citations are also broken, which is possibly another issue caused by the above problem.

RESPONSE: Thank you for pointing this out. We apologize for our mistake. We have changed the links and made the choice to integrate a reference in a bibliography, which will be more reliable. Moisan F, Kab S, Moutengou E, Boussac-Zerebska M, Carcaillon-Bentata L, Elbaz A. Fréquence de la maladie de Parkinson en France. Données nationales et régionales 2010-2015.; p. 69.

--

The 'Construction of the OntoPaRON ontology' section mentions a standard format of single quotes for OntoPaRON and italic for relationships in the ontology. It would be helpful if the mention of italic font was italicised here to provide an example. Furthermore, it would aid understanding to highlight such terms in table 1

RESPONSE: We thank the reviewer for pointing this out. Table 1 illustrates the pseudo-anonymization process and not the semantic annotation process, which is illustrated in Figure 4. The purpose of table 1 is to illustrate with 2 examples the type of textual data we have processed during our work. The textual data contained many personal data (patient's name, professional name, place of residence) but also many abbreviations. In order to be able to process them, we set up a processing chain to transform the nominal data into a concept, which could be replayed during the annotation. Thus, it is possible to identify the interactions between agents.

We have made changes to the legend to clarify these points.

--

Some previous work that could be interesting for French language ontology is WHOFRE ( http://www.aber-owl.net/ontology/WHOFRE ). Could this work potentially be integrated?

RESPONSE: Thank you for your comment. In our context, whofre is an interesting work, but it is not a development that highlights the treatment of French. It is more important for us to highlight the Alzheimer's ontology of Dramé, Diallo and Co. This is what we do now in the introduction (reference 13).

In table 1, the language names English and French should be capitalised.

RESPONSE: We thank the reviewer for pointing this out. We have revised.

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Luke T Slater

Attachment

Submitted filename: Response to Reviewers.pdf

Decision Letter 1

Robert Hoehndorf

14 Dec 2020

Use of a modular ontology and a semantic annotation tool to describe the care pathway of patients with amyotrophic lateral sclerosis in a coordination network.

PONE-D-20-20582R1

Dear Dr. CARDOSO,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Robert Hoehndorf, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: (No Response)

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: (No Response)

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: (No Response)

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: (No Response)

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Acceptance letter

Robert Hoehndorf

21 Dec 2020

PONE-D-20-20582R1

Use of a modular ontology and a semantic annotation tool to describe the care pathway of patients with amyotrophic lateral sclerosis in a coordination network.

Dear Dr. Cardoso:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Robert Hoehndorf

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. List of some of the fully defined concepts in OntoPaRON ontology.

    List of some of the defined concepts and their formal definitions, present in each module of the OntoPaRON modular ontology. We recall that the HermiT reasoner infers the membership of all concepts sharing a relationship to a fully defined concept defined with this same relationship.

    (PDF)

    Attachment

    Submitted filename: Response to Reviewers.pdf

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES