Skip to main content
ESMO Real World Data and Digital Oncology logoLink to ESMO Real World Data and Digital Oncology
. 2024 Jun 3;4:100044. doi: 10.1016/j.esmorw.2024.100044

The 1 million words pathology report or the challenge of a reproducible and meaningful message

C Eloy 1,2,, P Seegers 3, E Bazyleva 4, F Fraggetta 5
PMCID: PMC12836701  PMID: 41647787

Abstract

Many years have passed since the pathology report was all about a single-sentence diagnosis based on morphology. The pathology report is an invaluable source of data that needs to evolve from a narrative reporting to a synoptic reporting system by standardizing data elements to ensure consistency and structured formats that improve completeness, interoperability, and scalability across different health care systems. The convergence of technology, structured data, and artificial intelligence propels the field of pathology toward a future where the synthesis of information benefits not only health care professionals and patients but also serves as a wellspring of knowledge for machines, paving the way for unprecedented strides in data mining and health care innovation.

Key words: digital pathology, digital transformation, pathology report, computational pathology, synoptic reporting

Highlights

  • Pathology reports condense increasingly multiple source information.

  • The amount of information may turn a pathology report into a confusing document.

  • Technology is supporting the organization of data into synoptic reports.

  • Artificial intelligence contributes to organizing data for the benefit of patient care.


Many years have passed since the pathology report was all about a single-sentence diagnosis based on morphology. The growing body of knowledge on each disease turned the pattern recognition activity without an integrated clinical and immunohistochemical/molecular context into an infertile diagnosis exercise, far from the precision medicine aimed at nowadays. The complex armamentarium for disease characterization that is now available to the modern pathologist also provides powerful inputs on evaluating prognosis and predictive markers that computational tools may even augment. The results emanating from all these sources must be organized, integrated, and rapidly managed in a readable, reproducible, and meaningful report. The challenge of reporting under these conditions requires training, knowledge of pitfalls and analytic issues, as well as a relative resistance to stressing elements.

The pathology report is an invaluable source of data that needs to evolve from a narrative reporting to a synoptic reporting system by standardizing data elements to ensure consistency and structured formats that improve completeness, interoperability, and scalability across different health care systems.1 The adequacy of the synoptic reporting is easily observed during multidisciplinary oncology meetings, transforming the analysis of the pathology report into an efficient and more accurate exercise that contributes to better-informed treatment decisions and consequent better patient outcomes.2

Today, pathologists are facing the challenge of translating a report with 1 million words into a reproducible and meaningful message, refining reporting methods and improving data-sharing standards, for the benefit of pathology pairs, clinicians, registrars, researchers, data scientists, quality control managers, and ultimately, patients.3,4 Although there are some attempts to standardize reporting, particularly in relation to cancer diagnosis, there is still room for imagination in the reporting of cytology and in the reporting of benign conditions. Coping with information overload can go over the limits in molecular pathology reporting due to the increasing volume of genomic data. Beyond standards in the interpretation and reporting of sequence variants in next-generation sequencing and whole genome sequencing, metadata, such as the quality and type of the specimen, combined with technical aspects, also contribute to the cumulative data to be reported.5,6 Questions on the reporting of molecular data with unproven clinical relevance or on conflicting pathologist–computer evaluation are also moot points still under discussion, that for now are generating extra data to include in the report as well. Studies suggest that formatting options such as column formats and shorter sentences contribute to better understanding and faster information retrieval, preventing overwhelming the users.7 In general, there is a need to compartmentalize information into subjects, separating the report about the sample from technical specifications, explanations, and disclaimers. The need to include in the report institutional data (name of the laboratory, management and responsibilities, quality control policy), technical specifications relevant to the exploitation of results/medico-legal purposes (description of the study, technical procedures, results of validation and performance tests, and others), and patient data in sufficient detail is also of paramount importance to avoid mislabelling while preserving confidentiality (Figure 1).

Figure 1.

Figure 1

Information management at the pathology report level.

Recently, technology and the digital transformation of pathology laboratories8 have contributed to instigating the organization of data and respective reporting. Practical problems, related to the availability of specialized laboratory information systems (LISs) designed to facilitate standardization with a dedicated lexicon and speech recognition, are starting to be solved. Further, the convergence of informatics standards for producing reports in formats such as JavaScript Object Notation (JSON) or eXtensible Markup Language (XML) is contributing to interoperability and intraoperability.7 The integration of artificial intelligence (AI) tools, including natural language processing and large language models, into LISs is becoming increasingly common.9,10 These AI technologies can efficiently organize and translate information from unstructured pathology reports into structured formats. These structured data are then easily retrievable for statistical analysis. However, a significant challenge with these models lies in their dependency on the input content: information that is not included in the reports cannot be structured.11 Therefore the use of comprehensive synoptic reports is crucial. Such reports ensure that all necessary data are captured, preventing issues such as incomplete information, outdated classifications, or the omission of data critical for patient management. Ultimately, the goal is to go along with the patient movement, from institution to institution, from country to country, producing a reporting frame that is transversal at the national level or even at the international level. Generating synoptic reports that can be used at a national level is a complex process that encompasses activities on the range of regulations and laws, organizational policies, care processes, information requirements, applications, and informatics infrastructure.12

For each national activity regarding synoptic reporting, a jurisdiction frame under which the organization operates must be defined according to the regulations provided by governmental and supervisory authorities, including those at a European level such as the General Data Protection Regulation (GDPR) and European Health Data Space (EDHS). This frame will allow the organizational nation, as a general umbrella, or smaller organizational units, to establish their management policies to ensure all procedures for supporting the implementation of synoptic reporting. The maintenance of implemented synoptic reports, including core and noncore datasets, requires a centralized organization that depends on the chosen informatic application and on the information requirements that will incorporate the synoptic report itself. These information requirements consist of minimal datasets (core elements) inspired by international organizations such as the International Collaboration on Cancer Reporting (ICCR) or the College of American Pathologists (CAP), along with national guidelines. In addition, including the World Health Organization Classification of Tumours, the tumour–node–metastasis classifications, and international code systems, such as SNOMED Clinical Terms (SNOMED CT) or Logical Observation Identifiers Names and Codes (LOINC), for oncology is essential.13, 14, 15, 16, 17, 18, 19 The informatic application selected to manage this information will be storing, structuring, processing, analysing, or communicating information that are crucial factors for successfully adapting the use of synoptic report datasets. The level of implementation is influenced significantly by whether the application is centralized or locally implemented in an LIS. Ellis and Srigley20 describe the pathology report’s six reporting levels, ranging from basic (level 1) to advanced and standardized structured reports (level 6). Applications (standalone) or software (integrated into LISs) used for decision making in diagnosis or therapy purposes fall under class IIa according to the European Medical Device Regulations.21 At this national level, requirements on interoperability and intraoperability demand the use of advanced standards such as Fast Healthcare Interoperability Resources (FHIR) and open Electronic Health Record (openEHR).22,23

Unfortunately, both at the individual level and at the national level, synoptic reporting is still not adopted by all pathologists. Pathology and radiology reports parallel each other to some extent.24 In radiology, structured reporting is increasingly being used, specifically in the setting of multiparametric magnetic resonance imaging reports, and clinical disciplines are following,25 but the global level of implementation, as in pathology, is also far from being universal.26

Pathology has been identified as a strong candidate for AI development, particularly in the field of cancer diagnosis (potentiates early-stage diagnosis, refines disease classification, shortens turnaround time) and tissue biomarker analytics (quantification for precision and prediction of outcomes).27,28 During the clinical exercise of Pathology occurs the generation of data necessary for training AI algorithms, either images or report contents, that are required for the deployment of AI tools. Synoptic reports play a crucial role by creating databases with standardized templates, facilitating the acquisition of structured data of a consistent quality ready to train AI algorithms. This structured approach ensures that algorithms receive the cleanest data possible, thereby accelerating the aforementioned processes at significantly higher rates. In essence, synoptic reports may act as a catalyst, streamlining the flow of information to AI systems and optimizing their performance.

As we navigate the intersection between AI and pathology, it becomes evident that the meaningful message conveyed through synoptic reports is no longer confined to the understanding of pathologists and oncologists or the direct benefit to patients. The convergence of technology, structured data, and AI propels the field of pathology toward a future where the synthesis of information benefits not only health care professionals and patients but also serves as a wellspring of knowledge for machines, paving the way for unprecedented strides in data mining and health care innovation, and so creating a quality loop.

Acknowledgments

Funding

None declared.

Disclosure

CE consulted for MSD, Leica, and Mindpeak. All other authors have declared no conflicts of interest.

Data sharing

No data were generated.

Ethics approval and consent to participate

Formal ethical approval was not required for this study, as it did not involve any patient data collection or impact on patient care.

Declaration of generative ai and ai-assisted technologies in the writing process

Generative AI was not used.

References

  • 1.Marini N., Marchesin S., Otalora S., et al. Unleashing the potential of digital pathology data by training computer-aided diagnosis models without human annotations. NPJ Digit Med. 2022;5(1):102. doi: 10.1038/s41746-022-00635-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Schaad N., Berezowska S., Perren A., Hewer E. Impact of template-based synoptic reporting on completeness of surgical pathology reports. Virchows Arch. 2024;484(1):31–36. doi: 10.1007/s00428-023-03533-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Baranov N.S., Nagtegaal I.D., van Grieken N.C.T., et al. Synoptic reporting increases quality of upper gastrointestinal cancer pathology reports. Virchows Arch. 2019;475(2):255–259. doi: 10.1007/s00428-019-02586-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.de Neree Tot Babberich M.P.M., Ledeboer M., van Leerdam M.E., et al. Dutch gastrointestinal endoscopy audit: automated extraction of colonoscopy data for quality assessment and improvement. Gastrointest Endosc. 2020;92(1):154–162.e1. doi: 10.1016/j.gie.2020.01.052. [DOI] [PubMed] [Google Scholar]
  • 5.Li M.M., Datto M., Duncavage E.J., et al. Standards and guidelines for the interpretation and reporting of sequence variants in cancer: a joint consensus recommendation of the Association for Molecular Pathology, American Society of Clinical Oncology, and College of American Pathologists. J Mol Diagn. 2017;19(1):4–23. doi: 10.1016/j.jmoldx.2016.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Campbell J.R., Talmon G., Cushman-Vokoun A., Karlsson D., Scott Campbell W. An extended SNOMED CT concept model for observations in molecular genetics. AMIA Annu Symp Proc. 2016;2016:352–360. [PMC free article] [PubMed] [Google Scholar]
  • 7.Malapelle U., Donne A.D., Pagni F., et al. Standardized and simplified reporting of next-generation sequencing results in advanced non-small-cell lung cancer: practical indications from an Italian multidisciplinary group. Crit Rev Oncol Hematol. 2024;193 doi: 10.1016/j.critrevonc.2023.104217. [DOI] [PubMed] [Google Scholar]
  • 8.Eloy C. Postponing evolution: why are we choosing to ignore the need for a digital transformation in pathology? Virchows Arch. 2023 doi: 10.1007/s00428-023-03714-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Cazzaniga G., Eccher A., Munari E., et al. Natural language processing to extract SNOMED-CT codes from pathological reports. Pathologica. 2023;115(6):318–324. doi: 10.32074/1591-951X-952. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Truhn D., Loeffler C.M., Muller-Franzes G., et al. Extracting structured information from unstructured histopathology reports using Generative Pre-trained Transformer 4 (GPT-4) J Pathol. 2024;262(3):310–319. doi: 10.1002/path.6232. [DOI] [PubMed] [Google Scholar]
  • 11.Sluijter C.E., van Lonkhuijzen L.R.C.W.3, van Slooten H.J., Nagtegaal I.D., Overbeek L.I.H. The effects of implementing synoptic pathology reporting in cancer diagnosis: a systematic review. Virchows Arch. 2016;468(6):639–649. doi: 10.1007/s00428-016-1935-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Sprenger M. Electronic information for health and care services. https://nictiz.nl/app/uploads/2022/02/Electronic-information-for-health-and-care-services-Nictiz-2021.pdf Available at.
  • 13.Published Datasets—ICCR Datasets for a consistent, evidence-based approach for reporting of cancer. https://www.iccr-cancer.org/datasets/published-datasets/ Available at.
  • 14.College of American Pathologists Cancer protocol templates. https://www.cap.org/protocols-and-guidelines/cancer-reporting-tools/cancer-protocol-templates Available at.
  • 15.The World Health Organization. The World Health Organization Classification of Tumours. Geneva, Switzerland: The World Health Organization.
  • 16.Union for International Cancer Control (UICC). TNM Classification of Malignant Tumours. Geneva, Switzerland: Union for International Cancer Control.
  • 17.American Joint Committee on Cancer (AJCC). Cancer staging systems. Available at https://www.facs.org/quality-programs/cancer-programs/american-joint-committee-on-cancer/cancer-staging-systems/. Accessed April 20, 2024.
  • 18.SNOMED International. https://www.snomed.org/use-snomed-ct Available at.
  • 19.LOINC Introducing the LOINC ontology: a LOINC and SNOMED CT interoperability solution. https://loinc.org/ Available at.
  • 20.Ellis D.W., Srigley J. Does standardised structured reporting contribute to quality in diagnostic pathology? The importance of evidence-based datasets. Virchows Arch. 2016;468(1):51–59. doi: 10.1007/s00428-015-1834-4. [DOI] [PubMed] [Google Scholar]
  • 21.Medical Device Regulations Guidance on qualification and classification of software in regulation (EU) 2017/745 – MDR and regulation (EU) 2017/746 – IVDR. https://health.ec.europa.eu/system/files/2020-09/md_mdcg_2019_11_guidance_qualification_classification_software_en_0.pdf Available at.
  • 22.Monteiro S.C., Cruz Correia R.J. FHIR based interoperability of medical devices. Stud Health Technol Inform. 2022;290:37–41. doi: 10.3233/SHTI220027. [DOI] [PubMed] [Google Scholar]
  • 23.Mascia C., Uva P., Leo S., Zanetti G. OpenEHR modeling for genomics in clinical practice. Int J Med Inform. 2018;120:147–156. doi: 10.1016/j.ijmedinf.2018.10.007. [DOI] [PubMed] [Google Scholar]
  • 24.Hartung M.P., Bickle I.C., Gaillard F., Kanne J.P. How to create a great radiology report. Radiographics. 2020;40(6):1658–1670. doi: 10.1148/rg.2020200020. [DOI] [PubMed] [Google Scholar]
  • 25.Bedel A., Blache G., Jauffret C., et al. A computer synoptic operative report versus a report dictated by a surgeon in advanced ovarian cancer. Int J Gynecol Cancer. 2024;34(4):581–585. doi: 10.1136/ijgc-2023-004947. [DOI] [PubMed] [Google Scholar]
  • 26.Nobel J.M., van Geel K., Robben S.G.F. Structured reporting in radiology: a systematic review to explore its potential. Eur Radiol. 2022;32(4):2837–2854. doi: 10.1007/s00330-021-08327-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Zhang T., Chen J., Lu Y., Yang X., Ouyang Z. Identification of technology frontiers of artificial intelligence-assisted pathology based on patent citation network. PLoS One. 2022;17(8) doi: 10.1371/journal.pone.0273355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Cui M., Zhang D.Y. Artificial intelligence and computational pathology. Lab Invest. 2021;101(4):412–422. doi: 10.1038/s41374-020-00514-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from ESMO Real World Data and Digital Oncology are provided here courtesy of Elsevier

RESOURCES