Skip to main content
BMJ Health & Care Informatics logoLink to BMJ Health & Care Informatics
. 2026 Apr 2;33(1):e101821. doi: 10.1136/bmjhci-2025-101821

Robotic process automation for identifying missing codes on insurance claims

Jiyun Lee 1, Jun Hwan Cho 1,, Won Joo Lee 1, Dong Hoon Kim 1, Ye Lim Gong 1, Won Tae Kim 1, Chan Woong Kim 1
PMCID: PMC13052645  PMID: 41927105

Abstract

Objectives

This study aimed to develop and implement robotic process automation (RPA) for identifying missing codes during insurance claim post-review at a tertiary hospital and to evaluate its feasibility and effectiveness

Methods

As a single-centre, operational implementation, an RPA system integrated with optical character recognition (OCR) and electronic medical record (EMR) platforms was developed using Blue Prism. The system compared 532 surgical procedure codes with 21 cutting device codes, automatically flagging discrepancies. Accuracy and efficiency were compared with manual review.

Results

Between 1 and 31 May 2025, the RPA system analysed 61 claim statements and performed 199 OCR processes. The Google Cloud Vision API (application programming interface) achieved 100% detection accuracy without false positives, while Tesseract yielded lower accuracy. The RPA reduced average processing time from 120 (manual review) to 54 min, representing a 55% efficiency gain.

Discussion

RPA reliably automated repetitive, rule-based administrative tasks, improving accuracy and standardisation of insurance claim audits. Secure system architecture ensured compliance with healthcare data protection standards. User-centred development and integration with EMR demonstrated feasibility in complex healthcare workflows.

Conclusion

Implementing RPA for insurance claim post-review significantly enhanced efficiency and accuracy, reduced administrative workload and provided a scalable model for digital transformation in healthcare administration.

Keywords: Electronic Health Records, Medical Informatics Applications


WHAT IS ALREADY KNOWN ON THIS TOPIC

  • Robotic process automation (RPA) has been widely used in industries such as finance and manufacturing to improve efficiency and reduce human errors.

  • However, its application in healthcare has been limited due to complex workflows, unstructured electronic medical record (EMR) data and strict regulatory requirements.

WHAT THIS STUDY ADDS

  • This study demonstrates the successful development and implementation of an RPA system integrated with optical character recognition and EMR platforms for identifying missing codes in insurance claim post-review.

  • This system reduced processing time by 55% compared with manual review and achieved high accuracy, establishing a secure and standardised automation model for healthcare administration.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

  • The findings provide evidence that RPA can improve operational efficiency, accuracy and standardisation in healthcare administration.

  • This approach may encourage broader adoption of RPA in clinical and administrative workflows, support hospital revenue optimisation and enhance understanding of the potential role of automation in digital transformation of healthcare systems.

Introduction

South Korea’s National Health Insurance System (NHIS) is a mandatory single-payer social insurance scheme that provides universal coverage, funded mainly by insurance premiums and supplemented by government subsidies.1

Health insurance claims are jointly administered by the NHIS, which oversees premium collection, funds management and reimbursements, and the Health Insurance Review and Assessment Service (HIRA), which reviews claims and assesses the appropriateness of care. Reimbursement is predominantly fee-for-service under a resource-based relative value framework. Healthcare providers submit claims to HIRA for consultations, diagnostic tests, procedures, surgeries and anaesthesia. HIRA verifies claims to prevent unnecessary or inappropriate services and ensure proper reimbursement; reviewed results are then transferred to NHIS for payment processing. Although this framework supports efficient resource utilisation, it also creates administrative complexity for hospitals. Detecting omissions and errors in claims requires precise manual review of large datasets, which is time-consuming and labour-intensive. Therefore, improving both the accuracy and efficiency of the claims process is essential for optimising hospital revenue while maintaining sustainable healthcare expenditures.

Robotic process automation (RPA) uses software robots to automate repetitive, rule-based computer tasks such as data entry, form filling, file transfer and simple decision-making.2 RPA has improved efficiency, accuracy, processing speed and labour utilisation across multiple industries.3,9 However, healthcare settings remain challenging because of complex workflows, privacy and security requirements, and the need for integration across heterogeneous systems. This study aimed to evaluate the use of RPA tools to automatically identify and optimise missing insurance claim codes at Chung-Ang University Gwangmyeong Hospital, thereby improving operational efficiency.

Methods

Definition of the task

Most hospitals in Korea regularly submit claims to HIRA twice a month. Some of them are returned or reduced due to non-compliance with official notification standards or insufficient medical records. To prevent such losses, prior to submitting the statements, the hospitals conduct an internal ‘post-review process,’ which is a final audit to verify whether the claim statements comply with the National Health Insurance benefit criteria, relevant official notifications and review guidelines. This audit is critical not only for ensuring the accuracy and appropriateness of insurance claims but also for maintaining the quality of healthcare services. The key components of the ‘post-review process’ are as follows.

  • Identification of missing claim codes: Screening for treatment, procedures, medications and medical materials that were actually performed or provided, but were not reflected in the claim statement.

  • Review of items not meeting reimbursement criteria: Verifying that the essential requirements for claims, such as diagnosis codes or test results, are met.

  • Analysis of potential deduction risks: Proactively identifying and reviewing items with a high likelihood of being denied reimbursement because of past deduction patterns or discrepancies between clinical records and claims content.

These elements are screened by cross-referencing the hospital’s electronic medical record (EMR) system. When discrepancies or errors are identified, appropriate actions are taken, such as correcting claims, supplementing medical records or providing feedback to the clinical department. The standardised post-review process is illustrated in figure 1. As this study did not involve human participants, ethical approval from the Institutional Review Board was deemed unnecessary.

Figure 1. The standardised post-review process for insurance claims. EMR, electronic medical record.

Figure 1

Evaluation for automation feasibility

To assess the feasibility of automation, we evaluated candidate tasks using four predefined criteria—structure, repetitiveness, clarity and stability—with detailed definitions and operational examples provided in online supplemental table 1.

Based on these criteria, the screening of the mapping between surgical procedure codes and cutting device codes (eg, burrs and saws) in the post-review process was identified as highly suitable for RPA-based automation. The task follows a clearly defined workflow based on standardised notification tables and does not involve unstructured data processing, variable process sequencing or clinical judgements, all of which commonly hinder automation in healthcare systems. This ensured technical and operational stability. The task was also prioritised because omission of corresponding codes can result in complete denial of reimbursement for high-cost medical procedures.

The process involved rule-based comparison of 532 surgical procedure codes and 21 cutting-device codes. As it is performed during the post-review stage—after prescriptions are finalised and immediately before claim submission—data variability is minimal, improving automation reliability. Compared with the pre-discharge review stage, where prescriptions are frequently changed, the post-review phase provides a more favourable environment for RPA application.

In our institution, the EMR operates on a legacy architecture, in which surgical and cutting-device prescription data are displayed mainly as rendered on-screen images rather than structured text fields. Therefore, conventional RPA methods (eg, direct extraction of text-field values or user interface control-based access) alone were insufficiently reliable, and optical character recognition (OCR) was required to extract machine-readable text from visual information. Additionally, it is necessary to embed a reference table within the system that maps the surgical fee codes and cutting device items, as defined by the Ministry of Health and Welfare. A rule-based comparison logic is needed to handle a total of 532 surgical procedure codes and 21 cutting devices. Finally, the system must automatically determine whether the code extracted through OCR matches the reference table and flag any discrepancies. It is particularly important to classify and notify the discrepancies based on their causes, such as omissions or code errors, using a dedicated guidance system.

Development environment

The automated system for screening missing codes during the post-review process was implemented using Blue Prism (Blue Prism Group plc., Warrington, UK). Blue Prism provides a user-friendly development environment that allows frontline staff to participate directly in development at relatively low cost, and it reliably supports screen recognition-based automation, making it well suited to institutional environments that rely on legacy programmes. It was integrated with the hospital’s EMR system to validate its applicability to real-world clinical settings. The main development and execution environments are as follows.

  • Operating system: Windows 10 64 bit.

  • RPA solution: Blue Prism V.7.1.

  • OCR engine: Tesseract, Google Cloud Vision API.

  • EMR: CAUH-NURI (data extracted from prescription screens using image-based capturing).

  • Data processing: Image-based OCR extraction and reference table mapping algorithm applied.

Security

To implement RPA in healthcare institutions, it is essential to ensure the confidentiality, integrity and availability of patient information in accordance with relevant regulations, such as the Health Insurance Portability and Accountability Act (HIPAA), Personal Information Protection Act, Medical Service Act and International Organisation for Standardisation (ISO) 27799. To meet these requirements, we applied a validated enterprise-grade RPA architecture equipped with ISO 27001 and System and Organisation Control 2 certifications.

The RPA session logs and system events are aggregated on a centralised server, enabling real-time monitoring and auditing. The centralised governance module provides data location control, making it particularly suitable for healthcare environments with strong on-premise infrastructure requirements.

To mitigate Single Point of Failure risks, we implemented a redundant log collection pipeline and adopted Write-Once-Read-Many storage media. In addition, protected health information masking rules were applied to prevent logs from becoming a new source of sensitive data.

All credentials were encrypted using the Advanced Encryption Standard (AES)-256, and all data, both at rest and in transit, were secured with AES-256/Transport Layer Security (TLS) 1.2 or higher. In cases where legacy medical equipment supports only TLS 1.0, re-encryption is performed at the application programming interface (API) gateway, thereby maintaining consistent encryption standards across the entire architecture to protect patient information. RPA security and governance are shown in online supplemental figure 1.

Architecture

The complete system architecture is presented in table 1. The automated system, designed using RPA technology, integrates with the hospital’s EMR system, performs automated validation based on official notification criteria, incorporates OCR processing and includes a notification system.

Table 1. System architecture diagram.

Layer Component Function Data I/O
User layer
  • EMR

  • View RPA results

I: Alerts
O: Result
Process layer
  • RPA main process

  • Object modules

  • Work queue

  • Exception handler

  • Receive OCR results

  • Match against EDI code

  • Handle exceptions

I: OCR text, EDI tables
O: Processed results, classified items, exception logs
Integration layer
  • API

  • Notification system

  • Extract text from images

  • Retrieve reference data

  • Send report

I: EMR Images, EDI tables
O: OCR text, match results, result report
Data layer
  • Image storage

  • OCR result storage

  • EDI code storage

  • Log storage

  • Store source images

  • Save data

  • Maintain logs

SAVE: EMR Image, OCR outputs, EDI tables
Infrastructure layer
  • RPA server

  • API gateway

  • Security infrastructure

  • Provide runtime environment

  • Ensure secure communication

  • Operate on-premise

System resources: Server, network, authentication modules

API, application programming interface; EDI, electronic data interchange; EMR, electronic medical record; I/O, input/output; OCR, optical character recognition; RPA, robotic process automation.

RPA workflow

Figure 2 illustrates the RPA system workflow. The system is designed to sequentially screen post-review targets based on each electronic data interchange (EDI) code using a predefined reference table that maps surgical procedure codes with cutting devices. The system extracts information from prescription screens and determines whether the prescription matches the criteria defined in the reference table. The process begins by loading the reference table during the initialisation phase. For each surgical EDI code, the system queries relevant patients, captures their prescription screens and uses OCR technology to extract code information. The extracted data are then compared against the reference table to ensure that the treatment material codes match. In the case of a mismatch or omission, the case would be flagged and stored as an exception. Once all patients and EDI codes have been reviewed, the system automatically generates a report and sends it to the reviewer, completing the process. This design aims to reduce omissions, errors and inefficiencies commonly encountered in manual post-review processes while also enhancing the accuracy of claim audits. Our implementation incorporates transferable patterns that can be applied across institutions and jurisdictions, while also including several highly context-specific elements. Further details are provided in online supplemental table 2.

Figure 2. RPA flowchart. EDI, electronic data interchange; OCR, optical character recognition.

Figure 2

Results

RPA implementation

The developed RPA system automatically performed screening based on the presence of surgical procedure codes related to cutting devices. All 532 surgical EDI codes were entered into the system, which systematically searched for each code. It identified patients with corresponding prescription records and analysed the detailed information through OCR processing. The configuration of RPA objects and their processes is listed in online supplemental table 3. The RPA is composed of 1 main process and 10 functional objects, with each object designed to handle a specific functional task during automation. This modular design enhances code reusability, allowing for minimal modifications in the event of future process changes or OCR engine replacements, thereby improving maintenance efficiency.

Online supplemental table 4 summarises the key results of the post-review process for missing code screening conducted from 1 May to 31 May 2025. During this period, 61 claim statements were analysed. OCR processing was performed 199 times on the surgical and cutting device prescription screens. The extracted codes were compared with the reference table to identify missing codes. It took an average of 54 min to complete the full process and deliver the report to the designated reviewer via email. The report included the RPA execution date, EDI codes recognised through OCR that 16 were classified as missing after mapping with the reference table, and prescription details that allowed identification of the prescriptions associated with the missing items.

Accuracy evaluation

The accuracy of the RPA system was assessed by comparing its results with the missing codes identified by the reviewers, including both the consistency of the RPA-detected codes with the reference set and the number of false positives. Additionally, the differences in accuracy were analysed based on the OCR engine used. These findings are summarised in table 2.

Table 2. Comparison of OCR engine accuracy.

Case Actual missing cases RPA result (cases)* Accuracy (%)
Tesseract Google Cloud Vision Tesseract Google Cloud Vision
1 10 13 10 76.92 100
2 10 10 10 100 100
3 10 10 10 100 100
4 10 13 10 76.92 100
5 10 10 10 100 100
6 10 10 10 100 100
7 10 10 10 100 100
8 10 0 10 0 100
9 10 13 10 76.92 100
10 10 13 10 76.92 100
Average 80.77 100
*

RPA result (cases) indicates the number of missing cases detected (identified) by each OCR engine.

OCR, optical character recognition; RPA, robotic process automation.

Efficiency evaluation

To evaluate efficiency, we compared the RPA-based screening with the traditional manual post-review method for workflow structure and time requirements (table 3). The conventional approach required six reviewers to screen 532 surgical codes, 21 cutting device codes, with each reviewer spending approximately 20 min, amounting to a total of 120 person-minutes. In contrast, RPA completed the same task in approximately 54 min without human intervention. Therefore, the efficiency gain of the RPA-based approach is primarily attributable to reduced labour input, while also offering the advantage of simplifying the review workflow.

Table 3. Comparison between manual process and RPA process.

Work method Total personnel Time per person Total process time Human involvement
Manual process 6 people 20 min 120 person-minutes Yes
RPA process 0 person 54 min No

RPA, robotic process automation.

Hence, applying RPA to repetitive rule-based post-review processes was highly efficient. Furthermore, full automation eliminated the need for direct reviewer involvement, thereby reducing workload and enabling personnel to focus on high-value tasks, such as audit analysis and decision-making.

Discussion

We improved the operational efficiency of processing insurance claims by using RPA to automate the identification and optimisation of missing codes. By replacing the manual tasks with RPA, the repetitive and detailed administrative tasks were simplified and standardised, reducing the overall processing time by 55%. In addition, task roles and environments improved, leading to increased task satisfaction. The automation of healthcare administrative tasks provides valuable insights into the future digital transformation of healthcare.

Benefits of RPA

One of the biggest benefits of RPA is the improvement in efficiency and productivity. With RPA, more tasks can be processed with less labour and in less time, reducing the wait time.58,11 Another advantage is the improvement in accuracy. It effectively reduces the organisational costs by minimising human error associated with manual operations and enhancing the quality and accuracy of the data, ultimately contributing to improved productivity.9 12 13 In addition, RPA improves business processes due to increased transparency and standardisation. Because the processes performed by the RPA strictly follow the rules set by the organisation, there is little room for arbitrary human judgement, enhancing procedural fairness and transparency.813,15 Lastly, as RPA reduces the time spent on simple and repetitive tasks, it allows humans to focus more on unstructured tasks and those requiring creativity. At the organisational level, human resources can be allocated to high-value tasks and new business development to promote innovation and creativity.9 14 16

RPA in healthcare

Although RPA has been widely adopted in various industries, such as finance and human resources, its application in the healthcare sector remains relatively limited. This slow uptake can be attributed to the inherent complexity and sensitivity of medical data, which make the initial setup and integration of RPA particularly challenging. In addition, the healthcare domain is subject to stringent regulatory requirements, necessitating robust security measures when implementing automated solutions. To address these concerns, we adopted an enterprise-grade RPA architecture equipped with certified security features. A centralised governance module was developed, enabling real-time monitoring and auditability via a server-based aggregation system. This module incorporates data encryption mechanisms to ensure the protection of sensitive patient information.

Another key challenge lies in integrating the RPA with core clinical systems, such as EMR. Most EMR systems lack standardisation, and medical data are predominantly unstructured. Furthermore, the healthcare environment is characterised by a diverse workforce, including physicians, nurses and administrative staff, each operating under institution-specific workflows. Automation requires a high degree of customisation, and tailoring RPA to individual or institutional needs remains a complex task. To overcome this problem, we developed an OCR-based RPA system capable of extracting and processing text data from image-based EMR screens, thereby enabling effective system integration. EMR systems vary widely across institutions, and many hospitals continue to retain legacy architectures because core clinical and administrative workflows are already tightly and complexly interwoven. Our institution’s EMR is also operated on a legacy architecture; surgical and cutting-device prescription data are displayed as rendered on-screen images rather than as structured text fields. Therefore, OCR was necessary to obtain data in a format recognisable by the RPA. The high OCR accuracy observed in this study can be attributed to the standardised format and stable rendering characteristics of the EMR prescribing screen, as well as to the fact that the recognition targets consisted of limited code-based and number-based strings, resulting in relatively low variability in both screen layout and text content. In addition, because UI updates are infrequent and the same screen templates are used consistently in our EMR environment, OCR-based RPA can operate in a stable and robust manner. However, the maintenance burden may increase when screen layouts change. To minimise the impact of such changes, it is necessary to standardise key screens and recognition regions rather than attempting comprehensive full-screen recognition. Moreover, as changes in rendering methods or the introduction of new pop-up windows may affect OCR performance, complementary safeguards are required, such as header-based or label-based region detection, code-format rule validation (pattern mismatch detection), and exception-handling strategies (eg, retry logic or prompts for manual review when exceptions occur).

From a long-term maintenance perspective, the automation logic remains stably fixed, while EDI mappings are separated and managed independently in reference tables. This enables rapid response to policy updates or code additions/deletions by updating only the tables, without modifying the core logic.

This process, which combines rule-based reconciliation with OCR recognition patterns, can be reused and scaled not only for other claim items but also for tasks in departments beyond insurance review. A bottom-up approach to automation driven by the direct participation of frontline staff is a critical success factor, as it enables end users to rapidly automate recurring review tasks while allowing maintenance to be led by the task owner. In the absence of a thorough process analysis by end users, the effectiveness of RPA is significantly reduced; conversely, user-driven automated solutions tend to align more closely with real-world operational needs and are positively associated with improved employee well-being and resource utilisation.

During the development process, we systematically analysed the workflows of different staff members performing the same task and established standardised procedures. This allowed us to define the automation targets and understand the process-specific characteristics. Continuous communication and collaboration between field personnel and the RPA development team were vital in ensuring successful implementation.

Our findings suggest transferable design patterns for healthcare RPA–feasibility-based task selection, logic–data separation via reference tables, screen-level OCR integration and robustness measures (ROI/anchor-based extraction, format validation, rule-based reconciliation and monitoring for drift). However, key context-specific dependencies include the EMR screen layout, local coding and mapping rules, and the post-review workflow timing, which may vary across institutions. Accordingly, generalisability should be interpreted in the context of local insurance rules and EMR interface characteristics.

In conclusion, a user-centred approach that begins with process standardisation and emphasises empathy for the roles and perspectives of the personnel involved is a key determinant of successful RPA integration in healthcare settings.

Supplementary material

online supplemental table 1
bmjhci-33-1-s001.docx (20.6KB, docx)
DOI: 10.1136/bmjhci-2025-101821
online supplemental figure 1
bmjhci-33-1-s002.docx (112.9KB, docx)
DOI: 10.1136/bmjhci-2025-101821

Footnotes

Funding: The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Provenance and peer review: Not commissioned; externally peer reviewed.

Patient consent for publication: Not applicable.

Ethics approval: Not applicable.

Data availability statement

Data are available upon reasonable request.

References

  • 1.World Health Organization . Republic of Korea Health System Review. Vol. 5. Manila: WHO Regional Office for the Western Pacific; 2015. [Google Scholar]
  • 2.Syed R, Suriadi S, Adams M, et al. Robotic Process Automation: Contemporary themes and challenges. Computers in Industry . 2020;115:103162. doi: 10.1016/j.compind.2019.103162. [DOI] [Google Scholar]
  • 3.Aguirre S, Rodriguez A. Applied Computer Sciences in Engineering: 4th Workshop on Engineering Applications, WEA 2017. Cartagena, Colombia: Springer; 2017. Automation of a business process using robotic process automation (RPA): a case study; pp. 65–71. [Google Scholar]
  • 4.Fernandez D, Aman A. Impacts of Robotic Process Automation on Global Accounting Services. AJAG . 2018;9:123–32. doi: 10.17576/AJAG-2018-09-11. [DOI] [Google Scholar]
  • 5.Ortiz FCM, Costa CJ. RPA in finance: supporting portfolio management: applying a software robot in a portfolio optimization problem. 2020 15th Iberian Conference on Information Systems and Technologies (CISTI); Sevilla, Spain. 2020. pp. 1–6. [DOI] [Google Scholar]
  • 6.Zhang C, Issa H, Rozario A, et al. Robotic Process Automation (RPA) Implementation Case Studies in Accounting: A Beginning to End Perspective. Accounting Horizons. 2023;37:193–217. doi: 10.2308/HORIZONS-2021-084. [DOI] [Google Scholar]
  • 7.Huang F, Vasarhelyi MA. Applying robotic process automation (RPA) in auditing: A framework. International Journal of Accounting Information Systems . 2019;35:100433. doi: 10.1016/j.accinf.2019.100433. [DOI] [Google Scholar]
  • 8.Meironke A, Kuehnel S. How to measure rpa’s benefits? a review on metrics, indicators, and evaluation methods of rpa benefit assessment; 2022. pp. 245–57. [Google Scholar]
  • 9.Wewerka J, Reichert M. Towards quantifying the effects of robotic process automation. 2020 IEEE 24th International Enterprise Distributed Object Computing Workshop (EDOCW); Eindhoven, Netherlands. 2020. pp. 29–34. [DOI] [Google Scholar]
  • 10.William W, William L. Improving corporate secretary productivity using robotic process automation. 2019 International Conference on Technologies and Applications of Artificial Intelligence (TAAI); Kaohsiung, Taiwan. 2019. pp. 1–5. [DOI] [Google Scholar]
  • 11.Šimek D, Šperka R. How Robot/human Orchestration Can Help in an HR Department: A Case Study From a Pilot Implementation. Organizacija. 2019;52:204–17. doi: 10.2478/orga-2019-0013. [DOI] [Google Scholar]
  • 12.Bruno J, Johnson S, Hesley J. Robotic disruption and the new healthcare revenue cycle. Healthc Financ Manage. 2017;71:1–4. [Google Scholar]
  • 13.Schmitz M, Stummer C, Gerke M. Future telco: successful positioning of network operators in the digital age. Cham: Springer; 2019. Smart automation as enabler of digitalization? a review of rpa/ai potential and barriers to its realization; pp. 349–58. [Google Scholar]
  • 14.Lacity M, Willcocks L, Craig A. Robotizing global financial shared services at Royal DSM. The Outsourcing Unit Working Research Paper Series. 2016;26:1–27. [Google Scholar]
  • 15.Dey S, Das A. Robotic process automation: assessment of the technology for transformation of business processes. IJBPIM. 2019;9:220. doi: 10.1504/IJBPIM.2019.100927. [DOI] [Google Scholar]
  • 16.Madakam S, Holmukhe RM, Jaiswal DK. The Future Digital Work Force: Robotic Process Automation (RPA) JISTEM USP . 2019;16:e201916001. doi: 10.4301/S1807-1775201916001. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

online supplemental table 1
bmjhci-33-1-s001.docx (20.6KB, docx)
DOI: 10.1136/bmjhci-2025-101821
online supplemental figure 1
bmjhci-33-1-s002.docx (112.9KB, docx)
DOI: 10.1136/bmjhci-2025-101821

Data Availability Statement

Data are available upon reasonable request.


Articles from BMJ Health & Care Informatics are provided here courtesy of BMJ Publishing Group

RESOURCES