Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2022 May 27;3(6):100495. doi: 10.1016/j.patter.2022.100495

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2022 The Authors

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

PMC Copyright notice

Azure OCR pipeline

Generalized architecture of the automated Azure cloud computing pipeline hosted by Microsoft that was used for the WISE OCR and transcription experiment. Handwritten meteorological tables in portable document file (PDF) format were transferred to Microsoft and loaded to the Azure Data Lake Storage (ADLSv2), where a Function Apps code forwarded them for text extraction. The Read API Azure Cognitive Service was used to extract handwritten digits from each PDF, in conjunction with custom machine learning models deployed using the Azure Kubernetes service via the Azure Container Registry. The custom model removed noise from the digital surrogates and located cells with digits in them. The extracted components from each page were further processed and the final outcome from OCR analysis was stored in the Azure SQL database (Result Store) where they were accessed, analyzed, and visualized using Power BI. In addition, capabilities for inter-service communication were securely held in Key Vault.