Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2020 Apr 28;1(2):100019. doi: 10.1016/j.patter.2020.100019

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2020 The Authors

This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

PMC Copyright notice

Analysis of Labeling Function Types and Time-versus-Performance Tradeoffs

(A and B) Labeling function (LF) types (A); labeling time for datasets describing chest (CXR) and extremity (EXR) radiographs, head CT (HCT), and electroencephalography (EEG) (B). Labeling times are presented for the small development set (Dev) of several hundred examples, the Large fully supervised dataset (i.e., physician-years of labeling time), and the Medium fully supervised dataset (i.e., physician-months of labeling time). See Table 1 for additional details on dataset sizes. Hand-labeling times were estimated using median read times of 1 min 38 s per CXR, 1 min 26 s per EXR, 6 min 38 s per HCT, and 12 min 30 s per EEG drawn from reported values in the literature.³⁹^,⁴⁰ These estimates are conservative because they assume that only a single clinician contributed to reading each case.

(C) Labeling time versus performance in the context of dataset size, the task, and the type of supervision. Cross-modal data programming (DP) often yields models similar in performance to those trained on Large hand-labeled datasets (FS) but using a fraction of the labeling time.