Skip to main content
. 2020 Nov 27;8(11):e22508. doi: 10.2196/22508

Table 2.

Data sets used in multi-task learning.

Data set Task Domain Size Example
STS-Ba Sentence pair similarity General 8600 Sentence 1: “A young child is riding a horse”; Sentence 2: “A child is riding a horse”; Similarity: 4.75
RQEb Sentence pair classification Biomedical 8900 Sentence 1: “Doctor X thinks he is probably just a normal 18 month old but would like to know if there are a certain number of respiratory infections that are considered normal for that age”; Sentence 2: “Probably a normal 18 month old but how many respiratory infections are normal”; Ground truth: entailment
MedNLIc Sentence pair classification Clinical 14,000 Sentence 1: “Labs were notable for Cr 1.7 (baseline 0.5 per old records) and lactate 2.4”; Sentence 2: “Patient has normal Cr”; Ground truth: contradiction
QQPd Sentence pair classification General 400,000 Sentence 1: “Why do rockets look white?”; Sentence 2: “Why are rockets and boosters painted white?”; Ground truth: 1
Topic Sentence classification Clinical 1,300,000 Sentence: “Negative for difficulty urinating, pain with urination, and frequent urination”; Ground truth: SIGNORSYMPTOM
MedNERe Token-wise classification Clinical 15,000 Sentence: “he developed respiratory distress on the AMf of admission, cough day PTAg, CXRh with B/Li LLj PNAk, started ciprofloxacin and levofloxacin”; Ground truth: ciprofloxacin [DRUG] levofloxacin [DRUG]

aSTS-B: semantic textual similarity benchmark.

bRQE: Recognizing Question Entailment.

cMedNLI: natural language inference data set for the clinical domain.

dQQP: Quora Question Pairs.

eMedNER: medication named entity recognition.

fAM: morning.

gPTA: prior to admission.

hCXR: chest x-ray.

iB/L: bilateral.

jLL: left lower.

kPNA: pneumonia.