Skip to main content
. 2017 Jul 31;7:6918. doi: 10.1038/s41598-017-07111-0

Table 2.

Informative features associated with HNSCC by HPV status as discovered by TEPAPA.

Log (OR) P N Text feature Type EMR Source Interpretation Crossref.
Informative features associated with HPV-related HNSCC
3.50 3.0 × 10−6 25 “HPV (studies|genotypes|status):? P16 immunohistochemistry:? Positive” R Pathology HPV status (Self-referent) (S3c.1)
3.89 6.2 × 10−6 20 “HPV (positive|genotypes: Positive|associated squamous cell carcinoma|related).” R Pathology HPV status (Self-referent) (S3c.2)
3.29 2.0 × 10−5 23 “No FDG avid? pulmonary (nodules|nodule) or pleural” R PET (Lack of) metastasis to the lung (S4c.1)
3.14 5.6 × 10−5 21 “HPV related” N Pathology HPV status (Self-referent) (S3b.6)
2.06 0.00094 24 “irradiation (and|with) (or without|concurrent) chemotherapy” R MDT Management (S2c.7)
2.76 0.0093 9 “oropharyngectomy:” N Pathology Management, site of primary tumor (S3a.22)
3.23 0.0011 13 “SCC of the right (tonsil|base of tongue|glossotonsillar sulcus) -” R MDT Site of primary tumor (S2d.4)
2.68 0.0015 16 “SCC of the (right|left)? base of tongue” R MDT Site of primary tumor (S2d.5)
3.02 0.0023 11 “M0” N MDT Stage (S2a.3)
2.89 0.0047 10 “non-keratinising” N Pathology Pathology feature (S3a.16)
2.77 0.0092 9 “p16? positive,? HPV? positive” R MDT HPV status (Self-referent) (S2c.17)
Informative features associated with HPV-unrelated HNSCC
−3.54 0.00035 8 “for decalcification” N Pathology Pathology feature (S3e.2)
−2.91 0.00089 10 “a (locally|locoregionally)? (p16 negative|advanced) SCC” R MDT HPV status and pathology feature (S2h.3)
−3.17 0.0031 6 “SCC of the supraglottic? (lower lip|larynx).” R MDT Site of primary tumor (S2g.7)
−2.96 0.0086 5 “likely to? require adjuvant radiation therapy” R MDT Management (S2g.10)
−3.35 0.0011 7 supportive care N MDT Management (S2f.3)
−2.59 0.0058 8 “differentiated, keratinising squamous cell carcinoma” N Pathology Pathology feature (S3e.23)
−2.59 0.0058 8 “well differentiated” N Pathology Pathology feature (S3e.26)

Note: The type field indicates the type of text features (N: n-gram fragments or R: regular expression). N indicates number of documents containing the text features. Abbreviations: Log (OR): Log odds ratio. MDT: Multidisciplinary team meeting.