Skip to main content
. 2015 Nov 5;2015:953–962.

Table 2.

Example of methods used for the extraction of cancer notifications data.

Method Notifications Data Example
Queensland Cancer Registry coding rules (including special casings) Histological Type Select the highest morphology if more than one morphology is stated.
Histological Grade Assign the highest grade or differentiation code.
Primary Site Code all leukaemia except myeloid sarcoma (M-99303) to C42.1 (bone marrow).
Domain knowledge Primary Site List of one-to-one only ICD-O morphology to site mappings.
SNOMED CT property access Histological Type Restrict SNOMED CT concepts to those with a ‘morphologic abnormality’ semantic category and those that have alternate terms with the following regular expression “M-[0–9]{5}”.
Primary Site Restrict SNOMED CT concepts to those with a ‘body structure’ semantic category.
SNOMED CT to ICD-O topography cross-maps Primary Site Map SNOMED CT ‘body structure’ concepts to ICD-O topography codes.
SNOMED CT Subsumption querying Histological Type Candidate ‘leukaemia’ concepts are found by testing subsumption by the ‘128931003 | Leukemia – category’ concept.
SNOMED CT Concept relationship querying Primary Site “Procedure site – Direct” and “Finding site” relationship values from concepts are used as candidate sites.
SNOMED CT querying using ad-hoc term expansion Histological Type The histological type and grade’s preferred terms were used to search for a more specific concept. For example, the query for “Follicular Lymphoma” + “Grade 3” would return the histological type M-96983, which is “Follicular Lymphoma, Grade 3”.
Relation extraction Basis of Diagnosis Identification of multiple concepts or terms within a search scope such as metastasis and lymph nodes within a sentence.
Keyword/phrase spotting Histological Grade Detect keywords or phrases that were unable to be (or unreliably) mapped to SNOMED CT. For example “poorly to moderately differentiated”.