Skip to main content
. 2020 May 7;20:108. doi: 10.1186/s12874-020-00977-1

Table 1.

Two sets of variables from SEER’s research dataset

Feature set Variables
small-set AGE_DX, BEHO3V, DX_CONF, GRADE, LATERAL, PRIMSITE , SEQ_NUM, SEX
large-set AGE_DX, BEHO3V, CS1SITE, CS2SITE, CS3SITE, CS4SITE , CS5SITE , CS6SITE , CS7SITE , CS15SITE , CSEXTEN, CSLYMPHN, CSMETSDX, CSMETSDXBR_PUB, CSMETSDXB_PUB, CSMETSDXLIV_PUB, CSMETSDXLUNG_PUB, CSMTEVAL, CSRGEVAL, CSTSEVAL, CSVCURRENT, CSVFIRST, DX_CONF, GRADE, HISTO3V, LATERAL, MAR_STAT, NHIADE, NO_SURG, PRIMSITE, RACE1V, REC_NO, REG, REPT_SRC, SEQ_NUM, SEX, SURGSITF, TYPE_FU, YEAR_DX, YR_BRTH

small-set contains variables with low levels; while large-set contains a large number of variables including a few with large number of levels. variable PRIMSITE is only considered for BREAST dataset as it has a large number of levels for LYMYLEUK and RESPIR. indicates that the variable is not present in the LYMYLEUK dataset. indicates that the variable is not present in the RESPIR dataset