Skip to main content
. 2020 Jun 8;8(6):e17364. doi: 10.2196/17364

Table 1.

Descriptions and details of the variables included in the machine learning algorithms.

Variable Types of variable Description
Main residence Categorical variablea Urban area (0), rural area (1)
Menopausal status Categorical variable Premenopause (0), postmenopause (1)
Age in years Discrete variable Age at breast cancer diagnosis or screening
BMI (kg/m2) Continuous variable BMI at breast cancer diagnosis or screening
Age of menarche Discrete variable Age at first menstruation
Duration of reproductive life span Discrete variable Premenopausal women: current age – age of menarche; postmenopausal women: menopause age – age of menarche
Pregnancy history Categorical variable No (0), yes (1)
Number of live births Discrete variable Live births is defined as births of children who showed any sign of life
Age at first birth Discrete variable Age of women at birth of first child (for women with no live birth, this value equals 99)
Family history of breast cancer Categorical variable First-degree or second-degree female relatives had breast cancer: no (0), yes (1)
Case-control status (outcome variable) Categorical variable Control (0), case (1)

aCategorical variables were converted into one-hot encoding before being provided to machine learning algorithms.