Table 2.
Columns provided in the metadata table ptbxl_database.csv.
| Section | Variable | Data Type | Description | 
|---|---|---|---|
| Identifiers | ecg_id | integer | unique ECG identifier | 
| patient_id | integer | unique patient identifier | |
| filename_lr | string | path to waveform data (100 Hz) | |
| filename_hr | string | path to waveform data (500 Hz) | |
| General Metadata | age | integer | age at recording in years (see Fig. 3 left) | 
| sex | categorical | sex (male 0, female 1) | |
| height | integer | height in centimeters (see Fig. 3 right) | |
| weight | integer | weight in kilograms (see Fig. 3 middle) | |
| nurse | categorical | involved nurse (pseudonymized) | |
| site | categorical | recording site (pseudonymized) | |
| device | categorical | recording device | |
| recording_date | datetime | ECG recording date and time | |
| ECG Statements | report | string | ECG report from diagnosing cardiologist | 
| scp_codes | dictionary | SCP ECG statements (see Tables 6, 7 and 8) | |
| heart_axis | categorical | heart’s electrical axis (see Table 10) | |
| infarction_stadium1 | categorical | infarction stadium (see Table 11) | |
| infarction_stadium2 | categorical | second infarction stadium (see Table 11) | |
| validated_by | categorical | validating cardiologist (pseudonymized) | |
| second_opinion | boolean | flag for second (deviating) opinion | |
| initial_autogenerated_report | boolean | initial autogenerated report by ECG device | |
| validated_by_human | boolean | validated by human | |
| Signal Metadata | baseline_drift | string | baseline drift or jump present | 
| static_noise | string | electric hum/static noise present | |
| burst_noise | string | burst noise | |
| electrodes_problems | string | electrodes problems | |
| extra_beats | string | extra beats | |
| pacemaker | string | pacemaker | |
| Cross-validation Folds | strat_fold | integer | suggested stratified folds | 
Each ECG is identified by a unique ID (ecg_id) and comes with a number of ECG statements (scp_codes) that can be used to train a multi-label classifier that can be evaluated based on the proposed fold assignments (strat_fold).