1: procedure Sidification(, δ = 1 |
2: Translate each continuous feature so that they are positive and all have the same maximum value (note that the minimum value can differ over variables) |
3: Order the variables in terms of their range with variables with largest range appearing first. This applies only to continuous variables (factors are placed randomly at the end) |
4: Convert any categorical variable with more than two categories to a set of zero-one dummy variables with one for each category |
5: Add δ to the first variable |
6: for number of input variables, excluding the first do
|
7: Add δ plus the maximum of the previous input variable to the current variable |
8: end for
|
9: have now been staggered to the main SID features |
10: for all pairs of main SID features (from Line 9) do
|
11: if a pair consists of two dummy variables then
|
12: Interaction is a four level factor for each dummy variable combination |
13: else
|
14: Create interaction variable by multiplying them |
15: end if
|
16: end for
|
17: This yields the SID interaction features |
18: end procedure
|
19: return
the sidified data |