Table 1. The characteristics of the data sets analysed.
The number of views, number of clusters, the largest number of features amongst the data views, and the number of samples for both the real and synthetic data sets analysed are presented. Real data are taken as heterogeneous, whereas the synthetic data are regarded as homogeneous. High-dimensional data contain more features than samples ( ).
Data description | |||||||
---|---|---|---|---|---|---|---|
Views (M) | Clusters ( ) | Features ( ) | Samples (N) | Hetero-geneous | High dimensional | ||
Data set | |||||||
Real | Cancer types | 3 | 3 | 22,503 | 253 | ✓ | ✓ |
Caltech7 | 6 | 7 | 1,984 | 1,474 | ✓ | ✓ | |
Handwritten digits | 6 | 10 | 240 | 2,000 | ✓ | ✗ | |
−−−−−−− | |||||||
Synthetic | MMDS | 3 | 3 | 300 | 300 | ✗ | ✗ |
NDS | 4 | 3 | 400 | 300 | ✗ | ✓ | |
MCS | 3 | 5 | 300 | 500 | ✗ | ✗ |