Skip to main content
. 2021 May 13;24(3):1249–1274. doi: 10.1007/s10044-021-00985-x

Table 16.

Description of four high-dimensional datasets [56]

Dataset Brief description
Gastrointestinal lesions This dataset contains the features extracted from a database of colonoscopic videos showing gastrointestinal lesions. There are features vectors for 76 lesions, and there are 3 types of lesions: hyperplasic, adenoma, and serrated adenoma
DBWorld e-mails This dataset contains 64 e-mails from DBWorld newsletter. We use them to train different algorithms in order to classify between "announces of conferences" and "everything else"
Arcene Arcene is obtained by merging three mass spectrometry datasets. The original features show the abundance of proteins in human sera having a given mass value. Based on these features, cancer patients and healthy patients should be separated
Amazon reviews This dataset is derived from the reviews in Amazon Commerce Website for authorship identification. It identifies 50 of the most active users. The number of reviews collected for each author is 30