In a short and wide data matrix (A), the number of observations is much smaller than the number of variables (N << M). Considering different regions of interest (ROI) on the whole image of a given patient, the extraction of hand-crafted features (e.g., radiomics features) may lead to a short and wide data matrix, as the number of features extracted for each ROI per image is typically larger than the number of samples or patients (C). To make a tall and thin data matrix from the image data (B), an image can be divided to many (overlapping) ROIs or patches, each with a small number of pixels (D). The extraction of pixel data as features from each patch per patient may be much smaller in size than the total number of patients.