Fig. 5.
The English corpus using the NMF model. Paragraph distances are calculated using Euclidean distance. Although the dimensions are not constrained to be orthogonal, and therefore one would not expect the correlation dimension to give interpretable results, a weave-like two-scale dimensional structure is again evident. The randomized corpus is again space-filling to the limit of the dataset, suggesting that the observed dimensional structure is a property of the word choice in the paragraphs and not of paragraph length or word frequency in the corpus.