Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2018 Aug 27;115(37):E8585–E8594. doi: 10.1073/pnas.1800083115

History of art paintings through the lens of entropy and complexity

Higor Y D Sigaki a, Matjaž Perc b,c,d,1, Haroldo V Ribeiro a,1
PMCID: PMC6140488  PMID: 30150384

Significance

The critical inquiry of paintings is essentially comparative. This limits the number of artworks that can be investigated by an art expert in reasonable time. The recent availability of large digitized art collections enables a shift in the scale of such analysis through the use of computational methods. Our research shows that simple physics-inspired metrics that are estimated from local spatial ordering patterns in paintings encode crucial information about the artwork. We present numerical scales that map well to canonical concepts in art history and reveal a historical and measurable evolutionary trend in visual arts. They also allow us to distinguish different artistic styles and artworks based on the degree of local order in the paintings.

Keywords: art history, paintings, complexity, entropy, spatial patterns

Abstract

Art is the ultimate expression of human creativity that is deeply influenced by the philosophy and culture of the corresponding historical epoch. The quantitative analysis of art is therefore essential for better understanding human cultural evolution. Here, we present a large-scale quantitative analysis of almost 140,000 paintings, spanning nearly a millennium of art history. Based on the local spatial patterns in the images of these paintings, we estimate the permutation entropy and the statistical complexity of each painting. These measures map the degree of visual order of artworks into a scale of order–disorder and simplicity–complexity that locally reflects qualitative categories proposed by art historians. The dynamical behavior of these measures reveals a clear temporal evolution of art, marked by transitions that agree with the main historical periods of art. Our research shows that different artistic styles have a distinct average degree of entropy and complexity, thus allowing a hierarchical organization and clustering of styles according to these metrics. We have further verified that the identified groups correspond well with the textual content used to qualitatively describe the styles and the applied complexity–entropy measures can be used for an effective classification of artworks.


Physics-inspired approaches have been successfully applied to a wide range of disciplines, including economic and social systems (13). Such studies usually share the goal of finding fundamental principles and universalities that govern the dynamics of these systems (4). The impact and popularity of this research has been growing steadily in recent years, in large part due to the unprecedented amount of digital information that is available about the most diverse subjects at an impressive degree of detail. This digital data deluge enables researchers to bring quantitative methods to the study of human culture (57), mobility (8, 9), and communication (1012), as well as literature (13), science production, and peer review (1418), at a scale that would have been unimaginable even a decade ago. A large-scale quantitative characterization of visual arts would be among such unimaginable research goals, not only because of data shortage but also because the study of art is often considered to be intrinsically qualitative. Quantitative approaches aimed at the characterization of visual arts can contribute to a better understanding of human cultural evolution, as well as to more practical matters, such as image characterization and classification.

While the scale of some current studies has changed dramatically, the use of quantitative techniques in the study of art has some precedent. Efforts can be traced back to the 1933 book Aesthetic Measure by the American mathematician Birkhoff (19), where a quantitative aesthetic measure is defined as the ratio between order (number of regularities found in an image) and complexity (number of elements in an image). However, the application of such quantitative techniques to the characterization of artworks is much more recent. Among the seminal works, we have the article by Taylor et al. (20), where Pollock’s paintings are characterized by an increasing fractal dimension over the course of his artistic career. This research article can be considered a landmark for the quantitative study of visual arts, inspiring many further applications of fractal analysis and related methods to determine the authenticity of paintings (2125) and to study the evolution of specific artists (26, 27), the statistical properties of particular paintings (28) and artists (2931), art movements (32), and many other visual expressions (3335). The most recent advances of this emerging and rapidly growing field of research are comprehensively documented in several conference proceedings and special issues of scientific journals (3638), where contributions have been focusing also on artwork restoration tools, authentication problems, and stylometry assessment procedures.

To date, relatively few research efforts have been dedicated to study paintings from a large-scale art historical perceptive. In 2014, Kim et al. (39) analyzed 29,000 images, finding that the color-use distribution is remarkably different among historical periods of western paintings, and, moreover, that the roughness exponent associated with the grayscale representation of these paintings displays an increasing trend over the years. In a more recent work, Lee et al. (40) have analyzed almost 180,000 paintings, focusing on the evolution of the color contrast. Among other findings, they have observed a sudden increase in the diversity of color contrast after 1850, and showed also that the same quantity can be used to capture information about artistic styles. Notably, there is also innovative research done by Manovich and coworkers (4143) concerning the analysis of large-scale datasets of paintings and other visual art expressions by means of the estimation of their average brightness and saturation.

However, except for the introduction of the roughness exponent, preceding research along similar lines has been predominantly focused on the evolution of color profiles, while the spatial patterns associated with the pixels in visual arts remain poorly understood. Here, we present a large-scale investigation of local order patterns over almost 140,000 visual artwork images that span several hundred years of art history. By calculating two complexity measures associated with the local order of the pixel arrangement in these artworks, we observe a clear and robust temporal evolution. This evolution is characterized by transitions that agree with different art periods. Moreover, the observed evolution shows that these periods are marked by distinct degrees of order and regularity in the pixel arrangements of the corresponding artworks. We further show that these complexity measures partially encode fundamental concepts of art history that are frequently used by experts for a qualitative description of artworks. In particular, the complexity measures distinguish different artistic styles according to their average order in the pixel arrangements, enable a hierarchical organization of styles, and are also capable of automatically classifying artworks into artistic styles.

Results

Our results are based on a dataset comprising 137,364 visual artwork images (mainly paintings), obtained from the online visual arts encyclopedia WikiArt (https://www.wikiart.org). This webpage is among the most significant freely available sources for visual arts. It contains artworks from over 2,000 different artists, covering more than a hundred styles, and spanning a period on the order of a millennium. Each one of these image files has been converted into a matrix representation whose dimensions correspond to the image width and height, and whose elements are the average values of the shades of red, green, and blue (RGB) of the pixels in the RGB color space. For further details, we refer to Materials and Methods.

From this matrix representation of the artwork images, we calculate two complexity measures: the normalized permutation entropy H (44) and the statistical complexity C (45). As described in Materials and Methods, both measures are evaluated from the ordinal probability distribution P, which quantifies the occurrence of the ordinal patterns among the image pixels at a local scale. Here, we have estimated this distribution by considering sliding partitions of size dx=2 by dy=2 pixels (the embedding dimensions), leading to (dxdy)!=24 possible ordinal patterns. The value of H quantifies the degree of disorder in the pixel arrangement of an image: Values close to 1 indicate that pixels appear at random, while values close to zero indicate that pixels appear almost always in the same order. More-regular images (such as those produced by Minimalism) are expected to have small entropy values, while images exhibiting less regularity (such as Pollock’s drip paintings) are characterized by large values of entropy. The statistical complexity C, in turn, measures the “structural” complexity present in an image (46, 47): It is zero for both extremes of order and disorder in the pixel arrangement, and it is positive when an image presents more-complex spatial patterns. The joint use of the values of C and H as a discriminating tool give rise to a “complexity–entropy plane” (45, 48), which is a technique that has proven to be useful in several applications. Herein the complexity–entropy plane is our approach of choice for quantifying the characteristics of different visual artworks.

Evolution of Art

A careful comparison of different artworks is one of the main methods used by art historians to understand whether and how art has evolved over the years. Works by Heinrich Wölfflin (49) and Alois Riegl (50), for example, can be considered fundamental in this regard. They have proposed to distinguish artworks from different periods through a few visual categories and qualitative descriptors. Visual comparison is undoubtedly a useful tool for evaluating artistic style. However, it is impractical to apply at scale. This is when computational methods show their greatest advantage. Nevertheless, to be useful, it is important that derived metrics are still easily interpreted in terms of familiar and disciplinary-relevant categories.

We note that the complexity–entropy plane partially (and locally) reflects Wölfflin’s dual concepts of linear versus painterly and Riegl’s dichotomy of haptic versus optic artworks. According to Wölfflin, “linear artworks” are composed of clear and outlined shapes, while, in “painterly artworks,” the contours are subtle and smudged for merging image parts and passing the idea of fluidity. Similarly, Riegl considers that “haptic artworks” depict objects as tangible discrete entities, isolated and circumscribed, whereas “optic artworks” represent objects as interrelated in deep space by exploiting light, color, and shadow effects to create the idea of an open spatial continuum. The notions of order/simplicity versus disorder/complexity in the pixel arrangements of images captured by the complexity–entropy plane partially encode these concepts. Images formed by distinct and outlined parts yield many repetitions of a few ordinal patterns, and, consequently, linear/haptic artworks are described by small values of H and large values of C. On the other hand, images composed of interrelated parts delimited by smudged edges produce more random patterns, and, accordingly, painterly/optic artworks are expected to yield larger values of H and smaller values of C. It is also worth mentioning that Wölfflin’s and Riegl’s dual concepts are limiting forms of representation that demarcate the scale of all possibilities (51). In this regard, the continuum of H and C values may help art historians to grade this scale.

In this context, we ask whether the scale defined by H and C values is capable of unveiling any dynamical properties of art. To answer this question, we estimate the average values of H and C after grouping the images by date. Because the artworks are not uniformly distributed over time (see Materials and Methods), we have chosen time intervals containing nearly the same number of images in each time window. Fig. 1 shows the joint evolution of the average values of C and H over the years (i.e., the changes in the complexity–entropy plane), where a clear and robust (SI Appendix, Fig. S1) trend is observed. This trajectory of H and C values shows that the artworks produced between the ninth and the 17th centuries are, on average, more regular/ordered than those created between the 19th and the mid-20th century. Also, the artworks produced after 1950 are even more regular/ordered than those from the two earlier periods. We observe further that the pace of changes in the complexity–entropy plane intensifies after the 19th century, a period that coincides with the emergence of several artistic styles (such as Neoclassicism and Impressionism), and also with the increase in the diversity of color contrast observed by Lee et al. (40).

Fig. 1.

Fig. 1.

Quantifying the evolution of artworks through the history of art. Shown is the temporal evolution of the average values of permutation entropy H and statistical complexity C (complexity–entropy plane). Each dot corresponds to the average values of H and C for a given time interval (shown in the plot). Error bars represent the SEM. The highlighted regions show different art periods (black: Renaissance, Neoclassicism, and Romanticism; red: Modern Art; green: Contemporary/Postmodern Art). We observe that the complexity–entropy plane correctly identifies different art periods and the transitions among them.

The three regions in Fig. 1 defined by the values of H and C correspond well with the main divisions of art history. The first period (black rectangle) corresponds to Medieval Art, the Renaissance, Neoclassicism, and Romanticism, which developed until the 1850s (52). The second period (red rectangle) corresponds to Modern Art, marked by the birth of Impressionism in the 1870s, and by the development of several avant-garde artistic styles (such as Cubism, Expressionism, and Surrealism) during the first decades of the 20th century. Finally, the latest period corresponds to the transition between Modern Art and Contemporary/Postmodern Art. The specific date marking the beginning of the Postmodern period is still an object of fierce debate among art experts (52). Nevertheless, there is some consensus in that Postmodern Art begins with the development of Pop Art in the 1960s (52).

By carrying the analogy between the complexity–entropy plane and the concepts of Wölfflin and Riegl forward, the transition between the art produced before Modernism and Modern Art represents a change from linear/haptic to painterly/optic in the representation modes. This thus agrees with the idea that artworks from the Renaissance, Neoclassicism, and Romanticism usually represent objects rigidly distinguished from each other and separated by flat surfaces (49, 53, 54), while modern styles such as Impressionism, Fauvism, Pointillism, and Expressionism are marked by the use of looser and smudged brushstrokes to avoid the creation of pronounced edges (49, 53, 54). Intriguingly, the transition between Modern and Postmodern Art is marked by an even more intense and rapid change from painterly/optic to linear/haptic representation modes. This fact appears to agree with the Postmodern idea of art as being instantly recognizable, made of ordinary objects, and marked by the use of large and well-defined edges [such as in Hard Edge Painting and Op Art artworks (53, 54)].

The conceptions of art history proposed by Wölfflin and Riegl consider that art develops through a change from the linear/haptic to the painterly/optic mode of representation, which agrees with the first transition observed in Fig. 1. However, for Riegl, this development occurs through a single and continuous process (56), while Wölfflin has a cyclical conception of this transition that seems more consistent with the overall dynamical behavior of H and C. On the other hand, this cyclical conception is not compatible with the local persistent behavior of the changes in the complexity–entropy plane. Indeed, recent studies of art historians, such as the work of Gaiger (51), argue that neither of these conceptions hold when analyzing the entire development of art history. For Gaiger, the dual categories of Wölfflin and Riegl should be treated as purely descriptive concepts and not linked to a particular change over time.

Another possibility for understanding the underlying mechanisms of the dynamical behavior unveiled by the complexity–entropy plane is evolutionary theories of art (56, 57). These recently proposed theories consider art from different perspectives, such as adaptation, a by-product of the brain’s complexity, or sexual and natural selection aimed at sharing attention, and suggest that art’s evolutionary contribution was to foster social cohesion and creativity. According to these theories, art history is driven by the interplay between audience preference and the artist’s desire to engage attention and expand these preferences. This feedback mechanism among artists and the public would be responsible for propelling art toward its unprecedented degree of specialization, innovation, and diversity, and could also explain what has driven artists and artistic movements to follow the historical path depicted in Fig. 1.

Distinguishing Among Artistic Styles

We now ask whether the complexity–entropy plane is capable of discriminating among different artistic styles in our dataset. To do so, we calculate the average values of H and C after grouping the images by style. We also limit this analysis to the 92 styles having more than 100 images each (corresponding to 90% of data; see SI Appendix, Fig. S5 for name and number of images of each one) to obtain reliable values for the averages. Fig. 2 shows that the artistic styles are spread over the complexity–entropy plane, and the average values of H and C are significantly different for the majority of the pairwise comparisons (92%; see SI Appendix, Fig. S7). However, we also observe styles with statistically indistinguishable average values.

Fig. 2.

Fig. 2.

Distinguishing among different artistic styles with the complexity–entropy plane. The colored dots represent the average values of H and C for each one of the 92 styles with more than 100 images in our dataset. Error bars represent the SEM. For better visualization, only 41 artistic styles with more than 500 images each are labeled (see SI Appendix, Fig. S6 for all styles).

We note further that the arrangement of styles is in agreement with the general trend in the average values of H and C over time in which most Postmodern styles are localized in a region of smaller entropy and larger complexity values than are modern styles (such as Expressionism, Impressionism, and Fauvism). This arrangement maps the different styles into a continuum scale whose extreme values partially reflect the dichotomy of linear/haptic versus painterly/optic modes of representation. Among the styles displaying the highest values of C and the smallest values of H, we find Minimalism, Hard Edge Painting, and Color Field Painting, which are all marked by the use of simple design elements that are well-delimited by abrupt transitions of colors (53, 54). Styles displaying the smallest values of C and the highest values of H (such as Impressionism, Pointillism, and Fauvism) are characterized by the use of smudged and diffuse brushstrokes, and also by blending colors to avoid the creation of sharp edges (53, 54).

Hierarchical Structure of Artistic Styles

The values of H and C capture the degree of similarity among artistic styles regarding the local ordering of image pixels. This fact enables us to test for a possible hierarchical organization of styles with respect to this local ordering. To do so, we have considered the Euclidean distance between a pair of styles in the complexity–entropy plane as a dissimilarity measure between them. Thus, the closer the distance between two artistic styles, the more significant is the similarity between them, whereas pairs of styles separated by large distances are considered more dissimilar from each other. Fig. 3A shows the matrix plot of these distances, where we qualitatively observe the formation of style groups.

Fig. 3.

Fig. 3.

Hierarchical organization of artistic styles. (A) Matrix plot of the Euclidean distance in the complexity–entropy plane between every pair of styles. (B) Dendrogram representation of the distance matrix obtained by applying the minimum variance method proposed by Ward (58). The 14 groups of styles indicated by the colored branches are obtained by cutting the dendrogram at the threshold distance 0.03. This value maximizes the silhouette coefficient (59) (see Materials and Methods and SI Appendix, Fig. S8), and it is thus a “natural” number for defining the number of clusters in our dataset. The order of rows and columns in the matrix plot is the same as used in the dendrogram.

To investigate the clustering between artistic styles systematically, we use the minimum variance method proposed by Ward (58) to construct a dendrogram representation of the distance matrix. This method is a hierarchical clustering procedure that uses the within-cluster variance as the criterion for merging pairs of clusters. Fig. 3B depicts this dendrogram, unveiling an intricate relationship among the artistic styles in our dataset. By maximizing the silhouette coefficient (59) (as described in Materials and Methods and SI Appendix, Fig. S8), we find that 0.03 is the optimal threshold distance that maximizes the cohesion and separation among the clusters of styles. This threshold distance yields 14 groups of styles indicated by the different colors in Fig. 3.

These groups partially reflect the temporal localization of different artistic styles and their evolution reported in Fig. 1. In particular, several styles that emerged together or close in time are similar regarding the local arrangement of pixels and thus belong to the same group. For instance, the first five groups of Fig. 3B contain mainly Postmodern styles. On the other hand, these groups and their hierarchical structure organize the styles regarding their mode of representation in the scale delimited by the dichotomy of linear/haptic versus painterly/optic. This fact is more evident when examining groups in both extremes of order and regularity in the complexity–entropy plane. The right-most group of Fig. 3B, for example, contains styles that use relatively small brush strokes and avoid the creation of sharp edges. This fact is particularly evident in artworks of Impressionism, Pointillism, and Divisionism, but it is also evident in Neo-Baroque and Neo-Romanticism, and in the works of muralists (such as David Siqueiros and José Orozco), as well as in the abstract paintings of P&D (Pattern and Decoration). While devoted to patterning paintings (such as printed fabrics), P&D is considered a “reaction” to Minimalism and Conceptual Art (which are located in the other extreme of the complexity–entropy plane) that avoids restrained compositions by means of a subtle modulation of colors as in the works of Robert Zakanitch, who is considered one of the founders of P&D (60). As we move to groups characterized by high complexity and low entropy, we observe the clustering of styles marked by the presence of sharp edges and very contrasting patterns, usually formed by distinct parts isolated or combined with unrelated materials. That is the case for the group containing Op Art, Pop Art, and Constructivism, but also for the group formed by Kinetic Art, Hard Edge Painting, and Concretism (53, 54).

We can also verify the meaningfulness of these groups by comparing the clustering of Fig. 3B with an approach based on the similarities among the textual content of the Wikipedia pages of each artistic style. To do so, we have obtained the textual content of these webpages and extracted the top 100 keywords of each one by applying the term frequency–inverse document frequency approach (61). We consider the inverse of 1 plus the number of shared keywords between two styles as a measure of similarity between them. Thus, styles having no common keywords are at the maximum “distance” of 1, while styles sharing several keywords are at a closer distance.

By using a similar hierarchical clustering procedure to the one used in Fig. 3, we obtain 24 clusters of artistic styles from the Wikipedia text analysis (SI Appendix, Fig. S9). This number of clusters is much larger than the 14 clusters obtained from the complexity–entropy plane. However, both clustering approaches share similarities, which can be quantified by using the clustering evaluation metrics homogeneity h, completeness c, and v measure (62). Perfect homogeneity (h=1) implies that all clusters obtained from the Wikipedia texts contain only styles belonging to the same clusters obtained from the complexity–entropy plane. On the other hand, perfect completeness (c=1) implies that all styles belonging to the same cluster obtained from the complexity–entropy plane are grouped in the same cluster obtained from the Wikipedia texts. The v measure is the harmonic mean between h and c, that is, v=2hc/(h+c). Our results yield h=0.49, c=0.40, and v=0.44, which are values significantly larger than those obtained from a null model where the number of shared keywords is randomly chosen from a uniform distribution between 0 and 100 (hrand=0.42±0.02, crand=0.35±0.01, and vrand=0.38±0.01; average values over 100 realizations). Therefore, the similarities between the two clustering approaches cannot be explained by chance. This result indicates that, despite the very local character of our complexity measures, the values of H and C reflect the meaning of some keywords used for describing artistic styles.

Predicting Artistic Styles

Another possibility of quantifying the information encoded by the values of H and C is trying to predict the style of an image based only on these two values. To do so, we have implemented four well-known machine learning algorithms (63, 64) (nearest neighbors, random forest, support vector machine, and neural network; see Materials and Methods for details) for the classification task of predicting the style of images for all 20 styles that contain more than 1,500 artworks each. For each method, we estimate the validation curves for a range of values of the main parameters of the algorithms with a stratified n-fold cross-validation (63) strategy with n=10. Fig. 4A shows the validation curves for the k-nearest neighbors as a function of the number of neighbors. We note that this method underfits the data if the number of neighbors is smaller than 250. Conversely, the cross-validation score saturates at 0.18 if the neighbors are 300 or more, and there is no overfitting up to 500 neighbors. Another relevant issue for statistical learning is related to the number of data necessary to properly train the model. To investigate this, we again use a stratified n-fold cross-validation strategy with n=10 for estimating the learning curves. Fig. 4B shows the training and cross-validation scores for the k-nearest neighbors, where we observe that both scores increase with training size. However, this enhancement is very small when more than 50% of the data are used for training the model. SI Appendix, Fig. S10 shows results analogous to those presented in Fig. 4 as obtained with the other three machine learning algorithms.

Fig. 4.

Fig. 4.

Predicting artistic styles with statistical learning algorithms. (A) Training and cross-validation scores of the nearest neighbors algorithm as a function of the number of neighbors. We note that the algorithm underfits the data for values of the number of neighbors smaller than 250, but there is no significant accuracy improvement for larger values nor for overfitting up to 500 neighbors. (B) Learning curve, that is, the training and cross-validation scores, as a function of the training size (fraction of the whole data) for the nearest neighbors algorithm with the number of neighbors equal to 400. We observe no significant improvement in the cross-validation scores when more than 50% of the data are used to train the model. In both plots, the shaded regions are 95% confidence intervals obtained with a 10-fold cross-validation splitting strategy. (C) Comparison between four different statistical learning algorithms (nearest neighbors, random forest, support vector machine, and neural network; see SI Appendix, Fig. S10 for details of the parameters for each algorithm) as well as the null accuracy obtained from two “dummy” classifiers (stratified: generates random predictions respecting the style distributions; uniform: predictions are uniformly random). Error bars represent the SEM. The four classifiers have similar accuracy (18%), and they all significantly outperform the dummy classifiers. These results are based on the 20 styles with more than 1,500 images each, although similar results are obtained when including others styles as well (SI Appendix, Fig. S11).

By combining the previous analysis with a grid search algorithm, we determine the best combination of parameters enhancing the performance of each statistical learning method. Fig. 4C shows that the four algorithms display similar performances, all exhibiting accuracies close to 18%. We have further compared these accuracies with those obtained from two dummy classifiers. In the stratified classifier, style predictions are generated by chance but respecting the distribution of styles, while predictions are drawn uniformly at random when using the uniform classifier. The results in Fig. 4C show that all machine learning algorithms have a significantly larger accuracy than is obtained by chance. This result thus confirms that the values of H and C encode important information about the style of each artwork. Nevertheless, the achieved accuracy is quite modest for practical applications. Indeed, there are other approaches that are more accurate. For instance, Zujovic et al. (65) achieved accuracies of 70% in a classification task with 353 paintings from five styles, and Argarwal et al. (66) reported an accuracy of 60% in a classification task with 3,000 paintings from 10 styles. However, our results cannot be directly compared with those works, since they use a much smaller dataset with fewer styles and several image features, while our predictions are based only on two features. Our approach represents a severe dimensionality reduction, since images with roughly 1 million pixels are represented by two numbers related to the local ordering of the image pixels. In this context, an accuracy of 18% in a classification with 20 styles and more than 100,000 artworks is not negligible. Moreover, the local nature of H and C makes these complexity measures very fast, easy to parallelize, and scalable from the computational point of view. Thus, in addition to showing that the complexity–entropy plane encodes important information about the artistic styles, we believe that the values of H and C, combined with other image features, are likely to provide better classification scores.

Discussion and Conclusions

We have presented a large-scale characterization of a dataset composed of almost 140,000 artwork images that span the latest millennium of art history. Our analysis is based on two relatively simple complexity measures (permutation entropy H and statistical complexity C) that are directly related to the ordinal patterns in the pixels of these images. These measures map the local degree of order of these artworks into a scale of order–disorder and simplicity–complexity that locally reflects the qualitative description of artworks proposed by Wölfflin and Riegl. The limits of this scale correspond to two extreme modes of representation proposed by these art historians, namely, to the dichotomy between linear/haptic (H0 and C0) and painterly/optic (H1 and C0).

By investigating the dynamical behavior of the average values of the complexity measures used, we have found a clear and robust trajectory of art over the years in the complexity–entropy plane. This trajectory is characterized by transitions that agree with the main periods of art history. These transitions can be classified as linear/haptic to painterly/optic (before and after Modern Art) and painterly/optic to linear/haptic (the transition between Modern and Postmodern Art), showing that each of these historical periods has a distinct degree of entropy and complexity. While Wölfflin’s conception of art history in terms of a cyclical transition between linear and painterly does not withstand the local time persistence in the values of H and C nor the critical scrutiny of Gaiger (51) and other contemporary art historians, it is quite consistent with the global evolution depicted in the complexity–entropy plane. For Wölfflin, the transition from linear to painterly is governed by a “natural law in the same way as physical growth,” and “to determine this law would be a central problem, the central problem of history of art” (ref. 49, p. 17). However, the return to the linear “lies certainly in outward circumstances” (ref. 49, p. 233), and, in the context of Fig. 1, it is not difficult to envisage that the transition from Modern to Postmodern was driven by the end of World War II, the event that usually marks the beginning of Postmodernism in history books.

In addition to unveiling this dynamical aspect of art, the values of H and C are capable of distinguishing between different artistic styles according to the average degree of entropy–complexity in the corresponding artworks. We emphasize that the location of each style in the complexity–entropy plane partially reflects the duality linear/haptic versus painterly/optic, and thus can be considered as a ruler for quantifying the use of these opposing modes of representation. Also, the distances between pairs of styles in the complexity–entropy plane represent a similarity measure regarding these art history concepts. By using these distances, we find that different styles can be hierarchically organized and grouped according to their position on the plane. We have verified that these groups reflect well the textual content of Wikipedia pages used for describing each style, and they also reflect some similarities among them, in particular regarding the presence of soft/smudged/diffuse or well-defined/sharp/abrupt transitions. We have further quantified the amount of information encoded in these complexity measures by means of a classification task in which the style of an image is predicted based solely on the values of H and C. The obtained success rate of approximately 18% outperforms dummy classifiers, in turn showing that these two measures carry meaningful information about artwork style.

Since our two complexity measures are based entirely on the local scale of an artwork, they, of course, cannot capture all of the uniqueness and complexity of art. However, our results nevertheless demonstrate that simple physics-inspired metrics can be connected to concepts proposed by art historians and, more importantly, that these measures do carry relevant information about artworks, their style, and their evolution. In the context of Wölfflin’s metaphor about the evolution of art: “A closer inspection certainly soon shows that art even here did not return to the point at which it once stood, but that only a spiral movement would meet the facts.” (ref. 49, p. 234), we may consider the complexity–entropy plane as one of the possible projections of Wölfflin’s spiral.

Materials and Methods

Data.

The digital images used in this study were obtained from the visual arts encyclopedia WikiArt (https://www.wikiart.org/), which is one of the largest online and freely available datasets of visual artworks available to date. By crawling the web pages of WikiArt in August of 2016, we downloaded 137,364 digitalized images and metadata related to each artwork, such as painter (there are 2,391 different artists), date, and artistic style (e.g., Impressionism, Surrealism, and Baroque). The style labels provided by WikiArt are generated and collaboratively maintained by the users of that webpage. For the analysis of the temporal evolution, we have excluded all images whose composition dates were not specified (33,724 files). Fig. 5A depicts the number of images per year in our dataset, where we observe that these artworks were created between the years 1031 and 2016. Fig. 5B shows that the cumulative fraction of artworks in our dataset is well approximated by an exponential growth with the characteristic time equal to τ=111±1 y. Consequently, the cumulative number of artworks is doubling every 77 y. Also, more than 50% of these artworks were produced after the first decade of the 20th century, a period that is marked by the development of a large variety of different art movements.

Fig. 5.

Fig. 5.

Distribution of artwork images over the years. (A) The number of images per year in our dataset. (B) Cumulative fraction of artworks over the years (blue curve) on a log-linear scale. This fraction [f(t)] is approximated by an exponential growth [f(t)exp(t/τ)] with characteristic time equal to τ=111±1 y (dashed line). We observe that most artworks were produced after the onset of the 20th century; in particular, more than 50% were painted after the year 1912.

Matrix Representation of Image Files.

All image files are in JPEG format with 24 bits per pixel (8 bits each for red, green, and blue colors in the RGB “color space”), meaning that each pixel of the image is characterized by 256 shades of red, green, and blue, which, in total, allows 2563=16,777,216 color variations. For practical purposes, an image file can be thought of as a three-layer matrix of dimensions nx (the image width) by ny (the image height), where the layers correspond to the color channels, and the elements (ranging from 0 to 255) represent the color intensity. For our analysis, we have calculated the average value of the three color shades of each pixel, yielding a simple matrix for each image file. This approach is similar to the usual grayscale transformation of images, except that the average over the three color channels is usually weighted by different values. One of the most common weighting values defines the grayscale reflectance or luminance (67) and corresponds to calculating 0.2125R+0.7154G+0.0721B, where R, G, and B stand for the shade intensities of red, green, and blue, respectively. These weighting values are often chosen to mimic the color sensibility of the human eye, but our results are remarkably robust against different weighting choices. For example, the Pearson linear correlation between the values of H calculated with the usual grayscale transformation and the simple average is 0.989 and 0.992 for the values of C (SI Appendix, Fig. S2). We have therefore resorted to using the simple average value.

The Complexity–Entropy Plane.

By using the matrix representation of all images, we calculate the normalized permutation entropy H and statistical complexity C for each one. This technique was originally proposed for characterizing time series (44, 48), and, only recently, it has been generalized to use with higher-dimensional data such as images (46, 47). Here, we shall present this technique through a simple example (for a more formal description, we refer to the original articles). Let the matrix

A=602452674

represent a hypothetical image of size 3×3. The first step is to define sliding submatrices of size dx by dy, where these values are called embedding dimensions (the only parameters of the method). By choosing dx=dy=2, we have the following four partitions:

A1=6045,A2=0252,A3=4567,andA4=5274.

The next step is to investigate the ordinal patterns of occurrence of the elements of these submatrices. By letting a0, a1, a2, and a3 represent the elements of these matrices line by line, we have that A1 is described by the ordinal pattern Π1=(1,2,3,0), since a1<a2<a3<a0, where Π1 represents the permutation that sorts the elements of A1 in ascending order. Similarly, A3 is described by Π3=(0,1,2,3) (since a0<a1<a2<a3), and A4 is described by Π4=(1,3,0,2) (since a1<a3<a0<a2). In case of draws, we keep the occurrence order of the elements a0, a1, a2, and a3. Thus, A2 is described by Π2=(0,1,3,2), since a0<a1a3<a2 and because a1 precedes a3. From all ordinal patterns associated with A for a given dx and dy, we estimate the probability distribution of finding each permutation from its relative frequency of occurrence. In our example, from all 24 possible permutations [that is, (dxdy)!], four have appeared just once, and thus their probabilities are 1/4, while all other permutations have zero probability. Therefore, the probability distribution of the ordinal patterns associated with A is P={1/4,1/4,1/4,1/4,0,,0}. We have omitted the specification of each permutation in P because the order of its elements is irrelevant for the described procedure.

Having the probability distribution P={pi;i=1,,n}, we calculate the normalized Shannon entropy

H(P)=1ln(n)i=0npiln(1/pi), [1]

where n=(dxdy)! is the number of possible permutations and ln(n) corresponds to the maximum value of the Shannon entropy S(P)=i=0npiln(1/pi), that is, when all permutations are equally likely to occur (pi=1/n). The value of H quantifies the degree of “disorder” in the occurrence of the pixels of an image represented by the matrix A. We have H1 if the pixels appear in random order, and H0 if they always appear in the same order.

Despite the value of H being a good measure of randomness, it cannot adequately capture the degree of structural complexity present in A (47). Because of that, we further calculate the so-called statistical complexity (45, 68, 69)

C(P)=D(P,U)H(P)D*, [2]

where D(P,U) is a relative entropic measure (the Jensen–Shannon divergence) between P={pi;i=1,,n} and the uniform distribution U={ui=1/n;i=1,,n} defined as

D(P,U)=SP+U2S(P)2S(U)2, [3]

where P+U2={pi+1/n2,i=1,,n} and

D*=maxPD(P,U)=12n+1nln(n+1)+ln(n)2ln(2n), [4]

is a normalization constant (obtained by calculating D(P,U) when just one component of P is equal to 1 and all others are zero). The quantity D(P,U) is zero when all permutations are equally likely to happen, and it is larger than zero if there are privileged permutations. Thus, C(P) is zero in both extremes of order (P={pi=δ1,i;i=1,,n}) and disorder (P={pi=1/n;i=1,,n}). However, C(P) is not a trivial function of H(P). Namely, for a given H(P), there exists a range of possible values for C(P) (47, 48, 69), from which C(P) quantifies the existence of structural complexity and provides additional information that is not carried by the value of H(P). Mainly because of this fact, Rosso et al. (48) proposed to use a diagram of C(P) versus H(P), a representation space that is called the complexity–entropy plane (45).

We estimate the values of H(P) and C(P) for all 137,364 images by considering the embedding dimensions dx=dy=2. As we previously mentioned, these values are the only “tuning” parameters of the permutation complexity–entropy plane. However, the choice for the embedding dimensions is not completely arbitrary, as the condition (dxdy)!nxny must hold to obtain a reliable estimate of the ordinal probability distribution P (46). In our dataset, the average values of image length nx and height ny are both close to 900 pixels (SI Appendix, Fig. S3), thus practically limiting our choice to dx=dy=2.

Independence of H and C in Relation to Image Dimensions.

The image files obtained from WikiArt do not have the same dimensions. SI Appendix, Fig. S3 shows that image width and height have a similar distribution, with average values equal to 895 pixels for width and 913 pixels for height. Also, 95% of the images have width between 313 and 2,491 pixels, and height between 323 and 2,702 pixels. Because of these different dimensions, we have tested whether the values of H(P) and C(P) display any bias as a result. This is an important issue, since we expect the values of H(P) and C(P) to reflect the characteristics of an image, not its dimensions. SI Appendix, Fig. S4 shows scatter plots of the values of H(P) and C(P) versus the square root of image areas (that is, nxny) at several different scales, where any visual relationship is observed. The Pearson linear correlation is very low (0.05) for both relationships. We also estimate the maximal information coefficient (MIC) (70), a nonparametric coefficient that measures the association between two variables, even if they are correlated in a nonlinear fashion. The MIC value is very small (0.07) for both relationships. Thus, we conclude that the values of H(P) and C(P) are not affected or biased by image dimensions.

Finding the Number of Clusters with the Silhouette Coefficient.

The hierarchical organization of artistic styles presented in Fig. 3B enables the determination of clusters of styles. To do so, we must choose a threshold distance for which styles belong to different clusters. The number of clusters naturally depends on this choice. A way of determining an optimal threshold distance is by calculating the silhouette coefficient (59). This coefficient evaluates both the cohesion and the separation of data grouped into clusters. The silhouette coefficient is defined by the average value of

si=biaimax(ai,bi), [5]

where ai is the average intracluster distance and bi is the average nearest-cluster distance for the ith datum. By definition, the silhouette coefficient ranges from −1 to 1, and the higher its value the better the clustering configuration. SI Appendix, Fig. S8 shows the silhouette coefficient as a function of the threshold distance used for determining the clusters in Fig. 3B. We observe that the coefficient displays a maximum (of 0.57) when the threshold distance is 0.03. Thus, this threshold distance is the one that maximizes the cohesion and separation among the artistic styles. By using this value, we find 14 different groups of artistic styles shown in Fig. 3. Also, a similar approach yields the 24 different clusters that are associated with the similarities among the Wikipedia pages of the styles reported in SI Appendix, Fig. S9.

Implementation of Machine Learning Algorithms.

All machine learning algorithms used for predicting the artistic styles from an image are implemented by using functions of the Python scikit-learn library (71). For instance, the function sklearn.neighbors.KNeighborsClassifier implements the k-nearest neighbors. In statistical learning, a classification task involves inferring the category of an object by using a set of explanatory variables or features associated with this object, and the knowledge of other observations (the training set) in which the categories of the objects are known. For the results presented in the main text, we have included 20 different styles that each have more than 1,500 images in our analysis (see SI Appendix, Fig. S5 for names). However, similar results are obtained by considering a larger number of styles. For example, the overall accuracy of different learning algorithms is approximately 13% if we consider all of the styles with more than 100 images each (SI Appendix, Fig. S11) (compared with ∼18% if only 20 styles are considered).

Thus, our classification task involves identifying the artistic style (the categories) of an image from its entropy H and complexity C (the set of features). To perform the classification, data are randomly partitioned into n equally sized samples that preserve the total fraction of occurrences in each category. One of the samples is used for validating the algorithm, and the remaining n1 samples are used for training the algorithm. The accuracy (that is, the fraction of correct identifications) obtained from the training set is the training score, and the one obtained from the validation set is the cross-validation score. This process is repeated n times, producing an ensemble from which the average values of the scores and their confidence intervals are calculated. This approach is known as the n-fold cross-validation strategy (63).

We estimate the training and the cross-validation scores for each machine learning algorithm as a function of their main parameters (the validation curves). This is common practice for estimating the best trade-off between bias and variance errors. Bias errors occur when the learning methods are not properly taking into account all of the relevant information about the explanatory variables that describe the data (underfitting). Variance errors, on the other hand, usually happen when the complexity of the learning model is too high, that is, high enough even for modeling the noise in the training set (overfitting). SI Appendix, Fig. S10 shows the validation curves for the four learning methods that we use in our study. The parameters that we have studied are the number of neighbors in the case of the k-nearest neighbors algorithm, the number and the maximum depth of trees in the case of the random forest method, the parameter associated with the width of the radial basis function kernel and the penalty parameter for the support vector machine classification, and the so-called L2 penalty for the neural network model. We use these results as a guide for applying a more exhaustive grid search algorithm (71), from which the best tuning parameters of each learning method are obtained. These values are reported in SI Appendix, Fig. S10 and used for obtaining the results shown in Fig. 4C.

In addition to the validation curves, we have also estimated the learning curves, that is, the dependence of the training and the cross-validation scores on the size of the training set. This practice is also common when dealing with statistical learning algorithms, since very small training sets are usually not enough for fitting the model, while adding unnecessary data may introduce noise to the model. The results presented in SI Appendix, Fig. S10 show that the cross-validation score increases with the training size for all algorithms. However, this growth is practically not significant if the training set exceeds 50% of the data.

Supplementary Material

Supplementary File

Acknowledgments

This research was supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior Grants 440650/2014-3 and 303642/2014-9 and Slovenian Research Agency Grants J1-7009 and P5-0027.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1800083115/-/DCSupplemental.

References

  • 1.Mantegna RN, Stanley HE. Introduction to Econophysics: Correlations and Complexity in Finance. Cambridge Univ Press; Cambridge, UK: 1999. [Google Scholar]
  • 2.Wang Z, et al. Statistical physics of vaccination. Phys Rep. 2016;664:1–113. [Google Scholar]
  • 3.Perc M, et al. Statistical physics of human cooperation. Phys Rep. 2017;687:1–51. [Google Scholar]
  • 4.Stanley HE. Phase Transitions and Critical Phenomena. Clarendon Press; Oxford: 1971. [Google Scholar]
  • 5.Michel JB, et al. Quantitative analysis of culture using millions of digitized books. Science. 2011;331:176–182. doi: 10.1126/science.1199644. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Dodds PS, et al. Human language reveals a universal positivity bias. Proc Natl Acad Sci USA. 2015;112:2389–2394. doi: 10.1073/pnas.1411678112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Schich M, et al. A network framework of cultural history. Science. 2014;345:558–562. doi: 10.1126/science.1240064. [DOI] [PubMed] [Google Scholar]
  • 8.Gonzalez MC, Hidalgo CA, Barabasi AL. Understanding individual human mobility patterns. Nature. 2008;453:779–782. doi: 10.1038/nature06958. [DOI] [PubMed] [Google Scholar]
  • 9.Deville P, et al. Scaling identity connects human mobility and social interactions. Proc Natl Acad Sci USA. 2016;113:7047–7052. doi: 10.1073/pnas.1525443113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Onnela JP, et al. Structure and tie strengths in mobile communication networks. Proc Natl Acad Sci USA. 2007;104:7332–7336. doi: 10.1073/pnas.0610245104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Jiang ZQ, et al. Calling patterns in human communication dynamics. Proc Natl Acad Sci USA. 2013;110:1600–1605. doi: 10.1073/pnas.1220433110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Saramäki J, et al. Persistence of social signatures in human communication. Proc Natl Acad Sci USA. 2014;111:942–947. doi: 10.1073/pnas.1308540110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hughes JM, Foti NJ, Krakauer DC, Rockmore DN. Quantitative patterns of stylistic influence in the evolution of literature. Proc Natl Acad Sci USA. 2012;109:7682–7686. doi: 10.1073/pnas.1115407109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kuhn T, Perc M, Helbing D. Inheritance patterns in citation networks reveal scientific memes. Phys Rev X. 2014;4:041036. [Google Scholar]
  • 15.Perc M. Self-organization of progress across the century of physics. Sci Rep. 2013;3:1720. [Google Scholar]
  • 16.Sinatra R, Deville P, Szell M, Wang D, Barabási AL. A century of physics. Nat Phys. 2015;11:791–796. [Google Scholar]
  • 17.Sinatra R, Wang D, Deville P, Song C, Barabási AL. Quantifying the evolution of individual scientific impact. Science. 2016;354:aaf5239. doi: 10.1126/science.aaf5239. [DOI] [PubMed] [Google Scholar]
  • 18.Balietti S, Goldstone RL, Helbing D. Peer review and competition in the art exhibition game. Proc Natl Acad Sci USA. 2016;113:8414–8419. doi: 10.1073/pnas.1603723113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Birkhoff GD. Aesthetic Measure. Vol 38 Harvard Univ Press; Cambridge, MA: 1933. [Google Scholar]
  • 20.Taylor RP, Micolich AP, Jonas D. Fractal analysis of Pollock’s drip paintings. Nature. 1999;399:422. [Google Scholar]
  • 21.Jones-Smith K, Mathur H. Fractal analysis: Revisiting Pollock’s drip paintings. Nature. 2006;444:E9–E10. doi: 10.1038/nature05398. [DOI] [PubMed] [Google Scholar]
  • 22.Taylor RP, Micolich AP, Jonas D. Fractal analysis: Revisiting Pollock’s drip paintings (reply) Nature. 2006;444:E10–E11. doi: 10.1038/nature05398. [DOI] [PubMed] [Google Scholar]
  • 23.Taylor RP, et al. Authenticating Pollock paintings using fractal geometry. Pattern Recognit Lett. 2007;28:695–702. [Google Scholar]
  • 24.Jones-Smith K, Mathur H, Krauss LM. Drip paintings and fractal analysis. Phys Rev E. 2009;79:046111. doi: 10.1103/PhysRevE.79.046111. [DOI] [PubMed] [Google Scholar]
  • 25.De la Calleja EM, Cervantes F, De la Calleja J. Order-fractal transitions in abstract paintings. Ann Phys. 2016;371:313–322. [Google Scholar]
  • 26.Boon JP, Casti J, Taylor RP. Artistic forms and complexity. Nonlinear Dyn-Psychol Life Sci. 2011;15:265. [PubMed] [Google Scholar]
  • 27.Alvarez-Ramirez J, Ibarra-Valdez C, Rodriguez E. Fractal analysis of Jackson Pollock’s painting evolution. Chaos Solitons Fractals. 2016;83:97–104. [Google Scholar]
  • 28.Pedram P, Jafari GR. Mona Lisa: The stochastic view and fractality in color space. Int J Mod Phys C. 2008;19:855–866. [Google Scholar]
  • 29.Taylor R. Pollock, Mondrian and the nature: Recent scientific investigations. Chaos Complexity Lett. 2004;1:29. [Google Scholar]
  • 30.Hughes JM, Graham DJ, Rockmore DN. Quantification of artistic style through sparse coding analysis in the drawings of pieter bruegel the elder. Proc Natl Acad Sci USA. 2010;107:1279–1283. doi: 10.1073/pnas.0910530107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Shamir L. Computer analysis reveals similarities between the artistic styles of Van Gogh and Pollock. Leonardo. 2012;45:149–154. [Google Scholar]
  • 32.Elsa M, Zenit R. Topological invariants can be used to quantify complexity in abstract paintings. Knowl-Based Syst. 2017;126:48–55. [Google Scholar]
  • 33.Castrejon-Pita JR, Castrejón-Pita AA, Sarmiento-Galán A, Castrejón-Garcıa R. Nasca lines: A mystery wrapped in an enigma. Chaos Interdiscip J Nonlinear Sci. 2003;13:836–838. doi: 10.1063/1.1587031. [DOI] [PubMed] [Google Scholar]
  • 34.Koch M, Denzler J, Redies C. 1/f2 characteristics and isotropy in the fourier power spectra of visual art, cartoons, comics, mangas, and different categories of photographs. PLoS One. 2010;5:e12268. doi: 10.1371/journal.pone.0012268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Montagner C, Linhares JMM, Vilarigues M, Nascimento SMC. Statistics of colors in paintings and natural scenes. JOSA A. 2016;33:A170–A177. doi: 10.1364/JOSAA.33.00A170. [DOI] [PubMed] [Google Scholar]
  • 36.Stork DG, Coddington J, editors. 2008. Computer Image Analysis in the Study of Art, Proceedings of SPIE (Int Soc Opt Photonics, Bellingham, WA), Vol 6810.
  • 37.Stork DG, Coddington J, Bentkowska-Kafel A, editors. 2010. Computer Vision and Image Analysis of Art, Proceedings of SPIE (Int Soc Opt Photonics, Bellingham, WA), Vol 7531.
  • 38.Stork DG, Coddington J, Bentkowska-Kafel A, editors. 2011. Computer Vision and Image Analysis of Art II, Proceedings of SPIE (Int Soc Opt Photonics, Bellingham, WA), Vol 7869.
  • 39.Kim D, Son SW, Jeong H. Large-scale quantitative analysis of painting arts. Sci Rep. 2014;4:7370. doi: 10.1038/srep07370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Lee B, Kim D, Jeong H, Sun S, Park J. 2017. Understanding the historic emergence of diversity in painting via color contrast. arXiv:1701.07164.
  • 41.Ushizima D, Manovich L, Margolis T, Douglass J. The Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media. AAAI Press; Palo Alto, CA: 2012. Cultural analytics of large datasets from flickr; pp. 30–34. [Google Scholar]
  • 42.Manovich L. Data science and digital art history. Int J Digital Art Hist. 2015;1:21631. [Google Scholar]
  • 43.Yazdani M, Chow J, Manovich L. Quantifying the development of user-generated art during 2001–2010. PLoS One. 2017;12:e0175350. doi: 10.1371/journal.pone.0175350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Bandt C, Pompe B. Permutation entropy: A natural complexity measure for time series. Phys Rev Lett. 2002;88:174102. doi: 10.1103/PhysRevLett.88.174102. [DOI] [PubMed] [Google Scholar]
  • 45.Lopez-Ruiz R, Mancini HL, Calbet X. A statistical measure of complexity. Phys Lett A. 1995;209:321–326. [Google Scholar]
  • 46.Ribeiro HV, Zunino L, Lenzi EK, Santoro PA, Mendes RS. Complexity-entropy causality plane as a complexity measure for two-dimensional patterns. PloS One. 2012;7:e40689. doi: 10.1371/journal.pone.0040689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Zunino L, Ribeiro HV. Discriminating image textures with the multiscale two-dimensional complexity-entropy causality plane. Chaos Solitons Fractals. 2016;91:679–688. [Google Scholar]
  • 48.Rosso OA, Larrondo HA, Martin MT, Plastino A, Fuentes MA. Distinguishing noise from chaos. Phys Rev Lett. 2007;99:154102. doi: 10.1103/PhysRevLett.99.154102. [DOI] [PubMed] [Google Scholar]
  • 49.Wölfflin H. Principles of Art History: The Problem of the Development of Style in Later Art. Dover; Mineola, NY: 1950. [Google Scholar]
  • 50.Riegl A. Historical Grammar of the Visual Arts. Zone Books; New York: 2004. [Google Scholar]
  • 51.Gaiger J. The analysis of pictorial style. Br J Aesthetics. 2002;42:20–36. [Google Scholar]
  • 52.Danto AC, Goehr L. After the End of Art: Contemporary Art and the Pale of History. Vol. 197 Princeton Univ Press; Princeton: 1997. [Google Scholar]
  • 53.Kleiner FS. Gardner’s Art through the Ages: The Western Perspective. 14th Ed. Vol II Wadsworth; Boston: 2013. [Google Scholar]
  • 54.Hodge AN. A History of Art: Painting from Giotto to the Present Day. Arcturus; London: 2013. [Google Scholar]
  • 55.Blatt SJ, Blatt ES. Continuity and Change in Art: The Development of Modes of Representation. Routledge; New York: 1984. [Google Scholar]
  • 56.Boyd B. Evolutionary theories of art. In: Gottschall J, Wilson DS, editors. The Literary Animal. Northwestern Univ Press; Evanston, IL: 2005. pp. 147–176. [Google Scholar]
  • 57.Nadal M, Gómez-Puerto G. Evolutionary approaches to art and aesthetics. In: Tinio PPL, Smith JK, editors. The Cambridge Handbook of the Psychology of Aesthetics and the Arts. Cambridge Univ Press; Cambridge, UK: 2014. pp. 167–194. [Google Scholar]
  • 58.Ward JH., Jr Hierarchical grouping to optimize an objective function. J Am Stat Assoc. 1963;58:236–244. [Google Scholar]
  • 59.Rousseeuw PJ. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987;20:53–65. [Google Scholar]
  • 60.Swartz A. Pattern and Decoration: An Ideal Vision in American Art, 1975–1985. Hudson River Museum; New York: 2007. [Google Scholar]
  • 61.Chowdhury GG. Introduction to Modern Information Retrieval. Facet; London: 2010. [Google Scholar]
  • 62.Rosenberg A, Hirschberg J. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Vol 7. Assoc Comput Linguistics; Stroudsburg, PA: 2007. V-measure: A conditional entropy-based external cluster evaluation measure; pp. 410–420. [Google Scholar]
  • 63.Hastie T, Tibshirani R, Friedman J. 2013. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics (Springer, New York)
  • 64.Müller A, Guido S. Introduction to Machine Learning with Python: A Guide for Data Scientists. O’Reilly Media; Sebastopol, CA: 2016. [Google Scholar]
  • 65.Zujovic J, Gandy L, Friedman S, Pardo B, Pappas TN. IEEE Workshop on Multimedia Signal Processing, 2009. Inst Electr Electron Eng; New York: 2009. Classifying paintings by artistic genre: An analysis of features & classifiers; pp. 1–5. [Google Scholar]
  • 66.Agarwal S, Karnick H, Pant N, Patel U. 2015 Genre and style based painting classification. Availabe at https://ieeexplore.ieee.org/document/7045938/. Accessed August 1, 2016.
  • 67.van der Walt S, et al. scikit-image: Image processing in Python. PeerJ. 2014;2:e453. doi: 10.7717/peerj.453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Lamberti PW, Martin MT, Plastino A, Rosso OA. Intensive entropic non-triviality measure. Physica A Stat Mech its Appl. 2004;334:119–131. [Google Scholar]
  • 69.Martin M, Plastino A, Rosso OA. Generalized statistical complexity measures: Geometrical and analytical properties. Physica A. 2006;369:439–462. [Google Scholar]
  • 70.Reshef DN, et al. Detecting novel associations in large data sets. science. 2011;334:1518–1524. doi: 10.1126/science.1205438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Pedregosa F, et al. Scikit-learn: Machine learning in Python. J Machine Learn Res. 2011;12:2825–2830. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES