CineScale: A dataset of cinematic shot scale in movies

Mattia Savardi; András Bálint Kovács; Alberto Signoroni; Sergio Benini

doi:10.1016/j.dib.2021.107002

. 2021 Apr 20;36:107002. doi: 10.1016/j.dib.2021.107002

CineScale: A dataset of cinematic shot scale in movies

Mattia Savardi ^a, András Bálint Kovács ^b, Alberto Signoroni ^a, Sergio Benini ^a,^⁎

PMCID: PMC8090997 PMID: 33997191

Abstract

We provide a database containing shot scale annotations (i.e., the apparent distance of the camera from the subject of a filmed scene) for more than 792,000 image frames. Frames belong to 124 full movies from the entire filmographies by 6 important directors: Martin Scorsese, Jean-Luc Godard, Béla Tarr, Federico Fellini, Michelangelo Antonioni, and Ingmar Bergman. Each frame, extracted from videos at 1 frame per second, is annotated on the following scale categories: Extreme Close Up (ECU), Close Up (CU), Medium Close Up (MCU), Medium Shot (MS), Medium Long Shot (MLS), Long Shot (LS), Extreme Long Shot (ELS), Foreground Shot (FS), and Insert Shots (IS). Two independent coders annotated all frames from the 124 movies, whilst a third one checked their coding and made decisions in cases of disagreement. The CineScale database enables AI-driven interpretation of shot scale data and opens to a large set of research activities related to the automatic visual analysis of cinematic material, such as the automatic recognition of the director’s style, or the unfolding of the relationship between shot scale and the viewers’ emotional experience. To these purposes, we also provide the model and the code for building a Convolutional Neural Network (CNN) architecture for automated shot scale recognition. All this material is provided through the project website, where video frames can also be requested to authors, for research purposes under fair use.

Keywords: Shot scale, Frames, Long shot, Close up, Medium shot, Video analysis, Movies, Film studies, CNN

Specifications Table

Subject	Arts (General), Film studies, Video content analysis
Specific subject area	Quantitative analysis of shot scale in films
Type of data	Table (.CSV format)
How data were acquired	Shot scale annotations were provided by three human coders (2 coders + 1 who made decision in case of disagreement) in tabular form.
Data format	Raw
Parameters for data collection	Annotated frames were extracted at 1 frame per second (fps)
Description of data collection	The database contains the shot scale annotations for about 792,000 frames from 124 full movies from the entire filmographies by 6 authors: Martin Scorsese, Jean-Luc Godard, Béla Tarr, Federico Fellini, Michelangelo Antonioni, and Ingmar Bergman.
Data source location	Institution: University of Brescia City/Town/Region: Brescia Country: Italy Latitude: 45.564664 Longitude: 10.231660
Data accessibility	Repository name: Mendeley Data Data identification number: doi:10.17632/th46h4vdwd.1 Direct URL to data: https://doi.org/10.17632/th46h4vdwd.1 Project website: https://cinescale.github.io

Open in a new tab

Value of the Data

•
The data can be used by both computer scientists and film scholars to perform quantitative analysis of shot scale in movies.
•
The data are useful to develop models and classification strategies to automatically predict the shot scale in movies.
•
The actual dimensions of this dataset are of unprecedented size, more than 15 times bigger than those published so far.
•
There are evidences that the statistical distribution and the temporal pattern of some frequent cinematic features, such as shot scale, might act as a stylistic fingerprint of a specific director.
•
By evaluating the statistics of such cinematographic feature, it is possible to investigate of how shot scale influences the film viewing experience in viewers, such as empathy-related processes.

1. Data Description

Data consists of shot scale annotation for about 792,000 frames on the following 9 shot scale classes: Extreme Close Up (ECU), Close Up (CU), Medium Close Up (MCU), Medium Shot (MS), Medium Long Shot (MLS), Long Shot (LS), Extreme Long Shot (ELS), Foreground Shot (FS), and Insert Shots (IS).

1.1. Shot scale

Shot scale measures the relative distance of the camera from the main filmed object, and as such, it is tied to the size of the human figures on the screen and the relative ratio of the foreground to the background [1].

As such, shot scale is one of the most important cinematic features of any filmic product. The importance of shot scale lies in the fact that, among cinematic techniques which are relevant to viewers’ responses [2], it affects both low- and high-complexity responses in viewers, with a strong potential to influence how fictional narratives are processed and experienced [3], [4], [5]. Recent findings indicate that the shot scale impacts viewers’ responses related to character engagement, such as theory of mind [6], [7], [8] emotion recognition [9], and empathic care [10]. Shot scale assessment has been systematically related to viewers’ rating on film mood and narrative engagement [11]. Furthermore, especially in the practice of art cinema, the preferences in the use of specific shot scales can be important indicators of a particular style: previous work [12] indicates that the statistical analysis of the overall shot scale distribution and transitions in films may reveal consistent and recurrent patterns in the works of a specific author. In [13] it is shown how this can even lead to performing automatic attribution of the movie’s authorship starting from the statistical analysis of shot scale, which can be automatically computed as in [14], [15].

Very few databases containing information about shot scale do exist. Several pioneering computational works rely on the Cinemetrics.lv database [16] to scrutinize the history of cinematic phenomena such as cut frequency in editing and shot scale (see for example [17]), examining how they vary across genres, historical patterns, etc. However, since Cinemetrics.lv provide various annotations for the same content without the possibility to visualize the related video shots, it is often difficult to relate and synchronize annotations with the visual content, especially among the different distributions of the same movie. Very recently the authors of MovieNet, the massive project described in [18], totally annotated 47K shots from movies and trailers, each with one tag of view scale and one tag of camera movement. Although remarkable for its variety, the size of such dataset is still very limited if compared to the CineScale dataset.

1.2. Categories

In Fig. 1 we show some visual examples which illustrate how shot scale is mapped into the nine categories1.

Seven classes shown in Fig. 1 be considered as main categories: Extreme Close Up (ECU), Close Up (CU), Medium Close Up (MCU), Medium Shot (MS), Medium Long Shot (MLS), Long Shot (LS), and Extreme Long Shot (ELS). Moreover, in this work we include two additional scales: Foreground Shot (FS), in which two different shot scales, at least two categories away from each other (e.g., ECU and MCU or longer, or CU and MS or longer), are found in the same image (see also examples in Fig. 2); and Insert shots (IS), which include frames with textual credits, producer logos, or with totally black background (see also examples in Fig. 3).

Fig. 2 — Examples of Foreground Shots, adapted by [19].

Fig. 3 — Examples of Insert Shots, adapted by [19].

1.3. Data features

The full list of movies divided by director, with the year of production and original title, is provided in Fig. 5. Movies are approximately half in black-and-white (b&w) and half in color, since they cover more than 70 years of movie history (see Fig. 4).

Fig. 4 — Distribution of the movies vs. the publication year. The orange dots represent black & white movies, while blue dots the ones in color. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

The six directors are consensually considered highly unique and distinguishable in film historiography of author cinema; therefore, such artistic video presents a variety of experimental aesthetic situations, rich scene compositions, and often present unconventional and symbolic content.

Table 1 presents the distributions of shot scale for each director, which is also graphically rendered in Fig. 6

Table 1.

Average number of frames per movie and average shot scale distribution for each director.

Director	Fr/mov (avg)	FS	ECU	CU	MCU	MS	MLS	LS	ELS	IS	NA
Antonioni	6,247.38	0.25	0.01	0.13	0.21	0.14	0.12	0.04	0.01	0.01	0.00
Bergman	5,24.19	0.13	0.06	0.24	0.24	0.09	0.08	0.04	0.01	0.01	0.00
Fellini	7,096.39	0.26	0.01	0.21	0.24	0.10	0.12	0.04	0.02	0.01	0.00
Godard	5,151.29	0.11	0.04	0.16	0.27	0.15	0.14	0.05	0.03	0.02	0.00
Scorsese	7,931.35	0.22	0.01	0.09	0.43	0.09	0.09	0.04	0.02	0.03	0.00
Tarr	8,894.90	0.34	0.08	0.31	0.09	0.06	0.06	0.04	0.01	0.00	0.01

Open in a new tab

Fig. 6 — Shot scale distribution for each director (0=FS, 1=ECU, 2=CU, 3=MCU, 4=MS, 5=MLS, 6=LS, 7=ELS, 8=IS, 9=NA).

Fig. 7 represents the transitions between shot scale classes for each director, by means of transition matrices. Since scale is annotated each second, so as transition probabilities to different scales are pretty small, the values in the matrices are differently normalized depending on whether there is a transition to a different class or not, i.e., the probabilities of staying in the same class are normalized to the maximum value of intra-class transition (in blue, on the diagonal), while the probabilities to switch to a different class are normalized to the maximum value on inter-class transition (in red).

2. Experimental Design, Materials and Methods

2.1. Data extraction

Before being annotated, frames have been extracted at 1 fps, using ffmpeg version 3.4.8, from 124 full movies which correspond to the entire filmographies by 6 important film authors: Martin Scorsese, Jean-Luc Godard, Béla Tarr, Federico Fellini, Michelangelo Antonioni, and Ingmar Bergman. The resolution of 1 fps was chosen since a camera take may contain several scales in the same shot, e.g., whenever the camera or the objects in the image are moving.

Shot scale annotations were provided by three human coders (2 coders + 1 who made decision in case of disagreement) in tabular form, as shown in Fig. 8. Annotations can be retrieved from the Mendeley Data.

2.2. Materials

Among the additional material that can be retrieved on the project website we first mention the possibility to directly contact the author to request access to video frames, under fair use agreement. Additionally provided material includes the model and the code for building a Convolutional Neural Network (CNN) architecture for automated shot scale recognition. In particular, with respect to the CNN-based architecture, in [14] we tested three Convolutional Neural Networks with increasing capacity (AlexNet, GoogLeNet, VGG16) trained by an extensive hyperparameter selection process. All best performing CNNs use pre-computed weights from ImageNet and then are fine-tuned on the shot scale dataset. Among networks, VGG-16 performed best, resulting in 94% accuracy on the three scales. On the project website we include an updated version of the model that makes use of DenseNet, improving the overall accuracy by 3% with respect to the best model presented in the research article [14]. All related publications [11], [13], [14] containing a comparison with state of the art in the field can be also found on the same project website.

Ethics Statement

No human subjects were involved in this work.

CRediT Author Statement

Mattia Savardi: Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing review & editing, Visualization; András Bálint Kovács: Conceptualization, Methodology, Formal analysis, Investigation, Resources, Data curation; Alberto Signoroni: Methodology, Investigation, Resources, Writing review & editing, Supervision, Funding acquisition; Sergio Benini: Conceptualization, Methodology, Formal analysis, Investigation, Data curation, Writing original draft, Writing review & editing, Supervision, Project administration.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.

Acknowledgements

The author would like to thank Dr. Michele Svanera for his previous work on shot scale analysis of art films.

Footnotes

Original frames in Fig. 1 have been adapted by the method in [19] to avoid copyright infringement.

References

1.Salt B. 2006. Moving into pictures: More on film history, style, and analysis. [Google Scholar]
2.Lankhuizen T., Balint K., Savardi M., Konijn E., Bartsch A., Benini S. Shaping film: a quantitative formal analysis of contemporary empathy-eliciting hollywood cinema. Psychology of Aesthetics Creativity and the Arts. 2020 doi: 10.1037/aca0000356. [DOI] [Google Scholar]
3.Codispoti M., De Cesarei A. Arousal and attention: picture size and emotional reactions. Psychophysiology. 2007;44(5):680–686. doi: 10.1111/j.1469-8986.2007.00545.x. [DOI] [PubMed] [Google Scholar]
4.Canini L., Benini S., Leonardi R. Affective recommendation of movies based on selected connotative features. IEEE Trans. Circuits Syst. Video Technol. 2013;23(4):636–647. doi: 10.1109/TCSVT.2012.2211935. [DOI] [Google Scholar]
5.Reeves B., Lang A., Kim E., Tartar D. 1999. The effects of screen size and message content on attention and arousal media psychology. [Google Scholar]
6.Bálint K., Klausch T., Pólya T. Watching closely. J. Media Psychol. 2018;30(3):150–159. doi: 10.1027/1864-1105/a000189. [DOI] [Google Scholar]
7.Rooney B., Bálint K.E. Watching more closely: shot scale affects film viewers’ theory of mind tendency but not ability. Front. Psychol. 2018;8:2349. doi: 10.3389/fpsyg.2017.02349. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Balint K., Blessing J., Rooney B. Shot scale matters: the effect of close-up frequency on mental state attribution in film viewers. Poetics. 2020;83 doi: 10.1016/j.poetic.2020.101480. [DOI] [Google Scholar]
9.Cutting J., Armstrong K.L. Facial expression, size, and clutter: inferences from movie structure to emotion judgments and back. Attention, Perception & Psychophysics. 2016;78:891–901. doi: 10.3758/s13414-015-1003-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Cao X. The effects of facial close-ups and viewers’ sex on empathy and intentions to help people in need. Mass Communication and Society. 2013;16:161–178. doi: 10.1080/15205436.2012.683928. [DOI] [Google Scholar]
11.Benini S., Savardi M., Balint K., Kovacs A., Signoroni A. On the influence of shot scale on film mood and narrative engagement in film viewers. IEEE Trans. Affect. Comput. 2019:1. doi: 10.1109/taffc.2019.2939251. [DOI] [Google Scholar]
12.Raz G., Valente G., Svanera M., Benini S., Kovács A.B. A robust neural fingerprint of cinematic shot-scale. Projections. 01 Dec. 2019;13(3):23–52. doi: 10.3167/proj.2019.130303. [DOI] [Google Scholar]; https://www.berghahnjournals.com/view/journals/projections/13/3/proj130303.xml
13.Svanera M., Savardi M., Signoroni A., Kovács A.B., Benini S. Who is the film’s director? authorship recognition based on shot features. IEEE Multimedia. 2019;26(4):43–54. doi: 10.1109/MMUL.2019.2940004. [DOI] [Google Scholar]
14.Savardi M., Signoroni A., Migliorati P., Benini S. 2018 25th IEEE International Conference on Image Processing (ICIP) 2018. Shot scale analysis in movies by convolutional neural networks; pp. 2620–2624. [DOI] [Google Scholar]
15.Benini S., Svanera M., Adami N., Leonardi R., Kovács A.B. Shot scale distribution in art films. Multimed. Tools Appl. 2016;75(23):16499–16527. doi: 10.1007/s11042-016-3339-9. [DOI] [Google Scholar]
16.Tsivian Y. Cinemetrics, part of the humanities’s cyberinfrastructure. Digital Tools in Media Studies. 2009;9(93–100):94. [Google Scholar]
17.Cutting J.E. Narrative theory and the dynamics of popular movies. Psychonomic bulletin & review. 2016;23(6):1713–1743. doi: 10.3758/s13423-016-1051-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Huang Q., Xiong Y., Rao A., Wang J., Lin D. The European Conference on Computer Vision (ECCV) 2020. Movienet: A holistic dataset for movie understanding. [Google Scholar]
19.Johnson J., Alahi A., Fei-Fei L. European conference on computer vision. Springer; 2016. Perceptual losses for real-time style transfer and super-resolution; pp. 694–711. [Google Scholar]

[bib0001] 1.Salt B. 2006. Moving into pictures: More on film history, style, and analysis. [Google Scholar]

[bib0002] 2.Lankhuizen T., Balint K., Savardi M., Konijn E., Bartsch A., Benini S. Shaping film: a quantitative formal analysis of contemporary empathy-eliciting hollywood cinema. Psychology of Aesthetics Creativity and the Arts. 2020 doi: 10.1037/aca0000356. [DOI] [Google Scholar]

[bib0003] 3.Codispoti M., De Cesarei A. Arousal and attention: picture size and emotional reactions. Psychophysiology. 2007;44(5):680–686. doi: 10.1111/j.1469-8986.2007.00545.x. [DOI] [PubMed] [Google Scholar]

[bib0004] 4.Canini L., Benini S., Leonardi R. Affective recommendation of movies based on selected connotative features. IEEE Trans. Circuits Syst. Video Technol. 2013;23(4):636–647. doi: 10.1109/TCSVT.2012.2211935. [DOI] [Google Scholar]

[bib0005] 5.Reeves B., Lang A., Kim E., Tartar D. 1999. The effects of screen size and message content on attention and arousal media psychology. [Google Scholar]

[bib0006] 6.Bálint K., Klausch T., Pólya T. Watching closely. J. Media Psychol. 2018;30(3):150–159. doi: 10.1027/1864-1105/a000189. [DOI] [Google Scholar]

[bib0007] 7.Rooney B., Bálint K.E. Watching more closely: shot scale affects film viewers’ theory of mind tendency but not ability. Front. Psychol. 2018;8:2349. doi: 10.3389/fpsyg.2017.02349. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0008] 8.Balint K., Blessing J., Rooney B. Shot scale matters: the effect of close-up frequency on mental state attribution in film viewers. Poetics. 2020;83 doi: 10.1016/j.poetic.2020.101480. [DOI] [Google Scholar]

[bib0009] 9.Cutting J., Armstrong K.L. Facial expression, size, and clutter: inferences from movie structure to emotion judgments and back. Attention, Perception & Psychophysics. 2016;78:891–901. doi: 10.3758/s13414-015-1003-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0010] 10.Cao X. The effects of facial close-ups and viewers’ sex on empathy and intentions to help people in need. Mass Communication and Society. 2013;16:161–178. doi: 10.1080/15205436.2012.683928. [DOI] [Google Scholar]

[bib0011] 11.Benini S., Savardi M., Balint K., Kovacs A., Signoroni A. On the influence of shot scale on film mood and narrative engagement in film viewers. IEEE Trans. Affect. Comput. 2019:1. doi: 10.1109/taffc.2019.2939251. [DOI] [Google Scholar]

[bib0012] 12.Raz G., Valente G., Svanera M., Benini S., Kovács A.B. A robust neural fingerprint of cinematic shot-scale. Projections. 01 Dec. 2019;13(3):23–52. doi: 10.3167/proj.2019.130303. [DOI] [Google Scholar]; https://www.berghahnjournals.com/view/journals/projections/13/3/proj130303.xml

[bib0013] 13.Svanera M., Savardi M., Signoroni A., Kovács A.B., Benini S. Who is the film’s director? authorship recognition based on shot features. IEEE Multimedia. 2019;26(4):43–54. doi: 10.1109/MMUL.2019.2940004. [DOI] [Google Scholar]

[bib0014] 14.Savardi M., Signoroni A., Migliorati P., Benini S. 2018 25th IEEE International Conference on Image Processing (ICIP) 2018. Shot scale analysis in movies by convolutional neural networks; pp. 2620–2624. [DOI] [Google Scholar]

[bib0015] 15.Benini S., Svanera M., Adami N., Leonardi R., Kovács A.B. Shot scale distribution in art films. Multimed. Tools Appl. 2016;75(23):16499–16527. doi: 10.1007/s11042-016-3339-9. [DOI] [Google Scholar]

[bib0016] 16.Tsivian Y. Cinemetrics, part of the humanities’s cyberinfrastructure. Digital Tools in Media Studies. 2009;9(93–100):94. [Google Scholar]

[bib0017] 17.Cutting J.E. Narrative theory and the dynamics of popular movies. Psychonomic bulletin & review. 2016;23(6):1713–1743. doi: 10.3758/s13423-016-1051-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0018] 18.Huang Q., Xiong Y., Rao A., Wang J., Lin D. The European Conference on Computer Vision (ECCV) 2020. Movienet: A holistic dataset for movie understanding. [Google Scholar]

[bib0019] 19.Johnson J., Alahi A., Fei-Fei L. European conference on computer vision. Springer; 2016. Perceptual losses for real-time style transfer and super-resolution; pp. 694–711. [Google Scholar]

PERMALINK

CineScale: A dataset of cinematic shot scale in movies

Mattia Savardi

András Bálint Kovács

Alberto Signoroni

Sergio Benini

Abstract

Specifications Table

Value of the Data