Abstract
We analyze repeated cross‐sectional survey data collected by the Institute of Global Health Innovation, to characterize the perception and behavior of the Italian population during the covid‐19 pandemic, focusing on the period that spans from April 2020 to July 2021. To accomplish this goal, we propose a Bayesian dynamic latent‐class regression model, that accounts for the effect of sampling bias including survey weights into the likelihood function. According to the proposed approach, attitudes towards covid‐19 are described via ideal behaviors that are fixed over time, corresponding to different degrees of compliance with spread‐preventive measures. The overall tendency toward a specific profile dynamically changes across survey waves via a latent Gaussian process regression, that adjusts for subject‐specific covariates. We illustrate the evolution of Italians' behaviors during the pandemic, providing insights on how the proportion of ideal behaviors has varied during the phases of the lockdown, while measuring the effect of age, sex, region and employment of the respondents on the attitude toward covid‐19.
Keywords: Bayesian inference, categorical data, dynamic modeling, repeated cross‐sectional data, survey weights
1. INTRODUCTION
The outbreak of the covid‐19 pandemic has impacted our world, with more than 250 million infected people, and more than 5 million deaths at December 2021. 1 From an economic perspective, the covid‐19 outbreak and the national measures to contain the spread of the disease lead to severe economic recessions in many sectors, such as tourism, accommodation, and food services. 2 Psychological and social effects on the population are less immediate to quantify, but some preliminary results suggest that the pandemic has affected also these aspects of human life. 3 , 4 , 5
In this article, we study the evolution of behaviors in compliance with some measures that prevent the spread of covid‐19 (see Table 1). We focus on the Italian population, that represents an interesting case study, because Italy was the first European country to introduce a national lockdown to limit the spread of covid‐19, imposing behavioral changes in the population. The adopted measures are likely to have had a psychological and social effect proportionate with their severity and duration. In fact, some early reports, based on online surveys, suggest an increased level of distress, anxiety, and fear. 6 , 7 , 8 Monitoring these aspects, quantifying their evolution over time, and characterizing their impact on individuals is of primary importance to evaluate the wellbeing of a population. Similarly, it is of interest to evaluate the compliance with covid‐19 policies, that can largely depend on personal status. 9
TABLE 1.
List of analyzed survey items with code, label, and description
| Survey Code | Label | Description |
|---|---|---|
| i12_health_1 | ih1 | Worn a face mask outside your home (e.g. when on public transport going to a supermarket or going to a main road) |
| i12_health_2 | ih2 | Washed hands with soap and water |
| i12_health_3 | ih3 | Used hand sanitiser |
| i12_health_4 | ih4 | Covered your nose and mouth when sneezing or coughing |
| i12_health_5 | ih5 | Avoided contact with people who have symptoms or you think may have been exposed to the coronavirus |
| i12_health_6 | ih6 | Avoided going out in general |
| i12_health_7 | ih7 | Avoided going to hospital or other healthcare settings |
| i12_health_8 | ih8 | Avoided taking public transport |
| i12_health_11 | ih11 | Avoided having guests to your home |
| i12_health_12 | ih12 | Avoided small social gatherings (not more than 2 people) |
| i12_health_13 | ih13 | Avoided medium‐sized social gatherings (between 3 and 10 people) |
| i12_health_14 | ih14 | Avoided large‐sized social gatherings (more than 10 people) |
| i12_health_15 | ih15 | Avoided crowded areas |
| i12_health_16 | ih16 | Avoided going to shops |
Note: Subjects can respond to each item with “Not at all”, “Rarely”, “Sometimes”, “Frequently”, and “Always”, according to their level of agreement with each survey item.
Variations of personal routines and practices have been documented; for example, the reduction of social activities and gatherings and the changes in mobility patterns, such as the increase of work‐from‐home practices and the reduction of public transport use. 10 , 11 These behaviors reflect the compliance with the government regulations, as well as the internalization of different recommendations to reduce the spread of the virus, that changed day‐to‐day life. However, compliance with spread reducing measures depends upon personal conditions, psychological status, and many other personal factors, 12 and it is likely to change during different stages of the pandemic.
During the first phase of the pandemic (up to June 2020), there is evidence that the Italian population has scrupulously followed government measures; 13 , 14 however, to the best of our knowledge, no analysis has been performed to assess if such compliance is constant over time, across socio‐demographic groups, or proportional to the severity of the measures adopted by the Italian government throughout the pandemic. To quantify these aspects, we analyze survey data provided by the Institute of Global Health Innovation (ighi) at the Imperial College of London, in collaboration with the company YouGov, 15 described in Section 3.
We describe the attitude of the Italian population toward covid‐19 throughout the pandemic with a dynamic Bayesian latent class regression model. 16 , 17 Such models assume that the population is composed of groups corresponding to ideal behaviors, that can represent different attitudes towards covid‐19. At any given time, each subject composing the population is associated with one of these ideal profiles, and the categorical variables characterizing behaviors are modeled as conditionally independent given the profile memberships. Latent class models are conceptually simple and have been used as a building block for several methods to analyze categorical variables; for example, in problems including survey weights, 18 , 19 when interest is on characterizing temporal dependence across contingency tables 20 or differences among groups of subjects. 21 See also Chapters 9 and 11.5 of Handbook of mixture analysis 22 for further references.
The article is structured as follows: Section 2 contextualize our contribution within the current literature, while Section 3 describes the survey data analyzed in this article; Section 4 introduces the proposed dynamic Bayesian latent‐class regression model, and Section 5 illustrates the results and empirical findings. Finally, Section 6 provides a brief discussion.
2. RELATED WORK
During the past two years, there has been a broad interest in modeling the evolution of the covid‐19 pandemic. Indeed, incidence data such as the number of positive individuals, hospitalizations, and intensive care unit admissions have been systematically collected and released to the public on regular basis; for example, in Italy they have been released daily by the Department of “Protezione Civile” (https://github.com/pcm‐dpc/COVID‐19). Accurate modeling of these data has proven useful to measure the current state of the pandemic, to develop strategies based on empirical evidence, and to evaluate the success of different policies adopted by the governments. In this context, Girardi et al 23 developed a robust non‐linear regression to model and predict the contagion dynamics of covid‐19 in Italy. Alaimo Di Loro et al 24 introduced a novel generalized linear model based on Richards' curves, obtaining accurate short‐term forecasts of incidence indicators. Girardi et al 25 also proposed a change‐point growth model that is able to capture subsequent pandemic waves, while Scrucca 26 developed a real‐time index that summarizes the current state of the pandemic. We also refer to References 27, 28, 29, 30 for additional modeling strategies and analysis of Italian incidence data. Similar analysis have been developed at the European 31 , 32 , 33 and global level. 34 , 35 , 36
Less attention has been devoted to study the impact of covid‐19 on individual attitudes, analyzing personal behavior and their interactions with compliance with preventive measures. For example, Krekel et al 37 studied the associations between happiness and level of compliance with government regulations. Behavioral data have also been used as important predictors of the number of covid‐19 cases. 38 In Italy, Duradoni et al 39 studied the psychological profile of people that were compliant with the government regulations a month after the lockdown started, while Guazzini et al 40 tested changes in the psychological adaptation across the first two waves of the pandemic. In this context, we consider the evolution of the compliance of Italian population, quantifying variations of this compliance over time, across socio‐demographic groups, and studying its associations with the severity of the measures adopted by the Italian government throughout the pandemic.
3. DATA DESCRIPTION
We analyze repeated cross‐sectional survey data provided by the Institute of Global Health Innovation (ighi) at the Imperial College of London, in collaboration with the company YouGov. 15 Data are publicly available for research purposes at the repository https://github.com/YouGov‐Data/covid‐19‐tracker, along with a brief description of the collected variables. This survey aims to investigate how different populations responded to covid‐19, gathering self‐reported data on several aspects of the pandemic, including objective measurements, such as testing results and observed symptoms, and subjective measurements, such as daily behaviors.
We are interested in subjective measurements describing the compliance with national preventive regulations. These measurements include questions on self‐isolation, avoidance of social gatherings, frequency of hand washing, use of hand sanitizer and contact with other people, among many others. The complete list of variables used in this study is reported in Table 1, which includes 14 out of the 20 subjective measurements available in the survey. The removed items were considered uninformative for the population behavior in relation with the covid‐19 measures adopted by the Italian government. For example, one removed question asked whether children living in the same household were going to school; however, schools have been closed for most of the pandemic and students could only attend lectures remotely.
Publicly available data focus on survey waves conducted from April 2020 to July 2021. Notably, the survey waves are not administered at regular time‐intervals; for example, four survey waves were submitted in April and May 2020, and only one in July and August 2020; see Table 2 for details on the waves administration dates. Without loss of generality, we will indicate this time grid with . In each wave , cross‐sectional data were collected for a representative sample of 1000 statistical units, indexed as . Each unit is associated with a sampling weight and a vector of covariates , including age, sex, region of residence (“North‐West”, “North‐East”, “Center”, “South”, “Islands”) and employment status (“Full‐time employment”, “Part‐time employment”, “Not working”, “Student” and “Retired”). Note that unit at wave indicates a different subject than unit at wave ; for this reason, we include the wave index in the covariate vector as , even if the covariates can be considered fixed over time. Refer to Jones 15 for further information.
TABLE 2.
Calendar date of the 37 analyzed survey waves
| Wave | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| Date | 2 Apr 2020 | 8 Apr 2020 | 16 Apr 2020 | 26 Apr 2020 | 1 May 2020 | 7 May 2020 | 13 May 2020 | 29 May 2020 |
| Wave | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
| Date | 10 Jun 2020 | 25 Jun 2020 | 8 Jul 2020 | 23 Jul 2020 | 7 Aug 2020 | 19 Aug 2020 | 3 Sep 2020 | 12 Sep 2020 |
| Wave | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 |
| Date | 2 Oct 2020 | 15 Oct 2020 | 28 Oct 2020 | 11 Nov 2020 | 16 Dec 2020 | 6 Jan 2021 | 13 Jan 2021 | 27 Jan 2021 |
| Wave | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 |
| Date | 10 Feb 2021 | 24 Feb 2021 | 10 Mar 2021 | 24 Mar 2021 | 7 Apr 2021 | 21 Apr 2021 | 5 May 2021 | 19 May 2021 |
| Wave | 33 | 34 | 35 | 36 | 37 | |||
| Date | 2 Jun 2021 | 16 Jun 2021 | 30 Jun 2021 | 14 Jul 2021 | 28 Jul 2021 |
This data collection scheme is commonly referred to as “repeated cross‐sectional” or “pseudo‐panel” in the literature. We refer to Verbeek 41 for an extensive discussion on the advantages and limitations of repeated cross‐section data compared to longitudinal data. In longitudinal data, temporal variations can be modeled, for example, relying on Hidden Markov Models; refer to Latent Markov models for longitudinal data 42 and references therein for more details. These approaches model the transitions across ideal behaviors for each individual, and allow to include the effects of covariates directly in the transition probabilities that characterize the time dynamic. In our application, we cannot characterize individual trajectories due to the cross‐sectional nature of the survey data. This is in fact a major limitation of cross‐sectional data along with their inability to monitor changes in personal attitudes. On the other hand, repeated cross‐sectional data allow to effectively model average tendencies of a population across time and, since they are less effected by attrition, they typically present a larger sample sizes than longitudinal data. 41 Therefore in Section 4, we describe a dynamic regression model for repeated cross‐sectional data that can characterize average variations of the attitude of Italian population during the pandemic.
4. METHODS
4.1. Model specification and interpretation
Let denote the responses of subject to the survey items outlined in Table 1 during survey wave ; without loss of generality, we encode these responses as for , with and . We indicate with the observed responses for the interviewed subjects during the survey wave . Our main goal is to study the evolution of the population's behaviors over time and during different stages of the covid‐19 pandemic. To accomplish this goal, we specify a low‐dimensional dynamic model for multivariate categorical data, that characterizes the time‐varying probability mass function of in terms of a set of static latent classes and dynamic class‐memberships. Since at each survey wave we observe statistical units chosen according to a survey design, we adjust the proposed model likelihood to obtain an estimate of the population parameters.
In order to describe the evolution of over time, we assume that the population can be divided into ideal behaviors (or profiles) that express different attitudes toward covid‐19. These ideal profiles can correspond to different behavioral patterns, such as people that rigidly observed all rules and directives to avoid the spread of the disease, or people that kept behaving as usual, ignoring the emergency. The interpretation and the structure of the ideal behaviors is considered fixed over time, while their proportion can change dynamically. For example, it is likely that, before seeing the effect of covid‐19, several people were not concerned about using public transport or having guests at home. However, as the disease spread and the effects became more evident, these inclinations potentially reversed for a subset of individuals.
This evolution is modeled relying on a dynamic latent‐class regression model, graphically described in Figure 1. For each wave , subject is associated to one of the profiles via the discrete latent variable with probability that depends on the observed covariate vector . Given the profile membership , the probability that subject responds with to the survey item is denoted as , with and . Therefore, we assume that the profiles' attributes are constant over time, and can be characterized by the conditional probabilities of responding with a certain category to the different items, namely . Within each mixture component, the observed survey items are modeled as a realization from a product of independent multinomial distributions. And, under the considered conditional independence assumption, the dependence structure of the survey items is entirely induced via marginalization of the mixture weights.
FIGURE 1.

Graphical representation of the mechanism to generate data from model (1)
Potentially, we could rely on alternative model specifications, exploiting the ordering of the survey items categories. For example, we could let the difference of probabilities across adjacent categories for each item to be constant, effectively reducing the number of parameters in the model. However, this formulation implies that the distributions of the survey items are stochastically ordered. This assumption might be non‐trivial to check empirically, in particular if several items are modeled jointly; therefore we prefer to follow the common practice of using a latent class model for multivariate ordered categorical data, since this model is robust to violations of ordering assumptions. 43 , 44
The profile‐specific membership varies across the survey waves and according to subject specific covariates. Specifically, the evolution of the profile memberships is modeled trough the probability vector , with denoting the probability that subject of survey wave belongs to the ‐th profile and . This parameter is decomposed into two quantities: a profile‐specific component and a subject‐specific effect that linearly depends on their observed covariates, with . The dynamic component characterizes the temporal evolution of the profiles' memberships, while the linear term for the effect of subject‐specific covariates.
Isolating the individual and the profile‐specific component—which is shared across the population—is particularly relevant in our setting, where we analyze repeated cross‐sectional data, including different subjects across survey waves. Indeed, this structure allows to estimate the shared dynamic component , adjusting for the different composition of the cross‐sectional population, and measuring the effect of demographic information on the probability of belonging to a certain profile.
According to these specifications, the model depicted in Figure 1 can be formalized as
| (1) |
Using the first latent profile as a reference, we let and for and interpret each as the effect of the th covariate on the log‐odds of belonging to profile , instead of the first one, as in multinomial logit regression; refer, for example, to Reference 44. The conditional independence assumption among the categorical variables, and the inclusion of the temporal component in the mixture weights, leads to substantial dimensionality reduction in the number of model parameters, while incorporating borrowing of information across survey waves. The effect of time in the model (1) can be made explicit as follows. Let denote an ‐dimensional design matrix obtained stacking the vectors by rows; that is, the first rows of correspond to the observation of the individuals interviewed at the first survey wave , the rows from to correspond to the individuals interviewed at the second survey wave , and so on. In matrix form, the linear predictor of the second line of Equation (1) can be expressed as
where is a ‐dimensional matrix which identifies the survey wave of each observation, and is a vector of ones with length ; similarly, is a ‐dimensional vector containing the values of the dynamic intercepts. This specification illustrates that time variations in the mean composition of each profile are characterized non‐parametrically, since the model includes a different value of the intercept for each survey wave, instead of assuming a parametric relationship between consecutive values of .
Following Equation (1), the likelihood contribution for subject in the survey wave can be expressed as
| (2) |
where denotes the indicator function for the event . To mitigate the effect of potential bias due to the survey design, we follow the approach described by Vermunt and Magidson 18 and Patterson et al, 19 and incorporate the survey weights into the model via
| (3) |
exponentiating each likelihood contribution in (2) to the corresponding survey weight . The pseudo‐likelihood in (3) is used as building‐block of several likelihood‐based procedures that include survey weights 45 , 46 , 47 and recently in some Bayesian methods. 48 Alternatively, in a Bayesian setting, one can use the survey weights to approximate the population generative mechanism, and inferring characteristics of the non‐sampled units 49 or, when available, use strong prior information to correct for the sampling bias. 50 However, a weighted likelihood approach is conceptually simpler and computationally more efficient when the sampling mechanism is unknown, as in our application. 48
4.2. Prior specification and posterior computation
We consider a Bayesian approach to inference, using the pseudo‐likelihood outlined in Equation (3). An advantage of this approach is that prior regularization can avoid convergence issues of maximization algorithms, such as Expectation‐Maximization (em), when used in latent‐class regression. 43 , 51 Also, a Bayesian approach simplifies modeling of temporal dependencies across survey waves leveraging a hierarchical dynamic model. To effectively estimate the dynamic intercepts inducing borrowing of information across survey waves, we assume that the variations in profile composition are smooth over time, and leverage a Gaussian Process (gp) prior for the joint distribution of over ; refer, for example, to Gaussian Processes for Machine Learning 52 for an introduction to gp. Recalling that the first group is fixed as a reference, this prior assumes for each latent group
| (4) |
denoting a squared‐exponential covariance function and and corresponding to the variance, length‐scale and noise variance parameters, respectively. 52 The squared exponential function favors smooth transitions across time, with the parameters controlling the overall structure of the covariance function. Since the covariance is parametrized as a function of the squared time lags across all pairs of time points , it accounts for the unequal spacing effectively, inducing higher correlation among closer time points; indeed, continuous stochastic processes such as the gp are appropriate for time series defined over continuous domains, where the spacing across time points is arbitrary, see for example Chapter 6 of Analysis of financial time series. 53
Prior specification is completed selecting: independent log‐normal distributions with log‐mean 0 and log‐standard deviation 10 for the components of , standard Gaussian distribution for the coefficients , and symmetric Dirichlet distributions for the profile‐specific conditional probabilities , letting
and recalling that for identifiability.
We approximate the posterior distribution of the model parameters via Markov Chain Monte Carlo (mcmc). Specifically, we rely on a Hamilton Monte‐Carlo algorithm, efficiently implemented in the r package rstan, 54 including the weighted likelihood specification of Equation (3).
5. MODELING ATTITUDE TOWARDS COVID‐19
To select the number of latent profiles, we evaluated the model performance in predicting the 14 survey items in Table 1 for various values of , relying on a 4‐folds cross‐validation to estimate out‐of‐sample accuracy. The data provide evidence that latent profiles provide the best fit to the data. R; refer to the Appendix A for further details on model selection. Posterior inference for the selected model relies on 4000 iterations collected after a burn‐in of 1000. Convergence was assessed via graphical inspection of the trace‐plots, auto‐correlation function and convergence diagnostics. All chains mixed well, with an effective sample size larger than 3500 for all parameters. We did not observe label switching across the chains; see Appendix B for further details. Simulating the 5000 draws from the posterior required approximately 4 hours on a laptop with an Intel i7‐7700HQ cpu and 16GB of ram.
5.1. Latent profiles description
In this Section, we describe the composition of the three considered latent profiles (or groups) and their response pattern , for , estimated via posterior mean and reported in Figure 2.
FIGURE 2.

Posterior estimates for the profile‐specific parameters . The color gradient of the cells varies according to the values of the estimated probabilities, with lighter shades corresponding to smaller values. The number in each cell is the posterior mean of multiplied by 100. For each item, the white text corresponds to the response with higher posterior probability
The first profile is composed of individuals that scrupulously followed spread‐preventive measures. Individuals in this group, with a probability of approximately , always wore masks outside home (ih1), washed their hands frequently (ih2), and avoided taking public transport (ih8). With probability of they avoided going out in general (ih6). Additionally, with a probability of they avoided going to an hospital or health care institution (ih7). Also, they avoided any small gathering (ih12) with probability of , and avoided having guests to their home (ih11) with probability of . We will refer to this group as “meticulous” in the rest of the article.
The second profile is composed of individuals that moderately followed preventative measure. Compared to the previous group, a smaller fraction of individuals avoided going out (ih6) ( probability of responding “Always”). In particular, subjects in this group completely avoided going to shops (ih16) with probability (compared to the of the meticulous group). A larger fraction of individuals considered small gathering as safe; in fact, the probability of completely avoiding small gatherings (ih12) is of , compared with the , of the meticulous profile. Large gatherings (ih14) are avoided also in this group, with a probability of replying “Always” equal to . Compared to the meticulous, the probability of taking public transport (ih8) decreased from to , and the probability of having guests at their home (ih11) decreased from to . Finally, the probability of avoiding to go to an hospital (ih7) decreased from to . We will refer to this group as “moderate”.
The third profile is composed of individuals with a more lenient attitude towards covid‐19 measures. In this profile the probability of always wearing a mask outside home (ih1) is compared to and for meticulous and moderate; in addition, the probability of using a mask outside home “Sometimes” or “Rarely” is . The probability of responding “Always” to “Avoided going out in general” (ih6) is , and for the response “Not at all”. This profile is characterized by more permissive behaviors also for gatherings. For example, the probability of responding “Not at all” to items referring to avoiding gatherings is , and for small, medium and large gatherings, respectively (items ih12, ih13 and ih14). Notably, the probability of responding “Always” to “avoiding crowded places” (ih15) is only , compared to and of the moderate and meticulous profile. We refer to this group as “permissive”.
To summarize, the three estimated latent profiles can be interpreted in terms of level of compliance with covid‐19 preventive measures. Some behaviors are similar in the three profiles; for example the modal class for “Avoided going to the hospital or other healthcare setting” (ih7) is “Always” in all the profiles, even if the distribution of the responses is different across groups. Other behaviors switch across profiles; for example in the item “Avoiding going out in general” (ih6), we observe modal category “Always”, “Sometimes” and “Not at all” for the meticulous, moderate, and permissive group, respectively.
5.2. Effects of covariates
Figure 3 illustrates the effects of the covariates on the log‐odds of being assigned to the moderate or permissive group against the meticulous group, that is used as baseline. Our empirical findings suggest that older respondents are more likely to be in the meticulous group rather than in the moderate or permissive ones, since the coefficient for age is negative. For example, the odds of a subject belonging to profile 2 (moderate) instead of 1 (meticolous) decrease times per each 5 years of age, while for profile 3 (permissive) this estimate is . Males are more likely to be associated with the permissive groups compared to females; the estimated odds of belonging to profile 2 instead of 1 are times the estimated odds for females. The odds for profile 3 instead of 1 are even larger, with an estimate for males that is times the estimated odds for females. We also observe a clear regional effect: individuals living in the south of Italy or in the Italian Islands report higher probabilities to be associated with the meticulous group; for example, the odds of belonging to profile 3 instead of 1 for subjects in south Italy and Islands are and times the estimated odds for North‐West, respectively. Lastly, students, retired and people that are not currently working show higher probabilities to be associated with the meticulous group rather than moderate or permissive ones. These results are in line with what reported in Carlucci et al 9 and references therein, and suggests a more cautious behavior for younger individuals, females, living in south‐Italy or Islands and without a full‐time occupation.
FIGURE 3.

Posterior distributions of the regression coefficients representing the effect of subject characteristics on the profile memberships
5.3. Evolution of the attitude towards covid‐19
To characterize the evolution of the attitudes towards covid‐19, we rely on the properties of the gp specification (4), providing daily predictions of the proportion of Italians associated with the three profile described in Section 5.1. The mean parameters of the gp is approximated predicting the probability to belong to each of the three considered profiles on a new set of locations that is a large grid of equally spaced points between the first and last survey wave. These predictions are mapped to the proportion of Italians associated to each profile described in Section 5.1, at any given time, by setting the values of the covariates in Equation (1) at their population averages. Posterior estimates are displayed in Figure 4.
FIGURE 4.

Proportion of Italians associated with the three profiles described in Section 5.1. Dots indicate the observed waves and grey shaded areas the credible interval. Dashed lines correspond to important dates of Italian covid‐19 policy and official announcements: June 3 2020 the end of the first lockdown; October 13 2020 the beginning of the second lockdown; December 27 2020 the European vaccine day; and, March 13 2021 the day of the publication of the national vaccine plan
Each panel in Figure 4 is divided in five areas, separated by dashed lines that correspond to important dates of Italian covid‐19 policy and official announcements. The first phase in Italian lockdown lasted from February 21, 2020 to June 3, 2020, and included several preventative measures such as avoiding leaving the house for non‐essential reasons. In the second phase (June 3, 2020–October 13, 2020) shops, bars, and restaurants were open to the public, although appropriate social distance was still required. The third phase (October 13, 2020–December 27, 2020) correspond to the period of the second Italian lockdown, after covid‐19 cases increased over Europe, till the European vaccine day (27 December 2020); this date correspond to the beginning of vaccination policy in Europe. The fourth period spans from December 27, 2020 to March 13, 2021, corresponding to the publication of the Italian vaccination plan. Finally, the fifth phase after March 13, 2021 correspond to the larger scale diffusion of the vaccine.
According to our analysis, in the interval from April 2 to May 18, about of the Italian population observed a meticulous behavior, following most of covid‐19 preventive measures. In the same period, the proportion of individuals in the moderate profile was , and only for the indulgent group. This is in agreement with what reported by Graffigna et al 13 and Barari et al, 14 although with different methodologies and data.
On May 18, 2020, the prime minister Conte introduced an easing of the lockdown restrictions, allowing most businesses to open to the public. Commuting across Regions was still banned until the official announcement of the second phase, on June 3. In this period, we notice a rapid variation in the composition of the profiles, with the proportions at June 3 corresponding to , , and for the meticulous, moderate, and permissive group, respectively.
In the period ranging from July to October 2020 the proportion of Italians in the three profiles is essentially constant across time, with values close to , , and , respectively. In this phase, the number of confirmed cases increased (see Figure 5), and half of the Italian population has a moderate attitude towards covid‐19. Also, almost one third of the population is in the permissive group. However, it is worth noting that most of their behaviors, such as not wearing a mask outside, were allowed in this period.
FIGURE 5.

Daily variations in the total number of covid‐19 cases in Italy in log‐scale. Dashed lines correspond to important dates of Italian covid‐19 policy and official announcements: June 3, 2020 the end of the first lockdown; October 13, 2020 the beginning of the second lockdown; December 27, 2020 the European vaccine day; and, March 13, 2021 the day of the publication of the national vaccine plan. The data used for the plot can be downloaded at https://lab24.ilsole24ore.com/coronavirus/
After a new lockdown was enforced on October 13, 2020, we observe significant changes in the proportions of Italians associated with each profile. For example, the proportion of Italians in the meticulous group in the last three observed waves of the third period (October 28, November 11, and December 16, 2020) correspond to , , and . For the moderate group we observe values of , , and while the permissive drop to , , and . It is worth noting that despite the reported number of cases of covid‐19 on November 11 was comparable to the early phase of the pandemic, only half of the population shows a meticulous attitude toward covid‐19, compared to of the first lockdown. The permissive group also presents an higher proportion: , compared to of the first phase. These results suggest that, although the increase in the number of cases was comparable between November and April 2020, the behaviors and the reactions of the population were different, with the second phase characterized by less meticulous behaviors than the previous lockdown. A possible explanation involves different level of awareness on the disease compared to the first lockdown. For example, in the first period of the pandemic, it was recommended to clean streets and surfaces with disinfectant to avoid the infection; this practice was later flagged as an exaggerate behavior. 55
The profiles composition stabilizes after the vaccine day on the 27 December 2020, with averages for the period December 27, 2020–March 13, 2021 of , , and for the meticulous, moderates, and permissive groups, respectively. This composition remained essentially constant until vaccine became largely availability to the public. In the last phase (after March 13, 2021) we can see how the proportion of Italians following a meticulous behavior steeply declined, with a value of for the last wave. Contemporary, the proportion of permissive increased with a value at last wave of . The proportion of moderates stabilized around in this phase. This evolution suggests confidence in the vaccine effectiveness.
6. DISCUSSION
We analyzed Italian attitude towards covid‐19 leveraging a dynamic Bayesian latent class regression model for survey data. The estimated latent profiles can be interpreted as different degrees of precaution that determine the compliance with the national rules. At the population level, the proportion of Italians associated with each of these profiles follows the various phases of the lockdown. This suggests that Italians have followed the national rules.
There are several potential future directions for this work. When more recent data will be released, it would be interesting to analyze how the composition of the profiles changes in relation to the diffusion of new covid‐19 variants, such as Omicron. A further research direction would be to simultaneously study the evolution of attitude towards covid‐19 in several countries, comparing different nations and highlighting the main differences. However, such an extension is not immediate, since the interpretation of the profiles in different country might be substantially different and highly influenced by cultural aspects.
APPENDIX A. MODEL SELECTION
A.1.
The number of latent profiles is selected via 4‐folds cross validations. Specifically, for each wave we divided data into 4 folds of equal size, using in turn 3 folds for estimating the model and the remaining one for evaluating its performance. Since subjects are different across each wave, folds can be constructed with random sampling, independently across waves. We simulate 4000 samples from the posterior after a burn‐in of 1000, letting the number of groups for each cross‐validation fold. Predictive probabilities for each individual and modalities are computed as in Equation (2), using Monte‐Carlo integration, while the final out‐of‐sample predictions have been obtained selecting the modality with largest probability. We compare observed and predicted values in terms of accuracy, averaging the results in the four cross‐validation folds.
Results are reported in Table A1, divided for the 14 response items. Accuracy increases moving from a model with latent profiles to a model with profiles for most items (10 out of 14), while the remaining items are predicted with the same accuracy by the two models. The prediction accuracy stabilizes with the model with and does not increase further with . Therefore, model with is the most parsimonious model with best fit to the data, and should be selected.
TABLE A1.
Accuracy, multiplied by 100
| Item label |
|
|
|
|
|
|
|
|||||||
| ih1 | 78 | 78 | 78 | 78 | 78 | 78 | 78 | |||||||
| ih2 | 61 | 62 | 62 | 62 | 62 | 62 | 62 | |||||||
| ih3 | 57 | 58 | 58 | 58 | 58 | 58 | 58 | |||||||
| ih4 | 67 | 67 | 67 | 67 | 67 | 67 | 67 | |||||||
| ih5 | 65 | 66 | 66 | 66 | 66 | 66 | 66 | |||||||
| ih6 | 56 | 57 | 57 | 57 | 57 | 57 | 57 | |||||||
| ih7 | 58 | 59 | 59 | 59 | 59 | 59 | 59 | |||||||
| ih8 | 78 | 78 | 78 | 78 | 78 | 78 | 78 | |||||||
| ih11 | 61 | 62 | 62 | 62 | 62 | 62 | 62 | |||||||
| ih12 | 57 | 58 | 58 | 58 | 58 | 58 | 58 | |||||||
| ih13 | 67 | 67 | 67 | 67 | 67 | 67 | 67 | |||||||
| ih14 | 65 | 66 | 66 | 66 | 66 | 66 | 66 | |||||||
| ih15 | 56 | 57 | 57 | 57 | 57 | 57 | 57 | |||||||
| ih16 | 59 | 59 | 59 | 59 | 59 | 59 | 59 |
Note: Largest values are highlighted in bold‐face, and correspond to the best model. In case of multiple maxima, the smallest model is preferred and the corresponding accuracy is highlighted in bold‐face.
APPENDIX B. CONVERGENCE DIAGNOSTIC
B.1.
Figure B1 shows traceplots of the marginal posterior distributions for some parameters of the model described in Section 4. The chains exhibit good mixing, showing no jumps between profiles, that would indicate label switching; traceplots for the remaining model parameters behave similarly.
FIGURE B1.

Traceplots of the posterior distribution of some illustrative parameters. Light gray lines denote the cumulative means of the parameters at each iteration
Aliverti E, Russo M. Dynamic modeling of the Italians' attitude towards Covid‐19. Statistics in Medicine. 2022;1‐14. doi: 10.1002/sim.9560
Funding information Italian Ministry of Education, Universities and Research (MIUR), Grant/Award Number: MIUR‐prin2017br‐jxs
DATA AVAILABILITY STATEMENT
Data are publicly available for research purposes at the repository https://github.com/YouGov‐Data/covid‐19‐tracker, along with a brief description of the collected variables.
REFERENCES
- 1. World Health Organization (WHO) . Coronavirus disease (covid‐19). 2021; https://covid19.who.int/.
- 2. Papadimitriou E, Cseres‐Gergelyne BZ. Economic sectors at risk due to COVID‐19 disruptions: will men and women in the EU be affected similarly?. Luxembourg: Publications Office of the European Union; 2020.
- 3. Cullen W, Gulati G, Kelly B. Mental health in the Covid‐19 pandemic. QJM: An Int JMed. 2020;113(5):311‐312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Balkhi F, Nasir A, Zehra A, Riaz R. Psychological and behavioral response to the coronavirus (COVID‐19) pandemic. Cureus. 2020;12(5). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. American Psychological Association (APA) . Patients with depression and anxiety surge as psychologists respond to the coronavirus pandemic. 2020.
- 6. Forte G, Favieri F, Tambelli R, Casagrande M. The enemy which sealed the world: effects of COVID‐19 diffusion on the psychological state of the Italian population. J Clin Med. 2020;9(6). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Moccia L, Janiri D, Pepe M, et al. Affective temperament, attachment style, and the psychological impact of the COVID‐19 outbreak: an early report on the Italian general population. Brain Behav Immun. 2020;87:75‐79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Motta Zanin G, Gentile E, Parisi A, Spasiano D. A preliminary evaluation of the public risk perception related to the COVID‐19 health emergency in Italy. Int J Environ Res Public Health. 2020;17(9):3024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Carlucci L, D'Ambrosio I, Balsamo M. Demographic and attitudinal factors of adherence to quarantine guidelines during COVID‐19: the Italian model. Front Psychol. 2020;11:2702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Google LLC . Google COVID‐19 community mobility reports. 2021; Accessed 17, 2021, https://www.google.com/covid19/mobility/.
- 11. Bavadekar S, Dai A, Davis J, et al. Google COVID‐19 search trends symptoms dataset: anonymization process description (version 1.0). arXiv preprint, arXiv:2009.01265 2020.
- 12. Wolff W, Martarelli CS, Schüler J, Bieleke M. High boredom proneness and low trait self‐control impair adherence to social distancing guidelines during the COVID‐19 pandemic. Int J Environ Res Public Health. 2020;17(15):1‐10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Graffigna G, Barello S, Savarese M, et al. Measuring Italian citizens' engagement in the first wave of the COVID‐19 pandemic containment measures: a cross‐sectional study. PLoS ONE. 2020;15(9):1‐22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Barari S, Caria S, Davola A, et al. Evaluating COVID‐19 public health messaging in Italy: self‐reported compliance and growing mental health concerns. MedRxiv. 2020. [Google Scholar]
- 15. Jones SP. Imperial College London YouGov Covid Data Hub, v1. 0. Imperial College London Big Data Analytical Unit and YouGov Plc 2020.
- 16. Lazarsfeld PF. The logical and mathematical foundation of latent structure analysis. In: Samuel S, et al., ed. Studies in Social Psychology in World War II Vol IV: Measurement and Prediction; Princeton: Princeton University Press; 1950:362‐412. [Google Scholar]
- 17. Dayton CM, Macready GB. Concomitant‐variable latent‐class models. J Am Stat Assoc. 1988;83(401):173‐178. [Google Scholar]
- 18. Vermunt JK, Magidson J. Latent class analysis with sampling weights: a maximum‐likelihood approach. Sociolog Methods Res. 2007;36(1):87‐111. [Google Scholar]
- 19. Patterson BH, Dayton CM, Graubard BI. Latent class analysis of complex sample survey data: application to dietary data. J Am Stat Assoc. 2002;97(459):721‐741. [Google Scholar]
- 20. Kunihama T, Dunson DB. Bayesian modeling of temporal dependence in large sparse contingency tables. J Am Stat Assoc. 2013;108(504):1324‐1338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Russo M, Durante D, Scarpa B. Bayesian inference on group differences in multivariate categorical data. Comput Stat Data Anal. 2018;126:136‐149. [Google Scholar]
- 22. Fruhwirth‐Schnatter S, Celeux G, Robert CP. Handbook of Mixture Analysis. Boca Raton, Florida: CRC press; 2019. [Google Scholar]
- 23. Girardi P, Greco L, Mameli V, et al. Robust inference for non‐linear regression models from the Tsallis score: application to coronavirus disease 2019 contagion in Italy. Stat. 2020;9(1):e309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Alaimo Di Loro P, Divino F, Farcomeni A, et al. Nowcasting COVID‐19 incidence indicators during the Italian first outbreak. Stat Med. 2021;40(16):3843‐3864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Girardi P, Greco L, Ventura L. Misspecified modeling of subsequent waves during COVID‐19 outbreak: a change‐point growth model. Biom J. 2022;64(3):523‐538. [DOI] [PubMed] [Google Scholar]
- 26. Scrucca L. A COVINDEX based on a GAM beta regression model with an application to the COVID‐19 pandemic in Italy. Stat Methods Appl. 2022;1‐20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Sebastiani G, Massa M, Riboli E. Covid‐19 epidemic in Italy: evolution, projections and impact of government measures. Eur J Epidemiol. 2020;35(4):341‐345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Farcomeni A, Maruotti A, Divino F, Jona‐Lasinio G, Lovison G. An ensemble approach to short‐term forecast of COVID‐19 intensive care occupancy in Italian regions. Biom J. 2021;63(3):503‐513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Celani A, Giudici P. Endemic–epidemic models to understand COVID‐19 spatio‐temporal evolution. Spatial Stat. 2021;49:100528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. D'Urso P, De Giovanni L, Vitale V. A D‐vine copula‐based quantile regression model with spatial dependence for COVID‐19 infection rate in Italy. Spatial Stat. 2022;100586:1‐31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Cabras S. A Bayesian‐deep learning model for estimating COVID‐19 evolution in Spain. Mathematics. 2021;9(22). [Google Scholar]
- 32. Padellini T, Jersakova R, Diggle PJ, et al. Time varying association between deprivation, ethnicity and SARS‐CoV‐2 infections in England: a population‐based ecological study. Lancet Reg Health Eur. 2022;15:100322. doi: 10.1016/j.lanepe.2022.100322 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Selinger C, Choisy M, Alizon S. Predicting COVID‐19 incidence in French hospitals using human contact network analytics. Int J Infect Dis. 2021;111:100‐107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Ihme Covid‐19 Forecasting Team . Modeling COVID‐19 scenarios for the United States. Nat Med. 2020;27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Reddy T, Shkedy Z, Rensburg JVC, et al. Short‐term real‐time prediction of total number of reported COVID‐19 cases and deaths in South Africa: a data driven approach. BMC Med Res Methodol. 2021;21(1):1‐11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Roosa K, Lee Y, Luo R, et al. Real‐time forecasts of the COVID‐19 epidemic in China from February 5th to February 24th, 2020. Infectious Disease Modelling. 2020;5:256‐263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Krekel C, Swanke S, De Neve JE, Fancourt D. Are Happier People More Compliant? Global Evidence from Three Large‐Scale Surveys during Covid‐19 Lockdowns. Bonn, Germany: IZA – Institute of Labor Economics; 2020. [Google Scholar]
- 38. Tripathy D, Camorlinga SG. Prediction of COVID‐19 Cases based on Human Behavior using DNN Regressor for Canada. 2021 IEEE International Conference on Communications Workshops (ICC Workshops). 2021:1‐6.
- 39. Duradoni M, Fiorenza M, Guazzini A. When Italians follow the rules against COVID infection: a psychological profile for compliance. Covid. 2021;1(1):246‐262. [Google Scholar]
- 40. Guazzini A, Pesce A, Marotta L, Duradoni M. Through the second wave: analysis of the psychological and perceptive changes in the Italian population during the COVID‐19 pandemic. Int J Environ Res Public Health. 2022;19(3). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Verbeek M. Pseudo‐Panels and Repeated Cross‐Sections. Berlin, Heidelberg: Springer; 2008:369‐383. [Google Scholar]
- 42. Bartolucci F, Farcomeni A, Pennoni F. Latent Markov Models for Longitudinal Data. Boca Raton, Florida: CRC Press; 2012. [Google Scholar]
- 43. Linzer DA, Lewis JB, et al. poLCA: an R package for polytomous variable latent class analysis. J Stat Softw 2011; 42(10): 1–29. [Google Scholar]
- 44. Azzalini A, Scarpa B. Data Analysis and Data Mining: An Introduction. New York; 2012. [Google Scholar]
- 45. Godambe VP, Thompson ME. Parameters of superpopulation and survey population: their relationships and estimation. Int Stat Rev. 1986;54(2):127‐138. [Google Scholar]
- 46. Rabe‐Hesketh S, Skrondal A. Multilevel modelling of complex survey data. J R Stat Soc A Stat Soc. 2006;169(4):805‐827. [Google Scholar]
- 47. Skinner C, Mason B. Weighting in the regression analysis of survey data with a cross‐national application. Can J Stat. 2012;40(4):697‐711. [Google Scholar]
- 48. Gunawan D, Panagiotelis A, Griffiths W, Chotikapanich D. Bayesian weighted inference from surveys. Aust N Z J Stat. 2020;62(1):71‐94. [Google Scholar]
- 49. Si Y, Pillai NS, Gelman A. Bayesian nonparametric weighted sampling inference. Bayesian Anal. 2015;10(3):605‐625. [Google Scholar]
- 50. Gao Y, Kennedy L, Simpson D, Gelman A. Improving multilevel regression and Poststratification with structured priors. Bayesian Anal. 2021;1(1):1‐26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Durante D, Canale A, Rigon T. A nested expectation–maximization algorithm for latent class models with covariates. Stat Prob Lett. 2019;146(C):97‐103. [Google Scholar]
- 52. Rasmussen C, Williams C. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). Cambridge, MA: MIT Press; 2006. [Google Scholar]
- 53. Tsay RS. Analysis of Financial Time Series. Hoboken, New Jersey: John Wiley & Sons; 2005. [Google Scholar]
- 54. Stan Development Team . RStan: the R Interface to Stan. 2020. R package version 2.21.2. [Google Scholar]
- 55. Goldman E. Exaggerated risk of transmission of COVID‐19 by fomites. Lancet Infect Dis. 2020;20(8):892‐893. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data are publicly available for research purposes at the repository https://github.com/YouGov‐Data/covid‐19‐tracker, along with a brief description of the collected variables.
