Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
letter
. 2021 Jan 23;48(2):e2020GL091236. doi: 10.1029/2020GL091236

Observational Constraints on Warm Cloud Microphysical Processes Using Machine Learning and Optimization Techniques

J Christine Chiu 1,, C Kevin Yang 1, Peter Jan van Leeuwen 1,2, Graham Feingold 3, Robert Wood 4, Yann Blanchard 5, Fan Mei 6, Jian Wang 7
PMCID: PMC7900997  PMID: 33678926

Abstract

We introduce new parameterizations for autoconversion and accretion rates that greatly improve representation of the growth processes of warm rain. The new parameterizations capitalize on machine‐learning and optimization techniques and are constrained by in situ cloud probe measurements from the recent Atmospheric Radiation Measurement Program field campaign at Azores. The uncertainty in the new estimates of autoconversion and accretion rates is about 15% and 5%, respectively, outperforming existing parameterizations. Our results confirm that cloud and drizzle water content are the most important factors for determining accretion rates. However, for autoconversion, in addition to cloud water content and droplet number concentration, we discovered a key role of drizzle number concentration that is missing in current parameterizations. The robust relation between autoconversion rate and drizzle number concentration is surprising but real, and furthermore supported by theory. Thus, drizzle number concentration should be considered in parameterizations for improved representation of the autoconversion process.

Keywords: accretion, autoconversion, boundary layer cloud, cloud parameterization, machine learning, warm rain

Key Points

  • Machine‐learning trained by in situ data constrains autoconversion and accretion rates with uncertainty of 15% and 5%, respectively

  • There is a surprising relation between autoconversion rate and drizzle number concentration that significantly improves parameterizations

  • The exponent of autoconversion rate dependence on cloud number concentration is 0.75, lower than that in existing parameterizations

1. Introduction

Warm rain formation plays a crucial role in determining the properties and life cycle of marine boundary layer clouds, and has significant impacts on radiative and hydrological budgets. Yet, many global weather and climate models continue to produce rain too frequently over oceans (e.g., Stephens et al., 2010), too light (Ahlgrimm & Forbes, 2014; Jing et al., 2017), or too heavy (Abel & Boutle, 2012; Bodas‐Salcedo et al., 2008). The intermodel spread in precipitation rate in the southeast Pacific, one of the major marine boundary layer cloud decks, can be an order of magnitude (Wyant et al., 2015). The model discrepancy and spread in precipitation are linked to diverse issues, such as rain drop size distributions, and the representation of boundary layer, autoconversion, and accretion processes. The effects of autoconversion on precipitation can change surface temperature prediction significantly (Golaz et al., 2013).

Many warm rain parameterizations for autoconversion and accretion processes have been developed in the past (e.g., Beheng, 1994; Berry, 1968; Kessler, 1969; Khairoutdinov & Kogan, 2000; Liu & Daum, 2004; Seifert & Beheng, 2001; and many others). These parameterizations have been reviewed critically (Lee & Baik, 2017; Liu & Daum, 2004; Wood, 2005), and confronted with in situ observations (Hsieh et al., 2009; Wood, 2005). By applying in situ size‐resolved cloud measurements to the continuous collection equation, the two aforementioned observational studies showed that parameterized accretion rates generally agree with in situ data, but parameterized autoconversion rates can be significantly different from observational estimates. While these results are encouraging and informative, there has been little follow‐up observational work. It remains unclear how to maximize the use of observations for improving understanding and model representations of these microphysical processes, and how to extend constraints from in situ to remote sensing platforms that can provide continuous observations in various cloud regimes on a global scale.

The objectives of this study are manifold. Instead of evaluating existing parameterizations, here we use in situ observations to build machine‐learning (ML) models to “predict” autoconversion and accretion rates. Since translating ML results to physical formulations is not trivial and remains an active research area, we also perform nonlinear optimizations to fill the gap and to quantify the relationships of autoconversion and accretion with cloud/drizzle properties. These results are compared and contrasted with widely used parameterizations, and the implications are discussed.

2. In Situ Cloud Measurements

In situ cloud measurements were taken from the Aerosol and Cloud Experiments in the Eastern North Atlantic (ACE‐ENA) field campaign, deployed by the Atmospheric Radiation Measurement (ARM) user facility. The aircraft flew near the ARM site on Graciosa Island during two intensive operational periods in June‐July 2017 (IOP1) and January‐February 2018 (IOP2). The cloud types sampled in ACE‐ENA are mainly marine stratocumulus, with some scattered or precipitating cumulus. Measurements from three cloud probes were merged to form combined drop size distributions (DSD). Cloud droplets are defined as those with radii smaller than 25 µm, and drizzle drops are defined as those with radii larger than 25 µm and up to 400 µm. The choice of the cloud/drizzle separation threshold is appropriate for marine stratocumulus (Khairoutdinov & Kogan, 2000; Kogan, 2013). We also define a cloudy sample when cloud water content qc0.01 g m–3. Based on this definition, a total of ∼93,000 in situ cloudy DSDs comprise 11% drizzle‐free DSDs and 89% drizzling DSDs after data screening (see Table S1). The smallest drizzle water content (q r) observed in the drizzling DSDs is on the order of 10−5 g m–3, and thus we set it as the threshold for drizzle delineation, denoted as q r,crit. Property distributions from individual days for cloudy samples are shown in Figures 1a–1d.

Figure 1.

Figure 1

Box plots of in situ (a) cloud water content (q c), (b) cloud droplet number concentration (N c), (c) drizzle water content (q r), and (d) drizzle drop number concentration (N r) observed in IOP1 (left panels) and IOP2 (right panels) from the ARM campaign at the Azores. The bottom and top of each box represent the 25% and 75% quartiles, and the line inside the box represents the median. The whiskers mark the 5th and 95th percentiles. Panels (e and f) are calculated autoconversion rate (P au) and accretion rate (P ac) from observed drop size distributions using the stochastic collection equation formulated as a two‐moment bin model (see Section 3). Based on 1‐s measurements, the sample size for each day is listed in (b) for all conditions, and (f) for drizzling conditions. Flights on June 29 and July 6, 2017 were excluded due to data availability. ARM, atmospheric radiation measurement; IOP, intensive operational period.

3. Parameterization Derived From Machine Leaning Techniques

The in situ DSDs from ACE‐ENA are used to develop 2 ML models. The first ML model, dubbed “initiation model,” uses two inputs (q c, N c) for predicting autoconversion rate (P au) in drizzle‐absent conditions. The second, dubbed “standard model,” uses four inputs (q c, N c, q r, N r) for predicting P au and accretion rate (P ac) in drizzling conditions. As discussed further below, we use the initiation model to generate nonzero q r and N r values. Once q r and N r exist, the standard model is superior and used to better predict P au and P ac.

Both models use an Artificial Neural Network. It is a deep feed forward neural network (Schmidhuber, 2015) comprising eight hidden, fully connected layers with 1,024 nodes in each layer. All input and output variables are transformed to their logarithmic forms. Since these input variables have rather different magnitudes, we normalized them using their mean and standard deviation. We used LeakyReLU (Mass et al., 2013) as our activation function. Additionally, the training was performed by the Adam optimizer (Kingma & Ba, 2015), based on a loss function defined as the mean squared error between the true value and the prediction.

The training data sets for the two models are different but originate from the same pool of data points. The pool was generated by using the in situ DSDs as the initial conditions and propagating the DSDs forward in time with the stochastic collection equation (SCE). We used the two‐moment bin model of Tzivion et al. (1987) to compute P au and P ac directly from the explicit drop‐drop interaction terms at 1‐s time steps for 10 min. The bin model uses the Hall (1980) kernel. The 10‐min time period is based on the typical in‐cloud residence time (Feingold et al., 1996). Since our focus is on clouds, we exclude noncloudy data points from the pool.

For the initiation ML model, the training data set is based on data points generated from the initially drizzle‐free DSDs in the pool. To ensure that DSDs used for the initiation model are absolutely drizzle free, we exclude DSDs that contain cloud droplets in the instrument size bin (17.5–22.5 µm radius) proximate to the cloud/drizzle boundary (i.e., 25 µm radius), based on the uncertainty of 1.5–5 µm in in situ size measurements (Glienke & Mei, 2019, 2020). We do retain all DSDs that do not have droplets in that bin initially but produce nonzero q r in 5 s. These are practical choices that are as inclusive as possible of DSDs and also facilitate the ML. As shown in Figures 2a and 2b, the initiation model has the 25th and 75th errors ranging between –60% and 80%.

Figure 2.

Figure 2

Plots of the predicted versus the true autoconversion rates from the testing data set, using (a) the initiation and (c) standard machine‐learning model, (e) the KK parameterization, and (g) Equation 6. (a) is a scatter plot, while the number of data points in all others are indicated by color. The corresponding histograms of errors (%) from these individual methods are shown in (b, d, f, and h), respectively. The blue, black, and red dashed lines, respectively, represent the 25th, 50th, and 75th percentiles of the data. The corresponding errors for these lines are denoted in each subplot in their own color. For the KK parameterization, the 75th percentile is not plotted because it is out of the axis range. For the standard ML model, the mean error (%) and the mean absolute deviation (%, using the mean as the center point) are denoted in (d).

Note that these initially drizzle‐free DSDs generate q r ranging between 10−18 and 10−9 g m–3 in 5 s. The lower bound, 10−18 g m–3, is equivalent to one single drizzle drop with a radius of 25 µm in a 10 km × 10 km × 500 m cloudy volume. The upper bound, 10−9 g m–3, is at least four orders of magnitude smaller than any in situ measured q r. Therefore, although the initiation model generates nonzero q r, these numbers are very small and should still be considered as nondrizzling in any practical sense. We emphasize that the initiation ML model is mainly used as a gateway to the standard ML model that requires nonzero q r as input.

For the standard model, we sample all the cloudy points every 5 s from the pool to form the training data set, as long as their q r  10−18 g m–3. This threshold is based on the q r magnitude that can be initiated by the aforementioned initiation model, ensuring that the q r range between two models overlap as much as possible. This leads to a total of ∼10.7 M data points. From the rest of the pool that are not sampled for training, we randomly selected 2.5 M points for testing. The ratio between the training and the testing is about 4, similar to the standard practice in ML. Additionally, both the training and testing data sets contain ∼25% data that have a ratio of P au to P ac 1 (i.e., in the early stages of drizzle formation), and 75% data that have a ratio < 1.

Figures 2, 3, 2d, 3a and 3b show the performance of the standard ML model. For both P au and P ac, the majority of the data points fall on the 1:1 line in the scatter plots, confirming the appropriateness of the neural network. The uncertainty is 15% for P au, and 5% for P ac. The good performance on the testing data set indicates that the ML model does not suffer from overfitting. Predicting P au and P ac for these 2.5 M points takes about 100 s using a single Intel Xeon E5‐2697V4 processor. Note that training P au and P ac separately using the same input yielded similar results (see Table S2).

Figure 3.

Figure 3

Same as Figure 2, but for accretion rate. Panels (e and f) are based on Equation 7, using 2.5 M data points. Note that estimates from KK and Equation 7 are so similar that the density scatter plots do not show any obvious differences.

Results from the standard ML model are further compared with those from the parameterization proposed in Khairoutdinov and Kogan (2000), dubbed the KK parameterization. KK describes P au and P ac as:

Pau=7.42×1013qc2.47Nc1.79ρ1.47, (1)
Pac=67qc1.15qr1.15ρ1.3, (2)

where ρ is the air density and all variables are in SI units. As shown in Figures 2, 3, 2f, 3c, and 3d, the KK parameterization predicts P ac reasonably well, with errors smaller than 40%, but significantly overestimates low P au and slightly underestimates high P au. The performance of KK is consistent with the findings in Wood (2005).

4. Parameterizations Based on a Simple Form

While the ML model can be used to predict P au and P ac successfully, extracting the bulk dependencies that shed light on the underlying physics, is not straightforward. To characterize the physical relationships, we assume that process rate, P, representing either P au or P ac, can be parameterized as:

P=kqcaNcbqrcNrd. (3)

This form is similar to the KK parameterization (Equations 1 and 2), and the parameterization in Liu and Daum (2004), dubbed the LD parameterization, in which

Pauqc3Nc1. (4)

Unlike the KK and LD parameterizations, we include all observables on the right‐hand side of Equation 3 to cover any possible dependencies, but will examine if all are necessary.

To find the optimal values for parameter ad and k in Equation 3, we take the logarithm of both sides, leading to

logP=logk+alogqc+blogNc+clogqr+dlogNr. (5)

This linear equation allows us to set up a least squares problem, solved with a limited memory quasi‐Newton method (L‐BFGS; Byrd et al., 1995) with wide bounds on the parameter values. The bounds are wide enough to ensure the best values do not end up close to the bounds. Uncertainty estimates are obtained from the inverse of the Hessian. The mean value of k has been corrected for the fact that k is implicitly assumed to be lognormally distributed because of the log transformation between Equations 3 and 5.

We use the testing data set with 2.5 M points to derive the parameters in Equation 3. The in situ q c and q r have an uncertainty of 30%, while N c and N r have an uncertainty of 50% and 20%, respectively (Glienke & Mei, 2019, 2020; Mei et al., 2020). These uncertainties are accounted for in Equation 3, leading to an additive error of 0.8 in Equation 5. Since it is possible that not all the variables constrain the solution, we systematically reduce the number of variables and adjust the additive error accordingly in the minimization.

Table 1 summarizes the parameter estimates and error statistics for predicting P au and P ac, based on the testing data set but with q r q r,crit. The reason for this restriction on q r is that the power laws are unable to fit a range of q r spanning 18 orders of magnitudes. As a result, the sample size was reduced from 2.5 to 2.3 M. If we must predict P au from q c and N c alone, as do existing parameterizations, the corresponding exponents are about 2.90 and –1.69, respectively. Our q c exponent is closer to LD's value (a = 3) than KK's (a = 2.47), and our N c exponent is closer to KK's value (b = –1.78) than LD's (b = –1). As shown in Table 1, adding q r into parameterizations produces a better correlation in P au predictions. However, in general, the parameterizations involving N r tend to perform best and have smaller errors. Once N r is considered in the physical relationship, the exponents for both q c and N c, that is, the sensitivity of the P au to these two variables, is reduced. Interestingly, we also find that the exponents of N r and N c are nearly reciprocal. The key role of N r in autoconversion rate is counter‐intuitive and will be discussed in the next section.

Table 1.

Optimal Parameters for Representing Autoconversion and Accretion Rates in the Form of
P=kqcaNcbqrcNrd
, where P is in kg m –3  s –1 , q c in kg m –3 , N c in m –3 , q r in kg m –3 , and N r in m –3 , Using 2.3 M Data Points
Corr. Error (%) k a b c d
25th/50th/75th
Autoconversion
Nonturbulent conditions
0.96 −39/−8/55 2.44 ± 0.05 2.0681 ± 0.0007 −0.7760 ± 0.0007 −0.1285 ± 0.0004 0.7844 ± 0.0005
0.96 −39/−9/55 5.9 ± 0.1 1.9839 ± 0.0008 −0.7496 ± 0.0007 −0.0642 ± 0.0005 a 0.7043 ± 0.0006
0.92 −52/−9/90 (164 ± 2) E7 2.2742 ± 0.0007 −1.0930 ± 0.0006 0.3177 ± 0.0002
0.93 −49/−9/77 (71.1 ± 1.4) E7 2.5247 ± 0.0009 −1.0548 ± 0.0008 0.4185 ± 0.0004 a
0.96 −39/−8/55 16.8 ± 0.3 2.0150 ± 0.0006 −0.7461 ± 0.0006 0.6403 ± 0.0003
0.88 −59/−8/129 (375 ± 4) E12 2.8957 ± 0.0005 −1.6945 ± 0.0004
Turbulent conditions with a dissipation rate of 400 cm2 s−3
0.96 −39/−8/53 11.1 ± 0.2 1.9777 ± 0.0007 −0.7366 ± 0.0006 0.6511 ± 0.0003
Nonturbulent conditions, but only using points that PauPac
0.96 −31/1/49 (201 ± 7) E6 1.7699 ± 0.0018 −0.7975 ± 0.0014 0.8043 ± 0.0009
0.96 −31/1/48 1.22 ± 0.05 1.7656 ± 0.0017 −0.7929 ± 0.0013 0.8432 ± 0.0008
0.84 −56/3/129 (73 ± 2) E10 2.611 ± 0.0014 −1.512 ± 0.001
Nonturbulent conditions, but only using initially drizzle‐free cloud size distributions
0.89 −54/−2/113 (4 ± 2) E17 4.08 ± 0.02 −2.25 ± 0.02
Accretion
Nonturbulent conditions
0.996 −22/−3/22 (94.8 ± 1.5) E5 1.4030 ± 0.0007 −0.3147 ± 0.0006 1.3069 ± 0.0004 −0.2389 ± 0.0004
0.997 −25/−3/29 (89 ± 1) E3 1.4159 ± 0.0006 −0.3018 ± 0.0005 1.1172 ± 0.0001
0.960 −64/27/244 (631 ± 9) E–4 1.9603 ± 0.0006 −0.6487 ± 0.0005 1.2153 ± 0.0001
0.996 −23/−6/21 69.5 ± 0.2 1.1476 ± 0.0002 1.1587 ± 0.0001
Turbulent conditions with a dissipation rate of 400 cm2 s−3
0.997 −20/−5/17 55.2 ± 0.1 1.1287 ± 0.0003 1.1462 ± 0.0001

The correlation coefficient (Corr.), the 25th, 50th, and 75th percentile errors (%) are also listed. The rows in bold indicate the set of parameters used for comparisons to the ML model in Figures 2 and 3.

a

Instead of q r, we use qr/(qr+qc) in the formula, motivated by Seifert and Beheng (2001).

For accretion, the key roles of q c and q r are consistent with existing parameterizations. Taking the KK parameterization as an example, the exponent for q c and q r is 1.15, which is very close to our exponents, 1.148 for q c and 1.159 for q r.

Based on the performance in errors in Table 1 and the minimum number of variables required to reach a high correlation, we select the following power laws for predicting P au and P ac:

Pau=16.8qc2.015Nc0.746Nr0.640, (6)
Pac=69.5qc1.148qr1.159. (7)

Figures 2, 3, 2h, 3e, and 3f show that the parameterizations from Equations 6 and 7 perform well, but not as well as the standard ML model. The ML model is therefore the desired choice, but Equations 6 and 7 remain a good option compared to KK.

We also performed the minimization technique on the drizzle initialization data set using only q c and N c, resulting in

Pau=4×1017qc4.08Nc2.25. (8)

This is in very close agreement with the analytical expression in Seifert and Beheng (2001) that was based on a gamma distribution for the cloud droplet mass and the Long kernel, in the limit of drizzle initiation.

5. The Dependence of Autoconversion on Drizzle Number Concentration N r

Results in Sections 3 and 4 demonstrate that both P au and P ac are influenced by cloud and drizzle simultaneously. The influence of cloud and drizzle properties on accretion makes sense and is consistent with collision‐coalescence theory and existing parameterizations, but the influence of drizzle on autoconversion is less straightforward.

From the definition of autoconversion, one should not expect a causal relationship between N r and P au. Instead, the dependence of P au on N r represents the influence on P au from the evolution stage of the cloud DSD, which is related to the appearance of raindrops. Such an influence was first pointed out by Cotton (1972), followed by Seifert and Beheng (2001) who incorporated this associated relationship using the ratio of q r to total water content (shown as one of the options in Table 1). Zeng and Li (2020) also demonstrated that q r is a good predictor of the width of the cloud droplet size distribution, and thus P au. However, it remains unclear what is the best form to describe this associated relationship, and whether q r and N r are equally effective predictors.

To understand whether N r contains different information from q r and whether N r is an effective predictor for all coalescence regimes, we conducted a number of ML tests (see Table S2). For the regime of PauPac, we found the use of (q c, N c, q r, N r) remains the best, and the performances from (q c, N c, N r) and (q c, N c, q r) are similar. For all regimes, the use of (q c, N c, q r, N r) is better than (q c, N c, N r), and the latter is better than (q c, N c, q r). These suggest that q r and N r contain different information. These also suggest that q r and N r are equally effective predictors of the autoconversion‐dominant regime, but N r is a better choice for all regimes. This is understandable as q r depends also on the accretion rate, while N r is not affected by accretion.

We further turn to the SCE used to generate the data set to seek a theoretical basis for the associated relationship. The rate of change in the total drop number concentration (N) can be derived from

dNdt=1200K(x,y)n(x)n(y)dxdy (9)

in which x and y represent the mass per drop; n(..)is the drop number concentration as function of mass; and K(x,y) is the collection kernel.

Defining cloud droplets as those drops with mass smaller than x 0, and rain drops as those with mass larger than x 0 allows us to decompose Equation 9 as:

dNdt=120x00x0K(x,y)n(x)n(y)dxdy120x0x0K(x,y)n(x)n(y)dxdy12x00x0K(x,y)n(x)n(y)dxdy12x0x0K(x,y)n(x)n(y)dxdy. (10)

The first integral in Equation 10 denotes the changes in N due to interactions between cloud droplets, that is, via self‐collection and autoconversion. The second and third integrals denote interactions between cloud and rain drops, that is, via accretion. The last integral contains interactions between rain drops, that is, via self‐collection of rain drops. To proceed analytically, we use the kernels of Long (1974),

K(x,y)=kc(x2+y2),forxandy<x0;K(x,y)=kr(x+y),forxoryx0 (11)

as an example, leading to:

dNdt=kcNcZc1stintegralkr(Ncqr+qcNr)2nd&3rdintegralkrNrqrlastintegral, (12)

where Zc is defined as:

Zc0x0x2n(x)dx=0x0xxn(x)dx=0x0xq(x)dx, (13)

proportional to the radar reflectivity in the Rayleigh regime. As explained above, the underlying processes for each term in Equation 12 are:

kcNcZc=dNcdtsc+dNcdtaukr(Ncqr+qcNr)=dNcdtackrNrqr=dNrdtsc (14)

where the subscript “sc” denotes self‐collection.

Similar to Equation 9, we can perform the integrals for q c, that is,

dqcdtac=0x0x0xK(x,y)n(x)n(y)dydx=kr0x0x0x(x+y)n(x)n(y)dydx

which leads to

dqcdtac=kr(NrZc+qcqr). (15)

Let us define

qc=x¯cNc, (16)

where x¯c is the average mass per cloud droplet. Then, we find

dqcdt=d(x¯cNc)dt=x¯cdNcdt+Ncdx¯cdt. (17)

Since

dqcdt=dqcdtau+dqcdtac,anddNcdt=dNcdtsc+dNcdtau+dNcdtac, (18)

we can combine Equations 17 and 18 to express the autoconversion process rate as:

Pau=dqcdtau=dqcdt+dqcdtac=x¯cdNcdtNcdx¯cdt+dqcdtac=x¯c(dNcdtsc+dNcdtau+dNcdtac)+dqcdtacNcdx¯cdt. (19)

Inserting Equations (14), (15), (16) into Equation 19, yields

Pau=x¯c(kcNcZc+kr(Ncqr+qcNr))kr(NrZc+qcqr)Ncdx¯cdt=kcqcZc+krx¯cqcNrkrNrZcNcdx¯cdt=krqc2NrNc+kcqcZckrNrZcNcdx¯cdt (20)

with k r = 5.78 m3 kg−1 s−1 and k c = 9.44 ×109 m3 kg−2 s−1 (Long, 1974).

There are a number of interesting features in Equation 20. First, the right‐hand side is independent of q r. Second, the first term on the right‐hand side has a remarkable resemblance to our power laws, showing the dependence of P au on N r and the reciprocal feature between N c and N r. Since the separation in Long's kernels was 50 µm radii, we extend the derivation of Equation 20 for cases in which the cutoff size is smaller than Long's separation value. As shown in supporting information Text S1, Equation 20 can be rewritten as:

Pau=krqc2NrNc+kcqcZckrNrZc+(kcqcZcNckrqc2NckcTc+krZc)NrNcdx¯cdt, (21)

where T c is the third moment of the mass distribution, and Nr is the number concentration in the range between x 0 and the drop mass at 50 µm radius.

To examine the role of the first term in Equation 21, we calculate all terms for a wide range of size distributions approximated by various combinations of lognormal and gamma distributions with realistic cloud and drizzle properties (Figure S1). Our results show that although the first term alone cannot completely replicate P au, compared to the second, third, Nr‐related, and the last term, the first term is closer to P au than each of them in 95%, 100%, 71%, and 100% of all cases, respectively, with the relative contributions of terms dependent on the size distributions. This supports the finding in Table 1 that the first term is a good predictor of P au, and provides evidence as to why our P au estimates show a dependence on N r, and why the inclusion of N r in the autoconversion parameterization is beneficial.

6. The Effect of Turbulence

The ML model and power‐law parameterizations introduced above are based on SCE calculations in nonturbulent conditions. Since small‐scale turbulence can enhance the collection rate (e.g., Ayala et al., 2008; Chen et al., 2018; Grabowski & Wang, 2013; Wang & Grabowski, 2009), we evaluate the turbulence impact by incorporating the enhancement of the collision efficiency tabulated in Wang and Grabowski (2009). Using the enhancement factor under turbulent cloud conditions with a 400 cm2 s−2 dissipation rate, we found that the exponents in power‐law relationships did not change significantly (see Table 1). Ignoring the turbulence effects in the ML model leads to (−15% ± 13%) errors in P au and (−10% ± 7%) errors in P ac. The medians of the error histograms are −18% for P au and −7% for P ac. Compared to the median errors introduced by the KK parameterization, which are respectively 45% and −20% for P au and P ac (see Figures 2 and 3), the errors due to turbulence collision effects in our ML model are smaller and can be accounted for if the dissipation rate can be estimated from radar or lidar observations.

7. Summary

We have built machine‐learning models to predict autoconversion and accretion rates from cloud and drizzle properties, using cloud probe measurements from the ACE‐ENA campaign in the Azores and the stochastic collection equation formulated as a two‐moment bin model. Overall, the estimated autoconversion and accretion rates from the machine‐learning model agree with the observed rates to within 15% and 5%, respectively. The standard model requires concurrent, separated cloud and drizzle water contents and number concentrations, which can be obtained from in situ observations or retrievals from remote sensing measurements (e.g., Fielding et al., 2015; Mace et al., 2016; Rusli et al., 2017; Wu et al., 2020).

The joint analyses from the machine‐learning model and optimization techniques led to a robust dependence of autoconversion on drizzle number concentration. The dependence on drizzle number concentration also shows a reciprocity with the dependence of cloud droplet number concentration. These findings are unexpected, because the autoconversion process represents the coalescence between cloud droplets and is causally only related to cloud properties. However, drizzle number concentration does contain information on the width and evolution of the DSD, and hence indirectly on the autoconversion rate. By using simple collection kernels, we replicate the dependence and reciprocity in theoretical derivations. This implies that these features are physical and can be incorporated to improve parameterizations of autoconversion rate. The power‐law parameterizations also suggest that the autoconversion rate relates to cloud droplet number concentration with an exponent of 0.75, that is, smaller than often assumed, which will affect precipitation susceptibility, and therefore warrants further investigation.

Supporting information

Supporting Information S1

Acknowledgments

This research was supported by the Office of Science (BER), DOE under Grants DE‐SC0021167, DE‐SC0013489, DE‐SC0020259, and DE‐89243020SSC000055. Van Leeuwen was supported by the European Research Council under the CUNDA project 694509.

Chiu, J. C. , Yang, C. K. , van Leeuwen, P. J. , Feingold, G. , Wood, R. , Blanchard, Y. , et al. (2021). Observational constraints on warm cloud microphysical processes using machine learning and optimization techniques. Geophysical Research Letters, 48, e2020GL091236 10.1029/2020GL091236

Data Availability Statement

ARM data are available online through http://www.archive.arm.gov. The work on machine learning used resources of the Compute and Data Environment for Science (CADES) at the Oak Ridge National Laboratory, under DOE Contract No. DE‐AC05‐00OR22725. The training and testing data sets, and the machine‐learning trained models are available freely in the ARM Archive and in Github (https://github.com/yang0920colostate/AuAc).

References

  1. Abel, S. J. , & Boutle, I. A. (2012). An improved representation of the raindrop size distribution for single‐moment microphysics schemes. Quarterly Journal of the Royal Meteorological Society, 138, 2151–2162. [Google Scholar]
  2. Ahlgrimm, M. , & Forbes, R. (2014). Improving the representation of low clouds and drizzle in the ECMWF model based on ARM observations from the Azores. Monthly Weather Review, 142, 668–685. [Google Scholar]
  3. Ayala, O. , Rosa, B. , Wang, L.‐P. , & Grabowski, W. W. (2008). Effects of turbulence on the geometric collision rate of sedimenting droplets. Part 1: Results from direct numerical simulation. New Journal of Physics, 10, 075015 10.1088/1367-2630/10/7/075015 [DOI] [Google Scholar]
  4. Beheng, K. D. (1994). A parameterization of warm cloud microphysical conversion processes. Atmospheric Research, 33, 193–206. [Google Scholar]
  5. Berry, E. X. (1968). Modification of the warm rain process. Preprints, First national conference on weather modification (pp. 81–88). Albany, NY: American Meteorological Society. [Google Scholar]
  6. Bodas‐Salcedo, A. , Webb, M. J. , Brooks, M. E. , Ringer, M. A. , William, K. D. , et al. (2008). Evaluating cloud systems in the Met Office global forecast model using simulated CloudSat radar reflectivities. Journal of Geophysical Research, 113, D00A13 10.1029/2007JD009620 [DOI] [Google Scholar]
  7. Byrd, R. H. , Lu, P. , Nocedal, J. , & Zhu, C. (1995). A limited memory Algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16(5), 1190–1208. [Google Scholar]
  8. Chen, S. , Yau, M. K. , & Bartello, P. (2018). Turbulence effects of collision efficiency and broadening of droplet size distribution in cumulus clouds. Journal of the Atmospheric Sciences, 75, 203–217. 10.1175/JAS-D-17-0123.1 [DOI] [Google Scholar]
  9. Cotton, W. R. (1972). Numerical simulation of precipitation development in supercooled cumuli. Monthly Weather Review, 100, 764–784. 10.1175/1520-0493(1972)100,0764:NSOPDI.2.3.CO.2 [DOI] [Google Scholar]
  10. Feingold, G. , Kreidenweis, S. M. , Stevens, B. , & Cotton, W. R. (1996). Numerical simulation of stratocumulus processing of cloud condensation nuclei through collision‐coalescence. Journal of Geophysical Research, 101, 21391–21402. [Google Scholar]
  11. Fielding, M. D. , Chiu, J. C. , Hogan, R. J. , Feingold, G. , Eloranta, E. , O'Connor, E. J. , & Cadeddu, M. P. (2015). Joint retrievals of cloud and drizzle in marine boundary layer clouds using ground‐based radar, lidar and zenith radiances. Atmospheric Measurement Techniques, 8, 2663–2683. 10.5194/amt-8-2663-2015 [DOI] [Google Scholar]
  12. Glienke, S. , & Mei, F. (2019). Two‐dimensional stereo (2D‐S) probe instrument handbook. Retrieved from https://www.arm.gov/publications/tech_reports/handbooks/doe-sc-arm-tr-233.pdf [Google Scholar]
  13. Glienke, S. , & Mei, F. (2020). Fast cloud droplet probe (FCDP) instrument handbook. Retrieved from https://www.arm.gov/publications/tech_reports/handbooks/doe-sc-arm-tr-238.pdf [Google Scholar]
  14. Golaz, J.‐C. , Horowitz, L. W. , & Levy, H. II (2013). Cloud tuning in a coupled climate model: Impact on 20th century warming. Geophysical Research Letters, 40, 2246–2251. 10.1002/grl.5023 [DOI] [Google Scholar]
  15. Grabowski, W. W. , & Wang, L.‐P. (2013). Growth of cloud droplets in a turbulent environment. Annual Review of Fluid Mechanics, 45, 293–324. [Google Scholar]
  16. Hall (1980). A detailed microphysical model within a two‐dimensional dynamic framework: Model description and preliminary results. Journal of the Atmospheric Sciences, 37, 2486–2507. [Google Scholar]
  17. Hsieh, W. C. , Jonsson, H. , Wang, L.‐P. , Buzorius, G. , Flagan, R. C. , Seinfeld, J. H. , & Nenes, A. (2009). On the representation of droplet coalescence and autoconversion: Evaluation using ambient cloud droplet size distributions. Journal of Geophysical Research, 114, D07201 10.1029/2008JD010502 [DOI] [Google Scholar]
  18. Jing, X. , Suzuki, K. , Guo, H. , Goto, D. , Ogura, T. , Koshiro, T. , & Mülmenstädt, J. (2017). A multimodel study on warm precipitation biases in global models compared to satellite observations. Journal of Geophysical Research: Atmospheres, 122, 11806–11824. 10.1002/2017JD027310 [DOI] [Google Scholar]
  19. Kessler, E. (1969). On the distribution and continuity of water substance in atmospheric circulations. Meteorological monographs (Vol. 10, pp. 1–84). Boston, MA: American Meteorological Society; 10.1007/978-1-935704-36-2_1 [DOI] [Google Scholar]
  20. Khairoutdinov, M. , & Kogan, Y. (2000). A new cloud physics parameterization in a large‐eddy simulation model of marine stratocumulus. Monthly Weather Review, 128, 229–243. [Google Scholar]
  21. Kingma, D. P. , & Ba, J. L. (2015). ADAM: A method for stochastic optimization 3rd International Conference on Learning Representations, {ICLR} 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings. Retrieved from http://arxiv.org/abs/1412.6980 [Google Scholar]
  22. Lee, H. , & Baik, J.‐J. (2017). A physically based autoconversion parameterization. Journal of the Atmospheric Sciences, 74, 1599–1616. [Google Scholar]
  23. Liu, Y. , & Daum, P. H. (2004). Parameterization of the autoconversion process. Part I: Analytical formulation of the Kessler‐type parameterizations. Journal of the Atmospheric Sciences, 61, 1539–1548. [Google Scholar]
  24. Long, A. B. (1974). On the evaluation of the collection kernel for the coalescence of water droplets. Journal of the Atmospheric Sciences, 31, 1040–1052. [Google Scholar]
  25. Maas, A. L. , Hannun, A. Y. , & Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models In Proceedings of the 30th International Conference on Machine Learning. [Google Scholar]
  26. Mace, G. G. , Avey, S. , Cooper, S. , Lebsock, M. , Tanelli, S. , & Dobrowalski, G. (2016). Retrieving co‐occurring cloud and precipitation properties of warm marine boundary layer clouds with A‐Train data. Journal of Geophysical Research: Atmospheres, 121, 4008–4033. 10.1002/2015JD023681 [DOI] [Google Scholar]
  27. Mei, F. , Wang, J. , Comstock, J. M. , Weigel, R. , Krämer, M. , Mahnke, C. , et al. (2020). Comparison of aircraft measurements during GoAmazon2014/5 and ACRIDICON‐CHUVA. Atmospheric Measurement Techniques, 13, 661–684. 10.5194/amt-13-661-2020 [DOI] [Google Scholar]
  28. Rusli, S. P. , Donovan, D. P. , & Russchenberg, H. W. J. (2017). Simultaneous and synergistic profiling of cloud and drizzle properties using ground‐based observations. Atmospheric Measurement Techniques, 10, 4777–4803. 10.5194/amt-10-4777-2017 [DOI] [Google Scholar]
  29. Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85–117. 10.1016/j.neunet.2014.09.003 [DOI] [PubMed] [Google Scholar]
  30. Seifert, A. , & Beheng, K. D. (2001). A double‐moment parameterization for simulating autoconversion, accretion and self‐collection. Atmospheric Research, 59, 265–281. 10.1016/S0169-8095(01)00126-0 [DOI] [Google Scholar]
  31. Stephens, G. L. , L'Ecuyer, T. , Forbes, R. , Gettlemen, A. , Golaz, J.‐C. , Bodas‐Salcedo, A. , et al. (2010). Dreary state of precipitation in global models. Journal of Geophysical Research, 115, D24211 10.1029/2010JD014532 [DOI] [Google Scholar]
  32. Tzivion, S. , Feingold, G. , & Levin, Z. (1987). An efficient numerical solution to the stochastic coalescence equation. Journal of the Atmospheric Sciences, 44(21), 3139–3149. [Google Scholar]
  33. Wang, L.‐P. , & Grabowski, W. W. (2009). The role of air turbulence in warm rain initiation. Atmospheric Science Letters, 10, 1–8. [Google Scholar]
  34. Wood, R. (2005). Drizzle in stratiform boundary layer clouds. Part II: Microphysical aspects. Journal of the Atmospheric Sciences, 62, 3034–3050. [Google Scholar]
  35. Wu, P. , Dong, X. , Xi, B. , Tian, J. & Ward, D. M. (2020), Profiles of MBL cloud and drizzle microphysical properties retrieved from ground‐based observations and validated by aircraft in situ measurements over the Azores. Journal of Geophysical Research: Atmospheres, 125, e2019JD032205 10.1029/2019JD032205 [DOI] [Google Scholar]
  36. Wyant, M. C. , Bretherton, C. S. , Wood, R. , Carmichael, G. R. , Clarke, A. , Fast, J. , et al. (2015). Global and regional modeling of clouds and aerosols in the marine boundary layer during VOCALS: The VOCA intercomparison. Atmospheric Chemistry and Physics, 15(1), 153–172. [Google Scholar]
  37. Zeng, X. , & Li, X. (2020). Two‐moment bulk parameterization of the drop collection growth in warm clouds. Journal of the Atmospheric Sciences, 77, 797–811. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information S1

Data Availability Statement

ARM data are available online through http://www.archive.arm.gov. The work on machine learning used resources of the Compute and Data Environment for Science (CADES) at the Oak Ridge National Laboratory, under DOE Contract No. DE‐AC05‐00OR22725. The training and testing data sets, and the machine‐learning trained models are available freely in the ARM Archive and in Github (https://github.com/yang0920colostate/AuAc).


Articles from Geophysical Research Letters are provided here courtesy of Wiley

RESOURCES