Decomposition of the mean absolute error (MAE) into systematic and unsystematic components

Scott M Robeson; Cort J Willmott

doi:10.1371/journal.pone.0279774

. 2023 Feb 17;18(2):e0279774. doi: 10.1371/journal.pone.0279774

Decomposition of the mean absolute error (MAE) into systematic and unsystematic components

Scott M Robeson ^1,^*, Cort J Willmott ²

Editor: Fabiana Zama³

PMCID: PMC9937461 PMID: 36800326

Abstract

When evaluating the performance of quantitative models, dimensioned errors often are characterized by sums-of-squares measures such as the mean squared error (MSE) or its square root, the root mean squared error (RMSE). In terms of quantifying average error, however, absolute-value-based measures such as the mean absolute error (MAE) are more interpretable than MSE or RMSE. Part of that historical preference for sums-of-squares measures is that they are mathematically amenable to decomposition and one can then form ratios, such as those based on separating MSE into its systematic and unsystematic components. Here, we develop and illustrate a decomposition of MAE into three useful submeasures: (1) bias error, (2) proportionality error, and (3) unsystematic error. This three-part decomposition of MAE is preferable to comparable decompositions of MSE because it provides more straightforward information on the nature of the model-error distribution. We illustrate the properties of our new three-part decomposition using a long-term reconstruction of streamflow for the Upper Colorado River.

Introduction

Across the sciences, model-estimation and -prediction errors are often summarized and analyzed using dimensioned [1] and dimensionless [2] measures. While dimensionless error measures have received considerable attention [3–5], dimensioned measures are better suited to summarizing the magnitude of model error in meaningful units. When given a set of model predictions (P_i, i = 1, 2, …, n), where each P_i corresponds to a reliable observation (O_i), the mean squared error (MSE) and the root mean squared error (RMSE):

\begin{matrix} MSE = \frac{1}{n} \sum_{i = 1}^{n} {(P_{i} - O_{i})}^{2} \end{matrix}

(1)

\begin{matrix} RMSE = {[\frac{1}{n} \sum_{i = 1}^{n} {(P_{i} - O_{i})}^{2}]}^{\frac{1}{2}} \end{matrix}

(2)

are routinely reported [6]. The mean absolute error (MAE):

\begin{matrix} MAE = \frac{1}{n} \sum_{i = 1}^{n} | P_{i} - O_{i} | \end{matrix}

(3)

is reported less often, even though it has a clearer interpretation than RMSE because MAE is the average error [7].

We and others have shown elsewhere that error statistics based on sums-of-squares have a number of issues that make them less interpretable than those based on absolute values [1, 4, 7–12]. This is especially the case when they are used as measures of average model error. An additional drawback to using MSE is that its squared dimensional units are difficult to interpret. As a result, MAE is the preferred measure of average model error. Even so, sums-of-squares measures continue to be assessed and reported, partially due to inertia, but also to their amenability to mathematical decomposition into additive variance-based measures. In the context of evaluating model error, this property was used by Willmott in 1981 [13] to decompose MSE into systematic (MSE_s) and unsystematic (MSE_u) components:

\begin{matrix} {MSE}_{s} = \frac{1}{n} \sum_{i = 1}^{n} {({\hat{P}}_{i} - O_{i})}^{2} \end{matrix}

(4)

\begin{matrix} {MSE}_{u} = \frac{1}{n} \sum_{i = 1}^{n} {(P_{i} - {\hat{P}}_{i})}^{2} \end{matrix}

(5)

such that

\begin{matrix} MSE = {MSE}_{s} + {MSE}_{u} . \end{matrix}

(6)

For both (MSE_s) and (MSE_u), ordinary least-squares (OLS) regression of the model predictions on the observations typically is used to obtain ${\hat{P}}_{i}$ (i.e., a linear fit of P on O).

As computed above, (MSE_s) is typically interpreted as consistent over- and/or under-prediction of the observations by the model (i.e., the model has non-zero mean bias and/or the regression slope is not one). The unsystematic component provides an estimate of the model’s random error or scatter about the regression line. Forming the ratios of MSE_s/MSE and MSE_u/MSE gives estimates of the fraction of total error (as estimated by MSE) that is identified as systematic or unsystematic. This decomposition has served as a relatively insightful summary of model error (e.g., [14]) and has been used as a guide to model improvement because a model that has a large amount of systematic error usually can be respecified to reduce the consistent over- or under-prediction.

While decomposing MSE into its constituent components has been a useful approach, MSE is a flawed measure of average model error. Using MSE to identify systematic and unsystmatic components of error, therefore, can produce misleading summaries of the types of errors that various models contain. Even more importantly, models may be inappropriately adjusted to reduce systematic error that has been misidentified by the MSE-based approach, e.g., when the impacts of outliers are overemphasized. As a result, our goal here is to develop and present a more rational approach for error decomposition that uses MAE as the baseline for average model error.

Decomposition of MAE into three components

Although our goal is to partition MAE into components that represent systematic and unsystematic error patterns, we also want to move beyond the traditional two-part decomposition to further divide systematic errors into two separate components. One can be used to indicate the amount of bias in a model and the other to represent the extent to which the model predictions systematically under- or over-estimate observations falling below and above the observed mean (the regression slope is not one). The latter is referred to as proportionality error and is distinct from model bias represented in the under- or overestimation of the observed mean.

For each of these three types of error—bias, proportionality, and unsystematic—we develop a weighting function that can be used to partition MAE into its three components. We offer a diagram (Fig 1) using a small, synthetic dataset (Table 1) to illustrate the estimation of the three components.

Table 1. Data and error components for the example in Fig 1.

For the components whose sum (Σ) is given in the far right column, dividing by 6 gives the mean value (i.e., $\bar{O}$ , $\bar{P}$ , MSE, MSE_s, (MSE_u), and MAE). For the other three rows that have absolute values and are used to form the b, p_i, and u_i weights, which then determine ${MAE}_{b}$ , ${MAE}_{p}$ and ${MAE}_{u}$ (see Eqs 14–16), the sums are not relevant and, therefore, are not given.

Component	1	2	3	4	5	6	Σ
Observed (O_i)	-3	-2	2	3	4	5	9
Predicted (P_i)	0	-4	0	1	4	2	3
(P_i − O_i)²	9	4	4	4	0	9	30
${({\hat{P}}_{i} - O_{i})}^{2}$	0.587	0.140	1.431	2.524	3.926	5.635	14.24
${(P_{i} - {\hat{P}}_{i})}^{2}$	4.989	5.635	0.646	0.169	3.926	0.392	15.76
\|P_i − O_i\|	3	2	2	2	0	3	12
b = \|MBE\|	1	1	1	1	1	1	–
$p_{i} = \| {\hat{P}}_{i}^{'} - O_{i} \|$	1.766	1.374	0.196	0.589	0.981	1.374	–
$u_{i} = \| P_{i}^{'} - {\hat{P}}_{i}^{'} \|$	2.237	2.374	0.804	0.411	1.981	0.626	–

Open in a new tab

Bias error

Here, we define bias as the component of systematic error that is contained in the over- or under-prediction of the observed mean. This is often referred to as the mean bias error (MBE):

\begin{matrix} MBE = \bar{P} - \bar{O} \end{matrix}

(7)

In addition to indicating average over- or under-prediction, MBE can be used to develop a corresponding (to P_i) set of unbiased predicted values:

\begin{matrix} P_{i}^{'} = P_{i} - MBE \end{matrix}

(8)

The magnitude (absolute value) of MBE can additionally serve as the weight that determines the relative importance of bias to the overall MAE:

\begin{matrix} b = | MBE | . \end{matrix}

(9)

The magnitude of bias for our example dataset (Table 1) can be seen in the bottom left panel in Fig 1, where the model predictions systematically underestimate the mean of the observations by 1.

Proportionality error

In addition to bias error (caused by an incorrect estimation of the observed mean), there is another systematic error related to consistent under- or over-prediction. This error is reflected in the slope of the regression estimate of $P_{i}^{'}$ on O_i (note that if the regressions are estimated using OLS, then this slope estimate is the same as that of P on O). If the slope of this relationship is anything other than unity, there is proportionality error in the model predictions. A slope of less than one indicates that the model systematically overestimates values below $\bar{O}$ and underestimates those above. Conversely, a slope greater than one indicates that the model systematically underestimates values below $\bar{O}$ and overestimates those above. To estimate proportionality error, we use the unbiased predicted regression values:

\begin{matrix} {\hat{P}}_{i}^{'} = {\hat{P}}_{i} - MBE \end{matrix}

(10)

Given that the OLS solution for $\hat{P}$ is constrained to pass through $(\bar{O}, \bar{P})$ , ${\hat{P}}^{'}$ passes through $(\bar{O}, \bar{O})$ and, therefore, is unbiased. Weights for the relative importance of proportionality error (for each O_i) are determined using the difference between the unbiased predictions and the observations (the red lines in Fig 1d):

\begin{matrix} p_{i} = | {\hat{P}}_{i}^{'} - O_{i} | . \end{matrix}

(11)

Unsystematic error

After accounting for bias and proportionality errors, the remaining error is related to scatter about ${\hat{P}}^{'}$ . Analogous to the way that the individual components of MSE_u are formed, weights for the relative importance of each prediction’s unsystematic error are determined using the difference between the unbiased predictions and the unbiased regression values:

\begin{matrix} u_{i} = | P_{i}^{'} - {\hat{P}}_{i}^{'} | . \end{matrix}

(12)

Once again, if OLS regression is used for ${\hat{P}}_{i}^{'}$ , then the biased predictions and regression values produce the same weights:

\begin{matrix} u_{i} = | P_{i}^{'} - {\hat{P}}_{i}^{'} | = | P_{i} - {\hat{P}}_{i} | . \end{matrix}

(13)

Three-component decomposition of MAE

The three weights for bias, proportionality, and unsystematic error developed above now can be used to scale the individual components of absolute error:

\begin{matrix} {MAE}_{b} = \frac{1}{n} \sum_{i = 1}^{n} \frac{b}{b + p_{i} + u_{i}} | P_{i} - O_{i} |, \end{matrix}

(14)

\begin{matrix} {MAE}_{p} = \frac{1}{n} \sum_{i = 1}^{n} \frac{p_{i}}{b + p_{i} + u_{i}} | P_{i} - O_{i} |, \end{matrix}

(15)

\begin{matrix} {MAE}_{u} = \frac{1}{n} \sum_{i = 1}^{n} \frac{u_{i}}{b + p_{i} + u_{i}} | P_{i} - O_{i} | . \end{matrix}

(16)

A clear advantage of this weight-based decomposition of average error is that it uses MAE rather than MSE as the baseline. Another advantage is that predictions that have no error do not contribute to the components. This was not the case with the MSE-based decomposition, where predictions that have no error can substantially influence the values of MSE_s and MSE_u (e.g., the point that lies directly on the 1:1 line in the upper right panel of Fig 1 contributes substantially to both MSE_s and MSE_u despite being error-free).

It is possible for the denominator within these summations (i.e., b + s_i + u_i) to be zero, but that can only occur when a model has no bias and the regression line passes through a predicted value that has no error (i.e., when b = 0 and $P_{i} = {\hat{P}}_{i} = O_{i}$ ). If that rare model-prediction event occurs, those elements with b + p_i + u_i = 0 can simply be excluded from the summation.

Given the definitions in Eqs 14–16, MAE_b, MAE_p, and MAE_u sum to MAE:

\begin{matrix} MAE = {MAE}_{b} + {MAE}_{p} + {MAE}_{u} . \end{matrix}

(17)

As with MSE_s and MSE_u, it is instructive to form ratios (i.e., MAE_b/MAE, MAE_p/MAE, and MAE_u/MAE) to identify the proportion of total error contributed by each component. The constraints within the weighted decomposition of MAE diminish MAE_b relative to the magnitude of MBE. MBE, therefore, remains a useful metric to be reported when analyzing model error. R and Matlab functions for these calculations are provided in the S1 File.

An example of model-estimation errors

To illustrate the properties of our newly derived measures of model error, we use a tree-ring based reconstruction model developed by Meko et al. [15]. This reconstruction provides over 1200 years of annually resolved flow predictions for the Upper Colorado River at Lee’s Ferry. The large majority of the annual flow in the Colorado River comes from upstream of Lee’s Ferry [16], so the reconstruction is an essential indicator of historical water availability for the Colorado River. The observed and reconstructed flow are water-year totals for a large river and, therefore, are reported in billions of cubic meters per year. Both the observed data and reconstructed values are based on estimates of “naturalized” streamflow, which corrects for the anthropogenic alterations of flow (i.e., reservoirs, irrigation, etc.). In a recent article [14], the model was bias-corrected so that its empirical probability distribution better matched that of the observations (e.g., compare Fig 2a and 2b). Here, we employ our new decomposition of model errors to compare the three sources of error in the original and bias-corrected reconstruction.

Prior to bias correction (Fig 2a), the Upper Colorado River reconstruction has low overall error, with a MAE of 2.12 billion m³ (i.e., when compared to the $\bar{O}$ of 18.53 billion m³). The small value of MAE_b (0.08 billion m³) also shows that the reconstruction model faithfully reproduces the observed mean. From the scatterplot and the substantial amount (34%) of error in MAE_p, however, it is clear that high flow years are underestimated (and, to a lesser extent, low flows are overestimated). Even with these substantial proportionality errors, the majority (62%) of the mean absolute error is in MAE_u), which is desirable (i.e., the majority of error is unsystematic). At the same time, the traditional decomposition into MSE_s and MSE_u masks the distinction between bias and proportionality error while also providing an underestimate of these combined systematic errors because it is inflating its representation of the unsystematic error (MSE_u) by squaring the model-predicted deviations from the regression line. As a result, the MSE-based measures suggest that there is little room for improvement when there is.

Bias correction (Fig 2b) produces a reconstruction model that has similar MAE (2.05 billion m³) to the original model. But, bias correction has produced much lower error in the two systematic terms of MAE, reducing MAE_b to 3% and MAE_p to 19% of MAE. From the slope of the regression line, however, it is clear that there still is some proportionality error that the bias-correction procedure has not entirely removed. The MSE-based measures present a rosier picture of the reduction of systematic error, again due to the inflation of the unsystematic error produced by the squaring of the deviations around the regression line. Overall, the MAE-based approach shows that there is still room for additional improvement in the original reconstruction (Fig 2a) and in the bias correction procedure (Fig 2b) than is evident in the MSE-based measures. In particular, the additional systematic component introduced here, MAE_p, suggests that high flows still need to be adjusted upward.

Conclusions

Traditional decomposition of sums-of-squared errors into systematic (MSE_s) and unsystematic (MSE_u) components has been a popular approach for characterizing the different components of model error. These sums-of-squares-based measures, however, have been shown to be imprecise and, at times, deceptive indicators of average error and its constituents. As a result, evaluations of model estimates and predictions should increasingly use absolute-value-based error measures such as MAE. To fill the need for a decomposition of MAE into its constituent components, we present new measures that are formed as weighted averages of the absolute error. As a result, MAE can now be decomposed into three components that represent bias (MAE_b), proportionality (MAE_p), and unsystematic (MSE_u) errors. These measures provide a more intrepretable standard for evaluating model errors while also pointing to more specific types of error that may be reduced.

Supporting information

S1 File

(PDF)

Click here for additional data file.^{(38.9KB, pdf)}

Acknowledgments

The authors appreciate the informative and constructive comments of the two reviewers.

Data Availability

The data for the original reconstruction as well as the bias‐corrected version used here are available from NOAA’s Paleoclimatology Data site: https://www.ncei.noaa.gov/access/paleo-search/study/28810.

Funding Statement

The author(s) received no specific funding for this work.

References

1. Willmott CJ, Robeson SM, Matsuura K. Climate and other models may be more accurate than reported. EOS. 2017. 98. [Google Scholar]
2. Willmott CJ, Robeson SM, Matsuura K., Ficklin DL. Assessment of three dimensionless measures of model performance. Environ Mod Softw. 2015. 73: 167–174. doi: 10.1016/j.envsoft.2015.08.012 [DOI] [Google Scholar]
3. Nash JE, Sutcliffe JV. River flow forecasting through conceptual models part I—A discussion of principles. J Hydrol. 1970. 10: 282–290. doi: 10.1016/0022-1694(70)90255-6 [DOI] [Google Scholar]
4. Legates DR, McCabe GJ Jr. Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Wat Resour Res. 1999. 35: 233–241. doi: 10.1029/1998WR900018 [DOI] [Google Scholar]
5. Willmott CJ, Robeson SM, Matsuura K. A refined index of model performance. Intl J Climatol. 2012. 32: 2088–2094. doi: 10.1002/joc.2419 [DOI] [Google Scholar]
6. Jackson EK, Roberts W, Nelsen B, Williams GP, Nelson EJ, Ames Dp. Introductory overview: Error metrics for hydrologic modelling–A review of common practices and an open source library to facilitate use and adoption. Environ Mod Softw. 2019. 119: 32–48. doi: 10.1016/j.envsoft.2019.05.001 [DOI] [Google Scholar]
7. Willmott CJ, Matsuura K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res. 2005. 30: 79–82. doi: 10.3354/cr030079 [DOI] [Google Scholar]
8. Gao J. Bias-variance decomposition of absolute errors for diagnosing regression models of continuous data. Patterns. 2021. 2: 100309. doi: 10.1016/j.patter.2021.100309 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Mielke PW, Berry K. Permutation methods: a distance function approach, Springer; 2007.
10. Pontius RG Jr, Thontteh O, Chen H. Components of information for multiple resolution comparison between maps that share a real variable. Environ Ecol Stat. 2008. 15: 111–142. doi: 10.1007/s10651-007-0043-y [DOI] [Google Scholar]
11. Willmott CJ, Matsuura K, Robeson SM. Ambiguities inherent in sums-of-squares-based error statistics. Atmos Environ. 2009. 43: 749–752. doi: 10.1016/j.atmosenv.2008.10.005 [DOI] [Google Scholar]
12.Pontius Jr RG. Metrics That Make a Difference: How to Analyze Change and Error. Advances in Geographic Information Science. Springer Nature; 2022. 10.1007/978-3-030-70765-1 [DOI]
13. Willmott CJ. On the validation of models Phys Geog. 1981. 2(2):184–94. doi: 10.1080/02723646.1981.10642213 [DOI] [Google Scholar]
14. Robeson SM, Maxwell JT, Ficklin DL. Bias correction of paleoclimatic reconstructions: A new look at 1,200+ years of Upper Colorado River flow. Geophys Res Lett. 2020. e2019GL086689. [Google Scholar]
15. Meko DM, Woodhouse CA, Baisan CA, Knight T, Lukas JJ, Hughes MK, Salzer MW. Medieval drought in the upper Colorado River Basin. Geophys Res Lett. 2007. L10705. doi: 10.1029/2007GL029988 [DOI] [Google Scholar]
16. Christensen NS, Lettenmaier DP. A multimodel ensemble approach to assessment of climate change impacts on the hydrology and water resources of the Colorado River Basin, Hydrol Earth Sys Sci. 2007. 11: 1417–1434. doi: 10.5194/hess-11-1417-2007 [DOI] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0279774.r001

Decision Letter 0

Fabiana Zama

26 Oct 2022

PONE-D-22-17609

Decomposition of the mean absolute error (MAE) into systematic and unsystematic components

PLOS ONE

Dear Dr. Robeson,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 10 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Fabiana Zama

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments:

Dear Prof. Scott M. Robeson,

Based on the reviewers' comments, the decision is to accept your manuscript after minor revisions.

Please follow the points raised by each reviewer.

Kind Regards

Fabiana Zama

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: My areas of expertise focus on metrics to measure error and change, which is the exact topic of the submitted manuscript. I have thought about the concepts of the submitted manuscript for decades. Therefore, I report with confidence that the submitted manuscript is a brilliant major breakthrough. I inserted several comments in the PDF as I read. I ask the authors to consider those comments to increase clarity of the manuscript. The submitted manuscript is short, which is a strength. The manuscript states only what is necessary, which any scientific manuscript should do. The manuscript proposes a method to fix the flawed popular paradigm of squared deviations. The authors’ new method makes much better sense than the popular paradigm. Previous methods that have used Mean Absolute Error (Pontius Jr 2022) do not separate the Mean Absolute Error into as many helpful components as the proposed manuscript does. The manuscript illustrates how the new method has helpful practical implications. Below are ideas to make the manuscript even stronger than it already is.

My browser could not activate the link where the authors have posted the data at https://www.ncdc.noaa.gov/dataaccess/paleoclimatology%E2%80%90data

The example is helpful. It would be more helpful to have a column at the right in Table 1 to show the sum. It would be nicer to have the numbers be simpler such as all whole numbers for Pi and Oi, and to have n be a number that makes easier division than by 7.

In figure 1d, it is not immediately clear to me why MAEb = 0.424 rather than |-0.714|, which is the absolute bias in figure 1c. The reader would understand better if the revised manuscript were to have a sentence to explain why |Bias| does not equal MAEb.

Figures 1 and 2 should be consistent in the number of digits in the results. Report the % to the nearest whole number. The other numbers should have exactly two decimal places.

I thank the authors for using sequential line numbers.

In line 108, I think it would be clearer to eliminate “are conservative and must”

In line 137, the meaning of the quotes around average is unclear. I find the language is imprecise when I see quotes like that.

In line 155, replace “powerful” with “popular”.

In lines 139-140, the implication is profound. Congratulations on the creation of a helpful method.

Readers will be eager to have computer code, say in R, which would inspire more rapid adoption by researchers.

I hope the authors find this review helpful, as I intend it to be. The authors have achieved a major accomplishment.

LITERATURE

Pontius Jr, Robert Gilmore. 2022. Metrics That Make a Difference: How to Analyze Change and Error. Advances in Geographic Information Science. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-70765-1.

Reviewer #2: In the present paper, the authors propose the decomposition of the Mean Absolute Error into three components which represent the model bias, proportionality, and unsystematic components.

The error components are clearly explained through a synthetic example and a data sample.

Therefore, the present study improves the standard measures for evaluating model errors significantly. The analysis of the three components makes it possible to understand their different contribution.

Overall, the paper is well written and well organized, hence the decision is to accept after minor revisions.

Points of attention:

• The data link is not working: https://www.ncdc.noaa.gov/dataaccess/ paleoclimatology‐data

• Page 2 line 16. remove parenthesis

• Page 2 line 41. remove parenthesis

• Page 4 equation (14-16). Should be divided by n.

• Page 5 equation (17) holds due to the definitions in (14)-(16).

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Robert Gilmore Pontius Jr

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: PONE-D-22-17609_reviewer.pdf

Click here for additional data file.^{(791.6KB, pdf)}

PLoS One. 2023 Feb 17;18(2):e0279774. doi: 10.1371/journal.pone.0279774.r002

Author response to Decision Letter 0

9 Dec 2022

Decomposition of the mean absolute error (MAE) into

systematic and unsystematic components

Response to Reviews

Reviewer #1

My areas of expertise focus on metrics to measure error and change, which is the exact topic of the submitted manuscript. I have thought about the concepts of the submitted manuscript for decades. Therefore, I report with confidence that the submitted manuscript is a brilliant major breakthrough. I inserted several comments in the PDF as I read. I ask the authors to consider those comments to increase clarity of the manuscript. The submitted manuscript is short, which is a strength. The manuscript states only what is necessary, which any scientific manuscript should do. The manuscript proposes a method to fix the flawed popular paradigm of squared deviations. The authors’ new method makes much better sense than the popular paradigm. Previous methods that have used Mean Absolute Error (Pontius Jr 2022) do not separate the Mean Absolute Error into as many helpful components as the proposed manuscript does. The manuscript illustrates how the new method has helpful practical implications. Below are ideas to make the manuscript even stronger than it already is.

Thank you so much for your comments and the very useful feedback throughout your review.

My browser could not activate the link where the authors have posted the data at https://www.ncdc.noaa.gov/dataaccess/paleoclimatology%E2%80%90data

Our apologies. We have corrected the link, which now goes directly to the NOAA site for the particular data used. It also is given below:

https://www.ncei.noaa.gov/access/paleo-search/study/28810

We modified our example to use a set of 6 numbers. We also added the summation column at the far right. The figure has been updated accordingly.

We have added the following explanation at the end of the section that discusses Fig. 1, as well as the additional recommendation to continue examining MBE:

The constraints within the weighted decomposition of MAE diminish MAEb relative to the magnitude of MBE. MBE, therefore, remains a useful metric to be reported when analyzing model error.

Figures 1 and 2 should be consistent in the number of digits in the results. Report the % to the nearest whole number. The other numbers should have exactly two decimal places.

We made these corrections.

I thank the authors for using sequential line numbers.

In line 108, I think it would be clearer to eliminate “are conservative and must”

In line 137, the meaning of the quotes around average is unclear. I find the language is imprecise when I see quotes like that.

In line 155, replace “powerful” with “popular”.

We made minor edits to address all of these comments.

In lines 139-140, the implication is profound. Congratulations on the creation of a helpful method.

Thank you!

Readers will be eager to have computer code, say in R, which would inspire more rapid adoption by researchers.

We now provide R and Matlab functions for these calculations in the Supporting Information.

I hope the authors find this review helpful, as I intend it to be. The authors have achieved a major accomplishment.

We found this review extremely helpful and appreciate your positive assessment.

LITERATURE

This reference was added and we also made the suggested editorial changes that were in the annotated PDF.

Reviewer #2

In the present paper, the authors propose the decomposition of the Mean Absolute Error into three components which represent the model bias, proportionality, and unsystematic components.

The error components are clearly explained through a synthetic example and a data sample.

Overall, the paper is well written and well organized, hence the decision is to accept after minor revisions.

Thank you very much for your comments and for the overall positive assessment of our work.

Points of attention:

• The data link is not working: https://www.ncdc.noaa.gov/dataaccess/ paleoclimatology‐data

Our apologies. We have added the corrected link, which now goes directly to the NOAA site for the particular data used. It also is given below:

https://www.ncei.noaa.gov/access/paleo-search/study/28810

• Page 2 line 16. remove parenthesis

• Page 2 line 41. remove parenthesis

We made these two corrections.

• Page 4 equation (14-16). Should be divided by n.

Thank you for catching this error. It has been corrected.

• Page 5 equation (17) holds due to the definitions in (14)-(16).

We changed the text here slightly to clarify this point.

Attachment

Submitted filename: Decomposition_MAE_Response_to_Review.pdf

Click here for additional data file.^{(143.7KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0279774.r003

Decision Letter 1

Fabiana Zama

14 Dec 2022

Decomposition of the mean absolute error (MAE) into systematic and unsystematic components

PONE-D-22-17609R1

Dear Dr. Robeson,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Fabiana Zama

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0279774.r004

Acceptance letter

Fabiana Zama

21 Dec 2022

PONE-D-22-17609R1

Decomposition of the mean absolute error (MAE) into systematic and unsystematic components

Dear Dr. Robeson:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Professor Fabiana Zama

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 File

(PDF)

Click here for additional data file.^{(38.9KB, pdf)}

Attachment

Submitted filename: PONE-D-22-17609_reviewer.pdf

Click here for additional data file.^{(791.6KB, pdf)}

Attachment

Submitted filename: Decomposition_MAE_Response_to_Review.pdf

Click here for additional data file.^{(143.7KB, pdf)}

Data Availability Statement

[pone.0279774.ref001] 1. Willmott CJ, Robeson SM, Matsuura K. Climate and other models may be more accurate than reported. EOS. 2017. 98. [Google Scholar]

[pone.0279774.ref002] 2. Willmott CJ, Robeson SM, Matsuura K., Ficklin DL. Assessment of three dimensionless measures of model performance. Environ Mod Softw. 2015. 73: 167–174. doi: 10.1016/j.envsoft.2015.08.012 [DOI] [Google Scholar]

[pone.0279774.ref003] 3. Nash JE, Sutcliffe JV. River flow forecasting through conceptual models part I—A discussion of principles. J Hydrol. 1970. 10: 282–290. doi: 10.1016/0022-1694(70)90255-6 [DOI] [Google Scholar]

[pone.0279774.ref004] 4. Legates DR, McCabe GJ Jr. Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Wat Resour Res. 1999. 35: 233–241. doi: 10.1029/1998WR900018 [DOI] [Google Scholar]

[pone.0279774.ref005] 5. Willmott CJ, Robeson SM, Matsuura K. A refined index of model performance. Intl J Climatol. 2012. 32: 2088–2094. doi: 10.1002/joc.2419 [DOI] [Google Scholar]

[pone.0279774.ref006] 6. Jackson EK, Roberts W, Nelsen B, Williams GP, Nelson EJ, Ames Dp. Introductory overview: Error metrics for hydrologic modelling–A review of common practices and an open source library to facilitate use and adoption. Environ Mod Softw. 2019. 119: 32–48. doi: 10.1016/j.envsoft.2019.05.001 [DOI] [Google Scholar]

[pone.0279774.ref007] 7. Willmott CJ, Matsuura K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res. 2005. 30: 79–82. doi: 10.3354/cr030079 [DOI] [Google Scholar]

[pone.0279774.ref008] 8. Gao J. Bias-variance decomposition of absolute errors for diagnosing regression models of continuous data. Patterns. 2021. 2: 100309. doi: 10.1016/j.patter.2021.100309 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0279774.ref009] 9.Mielke PW, Berry K. Permutation methods: a distance function approach, Springer; 2007.

[pone.0279774.ref010] 10. Pontius RG Jr, Thontteh O, Chen H. Components of information for multiple resolution comparison between maps that share a real variable. Environ Ecol Stat. 2008. 15: 111–142. doi: 10.1007/s10651-007-0043-y [DOI] [Google Scholar]

[pone.0279774.ref011] 11. Willmott CJ, Matsuura K, Robeson SM. Ambiguities inherent in sums-of-squares-based error statistics. Atmos Environ. 2009. 43: 749–752. doi: 10.1016/j.atmosenv.2008.10.005 [DOI] [Google Scholar]

[pone.0279774.ref012] 12.Pontius Jr RG. Metrics That Make a Difference: How to Analyze Change and Error. Advances in Geographic Information Science. Springer Nature; 2022. 10.1007/978-3-030-70765-1 [DOI]

[pone.0279774.ref013] 13. Willmott CJ. On the validation of models Phys Geog. 1981. 2(2):184–94. doi: 10.1080/02723646.1981.10642213 [DOI] [Google Scholar]

[pone.0279774.ref014] 14. Robeson SM, Maxwell JT, Ficklin DL. Bias correction of paleoclimatic reconstructions: A new look at 1,200+ years of Upper Colorado River flow. Geophys Res Lett. 2020. e2019GL086689. [Google Scholar]

[pone.0279774.ref015] 15. Meko DM, Woodhouse CA, Baisan CA, Knight T, Lukas JJ, Hughes MK, Salzer MW. Medieval drought in the upper Colorado River Basin. Geophys Res Lett. 2007. L10705. doi: 10.1029/2007GL029988 [DOI] [Google Scholar]

[pone.0279774.ref016] 16. Christensen NS, Lettenmaier DP. A multimodel ensemble approach to assessment of climate change impacts on the hydrology and water resources of the Colorado River Basin, Hydrol Earth Sys Sci. 2007. 11: 1417–1434. doi: 10.5194/hess-11-1417-2007 [DOI] [Google Scholar]

PERMALINK

Decomposition of the mean absolute error (MAE) into systematic and unsystematic components

Scott M Robeson

Cort J Willmott

Roles

Abstract

Introduction

Decomposition of MAE into three components

Fig 1. Representations of model-prediction errors showing aspects of the decomposition into systematic and unsystematic components (using data from Table 1).

Table 1. Data and error components for the example in Fig 1.

Bias error

Proportionality error

Unsystematic error

Three-component decomposition of MAE

An example of model-estimation errors

Fig 2. Model-estimation errors before and after bias correction.

Conclusions

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Fabiana Zama

Roles

Author response to Decision Letter 0

Decision Letter 1

Fabiana Zama

Roles

Acceptance letter

Fabiana Zama

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Decomposition of the mean absolute error (MAE) into systematic and unsystematic components

Scott M Robeson

Cort J Willmott

Roles

Abstract

Introduction

Decomposition of MAE into three components

Fig 1. Representations of model-prediction errors showing aspects of the decomposition into systematic and unsystematic components (using data from Table 1).

Table 1. Data and error components for the example in Fig 1.

Bias error

Proportionality error

Unsystematic error

Three-component decomposition of MAE

An example of model-estimation errors

Fig 2. Model-estimation errors before and after bias correction.

Conclusions

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Fabiana Zama

Roles

Author response to Decision Letter 0

Decision Letter 1

Fabiana Zama

Roles

Acceptance letter

Fabiana Zama

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases