Author Correction: Machine learning model to project the impact of COVID-19 on US motor gasoline demand

Shiqi Ou; Xin He; Weiqi Ji; Wei Chen; Lang Sui; Yu Gan; Zifeng Lu; Zhenhong Lin; Sili Deng; Steven Przesmitzki; Jessey Bouchard

doi:10.1038/s41560-020-00711-7

. 2020 Oct 8;5(12):1051–1052. doi: 10.1038/s41560-020-00711-7

Author Correction: Machine learning model to project the impact of COVID-19 on US motor gasoline demand

Shiqi Ou ¹, Xin He ^2,^✉, Weiqi Ji ³, Wei Chen ⁴, Lang Sui ², Yu Gan ⁵, Zifeng Lu ⁵, Zhenhong Lin ¹, Sili Deng ³, Steven Przesmitzki ², Jessey Bouchard ²

PMCID: PMC7543031 PMID: 33052987

Correction to: Nature Energy 10.1038/s41560-020-0662-1, published online 17 July 2020.

In the version of this Article originally published, a comprehensive analysis of the model performance was not provided; thus, to avoid potential confusion over the model validation procedure and to provide a better representation of the model performance, the rolling-window cross-validation and out-of-sample testing results have now been included in the corrected Article and its Supplementary Information.

In the Methods section ‘The Mobility Dynamic Index Forecast Module’, the sentence describing the cross-validation method “In addition, cross validation is adopted to search the optimal network structure and avoid overfitting, in which the datasets are divided into training and test datasets by a ratio of 2:1.” has been changed to “In addition, the rolling-window cross-validation is adopted to search the optimal network structure, which is detailed in Supplementary Note 5. Out-of-sample testing is also performed for the selected neural network structure to estimate the performance of the model in predicting future mobility.”

In the Supplementary Information, the original Supplementary Fig. 10 that used R² to describe the random-split cross-validation results has been replaced by the corrected version that uses the root mean square error (RMSE) to describe the out-of-sample testing results, and the caption has accordingly been updated to read “Root Mean Square Error (RMSE) of the neural network model with 2 hidden layers and 25 nodes. The data before May 15 were used as the training dataset, and the data between May 25 and May 31 were used as the out-of-sample testing dataset. (a) Google mobility: workplaces; (b) Google mobility: retail and recreation; (c) Google mobility: grocery and pharmacy; (d) Google mobility: parks; (e) Apple Mobility.”

Additionally, the original Supplementary Table 4 that used R² to select the neural network structure has been replaced by the corrected version that uses the RMSE instead to select the neural network structure, and the caption has been updated accordingly to read “Rolling-window cross-validation of the neural network model for different combinations of hidden layers and nodes. The data in the table show the Root Mean Square Error (RMSE) of the training dataset and cross-validation dataset. Yellow highlighted text indicates the layers and the nodes are adopted in the neural network in the PODA model.”

Furthermore, discussion of the rolling-window cross-validation and out-of-sample testing results has been added in Supplementary Note 5: the first paragraph, starting “Supplementary Figure 10 compares the historical mobility data with the results predicted by the trained model…” has been rewritten to read:

“Multiple regularization techniques were adopted to avoid overfitting. We used weight-decaying (equivalent to L₂ regularization) to penalize large neural network weights and enforce model parameter sparsity and such to avoid overfitting. We also used mini-batch with Adam optimizer to train the neural network. Mini-batch training can offer a regularizing effect since it adds noise to the learning process. In addition, early stopping was employed to avoid overfitting.

The rolling-window cross-validation was performed to study the effect of the number of layers and nodes on the performance of the neural network model. Supplementary Table 4 lists the Root Mean Square Error (RMSE) of the training datasets and cross-validation datasets. For each combination of layer and node, two evaluations were performed with training dataset to be before April 15 and April 29, respectively. For each run, the model was trained using 2/3 of the randomly selected data from the training dataset. The “Validation dataset” listed in Supplementary Table 4 was used for cross-validation. Generally speaking, the neural network models with 1-hidden-layer and 2-hidden-layer achieve better performance than the 3-hidden-layer and 4-hidden-layer models. They are relatively insensitive to the number of nodes. Overall, the neural network models with 1-layer-30-node, 1-layer-35-node, 2-layer-25-node, and 2-layer-30-node are top performers. The 2-layer-25-node neural network is adopted in the PODA model for this work.

Supplementary Figure 10 shows the out-of-sample testing of the neural network model with 2 hidden layers and 25 nodes. Data before May 15 was used for model training, and the data between May 25 and May 31 for model testing. The trained model well predicts the future mobility related to “workplaces”, “retail and recreation”, and “grocery and pharmacy”. The relatively poor performance in predicting “Google parks” and “Apple mobility” is due to the high day-to-day variations. There is no obvious over-fitting as the performance in the testing dataset is comparable to the training dataset. Finally, the neural network model was retrained with 2/3 of random-sampled all of available data before June 11 to capture the latest pattern.”

The original and corrected Supplementary Fig. 10 and Table 4 are shown in the Supplementary Information for this correction notice.

These corrections have been peer reviewed.

Supplementary information

Supplementary Information^{(707.2KB, pdf)}

Original and corrected Supplementary Fig. 10 and Table 4.

Supplementary information

is available for this paper at 10.1038/s41560-020-00711-7.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information^{(707.2KB, pdf)}

Original and corrected Supplementary Fig. 10 and Table 4.

PERMALINK

Author Correction: Machine learning model to project the impact of COVID-19 on US motor gasoline demand

Shiqi Ou

Xin He

Weiqi Ji

Wei Chen

Lang Sui

Yu Gan

Zifeng Lu

Zhenhong Lin

Sili Deng

Steven Przesmitzki

Jessey Bouchard

Supplementary information

Supplementary information

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Author Correction: Machine learning model to project the impact of COVID-19 on US motor gasoline demand

Shiqi Ou

Xin He

Weiqi Ji

Wei Chen

Lang Sui

Yu Gan

Zifeng Lu

Zhenhong Lin

Sili Deng

Steven Przesmitzki

Jessey Bouchard

Supplementary information

Supplementary information

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases