Abstract
Heating degree days (HDD) represent a concise measure of heating energy requirements used to inform decision making about the impact of climate change on heating energy demand. This data paper presents spatial datasets of heating degree days (HDD) for Canada for two thirty-year periods, 1951–1980 and 1981–2010, using daily temperature gauge observations over these time periods. Stations with fewer than nine missing days in a year and greater than nine years of data over each thirty-year period were included, resulting in 1339 and 1679 stations for the 1951–1980 and 1981–2010 periods respectively. Mean absolute error (MAE) of the spatial models ranged from 124.2 Celsius degree days (C-days) for the 1951–1980 model (2.4% of the surface mean) to 137.6 C-days for the 1981–2010 model (2.7%). This note presents maps illustrating cross validation errors at a set of representative stations. The grids are available at ∼2 km resolutions.
Keywords: Grids, Raster, Temperature, Spatial datasets, Heating degree days, HDD, Historical, Thin plate spline, Climate, ANUSPLIN, Canada
Specifications Table
Subject | Earth and Planetary Sciences |
Specific subject area | Thin plate spline datasets for heating degree days, Canada, spatial dataset, 1981–2010 & 1951–1980 |
Type of data | Geospatial grids |
How data were acquired | Environment and Climate Change Canada (ECCC) provided daily minimum and maximum temperature values at meteorological stations across Canada (1950–2010). |
Data format | Raw – delimited text/ascii Analysed – delimited text/ascii Final - geotiff |
Description of data collection | Climate data in Canada are collected through a system of weather stations distributed unevenly across the country. Daily minimum and maximum temperature values from 1339 (1951–1980) and 1679 (1981–2010) weather stations were used to calculate HDD values for 1951–1980 and 1981–2010, which were then interpolated and mapped using ANUSPLIN via tri-variate thin-plate splines. |
Data source location | Canada |
Data accessibility | https://osf.io/xkpc7/ |
Value of the Data
-
•
These datasets were developed in part to support updates to tax credits for northern and isolated areas in Canada for the Canadian Finance Department [1].
-
•
Energy analyses rely on HDD to track changes in natural gas and other energy usage. Historical change in HDD is an important factor in energy consumption planning, particularly in northern areas.
-
•
Users can use the dataset to obtain information about heating requirements for any location in Canada for two long-term periods, 1951–1980 and 1981–2010.
-
•
This data description also provides a case study using published output from ANUSPLIN thin-plate spline program [2].
1. Objective
The ‘degree day’ method is used to calculate the difference between mean daily temperature and any given threshold – typically these differences are summed over a period of interest to provide a measure of heat or cold accumulation through time [3]. Heating degree days (HDD) sum the degree to which average daily temperatures are below the temperature of human comfort, defined as 65°F [4], or in Canada as 18 °C [3], [4], [5], [6]. HDD have been analysed to estimate changes in energy usage [7], impacts of climate change [8], [9], and historical trends [9] with respect to how often and how hard a furnace must work to keep a house warm.
The purpose of this brief report is to introduce HDD datasets for Canada for the 1951–1980 and 1981–2010 periods. These datasets were developed in part to support updates to tax benefits for northern and isolated areas in Canada which experience higher than average heating costs in Canada [1]. We describe these datasets and report on the quality and accuracy of the spatial datasets.
2. Data Description
2.1. Heating degree day (HDD) datasets
Canada-wide Heating Degree Day (HDD) gridded datasets were generated for two thirty-year periods, 1951–1980 and 1981–2020 (Fig. 1), using tri-variate thin-plate splines in ANUSPLIN [2] version 4.5 employing a 60′ sec (approximately 2 km) Digital Elevation Model [10].
The datasets documented include:
-
1.Heating Degree Day Data Files containing Heating Degree Day values calculated for in situ temperature monitoring stations (see [11] detailing a rationale for a similar methodology). HDD, defined as the annual sum of the positive differences between the base temperature of 18 °C and daily temperature, was calculated using the average of maximum and minimum daily temperature according to the following formula:
where i is the day of the year, θMAX is the daily maximum temperature, θMIN is the daily minimum temperature, θb is the base temperature (18 °C), and θb > (θMAX + θMIN)/2.(1) Raw data file containing minimum and maximum temperatures by station:
Average HDD calculated for the following 30-year periods:
1951–1980 (1339 stations): https://osf.io/he3w8
1981–2010 (1679 stations): https://osf.io/x62vu
The format used to read in these .dat files is provided at:
COMBINED 1951–1980 and 1981–2010 Heating Degree Day Values (.xlsx format):
1951–1980 and 1981–2010 HDD average values for stations with greater than 10 years of data and the count of number of years of observation data for 1951–1980 and 1981–2010.
-
2.
Output from ANUSPLIN (Lis Files) – 1951–1980 and 1981–2010 “Lis” files contain Station coordinates (latitude, transformed longitude and transformed elevation), HDD value for the station, the fitted value (“Fitted_estimate”), and the individual cross validated values (“CV_estimate”) see [2] for a description of ANUSPLIN output).
Lis files:
1951–1980: https://osf.io/29vk4
1981–2010: https://osf.io/5wyj6
A genericized script to read in the “Lis files” is provided at:
-
3.
Geotiff files – Canada-wide HDD surfaces
1951–1980: https://osf.io/2zu5p
1981–2010: https://osf.io/sb5p3
2.2. Predictive error of ANUSPLIN datasets
ANUSPLIN produces individual station cross-validation (CV) estimates (“CV_Estimate”), which were compared to HDD calculated from station observations. The CV estimates are individually cross-validated values [2]. Mean error (ME) was calculated using the CV estimate minus calculated HDD. ME and Mean Absolute Error (MAE) are presented in C-days as well as a percentage of the surface mean.
ANUSPLIN CV estimates were biased on average by less than 1C-days for both periods (Table 1). Mean absolute error (MAE) of the ANUSPLIN models ranged from 124.2C-days for the 1951–1980 model to 135.3C-days for the 1981–2010 model. The average MAE for the 1981–2010 period represented 2.7% of the surface mean compared to 2.4% for the 1951–1980 period.
Table 1.
Time Period | N | ME in C-days (% of Surface Mean) | MAE in C-days (% of Surface Mean) |
---|---|---|---|
1951–1980 | 1339 | 0.00 (0.0%) | 124.2 (2.4%) |
1981–2010 | 1679 | −0.47 (0.0%) | 137.6 (2.7%) |
Plots of observed versus predicted values exhibited strong linear relationships with few outliers for both time periods (Fig. 2).
Predictive errors were plotted for 60 stations selected in previous Canadian studies to better reflect the range in latitude, longitude, and elevation across the country [12,14] as compared with the full set of stations, which are concentrated in southern Canada. Of these 60 stations, 56 stations met the criterion for inclusion in this analysis. Predictive errors at 56 selected stations (Fig. 3) were generally highest in mountainous and coastal regions. Higher errors in areas of complex terrain and coastal areas reflects known challenges with generating spatial models in these highly variable environments for sparse in-situ networks [12], [13], [14]. As a percentage, errors were greater in the 1981–2010 period compared to the 1951–80 period.
3. Experimental Design, Materials and Methods
3.1. Data acquisition
Environment and Climate Change Canada (ECCC) provided daily minimum and maximum temperature values at meteorological stations across Canada from 1950 to 2010 [15].
3.2. Data pre-processing
Plots were generated to examine the number of stations available for analysis based on cut-offs associated with the number of missing days in a year and the number of missing years in a normal period (Fig. 4). We selected stations with ≤ 10 missing days in a year and ≥ 10 years in a normal period for the spatial modelling. With these cut-offs, 1339 and 1679 stations were available for analysis in 1951–1980 and 1981–2010 respectively.
3.3. Specifics of implementation
Spatial models were developed in ANUSPLIN [2] and resolved into map form using a 60′ sec (approximately 2 km) DEM [10]. The ANUSPLIN grid was created using latitude, longitude (multiplied by 0.64279), and elevation (multiplied by 1000) as predictors. ANUSPLIN fits partial thin plate smoothing splines constructed from a set of “knots” to noisy multivariate data. A portion of the available observations (in this case, 40%) are selected to limit the complexity of the fitted surface; however, all data points are used to calculate the fitted surface [2].
3.4. Experimental results
In addition to predictive error, the quality of the spatial datasets was evaluated using two diagnostic statistics output by ANUSPLIN:
-
(a)
The ratio of the “signal” (S), which ranges between zero and the number of stations (or ‘knots’) selected by ANUSPLIN (nKTS), to the number of knots (S:nKTS). Ratios between 0.2 and 0.8 are considered acceptable [2,12]. HDD dataset ratios of 0.48 and 0.57 (Table 2) were non-problematic.
-
(b)
Root GCV (RtGCV). The GCV (Generalized Cross Validation) is calculated by removing each data point and summing the square of the difference of each omitted data point from a surface fitted to all remaining data points [16]. RtGCV, the square root of the GCV, essentially provides a spatially averaged estimate of standard error [14]. The RtGCV was 2.7% for 1951–1980 and 3.3% for 1981–2010 as a percentage of the surface mean (Table 2).
Table 2.
Time Period | Surface Mean (in C-days) | S:nKTS | RtGCV C-days (% of surface mean) |
---|---|---|---|
1951–1980 | 5400 | 0.57 (382:669) | 147 (2.7%) |
1981–2010 | 5115 | 0.48 (406:839) | 167 (3.3%) |
3.5. Limitations
Most stations were missing observations for at least some portion of the period considered for this study. With this data report, we published the number of years of data upon which the calculations are based to allow users to make decisions about the use of this dataset. Future work will consider the use of fully in-filled time series for a thirty-year period using estimates for missing HDD values. Canadian in situ stations were concentrated in southern latitudes. Notably much of northern Canada is monitored through a relatively sparse network. To address this feature of the datasets, ANUSPLIN predictions were evaluated for a set of 60 stations selected to better reflect the range in latitude, longitude, and elevation across the country [12].
Ethics Statement
This work did not involve human subjects or experiments using animals.
CRediT Author Statement
Heather MacDonald: Conceptualization, Methodology, Formal Analysis, Validation, Writing -Original draft preparation, Writing - review & editing. John Pedlar: Conceptualization, Methodology, Visualization, Formal Analysis, Validation, Writing –Original draft preparation, Writing - review & editing. Daniel McKenney: Conceptualization, Methodology, Writing –Original draft preparation, Writing - review & editing. Kevin Lawrence: Data curation, Investigation, Validation. Kaitlin de Boer: Data curation, Investigation, Visualization, Investigation, Writing - review & editing. Michael Hutchinson: Software, Methodology, results validation and review, manuscript editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
Funding to support development of these datasets was provided to the Integrative Ecology and Economics Group at Great Lakes Forestry Centre (GLFC), Canadian Forest Service, Natural Resources Canada by Canada1Water (C1W) project, a collaboration between Natural Resources Canada, Agriculture and Agri-Food Canada, and Aquanty Inc. Funding is from the Canadian Safety and Security Program (CSSP), Defence Research and Development Canada (DRDC), and the Geological Survey of Canada Groundwater Geoscience and GEM-GeoNorth programs. Work completed in 2021 and 2022 was partly supported by Environment and Climate Change Canada funding for the “Disseminating high resolution historical spatial climate models for Canada” project. Funding to support this work was also provided by the Canadian Forest Service Forest Climate Change Program. Thanks also go to Pia Papadopol for reviews of earlier drafts of this manuscript.
Data Availability
Heating Degree Days Canada 1951-1980, 1981-2010 (Original data) (Open Science Framework).
References
- 1.Government of Canada. Report on the Task Force of Tax Benefits for Northern and Isolated Areas. (1989).
- 2.Hutchinson M.F., Xu T. Australian National University, Fenner School of Environment and Society; Canberra, Canberra, Australia: 2013. ANUSPLIN Version 4.4 User Guide.http://fennerschool.anu.edu.au/files/anusplin44.pdf Accessed from. [Google Scholar]
- 3.Climate Atlas of Canada, version 2 (2019), Accessed from https://climateatlas.ca/map/canada/hdd_2030_85# on July 10, 2023.
- 4.Prairie Adaptation Research Collaborative. Sask Adapt. (2023). Accessed from https://www.parc.ca/saskadapt/learn-more/degree-days.html on July 10, 2023.
- 5.climatedata.ca. (2023). Accessed from https://climatedata.ca/var-type/other/ July 10, 2023.
- 6.Government of British Columbia . 2014. Determining ASHRAE 90.1-2010 Climate Zones.https://www2.gov.bc.ca/assets/gov/farming-natural-resources-and-industry/construction-industry/building-codes-and-standards/bulletins/b14-01_determining_ashrae_901-2010_climate_zones.pdf Accessed from. July 10, 2023. [Google Scholar]
- 7.Thom H.C.S. The rational relationship between heating degree days and temperature. Month. Weather Rev. 1954;82:1–6. [Google Scholar]
- 8.Castañeda M.E., Claus F. Variability and trends of heating degree-days in Argentina. Int. J. Climatol. 2013;33(10):2352–2361. doi: 10.1002/joc.3583. [DOI] [Google Scholar]
- 9.Vincent L.A., Zhang X., Mekis É., Wan H., Bush E.J. Changes in Canada's climate: trends in indices based on daily temperature and precipitation data. Atmosphere-Ocean. 2018;56(5):332–349. [Google Scholar]
- 10.Lawrence K.M., Hutchinson M.F., McKenney D.W. Multi-scale digital elevation models for Canada. Natural Resources Canada, Great Lakes Forestry Centre, Sault Ste Marie, Ontario. Frontli. Tech. Note. 2008;109:4. https://cfs.nrcan.gc.ca/publications?id=31499 [Google Scholar]
- 11.D'Amico A., Ciulla G., Panno D., Ferrari S. Building energy demand assessment through heating degree days. App. Energ. 2019;242:1285–1306. doi: 10.1016/j.apenergy.2019.03.167. [DOI] [Google Scholar]
- 12.Macdonald H., McKenney D.W., Wang X.L., Pedlar J., Papadopol P., Lawrence K., Feng Y., Hutchinson M.F. Spatial models of adjusted precipitation for Canada at varying time scales. J. Appl. Meteorol. Climatol. 2021;60(3):291–304. doi: 10.1175/JAMC-D-20-0041.1. [DOI] [Google Scholar]
- 13.Hutchinson M.F., McKenney D.W., Lawrence K., Pedlar J.H., Hopkinson R.F., Milewska E., Papadopol P. Development and testing of Canada-wide interpolated spatial models of daily minimum–maximum temperature and precipitation for 1961–2003. J. Appl. Meteor. Climatol. 2009;48:725–741. doi: 10.1175/2008JAMC1979.1. [DOI] [Google Scholar]
- 14.MacDonald H., McKenney D.W., Papadopol P., Lawrence K., Pedlar J., Hutchinson M.F. North American historical monthly spatial climate dataset, 1901–2016. Sci. Data. 2020;7(2020) doi: 10.1038/s41597-020-00737-2. 411-411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Environment and Climate Change Canada (ECCC) 2023. Climate Data Extraction Tool.https://climate-change.canada.ca/climate-data/#/daily-climate-data [Google Scholar]
- 16.Wahba G. Siam Publications; 1990. Spline Models For Observational Data. Vol. 59. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Heating Degree Days Canada 1951-1980, 1981-2010 (Original data) (Open Science Framework).