Abstract
An ocean surface currents forecasting system, based on a Self-Organizing Maps (SOM) neural network algorithm, high-frequency (HF) ocean radar measurements and numerical weather prediction (NWP) products, has been developed for a coastal area of the northern Adriatic and compared with operational ROMS-derived surface currents. The two systems differ significantly in architecture and algorithms, being based on either unsupervised learning techniques or ocean physics. To compare performance of the two methods, their forecasting skills were tested on independent datasets. The SOM-based forecasting system has a slightly better forecasting skill, especially during strong wind conditions, with potential for further improvement when data sets of higher quality and longer duration are used for training.
Introduction
Ocean and environmental forecasting at regional or basin-wide scales1 can be done in different ways, depending on the processes to be investigated, and the time and spatial scales of interest2. The most common approach at regional scales and for short-term forecasts (up to a week), is to use regional ocean models driven by mesoscale atmospheric model outputs. Forcing is usually carried out through one-way coupling, using multiple nesting techniques and with lateral boundary conditions coming from larger or global models3. Other approaches, developed in recent decades, include multi-ensemble methodologies, different data assimilation techniques4, hybrid modeling5 and neural network modeling6. The latter is often based on the Self-Organizing Maps (SOM) algorithm7.
Since the 2000s the Self-Organizing Maps method has been introduced more extensively in atmospheric and ocean sciences8, above all for mapping of: climate and atmospheric processes9, remote sensing data10, and ocean modeling data11. The method performs best on comprehensive and long time series, and it is particularly suitable for mapping surface current fields from moored current-meters and high-frequency (HF) radars in coastal areas12,13,14. However, a SOM-based operational forecasting method has not been implemented yet in any operational oceanography network, in spite of the fact that a similar approach, in combination with other techniques, is already used for instance in energy systems15, and hydrology16. This paper documents the architecture of a SOM-based forecasting system, specifically designed for surface currents, and its evaluation vs. more conventional ocean forecasting tools (specifically, the ROMS ocean model). The system is based on numerical weather prediction (NWP) from the atmospheric model ALADIN/HR, on HF radar data and on an unsupervised SOM training. Skill assessment for such system is done and relevant metrics are estimated. System performance is compared to the existing operational ocean forecasting system based on the ROMS ocean model, which is coupled with the same ALADIN/HR NWP model.
The SOM-based forecasting system and the operational ROMS setup
The SOM-based operational system was developed with the purpose of providing surface current forecasts (up to 72 h) in the northern Adriatic Sea (Fig. 1) as part of the NEURAL project (http://jadran.izor.hr/neural). The system is based on the assumption that local wind is the major driver of near-surface circulation in the region. The temporal extent of the ocean current forecast window is dictated by the available NWP weather forecast, which extends to 72 hours. The system is based on training and forecast phases (Fig. 2). The main goal of the training phase is to find a robust link between ocean surface currents and surface wind fields. In order to do so, the following steps are undertaken: (1) surface currents in the area are measured using HF radars during a prolonged time interval; (2) sea-surface wind fields are obtained from a high-resolution mesoscale atmospheric model; (3) the characteristic surface current patterns are extracted from HF radar data using the SOM method; (4) joint surface current and wind patterns are extracted from wind and surface currents using the SOM method; (5) then, optimal SOM parameters are estimated and the method applicability is tested on available data. The forecasting phase is based on: (1) forecasting surface winds with a high resolution mesoscale atmospheric model; (2) recognizing characteristic SOM patterns (as determined during the training phase) in the surface wind forecast by searching for the minimum Euclidian distance between characteristic patterns and the NWP-forecasted surface winds; and (3) assigning the corresponding surface current pattern as the surface current forecast.
For this study HF radar data were acquired from four SeaSonde radar systems, located at Bibione (BIB), Aurisina (AUR), Savudrija (SAV) and Zub (ZUB) (Fig. 1), that were operational from 2007 to 2010. Radars worked at 25 MHz, were set up with 512 fft points, 2 Hz sampling rate and a bandwidth of 100 kHz, corresponding to a radial resolution of 1.5 km in range, and 2.3 cm/s in radial velocity. Angular resolution was 5°, and the maximum operating range was 50 km. Hourly surface currents were estimated on a predefined 2 km × 2 km Cartesian grid (Fig. 1) by applying a least-squares algorithm17. Radial currents were quality-checked to remove Doppler lines that were either poorly constrained by signal-to-noise (SNR) ratio or had insufficient spectral quality factor18. Total vectors were also checked for spikes19 and spatial gaps were filled using a Gaussian-weighted spatial interpolation. The period with the best temporal coverage (operational coverage from 60% at the distant points up to 100% in the center of the radar domain) and having the highest quality of data was the period between February and November 2008, so this dataset was used for the SOM training phase.
High-resolution surface wind fields (2 km × 2 km) obtained with the ALADIN (Aire Limitée Adaptation Dynamique Développement Inter National) mesoscale NWP model20, have been used both in training (jointly with surface currents data in the period February-November 2008) and forecast phase. The ALADIN/HR version of the NWP model which has been used for the Adriatic Sea has 37 levels in the vertical, 2 km resolution in horizontal, and a full set of physics parameterizations, including prognostic TKE (turbulent kinetic energy), microphysics and a prognostic convection scheme21. Hourly files were used for the training.
Both surface currents and surface winds were preprocessed with a 33-h low-pass filter, to remove tides and high-frequency atmospheric and ocean processes from the data. Then, current data in the training period (February-November 2008) were introduced to the 4 × 5 SOM of the sheet type and 20 characteristic surface current patterns were mapped. Next, a joint set of surface currents and surface winds was introduced, again to the 4 × 5 SOM of the sheet type and another 20 characteristic joint surface currents surface wind patterns have been obtained. Complex correlations coefficients with values from 0.880 to 0.999 between SOM patterns based on surface current data only and patterns based on the joint surface current and surface wind data were estimated22. High correlation between these two sets of SOM patterns is a prerequisite for creating SOM ocean surface currents based on wind forecasts.
Following the testing phase, three distinct periods, 1: July-September 2009, 2: January-February 2010, and 3: April-July 2010, were used for testing reliability of the SOM-based operational forecast (Fig. 1). BIB and AUR stations were largely out of operation during the three testing periods, therefore operational coverage was as low as 20% at some grid points (Fig. 1), much lower than operational coverage during the training phase (at least 60% at the same Cartesian grid). A forecast of surface currents was verified on measured data and standard metrics (RMSE) were computed. The 4 × 5 SOM matrix was chosen for the forecasting system, resulting in 20 patterns of forecast surface currents, as this choice provides the minimum root-mean-square error (RMSE) between the measured and forecast surface current patterns when compared to other tested sets of SOM matrices (from 2 × 2 to 16 × 12 SOM).
The parallel ocean forecasting system, based on the ROMS ocean model (Regional Ocean Modeling System23), is operational in the Adriatic Sea since 2008. The Adriatic ROMS model has a 2 × 2 km horizontal resolution, 20 sigma layers in the vertical; it is forced by the ALADIN/HR NWP fields at the surface. Lateral boundary conditions come from the larger AREG model24 and river climatology25. Further details of the modeling system can be found in this paper25. Modeled ROMS surface current data filtered with a 33-h low pass filter were recomputed over the HF radar grid for the three testing periods to allow comparison with the SOM-based surface current forecast.
Evaluation of the SOM-based forecast
Average RMSE from the SOM-based forecast is 9.4 cm/s for period 1, 7.7 cm/s for period 2 and 8.4 cm/s for period 3. The best metrics are estimated during the winter period (period 2), while they are much larger during summer and spring periods (periods 1 and 3, respectively). The respective ROMS-based metrics have a similar level of quality, with slightly higher RMSE values during summer (period 1: 10.0 cm/s) and spring (period 3: 9.8 cm/s) and lower during winter period (period 2: 7.5 cm/s). Averaged over all testing periods, the ROMS-based forecast is 8.7% worse than the SOM-based forecast.
We have further estimated both ROMS and SOM forecast system skill changes over different SOM solutions (Fig. 3). For any given SOM pattern we computed: (1) RMSE between the current field characteristic for that pattern and the surface current field obtained from measurements; (2) RMSE between ROMS currents and the surface current field obtained from measurements; both (1) and (2) were computed for those times for which a particular SOM pattern was forecasted. The pattern BMU16, associated to strong bora wind (Fig. 4), has the largest RMSE values. During wintertime (period 2), this pattern appears more frequently than during other investigated periods (9% in period 2, 1.5% in period 1 and 5% during period 3) and is slightly better forecast by the ROMS than by the SOM forecast system (Fig. 3). On the other hand, it is better forecast by the SOM during periods 1 and 3, but its representation in all patterns is smaller (1.5 and 5%, respectively). Another important extreme wind pattern, BMU20, is associated with the sirocco wind blowing from the south (Fig. 4). This pattern has a much lower frequency (up to 3%) during testing periods (Fig. 3). Its SOM-based forecast also provides larger errors than the ROMS forecast during periods 1 and 2, but smaller errors in period 3 when the pattern BMU20 is slightly more frequent. The overall shape of the RMSE distribution over the BMUs (Fig. 3) shows more resemblance between periods 1 and 3, which are characterized by vertical stratification in the sea, than in period 2, when the sea is usually homogenized26. Also, RMSE is much smaller during period 2 than during periods 1 and 3. Therefore, baroclinicity introduces additional error to both SOM- and ROMS-based forecasts.
SOM and ROMS forecast for BMU 16 and BMU 20 are analyzed in more detail. The BMU16 is a result of extreme bora conditions. The bora wind is strongly horizontally variable over a kilometer spatial scale in the northern Adriatic27, and it might not be properly represented by the 2-km resolution ALADIN/HR mesoscale model. Mesoscale atmospheric models tend to largely underestimate energies at high frequencies28, thus implying that higher resolution is needed in the model to fully describe the bora wind dynamics.
For that reason surface currents forecast by the SOM-based operational system – which are based on real measurements – are more intense and reliable during extreme events than surface currents forecast by ROMS. As seen in Fig. 4, the SOM-forecasted surface currents are much stronger (30–40%) than ROMS-forecasted values. Also, ROMS-forecasted fields corresponding to BMU16 wind conditions contain a northward current in the southern part of the domain, being a part of the cyclonic circulation known to occur in the area of the northern Adriatic during strong bora forcing29. By contrast, the SOM-based forecast has very weak currents in the southern part of the domain associated with the BMU16 pattern, followed by a lower RMSE than the one of the ROMS-based forecast (Fig. 4). Two alternative explanations for the discrepancy are feasible: (i) ALADIN/HR wind forcing during wintertime extreme bora wind creates an artificial strengthening of the cyclonic gyre in the area of the northern Adriatic, frequently reproduced by ocean numerical models which use atmospheric mesoscale models30, or (ii) HF radar measurements are of lower quality at the southernmost part of the domain, which is covered by two HF radars only.
The difference between the SOM- and ROMS-based forecasts is also evident during intervals represented by other patterns, like the BMU20 (Fig. 4), which is representative of the second dominant wind in the northern Adriatic, the sirocco31. Although currents are not as strong as during strong bora episodes, the SOM-based forecast still gives 20–40% stronger currents than the ROMS-based forecast. In addition, there is a 10–30% clockwise offset in ROMS-forecasted current direction.
The spatial distribution of RMSE during periods corresponding to BMU16 and the BMU20 patterns (Fig. 4) reveals that both SOM- and ROMS-based forecasting systems reproduce mean current field relatively well (small RMSE in area with temporal coverage higher than 60%). On the other hand, very high RMSE values in areas with relatively low temporal coverage reveal that variability of the measurements is much higher than variability of both SOM and ROMS forecasts and that this variability is not adequately reproduced by either of the two forecast models.
Discussion and perspectives
Both SOM- and ROMS-based forecasting systems perform short-term (72 h) forecasting of surface currents in the northern Adriatic with similar skill and quality. However, a limited dataset used for training of the SOM-based forecast, including only 10 months of HF currents in the northern Adriatic, presumably lower the quality of the SOM-based forecast. An increase of the training dataset to several years would likely improve the skill of the forecast, allowing for better and more reliable selection of an optimal number of SOM patterns. Such datasets are now available from HF radar measurements in numerous other regions of the World Ocean32, so this approach can be easily tested.
There are several advantages when using SOM-based neural networks for forecasting ocean parameters, in our case surface currents. The first is that the system is based on real data, with a much simpler model learning approach than in other techniques based on real data, such as data assimilation33. Another advantage is that, after HF radar time series of sufficient length are collected and a SOM-based forecasting system for an area has been created, HF radars can be redeployed in other coastal areas, potentially allowing for a spatial extension of the SOM-based operational forecasting. Finally, the required time for execution of the SOM-based forecast is much smaller, an order of magnitude, than for forecasts provided by ocean numerical models.
On the other hand, a prerequisite for the proposed forecasting system is a high correlation between predictor and predictand, in our case between wind and surface currents. However, such a high correlation is common for coastal areas34, widening application of a SOM-based forecasting system for surface currents at a number of coastal regions where HF radar measurements are available. Another problem is an under-representation of extreme situations in the forecasting system - and forecasting of extremes has much more practical use than forecasting of mean situations - if not trained by a satisfactorily number of BMU patterns. Such a problem may be overcome by a two-step SOM, i.e. by first training it on all measured current fields and then by re-training it only on those current fields which are representative for extreme conditions (e.g., BMU16 and BMU20 in our case). This would downscale the training process through use of a number of BMU solutions instead of one representative for extreme conditions. However, this should be accompanied with an increasing resolution of atmospheric numerical models - enabling them to recognize more variable wind fields. Also, a use of supervised neural network algorithms35 with prescribed patterns in predictands might be a solution for improving the skill of the SOM-based forecasting system.
The evaluated SOM-based forecasting system in the northern Adriatic is performed on low-pass (33 h) filtered ocean currents and winds, and therefore it does not account for tides and other processes over hourly timescales, such as inertial oscillations or mesoscale phenomena originating outside of the domain. These processes might be significant in marginal seas (tidal currents are about 10 cm/s in the area36). For that reason, we tested both SOM and ROMS-based forecasting systems without preprocessing (low-pass filtering) on the same training and testing dataset, resulting in RMSE increase for about 60% for both forecasting systems compared to the computations with preprocessing. Nevertheless, the RMSE of the SOM forecast is on average about 9.5% smaller than the RMSE of the ROMS forecast. However, the increase of the RMSE in the SOM-based forecasting system might be lowered by removing tides prior to the SOM analyses and re-adding them later to the forecast patterns. Also, a significant increase of the testing dataset might allow for use of much more SOM patterns than used here (20 patterns), which, in the end, might result in more reliable reproduction of all processes appearing in a domain.
The proposed SOM-based forecasting system may be particularly applicable for areas with heavy marine traffic, harbors, touristic areas, etc., where forecasting of surface currents may be essential for preventing and mitigating impacts of marine accidents (e.g., oil spill, rescue at sea, safety of navigation). A similar architecture of the forecasting system could also be appropriate for forecasting different parameters in marine systems (in biogeochemistry, trophic relations, fisheries37) which are too complex for “classical” forecasting tools based on numerical models.
Additional Information
How to cite this article: Vilibić, I. et al. Self-Organizing Maps-based ocean currents forecasting system. Sci. Rep. 6, 22924; doi: 10.1038/srep22924 (2016).
Acknowledgments
We are grateful to all staff involved in maintenance and data collection of high-frequency radar data in the northern Adriatic. Two reviewers are greatly appreciated for their comments to the manuscript. The work was performed within the project “Interpreting and forecasting Adriatic surface currents by an artificial brain (NEURAL)” funded by Unity Through Knowledge Fund (Contract UKF 16/13). The work has also been supported by Croatian Science Foundation under contracts IP-2014-09-5747 (project SCOOL) and IP-2013-11-5928 (project ADAM-ADRIA).
Footnotes
Author Contributions I.V. coordinated the work and wrote the initial version of the text, J.S. and H.M. created the figures, H.K. did the neural network computations, J.S. and H.K. compared SOM and ROMS-based forecasting systems, I.J. ran the ROMS operational model, S.C. reprocessed the HF radar data, M.T. ran the ALADIN/HR operational model, V.D. and D.I. managed the HF radar network, N.Ž. and B.J. contributed to creation of SOM-based forecasting system; all authors actively contributed to the manuscript writing, polishing and improvement.
References
- Bryan K. A numerical method for the study of the circulation of the world ocean. J. Comp. Phys. 4, 347–376 (1969). [Google Scholar]
- Haidvogel D. B. & Beckmann A. Numerical Ocean Circulation Modeling 344 p (Imperial College Press, 1999). [Google Scholar]
- Oddo P. et al. A nested Atlantic-Mediterranean Sea general circulation model for operational forecasting. Ocean Sci. 5, 461–473 (2009). [Google Scholar]
- Krishnamurti T. N. et al. Improved weather and seasonal climate forecasts from multimodel superensemble. Science 285, 1548–1550 (1999). [DOI] [PubMed] [Google Scholar]
- Chassignet E. P. et al. The HYCOM (HYbrid Coordinate Ocean Model) data assimilative system. J. Mar. Syst. 65, 60–83 (2009). [Google Scholar]
- Lee T. L. Neural network prediction of a storm surge. Ocean Eng. 33, 483–494 (2006). [Google Scholar]
- Kohonen T. Self-organized information of topologically correct features maps. Biol. Cybern. 43, 59–69 (1982). [Google Scholar]
- Liu Y. & Weisberg R. H. In Self-Organizing Maps: Applications and Novel Algorithm Design (ed Mwasiagi J. I.) 253–272 (InTech, 2011). [Google Scholar]
- Chattopadhyay R., Vintzileos A. & Zhang C. D. A description of the Madden-Julian Oscillation based on a Self-Organizing Map. J. Clim. 26, 1716–1732 (2013). [Google Scholar]
- Richardson A. J., Risien C. & Shillington F. A. Using self-organizing maps to identify patterns in satellite imagery. Prog. Oceanogr. 59, 223–239 (2003). [Google Scholar]
- Williams R. N., de Souza P. A. & Jones E. M. Analysing coastal ocean model outputs using competitive-learning pattern recognition techniques. Environ. Modell. Softw. 57, 165–176 (2014). [Google Scholar]
- Liu Y. & Weisberg R. H. Patterns of ocean current variability on the West Florida Shelf using the self-organizing map. J. Geophys. Res. 110, C06003 (2005). [Google Scholar]
- Liu Y., Weisberg R. H. & Mooers C. N. K. Performance evaluation of the self-organizing map for feature extraction. J. Geophys. Res. 111, C05018 (2006). [Google Scholar]
- Hisaki Y. Classification of surface current maps. Deep-Sea Res. I 73, 117–126 (2013). [Google Scholar]
- Chen C. S., Duan S. X., Cai T. & Liu B. Y. Online 24-h solar power forecasting based on weather type classification using artificial neural network. Solar Energy 85, 2856–2870 (2010). [Google Scholar]
- Chang L. C., Shen H. Y. & Chang F. Y. Regional flood inundation nowcast using hybrid SOM and dynamic neural networks. J. Hydrol. 519, 476–489 (2014). [Google Scholar]
- Lipa B.J. & Barrick D. Least-squares methods for the extraction of surface currents from Codar crossed-loop data – application at ARSLOE. IEEE J. Ocean. Engineer. 8, 226–253 (1983). [Google Scholar]
- Cosoli S., Bolzon G. & Mazzoldi A. A real-time and offline quality control methodology for SeaSonde high-frequency radar currents. J. Atmos. Oceanic Technol. 29, 1313–1328 (2012). [Google Scholar]
- Kovačević V., Gačić M., Mancero Mosquera I., Mazzoldi A. & Marinetti S. HF radar observations in the northern Adriatic: Surface current field in front of the Venetian Lagoon. J. Mar. Syst. 51, 95–122 (2004). [Google Scholar]
- Aladin International Team. The ALADIN project: Mesoscale modelling seen as a basic tool for weather forecasting and atmospheric research. WMO Bulletin 46, 317–324 (1997). [Google Scholar]
- Tudor M. & Ivatek-Šahdan S. The case study of bura of 1st and 3rd February 2007. Meteorol. Z. 19, 453–466 (2010). [Google Scholar]
- Mihanović H. et al. Surface current patterns in the northern Adriatic extracted from High-Frequency radar data using Self-Organizing Map analysis, J. Geophys. Res. 116, C08033 (2011). [Google Scholar]
- Shchepetkin A. F. & McWilliams J. C. The regional oceanic modeling system (ROMS): A split-explicit, free-surface, topography-following coordinate ocean model. Ocean Modell. 9, 347–404 (2005). [Google Scholar]
- Oddo P., Pinardi N., Zavatarelli M. & Coluccelli A. The Adriatic basin forecasting system. Acta Adriat. 47 (Suppl.), 169–184 (2006). [Google Scholar]
- Janeković I., Mihanović H., Vilibić I. & Tudor M. Extreme cooling and dense water formation estimates in open and coastal regions of the Adriatic Sea during the winter of 2012. J. Geophys. Res. 119, 3200–3218 (2014). [Google Scholar]
- Buljan M. & Zore-Armanda M. Oceanographic properties of the Adriatic Sea. Oceanogr. Mar. Biol. Ann. Rev. 14, 11–98 (1976). [Google Scholar]
- Grubišić V. Bora-driven potential vorticity banners over the Adriatic. Q. J. R. Met. Soc. 130, 2571–2603 (2004). [Google Scholar]
- Skaramock W. C. Evaluating mesoscale NWP models using kinetic energy spectra. Mon. Wea. Rev. 132, 3019–3032 (2004). [Google Scholar]
- Beg-Paklar G., Isakov V. Koračin K., Kourafalou V. & Orlić M. A case study of bora-driven flow and density changes on the Adriatic shelf (January 1987). Cont. Shelf Res. 21, 1751–1783 (2001). [Google Scholar]
- Kuzmić M., Janeković I., Book J. W., Martin P. J. & Doyle J. D. Modeling the northern Adriatic double-gyre response to intense bora wind: A revisit. J. Geophys. Res. 111, C03S13 (2006). [Google Scholar]
- Jurčec V., Ivančan-Picek B., Tutiš V. & Vukičević V. Severe Adriatic jugo wind. Meteorol. Z. 5, 67–75 (1996). [Google Scholar]
- Harlan J. et al. The integrated ocean observing system high-frequency radar network: Status and local, regional, and national applications. Mar. Technol. Soc. J. 44, 122–132 (2010). [Google Scholar]
- Moore A. M. et al. The Regional Ocean Modeling System (ROMS) 4-dimensional variational data assimilation systems Part I - System overview and formulation. Prog. Oceanogr. 91, 34–49 (2011). [Google Scholar]
- Paduan J. D. & Rosenfeld L. K. Remotely sensed surface currents in Monterey Bay from shore-based HF radar (Coastal Ocean Dynamics Application Radar), J. Geophys. Res. 101, 20669–20686 (1996). [Google Scholar]
- Melssen W., Wehrens R. & Buydens L. Supervised Kohonen networks for classification problems. Chemometrics Intell. Lab. Syst. 83, 99–113 (2006). [Google Scholar]
- Malačič V., Viezzoli D. & Cushman-Roisin B. Tidal dynamics in the northern Adriatic Sea, J. Geophys. Res. 105, 26265–26280 (2000). [Google Scholar]
- Cheng L., Lek S., Lek-Ang S. & Li Z. J. Predicting fish assemblages and diversity in shallow lakes in the Yangtze River basin. Limnologica 42, 127–136 (2012). [Google Scholar]