Table 1.
Application of machine learning models in surface water.
Task | Algorithms | Sample size | Input parameters | Evaluation results | Reference |
---|---|---|---|---|---|
DO prediction | BWNN, ANN, ARIMA, BANN | 370 | DO | BWNN > BANN > WNN > ANN > ARIMA | [18] |
DO prediction | LSTM | 236 | DO | The model performed well at 74% of sites (NSE ≥ 0.4) | [19] |
DO prediction | PNN | 1912 | Cl–, alkalinity, BOD, PO4–P, COD, pH, temperature, NO3–N, Ca2+, P, Mg2+, and EC | Good interpolation performance (R2 = 0.82) | [20] |
DO prediction | CCNN | 232 | DO and water quality parameters (e.g., Cl, NOx, TDS, pH, temperature) | R2 = 0.825 RMSE = 0.550 |
[21] |
BOD prediction | DNN, SVR, RF | 32323 | Latitude, longitude, time, site actual depth, sea state, degree of turbulence at sea, wind speed, DO, temperature, salinity, total coliform, light penetration in water, chlorophyll-a, polychlorinated biphenyls plate count, NOx–N, PO4–P, NH3–N, TP, pH, TSS, EC, sample depth, density, and transparency | DNN is 19.20%–25.16% lower RMSE than traditional models | [22] |
EC, HCO3–, SO42−, Cl, TDS, Na+, Mg2+, Ca2+ prediction | SVM, ANN | All data since 1960 | Temperature, pH, EC, HCO3–, SO42−, Cl, TDS, Na+, Mg2+, and Ca2+ | SVM > ANN | [23] |
TN, TP prediction | SVM, ANN | 660 | River flow, temperature, flow travel time, rainfall, DO, TN, and TP | SVM > ANN | [24] |
Water quality level prediction | DT, RF, DCF, and 10 other models | 33612 | pH, DO, CONMn, and NH3–N | DT, RF, and DCF provide better predictive performance | [25] |
TRP, NO3–N, TP, NH4–N) prediction | RF | 21657 | EC, turbulence, temperature, DO, pH, chlorophyll-a, and flow rate | Compared with the linear model, RMSE decreased by 60.1% | [26] |
Chlorophyll-a prediction | SVM, ANN | 357 | Chlorophyll-a, PO4–P, NH3–N, NO3–N, temperature, solar radiation, and wind speed | SVM > ANN | [27] |
Algal bloom prediction | ANFIS | 896 | COD, BOD, TOC, TSS, TP, DTP, PO4–P, TN, NO3–N, NH3–N, chlorophyll-a, temperature, precipitation, flowrate, DO, pH, EC, total coliform, and fecal coliform | ANFIS performed best in both quantitative and classification problems | [28] |
Hyperparameter selection optimization | SVR | 223 | BGA-PC, chlorophyll-a, DO, EC, fDOM, turbidity, and pollution sediments | BGA-PC (accuracy = 0.77), chlorophyll-a (0.78), TSS (0.81), fDOM (−), turbidity (0.55) and DO (−) | [29] |
Water pollution monitoring | Attention neural network | 1000 | Water images | The resolution accuracy of clean water was 71.2%, and that of polluted water was 73.6% | [30] |
Water pollution monitoring | CNN, SVM, RF | 81 | Landsat8 images and water quality level | CNN (accuracy = 97.12%) > SVM (96.89%) > RF (86.21%) | [31] |
Heavy metal contamination assessment | PCA | 42 | Cu, Mn, Cr, Zn, Pb, Cd, Ni, and Co | Areas with heavy metal pollution were identified | [32] |
WQI parameters selection | PCA | 240 | Temperature, DO, pH, EC, BOD, NO3–N, fecal coliform, total coliform, turbidity, alkalinity, Cl, COD, NH3–N, total Hardness, Ca2+, Mg2+, Na+, TDS, and PO4–P | Nine key parameters were DO, pH, EC, BOD, total coliform, Cl−, Mg, SO42−, and TDS | [33] |
DO, Dissolved oxygen; BWNN, bootstrapped wavelet neural network; ANN, artificial neural network; ARIMA, autoregressive integrated moving average; BANN, bootstrapped artificial neural network; LSTM, long short-term memory; NSE, Nash-Sutcliffe efficiency; PNN, polynomial neural network; BOD, biological oxygen demand; COD, chemical oxygen demand; EC, electrical conductivity; CCNN, cascade correlation neural network; TDS, Tsinghua/Temporary DeepSpeed; RMSE, lower root mean square error; DNN, deep neural network; SVR, support vector regression; RF, random forest; SVM, support vector machine; TP, total phosphorus; TN, total nitrogen; TRP, total reactive phosphorus; TOC, total organic carbon; TSS, total suspended solids; DTP, dissolved total phosphorus; BGA-PC, blue-green algae phycocyanin, fDOM, fluorescent dissolved organic matter, CNN, convolutional neural network; PCA, principal component analysis; WQI, water quality index.