Abstract
Water turbidity is an important indicator for evaluating water clarity and plays an important role in environmental protection and ecological balance. Due to the subtle changes in water turbidity images, the differences captured are often too subtle to be classified. Convolutional neural networks (CNN) are widely used in image classification and perform well in feature extraction and classification. This study explored the application of convolutional neural networks in water turbidity classification. The innovation lies in applying CNN to water turbidity images, focusing on optimizing the CNN model to improve prediction accuracy and efficiency. The study proposed four CNN models for water turbidity classification based on artificial intelligence, and adjusted the number of model layers to improve prediction accuracy. Experiments were conducted on noise-free and noisy datasets to evaluate the accuracy and running time of the models. The results show that the CNN-10 model with a dropout layer has a classification accuracy of 96.5% under noisy conditions. This study has opened up new applications of CNN in fine-grained image classification, and further demonstrated the effectiveness of convolutional neural networks in water turbidity image classification through experiments.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-025-93521-4.
Keywords: Water turbidity, AI models, CNN, Accuracy
Subject terms: Environmental sciences, Hydrology, Mathematics and computing
Introduction
Water turbidity1,2 is a crucial indicator of water quality, directly impacting human health and environmental safety. High turbidity levels in water bodies often accompany increased concentrations of microorganisms and pollutants, posing significant public health risks3. Globally, water quality issues have become a critical challenge for many countries4. According to the World Health Organization, millions of cases of illness each year are linked to the consumption of contaminated water, highlighting the importance of monitoring water turbidity to protect public health. Traditional turbidity detection methods, such as spectrophotometry and laser scattering, although accurate, are often expensive and complex, making them unsuitable for large-scale, real-time monitoring5.
With the development of artificial intelligence (AI) technology6, machine learning techniques have been introduced to predict water quality7,8. Several studies have used Artificial Intelligence (AI) techniques to predict and model water quality p 8arameters such as turbidity in the hope of improving water management capabilities.Upton et al. (2017)9 evaluated current metrics for assessing filtration performance using the computation of predictor variables and the Classification and Regression Tree (CART) algorithm for on-line turbidity data and found that these metrics did not effectively and consistently summarise the important characteristics of turbidity distributions and associated water quality risks10. Vidyarthi et al. (2023)11 propose a novel approach integrating ANN and DT for rainfall occurrence forecasting, demonstrating that the extracted rules from an ANN model, using daily climatic data, provide a simple and accurate tool for predicting rainfall occurrence, addressing a gap in existing literature focused on rainfall magnitude forecasting. Alizadeh et al. (2018)12 recorded water quality parameters such as salinity, temperature and turbidity as well as flow data for the Wailuku River. Various machine learning models such as artificial neural network, extreme learning machine and support vector regression were used to predict the effect of river flow on water quality parameters for the next 2 h. Wang et al. (2021)13 investigated the development of three machine learning models, including artificial neural networks (ANN), genetic programming (GP), and support vector machines (SVM), for better estimation and prediction of tidally averaged sea surface turbidity (SST), using tidal and wave observations from a large tidal bayou along the coast of Jiangsu Province, China, as inputs to the model. Lo et al. (2023)14 used machine learning regression methods (Random Forest, Gradient Boosting, Backpropagation Neural Networks and Convolutional Neural Networks(CNN) to construct individual water quality inversion models for ten water parameters. Taking Yuandang Lake in the Yangtze Delta region of China, it is proposed to collect large-scale data using a multispectral Unmanned Aerial Vehicle (UAV) and retrieve multiple water quality parameters using machine learning algorithms15,16. Vidyarthi et al. (2020)17 examined the potential of Particle Swarm Optimization (PSO), an evolutionary optimization technique, for training ANNs. Their case study on the Jardine River Basin demonstrated that PSO outperformed GD in optimizing ANN performance for rainfall–runoff prediction, showing greater effectiveness through various error statisticsYao et al. (2023)18 introduced a hybrid model (CEEMDAN-FE-LSTM-transformer ) combining advanced decomposition and neural network techniques to predict total phosphorus concentrations in lakes, achieving an R² of 0.37–0.87. The model identified turbidity and total nitrogen as key factors, providing valuable insights for managing surface water eutrophication in the Taihu Lake basin. He et al. (2023)19 analyzed dissolved oxygen dynamics in the River Thames using superstatistical methods and machine learning, finding that the Informer model achieved the lowest Mean Absolute Error (0.15) for long-term predictions. Their results highlight the Light Gradient Boosting Machine for same-time predictions, offering valuable insights for managing river water quality. Gong et al. (2023)20 developed a Transformer-based Shrimp Detector (TSD) to enhance shrimp detection in aquaculture ponds, addressing challenges like light variations and water turbidity. Their method achieved an Average Precision of 82.7%, surpassing mainstream object detection models by integrating a Convolutional Neural Network and an innovative random feature query setting. These studies explored the application of various machine learning methods and algorithms in hydrological forecasting and water resources management.
Figure 1 shows the data information retrieved from the Scopus database using the keywords “water turbidity” and “machine learning”. As seen from Fig. 1, there is an increasing trend in the number of times various machine learning models are mentioned in the water turbidity research literature from 2013 to 2023. In particular, some scholars began to use CNN to classify water turbidity21. For example, A CNN-based soft sensor model developed by Lopez-Betancur et al. (2022)22 measures suspended solids and turbidity in liquid samples using smartphone cameras and LED illumination, achieving high accuracy with a pre-trained AlexNet model and MLR. H. Feizi et al. (2022)23 further developed an image-based deep learning model for water turbidity estimation, aiming to improve accuracy and applicability under various environmental conditions. CNN were used to estimate water turbidity to reduce reliance on laboratory equipment and reduce costs. The study by Ali Mohammed (2024)24 proposes the use of image processing technology and a CNN to estimate water turbidity. The CNN was trained on images of water samples with different turbidity levels, and the results were compared to a turbidity scale. The proposed method achieved an accuracy of 91.6% in detecting five categories of water turbidity. The study suggests that image processing technology has potential for measuring water turbidity with high accuracy.The study by Vijay Anand et al. (2023)25 proposes using a CNN for predicting water quality based on its color and other parameters, using TensorFlow and Keras, and can be checked using mobile-captured and Google Earth images. In short, CNN as a powerful deep learning algorithm, has been widely used in water environment research26,27. However, despite the significant progress and successful application of CNN in water turbidity research28, several challenges remain. One of the main issues is the limited availability of high-quality labeled datasets for training these models, which may hinder their generalization to different water conditions. Additionally, while CNN models have demonstrated high accuracy in controlled environments, their performance may degrade in real-world scenarios, where water turbidity is influenced by various environmental factors such as different lighting conditions, camera quality, and water sample composition. In this study, in addition to considering the impact of water turbidity, various types of noise, such as Gaussian Noise, Salt-and-Pepper Noise, and Joint Noise, were introduced into the dataset. This approach expands the dataset by accounting for more external interference factors to enhance the model’s robustness and adaptability in practical applications. Furthermore, continuous adjustments and optimizations were made to the CNN architecture layers in order to achieve the optimal training performance.
Fig. 1.
Machine leaning applications for determination of water turbidity over 2013-2023 based on scopus database analysis.
Materials and methods
Overall idea
In this study, the overall idea of the research is divided into two parts: (1) Data Preparation, (2) Model Optimization. As shown in Fig. 2, data preparation includes experimental data, turbidity image collection, image feature extraction, image cutting, and image anti-interference processing. Other experimental data refers to the turbidity ratio experiment in the industrial vision laboratory. Configure ten turbidity solutions with different CaCO3 concentrations. The camera then collects different types of images for machine learning. In the model optimization part, it mainly refers to feeding the prepared image data to the CNN model for training, verification, and evaluation, and finding the optimal model by continuously modifying the CNN structure.
Fig. 2.
The overall idea of the research.
Experimental data preparation
To obtain accurately measured turbidity solutions, liquid solutions were prepared using varying volume/weight of calcium carbonate powder (CaCO3) and diluted with distilled water to form 10 different liquid solutions. Use pure distilled water (PW) as a reference sample. For simplicity, the volume of distilled water is fixed at 2000 ml, while the volume/weight of calcium carbonate solution is constantly changing. Table 1 shows the proportions of all calcium carbonate liquid solutions.
Table 1.
The proportion and turbidity measurement of calcium carbonate solution.
| Class | CaCO3(g) | Water(ml) | Measurement (NTU) | |||
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | Average | |||
| I | 0.2 | 2000 | 73.2 | 68.1 | 65.4 | 68.90 |
| II | 0.4 | 2000 | 146 | 142 | 135 | 141.00 |
| III | 0.6 | 2000 | 176 | 158 | 150 | 161.34 |
| IV | 0.8 | 2000 | 239 | 232 | 230 | 233.67 |
| V | 1.0 | 2000 | 394 | 380 | 362 | 378.67 |
| VI | 1.2 | 2000 | 417 | 410 | 403 | 410.00 |
| VII | 1.4 | 2000 | 460 | 471 | 461 | 464.00 |
| VIII | 1.6 | 2000 | 641 | 615 | 601 | 619.00 |
| IX | 1.8 | 2000 | 776 | 751 | 727 | 751.34 |
| X | 2.0 | 2000 | 895 | 887 | 852 | 878.00 |
The turbidity meter used was the HI98703 Precision Portable turbidity meter A few metres from Hanna Instruments. The HI98703 is a highly accurate (± 2% of reading plus 0.02 NTU) portable turbidimeter that operates on the principle of angular light scattering. The optical system of the HI9870329 consists of a tungsten lamp, a scattered light detector (90°) and a transmitted light detector (180°). The light beam that passes through the sample is scattered in all directions, and by analysing the angular distribution and intensity of the scattered light, information about the characteristics of the particles in the sample can be deduced. The microprocessor of the HI98703 instrument calculates the NTU value from the signals reaching the two detectors by using an efficient algorithm to correct and compensate for colour interference.
There are many methods of angular light scattering, such as forward scattering, vertical scattering, and backward scattering. The vertical scattering turbidity measurement method in Fig. 3, its scattering principle conforms to Rayleigh scattering:
![]() |
1 |
Fig. 3.
Scattering turbidity measurement method.
NTU (Nephelometric Turbidity Unit) describes a standardized unit of measurement used for the transparency or clarity of water bodies30.
![]() |
2 |
In formula (1),
is the intensity of scattered light,
is the intensity of incident light,
is the angle between scattered light and incident light, and the other items can be written in the form of formula (2),
is the reading of the instrument and is used to correct the reading of the instrument.
As shown in Table 1, the turbidity experiments were conducted by gradually increasing different volumes of CaCO3 solution mixed into distilled water. The turbidity of the water was classified into10 groups (I to X) and 3 measurements were taken at each volume. The “Measurement (NTU)” column in the table shows the turbidity value of each measurement, and the “Average” column is the average of the three measurements.
Turbidity image collection
As shown in Table 1, the experiment classified the water turbidity into 10 categories, class I to class X. 2000 ml of pure water was taken as the water quality substrate for the experiment, and then different CaCO3 powders, such as 0.2 g, 0.4 g, 0.6 g, 0.8 g, 1.0 g, 1.2 g, 1.4 g, 1.6 g, 1.8 g, 2.0 g, were classified and stirred. Water quality with different turbidity values (NTU) was obtained. Different turbidity values give different clarity to the water picture. After making the turbidity solution, 100 ml of the solution from 2000 ml was stored in a transparent round container with a diameter of about 8 cm to obtain 10 different types of water samples. Then these 10 categories of water quality pictures with different turbidity were photographed in the same light source, the same container and the same temperature to obtain the original water quality image data.
A total of 10*25 = 250 water quality classification samples were obtained for the picture samples, out of which, 4 samples with different CaCO3 grams are shown in Fig. 4. Figure 3 shows that the changes in water turbidity across different concentration ratios are not easily discernible, despite turbidity values ranging from 100 NTU to 1000 NTU. Given the difficulty in visually distinguishing these turbidity levels, employing CNN network for water quality classification becomes a promising approach.
Fig. 4.
Image of water turbidity at different CaCO3 concentration ratios.
Convolutional neural network
CNN is a class of deep learning models for image recognition, computer vision and other visual tasks31. Here are some of the classic CNN networks32, such as LeNet-5: Proposed by Yann LeCun et al. in 1998, it is one of the earliest CNNs for handwritten digit recognition. It contains convolutional, pooling and fully connected layers33; AlexNet: by Alex Krizhevsky et al. won the ImageNet image classification challenge in 2012. It employs a deeper network structure, uses ReLU activation functions and Dropout regularisation, introduces GPU acceleration, and makes possible the widespread use of CNNs.VGG-16/VGG-19: Proposed by Visual Geometry Group (VGG) in 2014, it has 16 or 19 convolutional and fully connected layers, all of which use 3 × 3 convolutional kernels, making the network structure very regular. GoogLeNet/Inception: proposed by the Google team in 2014, uses the Inception module, which is a structure that mixes convolutional kernels of different sizes to help improve the efficiency and accuracy of the network.
These networks have yielded impressive results in areas such as image recognition and have served as the basis for subsequent research. Over time, many derived and improved network architectures for different tasks and domains have also emerged. In this experiment, we use a customised CNN neural network trained on water turbidity using four AI models and compare the results in order to find the best applicable model. To facilitate the experimental comparison, the attributes of the CNN models were unified, as shown in Table 2, the optimizer uniformly uses ‘adam’, which is responsible for adjusting the network’s weights to minimize the loss function; the Initial Learn Rate is set to 0.001, which means that at the beginning of training, the weights will be updated with a smaller step size; the Max Epochs is set to 10, which means that the entire training set will be used to train the network 10 times; the Mini-Batch Size is set to 32, and a smaller batch size helps the model generalize better; the Validation Frequency is set to 25, which helps monitor the training process.
Table 2.
Properties of CNN neural networks in this study.
| AOption | Description | Value |
|---|---|---|
| Optimizer | The optimization algorithm used during training. | ‘adam’ |
| Initial learn rate | The initial learning rate for the optimizer. | 0.001 |
| Max epochs | The maximum number of training epochs. | 10 |
| Mini-batch size | The number of samples in each mini-batch. | 32 |
| Validation data | The dataset used for monitoring model performance. | imdsValidation |
| Validation frequency | The frequency of validation during training. | 25 |
Results and discussion
Image feature extraction
Gray Level Energy and Gradient Energy are two important metrics used for image feature extraction34. They can reflect the grey scale distribution and texture information of an image, which is very useful for the analysis and evaluation of image information such as water turbidity. The following are the definitions of the two metrics:
1) Gray Level Energy : Gray Level Energy, also known as Gray Level Sum of Squares or Normalised Second Order Moment, is the sum of the squares of the grey level values of the pixels in an image. The higher the grey level energy, the more uniform the distribution of grey values in the image, and the richer the main information of the image, which corresponds to the areas of the image with large variations in brightness. Equation (3) is the mathematical expression of Gray Level Energy:
![]() |
3 |
2) Gradient energy: Gradient Energy measures the degree of grey scale variation in an image and is the sum of the squares of the gradient magnitudes of the image. A high gradient energy indicates that the image is richer in texture information, i.e. there are more grey transitions and edges in the image. Gradient energy has important applications in areas such as image edge detection and texture analysis. Equation (4) is the mathematical expression of Gradient Energy.
![]() |
4 |
In addition, there are feature functions such as Gray Level Entropy, Gradient Entropy, Gray Level Variance and Gradient Variance, who are more concerned with the uniformity of the distribution of the image.
3) Gray Level Entropy : Gray Level Entropy is used to describe the complexity of the distribution of the grey value of the image, the larger the entropy indicates that the distribution of the grey value of the image is more complex, i.e., the grey value of the image is highly diverse. Equation (5) is the mathematical expression of Gradient Entropy:
![]() |
5 |
4) Gradient entropy: Gradient Entropy measures the complexity of the distribution of the gradient magnitude of an image, the larger the entropy, the more complex the distribution of the gradient magnitude of an image is, i.e., the image has a high degree of richness of texture. Equation (6) is the mathematical expression of Gradient Entropy.
![]() |
6 |
5) Gray level variance: Gray Level Variance is used to measure the degree of fluctuation of the grey value of the image, the larger the variance indicates that the grey value of the image is more variable and the grey distribution of the image is uneven. Equation (7) is the mathematical expression of Gray Level Variance.
![]() |
7 |
6) Gradient Variance : Gradient Variance measures the degree of fluctuation of the gradient magnitude of the image, the larger the variance indicates that the gradient of the image varies greatly and the texture of the image is not uniformly distributed. Equation (8) is the mathematical expression of Gradient Variance:
![]() |
8 |
In Eq. (3) to (8), each letter stands for the following:
represents the horizontal position of the image, which usually ranges from 0 to
(width of the image);
denotes the vertical position of the image, usually ranging from 0 to
(height of the image);
denotes the pixel intensity value of the image at coordinates (x, y), i.e. the grey value or colour value at that position.
Figure 5 shows the water turbidity image at water:2000 ml and CaCO3:1.2 g, from which some relevant feature information can be derived. Among them, grey energy and gradient energy as the main grey information of the image are biased towards the water turbidity information, while grey entropy, gradient entropy, grey variance and gradient variance as the image information considering the uniformity of the distribution. In Fig. 5 below there are three histogram figure, the first one is the gray energy and gradient energy of water turbidity. It can be seen from the picture that the gray energy of the water turbidity image is small and the gradient energy is relatively large, which means that there is no There is a significant difference in brightness, but there are obvious changes in edges, textures or structures; the second histogram figure is grayscale entropy and gradient entropy, and the turbidity of the water body represented by the image may be at a high level; the third histogram figure is grayscale variance, gradient variance. Overall, the distribution of CaCO3 suspended solids in the water body is relatively uniform, resulting in a relatively uniform grayscale distribution on the image. However, there may be unevenness in local areas, or the light may be significantly scattered and reflected in these areas, resulting in gradients on the image. The value distribution is relatively scattered, forming obvious edge or texture features.
Fig. 5.
Water turbidity image features analysis.
Image cutting
Images in the acquisition process, it is possible to be affected by the environment of the acquisition, if there is a large difference between the image background and the image, it will affect the accuracy of image recognition. Therefore, feature extraction is needed to extract a representative image region, which can improve the recognition accuracy of the image. At the same time, CNN has requirements on the pixels of the input image. Therefore, it is necessary to cut the images while training them. Figure 6 shows the before and after comparison of cutting and intercepting the sample image.
Fig. 6.
A comparison before and after the cutting of an image.
Image anti-interference processing
Realistic images can be affected by various factors and contain a certain amount of noise, image noise can be understood as the phenomenon of image signal degradation caused by certain external factors. Usually, images contain more than one type of noise. There is more than one type of noise in an image, and generally speaking, the area where the noise appears more is the area where the image is brighter. In this experiment, Gaussian noise, pretzel noise, joint noise, and pretzel noise are added to the water quality images respectively.
(1) Gaussian noise: Gaussian noise is a statistical noise whose probability density distribution is equal to the normal distribution.
(2) Salt-and-Pepper Noise: Salt-and-Pepper Noise is a discrete random noise, a phenomenon in which black pixel dots (pepper noise) and white pixel dots (salt noise) appear randomly in an image, thus creating noise.
(3) Joint Noise: Joint Noise is a form of noise that refers to the joint effect of Gaussian noise and pepper noise, and its mathematical definition can be expressed as Gaussian noise and pepper noise added to the original image, respectively.
In fact the image obtained from the outside world can be disturbed by different situations, resulting in the image quality not being as good as it should be. Figure 7 shows the water quality turbidity image with Gaussian noise, pretzel noise, and joint noise added to the original shear map, respectively. Due to space limitation, only one image each in calss I and class X is extracted in Fig. 8 to enumerate the display.
Fig. 7.
Four AI models with different architectures.
Fig. 8.
Schematic of water turbidity shear charts incorporating noise.
AI model training
CNN network is a deep learning model for image classification. In order to obtain better results, CNN networks with different structures are designed specifically for the water turbidity image classification task to match and optimize the model that is more suitable for this experiment. In this study, we explore four different classes of convolutional neural network (CNN) models applied to the task of predicting water quality turbidity classification. To construct and validate the performance of these models, we employ a dataset that covers a wide range of data, which is composed in two main parts, to explore the predictive ability of the models more comprehensively.
As shown in Fig. 7, four CNN models with different structures were constructed to predicting water quality turbidity classification. To construct and validate the performance of these models, we employ a dataset that covers a wide range of data, which is composed in two main parts, to explore the predictive ability of the models more comprehensively. In this figure, M1 is the CNN 8-layer network architecture, which contains 1 input layer, 3 convolutional layers C1, C2, C3 (each convolutional layer contains 1 2D convolution, 1 normalization, 1 excitation ReLU and 1 pooling layer), 1 pooling layer (MaxPool), and 2 fully connected layers, a softmax layer and a classification layer. Once the CNN_8-layer network structure is defined, water turbidity image data can be input into the model for training. Figure 9 shows the accuracy and loss curves of the CNN_8-layer network. During the training process, the network learns to optimize its parameters and weights to minimize the loss function to make accurate predictions on the provided data.
Fig. 9.
CNN 8 layer accuracy and loss curves.
Performance of four different CNN models
Through in-depth training and evaluation on two different datasets, we were able to gain a deep understanding of the applicability, advantages and disadvantages of the different models. The experimental results presented in Table 3 not only demonstrate the predictive ability of the models, but also reveal the combined impact of multiple dimensions such as computation time, accuracy and performance.
Table 3.
Performance of 4 different CNN models.
| No | AI model | Data source | CNN structure | Run time | Accuracy | Performance |
|---|---|---|---|---|---|---|
| M1 | CNN8_Layer | 10 classes of water quality turbidity images, 25 for each class | Figure 7 | 65s | 88.0% | Figure 10(a) |
| M2 | CNN8 + Drop | 67s | 84.0% | Figure 10(c) | ||
| M3 | CNN10_Layer | 145s | 88.0% | Figure 10(e) | ||
| M4 | CNN10 + Drop | 149s | 86.0% | Figure 10(g) | ||
| MN1 | CNN8_Layer | 10 classes of water quality turbidity images, including 4 noise, joint noise, 100 pictures per class | 266s | 94.0% | Figure 10(b) | |
| MN2 | CNN8 + Drop | 269s | 93.5% | Figure 10(d) | ||
| MN3 | CNN10_Layer | 675s | 95.0% | Figure 10(f) | ||
| MN4 | CNN10 + Drop | 687s | 96.5% | Figure 10(h) |
To train and evaluate the model, the dataset is divided into a training set and a validation set, where 80% of the data samples are used for the training of the model and the remaining 20% is used to validate the generalisation performance of the model. This division strategy helps avoid overfitting and ensures that the model can adapt to new and unseen data. By employing a diverse dataset, an exhaustive training and validation strategy, and powerful visualisation tools, in-depth analytical and empirical support is provided for model selection and optimisation for the task of water quality turbidity classification and prediction. During the model training process, the predictive performance of the model is visualised using the confusion matrix, a powerful visualisation tool that allows us to gain a deeper understanding of the model’s performance on different datasets. Detailed results can be seen in Fig. 10.
Fig. 10.
(a) CNN 8_layer (Data = 250). (b) CNN 8_layer (Data = 1000). (c) CNN 8_layer+D (Data = 250). (d) CNN 8_layer+D (Data = 1000). (e) CNN 10_layer (Data = 250). (f) CNN 10_layer (Data = 1000). (g) CNN 10_layer+D (Data=250). (h) CNN 10_layer+D (Data = 1000).
Conclusions
This study aims to systematically examine the performance of different models in different scenarios such as original data and added noise data to find the best model for water turbidity image datasets. By comparing the results of “original data” and “added noise data”, we can better understand the performance and applicability of the model. In terms of accuracy, we evaluated four different AI models, including “8-layer CNN model”, “8-layer CNN model with Dropout”, “10-layer CNN model” and “10-layer CNN model with Dropout”. The results show that the “10-layer CNN model” and “10-layer CNN model with Dropout” performed well in terms of accuracy, achieving 88% and 95% accuracy on original data and noisy data, respectively. The accuracy of other models ranged from 84 to 88% on original data, and between 93.5% and 96.5% after adding noise data. This shows that the model performs better in dealing with noisy data, but its performance will also improve as the complexity of the model increases.
However, this study also has some limitations. First, although more complex models improve accuracy, this is often at the expense of increased running time and computing resource consumption, which may affect the practical applicability of the model, especially when dealing with large datasets. In addition, although noise processing is optimized, this study mainly focuses on specific datasets and scenarios, and the generalization ability of the model in other types of datasets has not been fully verified.
Future can consider introducing more efficient optimization algorithms or improving the existing model structure to further improve the performance of the model while reducing the computational cost. Secondly, we can explore other types of noise processing methods or combine different noise robustness techniques to improve the applicability of the model in more scenarios. In addition, we can also consider multi-model fusion strategies to further improve the generalization ability and accuracy of the model, especially in applications with multivariate and complex data.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgements
This research is generously supported by the following funding sources: Guangdong Province Climbing Plan (Project No. pdjh2023b1132), Qingyuan Science and Technology Think Tank Special Project (Project No. QYKX2024001), Guangdong Provincial Department of Education Key Special Projects (Project Nos. 2022ZDZX1073 and 2023ZDZX1086), and Dongguan Science and Technology of Social Development Program (No. 20211800904472).
Author contributions
Conceptualization, N.Y. and Y.C.; methodology, N.Y. and J.G.; software, N.Y. and S.L.; valida-tion, Y.X. and W.G.; formal analysis N.Y. and Y.C.; data curation, N.Y. ;writing—original draft preparation, N.Y. ;writing—review and editing, N.Y. and Y.C.; project administration, N.Y. and R.L.; funding acquisition, N.Y. and Y.C.
Data availability
The code link: https://github.com/NieYing666/-Scientific-Reports-Paper-Code.git.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Ying Nie, Email: nieying2022ch@163.com.
Yuqiang Chen, Email: chenyq@dgpt.edu.cn.
References
- 1.Ali, N. S. et al. Performance of a solar photocatalysis reactor as pretreatment for wastewater via UV, UV/TiO2, and UV/H2O2 to control membrane fouling [J]. Sci. Rep., 12(1) (2022). [DOI] [PMC free article] [PubMed]
- 2.Lyons, K. J. et al. Monitoring groundwater quality with real-time data, stable water isotopes, and microbial community analysis: A comparison with conventional methods [J]. Sci. Total Environ., 864 (2023). [DOI] [PubMed]
- 3.Anyanwu, I. N. et al. Pollution of the Niger delta with total petroleum hydrocarbons, heavy metals and nutrients in relation to seasonal dynamics [J]. Sci. Rep., 13(1) (2023). [DOI] [PMC free article] [PubMed]
- 4.Mcdowell, R. W. et al. Difficulties in using land use pressure and soil quality indicators to predict water quality [J]. Sci. Total Environ., 935 (2024). [DOI] [PubMed]
- 5.Irfan, M. et al. Distance and weightage-based identification of most critical and vulnerable locations of surface water pollution in Kabul river tributaries [J]. Sci. Rep., 13(1) (2023). [DOI] [PMC free article] [PubMed]
- 6.Granata, F. & Di Nunno, F. Neuroforecasting of daily streamflows in the UK for short- and medium-term horizons: A novel insight [J]. J. Hydrol., 624 (2023).
- 7.Severati, A. et al. The autospawner system - Automated ex situ spawning and fertilisation of corals for reef restoration [J]. J. Environ. Manage., 366 (2024). [DOI] [PubMed]
- 8.Igwegbe, C. A. et al. Purification of aquaculture effluent using Picralima nitida seeds [J]. Sci. Rep.12(1) (2022). [DOI] [PMC free article] [PubMed]
- 9.Upton, A. et al. Rapid gravity filtration operational performance assessment and diagnosis for preventative maintenance from on-line data [J]. Chem. Eng. J.313, 250–260 (2017).
- 10.Gabisa, E. W. & Ratanatamskul, C. Recycling of waste coffee grounds as a photothermal material modified with ZnCl(2) for water purification [J]. Sci. Rep.14 (1), 10811 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Vidyarthi, V. K. & Jain, A. Advanced Rule-Based system for rainfall occurrence forecasting by integrating machine learning techniques [J]. J. Water Resour. Plan. Manag., 149(1): (2023).
- 12.Alizadeh, M. J. et al. Effect of river flow on the quality of estuarine and coastal waters using machine learning models [J]. Eng. Appl. Comput. Fluid Mech.12 (1), 810–823 (2018). [Google Scholar]
- 13.Wang, Y. et al. Predicting water turbidity in a macro-tidal coastal Bay using machine learning approaches [J]. Estuar. Coast Shelf Sci., 252 (2021).
- 14.Lo, Y. et al. Medium-Sized lake water quality parameters retrieval using multispectral UAV image and machine learning algorithms: A case study of the Yuandang lake. China [J] Drones, 7(4) (2023).
- 15.Afshari Nia, M., Panahi, F. & Ehteram, M. Convolutional neural Network- ANN- E (Tanh): A new deep learning model for predicting rainfall [J]. Water Resour. Manage.37 (4), 1785–1810 (2023). [Google Scholar]
- 16.Cho, E. et al. Identifying Subsurface Drainage using Satellite Big Data and Machine Learning via Google Earth Engine [J]. Water Resour. Res., 55(10), 8028–8045 (2019).
- 17.Vidyarthi, V. K. & Chourasiya, S. Particle swarm optimization for training artificial neural Network-Based Rainfall–Runoff model, case study: Jardine river basin [M]. Micro-Electronics Telecommun. Eng. : 641–647. (2020).
- 18.Yao, J., Chen, S. & Ruan, X. Interpretable CEEMDAN-FE-LSTM-transformer hybrid model for predicting total phosphorus concentrations in surface water [J]. J. Hydrol., 629 (2024).
- 19.He, H. et al. Analyzing spatio-temporal dynamics of dissolved oxygen for the river Thames using Superstatistical methods and machine learning [J]. Sci. Rep., 14(1): (2024). [DOI] [PMC free article] [PubMed]
- 20.Gong, B., Jing, L., Chen, Y. & TSD Random Feature Query Design for transformer-based Shrimp Detector [J]221 (Computers and Electronics in Agriculture, 2024).
- 21.Xu, R-Z. et al. Attention improvement for data-driven analyzing fluorescence excitation-emission matrix spectra via interpretable attention mechanism [J]. Npj Clean. Water, 7(1): (2024).
- 22.Lopez-Betancur, D. et al. Convolutional neural network for measurement of suspended solids and turbidity [J]. Appl. Sci., 12(12): (2022).
- 23.Feizi, H. et al. An image-based deep learning model for water turbidity Estimation in laboratory conditions [J]. Int. J. Environ. Sci. Technol.20 (1), 149–160 (2022). [Google Scholar]
- 24.Mohammed, A. Determine water turbidity by using image processing technology [J]. Int. J. Intell. Syst. Appl. Eng. 2024 IJISAE. 12 (3), 4260–4265 (2024). [Google Scholar]
- 25.AL. V A M E. Water quality prediction using CNN [J]. J. Phys. Conf. Ser.2484(023), 012051 (2023).
- 26.santos, M. et al. Spatio-temporal dynamics of phytoplankton community in a well-mixed temperate estuary (Sado estuary, Portugal) [J]. Sci. Rep., 12(1) (2022). [DOI] [PMC free article] [PubMed]
- 27.Esteki, R. et al. Investigating the improvement of the quality of industrial effluents for reuse with added processes: coagulation, flocculation, multi-layer filter and UV [J]. Sci. Rep., 14(1): (2024). [DOI] [PMC free article] [PubMed]
- 28.Montúfar-Romero, M. et al. Feasibility of aquaculture cultivation of Elkhorn sea moss (Kappaphycus alvarezii) in a horizontal long line in the tropical Eastern Pacific [J]. Sci. Rep., 13(1) (2023). [DOI] [PMC free article] [PubMed]
- 29.Hanna HI98703-02 Turbidity Portable Meter [M]. (2023).
- 30.Farkas, K. et al. Implications of long-term sample storage on the recovery of viruses from wastewater and biobanking [J]. Water Res., 265 (2024). [DOI] [PubMed]
- 31.Talukdar, S. et al. Optimisation and interpretation of machine and deep learning models for improved water quality management in lake Loktak [J]. J. Environ. Manage., 351 (2024). [DOI] [PubMed]
- 32.Zhao, R. et al. Deep learning and its applications to machine health monitoring [J]. Mech. Syst. Signal Process.. 115, 213–237 (2019).
- 33.Zakir Hossain, M. D. et al. A comprehensive survey of deep learning for image captioning [J]. ACM Comput. Surveys, 51(6) (2019).
- 34.Housh, M. & Ostfeld, A. An integrated logit model for contamination event detection in water distribution systems [J]. Water Res.75, 210–223 (2015). [DOI] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The code link: https://github.com/NieYing666/-Scientific-Reports-Paper-Code.git.


















