Image Recognition of Mine Water Inrush Based on Bilinear Convolutional Neural Network with Few-Shot Learning

Shuai Zhang; Yuanze Du; Yingwang Zhao; Lifu Zhou

doi:10.1021/acsomega.3c09735

. 2024 Feb 27;9(10):12027–12036. doi: 10.1021/acsomega.3c09735

Image Recognition of Mine Water Inrush Based on Bilinear Convolutional Neural Network with Few-Shot Learning

Shuai Zhang ^†,^‡,^§,^∥, Yuanze Du ^†,^‡,^§,^∥,^*, Yingwang Zhao ^†,^‡,^§,^∥, Lifu Zhou ^∥

PMCID: PMC10938431 PMID: 38496943

Abstract

graphic file with name ao3c09735_0011.jpg

With the increasingly widespread application of deep learning technology in the field of coal mines, the image recognition of mine water inrush has become a hot research topic. Underground environments are complex, and images have a high noise and low brightness. Additionally, mine water inrush is accidental, and few actual image samples are available. Therefore, this paper proposes an algorithm that recognizes mine water inrush images based on few-shot deep learning. According to the characteristics of images with coal wall water seepage, a bilinear neural network was used to extract the image features and enhance the network’s fine-grained image recognition. First, features were extracted using a bilinear convolutional neural network. Second, the network was pre-trained based on cosine similarity. Finally, the network was fine-tuned for the predicted image. For single-line feature extraction, the method is compared with big data and few-shot learning. According to the experimental results, the recognition rate reaches 95.2% for few-shot learning based on a bilinear neural network, thus demonstrating its effectiveness.

1. Introduction

Mine water inrush is one of the main disasters in coal mines. In recent years, the major accidents in coal mines have been water disasters, causing heavy casualties and property losses. As computer technology has developed, deep learning¹⁻³ technology has gradually been applied to mine engineering⁴⁻⁷ and water resource engineering.⁸⁻¹⁰ Existing methods have mostly used traditional image recognition, which recognizes water inrush¹¹ in only a particular scene, but the mine environment is harsh and the image scenes are diverse, so traditional image recognition has difficulty accurately recognizing complex scenes. Compared with traditional image recognition, image recognition algorithms based on deep learning have strong generalization ability and robustness, and they can adapt to image recognition in multiple scenes. However, these applications are based on deep learning with a large number of samples. Small sample sizes result in overfitting problems and low recognition rates. Most importantly, there is little probability of water inrush occurring in mines, so obtaining real samples of water inrush is difficult. To solve this problem, a deep learning method was used to identify a limited number of water inrush samples using few-shot learning.¹² Few-shot learning refers to classifying new images with a small amount of labeled training data, comparing the predicted images with a small number of images already in the category and comparing the similarity between them to determine the category of the image.

Among the methods currently in use, Liu et al.¹³ proposed a coal and coal gangue detection method based on YOLOv4, and Wang¹⁴ proposed a new method of obtaining coal texture features using a co-occurrence matrix, which was input into the neural network as a feature vector. Zhang et al.¹⁵ summarized the image recognition of coal and rock and pointed out the existing problems. Li et al.¹⁶ proposed a new mine water source discrimination method based on a generative adversarial network using the fluorescence spectrum of water samples. Alfarzaeai et al.¹⁷ used a convolutional neural network (CNN) and thermal imaging to identify coal gangue. Si et al.¹⁸ proposed a coal rock–recognition algorithm based on a deep CNN, which solved the overfitting problem of the CNN and had a good recognition effect on coal rock images. These studies achieved good results in their respective fields, but few scholars have conducted in-depth research on the image recognition of the mine water inrush. The main reason is that mine water inrush is extremely contingent; therefore, line storage has very few samples, and large-scale learning is difficult to conduct. Therefore, image recognition based on small samples can well solve the problem of fewer samples.

The original method of few-shot learning uses generative models and data augmentation.¹⁹ This method has also achieved certain results, but overfitting problems, among others, remain. Subsequently, meta-learning²⁰ has proposed a new idea for few-shot learning. Different from the traditional learning model, meta-learning guides learning based on the comparison method. For example, the Siamese neural network proposed by Koch et al.²¹ is a type of meta-learning. Providing an application case, Li et al.²² proposed a feature-learning module based on the metric learning algorithm, which specifically extracted “intraclass commonality” and “interclass uniqueness”. The most commonly used matching method in meta-learning is metric learning,²³ which classifies the extracted feature vectors using the distance measurement method. In the metric method, the cosine distance²⁴ is often used to calculate the similarity of two feature vectors. Because underground conditions are dim, the feature extraction network has difficulty extracting the features of water seepage in the coal wall and cannot judge whether it has water inrush. Therefore, in image recognition, the focus is on the fine-grained classification of images.

Lin et al.²⁵ proposed a bilinear convolutional neural network (Bilinear-CNN), which uses two feature extraction networks and intersects the feature maps in pairs to ensure that the image features are not lost. Considering the small number of mine water inrush samples and the problem of fine-grained image classification, a few-shot learning method was used based on metric classification to classify images and a bilinear neural network to extract feature maps to enhance the recognition rate of the fine-grained images.

This study used the training method of a matching network to obtain the feature vector of the image and then fine-tuned the support and target sets. We explain the fine-tuning effect through experimental comparison. Because all sudden mine water is water flow or water droplets attached to the coal wall, and the coal wall is dark in color and has an uneven surface, recognizing when water is seeping from the surface of the coal wall is difficult. For feature extraction, previous few-shot recognition algorithms have used monolinear neural networks, which are prone to losing many image details. Therefore, this problem was solved by classifying images with such features as fine grained.²⁶⁻³¹ Furthermore, bilinear neural networks were used for feature extraction during pretraining to enhance the feature channels of the images and recognize the hanging sweat feature of burst mine water. Finally, we compared the feature extraction effects of bilinear convolutional neural networks based on Resnet18, Resnet50, and VGG16, respectively.

2. Methods

2.1. Network Structure

The few-shot task divides the data set into multiple meta-tasks in the training phase to learn the generalization ability of category changes. In the testing phase, a small number of samples in a new category were directly classified. In the training phase, the C categories was selected in the source data set and a batch of data in each category for similarity pretraining. The target set is the data set that needs to be identified. The target set is divided into C categories, and K data in each category are selected as the input of the model’s support set. Then, a picture is randomly selected from the remaining target data as the prediction object of the model, called a query. This problem type is defined as the C way K shot.

The main process of identifying mine water inrush images in this paper is as follows. First, input the source data set into the bilinear neural network to obtain the corresponding feature vectors; divide the source data set into an anchor, a positive, and a negative image; compare the cosine similarity of the positive and negative images with the anchor image; and use the mean squared error (MSE) loss to calculate the loss and obtain the similarity weight of the image through backpropagation (Figure 1). Then, on the target set, use the K shots and a query image as the model input, and load the pretraining weights to obtain the K shot and query feature vectors. Then, the model was fine-tuned by comparing the cosine similarity of the shot and query feature vectors (Figure 2). Use cross-entropy to calculate the loss and backpropagation to update the K shot feature vectors parameters. Finally, select an image from the remaining target set samples as the predicted image and generate the feature vector by loading and training the weighted bilinear neural network. Calculate the cosine similarity between the vector and the K shot feature vectors and take the highest score as the predicted category.

2.2. Bilinear Feature Extraction Network

Most of the current few-shot feature extraction processes use simple neural networks, such as Conv-4 networks with four convolution kernels, VGG networks, or ResNet networks. Considering that these simple CNN structures lead to the loss of some features, the hanging sweat feature cannot be detected, so in this few-shot learning process, a bilinear neural network was used to extract image features. The Bilinear-CNN consists of two identical CNNS, and eq 1 indicates that the Bilinear-CNN consists of a four-element tuple function

where f_A, f_B is a feature extraction function based on a CNN, I is a image, and l is the pixel position of the image. Then, all of the positions are added and converted to vector

Then, an x moment normalization and an L2 normalization on x were performed to obtain bilinear eigenvector z₁

The bilinear neural network uses the matrix outer product of the last convolutional layer of the two neural networks to fuse the two sets of feature networks, which permits greater retention of image features, thereby increasing the fine-grained recognition of images. Therefore, seepage features can be well identified in mine water inrush situations.

Figure 3 shows the structure of bilinear neural network. Two ResNet18 networks were used on the two branches in the bilinear neural network, and each branch outputs 512 feature maps. According to Formula 1, the final output of the model was 512 × 2 feature maps. After running eqs 2–5, 512 × 2 parameters were obtained, which resulted in too many final output parameters. These are difficult to converge when using the MSE loss to calculate the loss function. Therefore, in this study, a fully connected layer (eq 6) was added after the output layer of the neural network, which not only ensured that the feature map would not be lost but also reduced the parameters so that the model would converge during training.

2.3. Cosine Similarity based on the Bilinear Neural Network

As shown in Figure 4, the feature vector of each class is denoted as w_j (j = 1, 2, L, N), where N is the number of categories of samples. The query image was defined to be compared as x_ij (i = 1, 2, L, P), where P is the number of test images in this category. According to the formula of cosine similarity, we define

2.4. Loss Function Calculation

Few-shot learning examines the similarity between two images and calculates the similarity loss function to update the network. Therefore, in this study, the similarity of the two images was obtained by calculating the cosine distance of the two feature vectors of the two images. To make the loss function reflect the distance error, MSE was used to calculate the loss function as follows

where n represents the total number of positive and negative samples, y_i is the cosine distance value, and t_ij is the target corresponding to the cosine value. Therefore, when the gradient is updated, the weight is updated in the direction of the positive sample, and the result is the similarity weight of the image.

3. Experimental Analysis

According to a small number of water inrush samples and a certain number of nonwater inrush samples collected, the data sets were divided into three categories: mine water inrush, water free, and industrial water (other water data sets). The sample was data augmented. Then, the data plan was divided into a training and a validation in an 8:2 ratio, similarity training was performed on the training set, and model fine-tuning and data prediction were performed on the validation set. The recognition rates were compared for the large- and few-shot learning methods as were the recognition rates of the bidirectional and single-line neural networks. Finally, the importance of the fully connected layer in the bilinear neural network was obtained through experiments.

3.1. Data Set Analysis

The analysis of this study is divided into three parts. First, the role of the fully connected layer in the bilinear neural network is determined through the loss function curve. Second, based on the three convolutional neural networks ResNet-18, ResNet-50, and Vgg16, the accuracy of small-sample learning and large-sample learning is compared to determine the accuracy of the model. Finally, the monolinear and bilinear feature extraction effects of the three convolutional neural networks are compared through the CAM³² diagram, and the results are explained.

3.1.1. Data Set Division

During training and prediction, both the source and target data sets were divided into three categories: water inrush images, no water images, and images with water but not water inrush (water spraying along the working face, water in the roadway, etc.). This method was used because water accumulates in the wells and water sprays along the working face in situations other than water inrush (Figure 5), and such images are easily confused with situations of mine water inrush. Therefore, to improve the model’s ability to generalize and to verify the category recognition, these images were classified into categories.

In the few-shot recognition, when the target and source data sets were different data types, the final prediction results of the model vary greatly. Therefore, our source and target data sets used mine water inrush images to improve the model’s performance.

3.1.2. Data Augmentation

This study collected 160 images of water inrush, 73 images of no water, 39 images of mine shafts with minor water accumulation, and images of industrial water used in industrial production. Each image frame is an RGB color image. The number of mine water inrush samples and other water samples is relatively smaller compared to the non-inrush samples. This situation led to a bias in the model during the training process toward non-inrush cases, resulting in misjudgments and reducing the generalizability of the model.

To overcome these effects, the images were rotated and image noise was added to make each data set more balanced. This approach expanded the water inrush and other water data sets, enhancing the model’s discriminatory capability. The final number of data sets constructed is shown in Table 1.

Table 1. Data Set Distribution.

data set	before expansion	after expansion
no water	73	226
mine water inrush	160	225
other water	39	222

Open in a new tab

3.2. Evaluation Results

In our experimental setup, we chose the commonly used feature extraction network model and used Vgg16, ResNet-50, and ResNet-18 to perform feature extraction on images. Each model performs 4000 iterations and fits the change curve of the loss value during the training process of each model. The results are shown in Figure 6. For each model, the loss function curve was compared for bilinear and unilinear extraction methods. Since the bilinear neural network intersects the feature maps of the two branches in feature extraction, the number of parameters is larger than that of the single linear neural network. The loss of the single linear model in stage I drops faster, and the initial value of the loss function is smaller. The single linear model in stage II reaches convergence very quickly. Since ResNet-50 has 49 convolutional layers, which is much larger than those of Vgg16 and ResNet-18, its loss function converges slowly.

Loss changes for different models, including (a) Vgg16, (b) Resnet50, and (c) Resnet18.

Then, they were compared on the validation set for different training methods. Table 2 summarizes its performance. It is divided into three groups. The first group compares bilinear (B) and monolinear (M) learning results for each network based on the big data training method (BL). The second and third groups are based on unilinear few-shot learning (MF) and bilinear few-shot learning (BF) methods to compare the accuracy of the network, respectively. At the same time, the effect of fine-tuning on the recognition rate is compared.

Table 2. Accuracy of Different Training Methods.

methods	fine-tuning	accuracy
BL + M-Resnet-18	none	81.8%
BL + M-Vgg16	none	84.8%
BL + M-Resnet-50	none	84.8%
BL + B-Resnet-18	none	87.9%
BL + B-Vgg16	none	87.9%
BL + B-Resnet-50	none	88.1%
MF + Resnet-18	no	70%
MF + Vgg16	no	70%
MF + Resnet-50	no	72%
MF + Resnet-18	yes	90.5%
MF + Vgg16	yes	90.9%
MF + Resnet-50	yes	92.9%
BF + Resnet-18	no	71%
BF + Vgg16	no	73%
BF + Resnet-50	no	75%
BF + Resnet-18	yes	90.5%
BF + Vgg16	yes	92.2%
BF + Resnet-50	yes	95.2%

Open in a new tab

As shown in Table 2, under the premise that the number of data sets is small, the performance of the model based on small-sample learning is better than that of ordinary large-sample learning. Under each training method, the performance of bilinear and monolinear feature extraction, respectively. The bilinearity was on average 3% higher than monolinear when using the same network model. When making a fine-tuning comparison, fine-tuning has a great impact on the prediction results of the network. The data set is divided into three categories, and fine-tuning is performed using 3-way and 3-shot. It can be seen that the performance of the model is significantly improved after fine-tuning.

As shown in the table, among the three models of Vgg16, ResNet-18, and ResNet-50, the recognition rates of Vgg16 and ResNet-18 are very close, with little difference in performance. And ResNet-50 has the highest recognition rate, reaching 88.1, 92.9, and 95.2% among the three training methods, respectively. According to the results in the table, the performance of the few-shot training method based on the bilinear ResNet-50 model is better than those of other methods.

3.3. Influence of the Fully Connected Layer on the Prediction Results

Feature maps need to be intersected pairwise in bilinear neural networks, so the final output layer has C² feature maps, which lead to too many final parameters. The MSE loss function during training was used, and the loss function was difficult to converge when calculating the output of the bilinear neural network. To solve this problem, a fully connected layer was added after the output layer of the bilinear neural network.

Doing so can not only ensure the intersection of the feature maps but also reduce the number of parameters when calculating the loss, which makes it easier to converge (Figure 7).

Prediction results without a fully connected layer, where epoch represents the number of iterations and loss is the loss function value of each iteration. (a) The scenario without adding a fully connected layer and (b) the scenario with an added fully connected layer.

3.4. Comparison of the Two Feature Extraction Methods for Different CNN Models

Before the mine water inrush, very few water droplets seep out. This situation is a precursor to the water inrush. This situation is small in grayscale, and the image detail is small. Such small features are easily lost when the image features are extracted. Bilinear neural networks can largely avoid this from happening. In order to better compare the feature extractor results of bilinear and unilinear neural networks, CAM plots are used to visualize the extraction results.

As shown in Figures 8–9 10, the differences between bilinear and monolinear are compared on the basis of the three models. Panel (a) represents the original test image, and panels (b) and (c) represent the bilinear and unilinear CAM maps, respectively. In general, the darker the color on the activation map, the greater the weight of the model, the more the model pays attention to that position. Judging whether water inrush occurs according to image features. For example, the coal wall water inrush in the fourth test image, this image includes two features, namely, water and coal wall. Therefore, it is necessary to judge whether water inrush occurs according to these two characteristics. For another example, the third test picture shows water spraying on the working face. If features only pay attention to the water and the coal wall and ignore the sprinkler, it is easy to judge that it is a mine water inrush situation. Therefore, we not only need to let the model pay attention to the water and the coal wall but also let the model pay attention to the sprinkler.

Visualization results are based on ResNet-50. (a) Original rgb image, (b) bilinear extraction result, and (c) unilinear extraction result.

Visualization results are based on ResNet-18. (a) Original rgb image, (b) bilinear extraction result, and (c) unilinear extraction result.

Visualization results are based on Vgg16. (a) Original rgb image, (b) bilinear extraction result, and (c) unilinear extraction result.

Figure 8 shows the activation map of ResNet-50. It can be seen that the monolinear network pays more attention to a certain position of the image. As mentioned earlier, it is easy to make misjudgments. The bilinear model focuses not only on water but also on coal walls and sprinklers. These features will allow the model to determine that this image is water spraying on the face rather than mine water inrush. The ResNet-18 gap in Figure 9 is not very obvious, but the region of interest of the bilinear network is more precise. In the Vgg16 network in Figure 10, it can be clearly found that the activation map of the single linear neural network is not obvious and it is difficult for the model to make judgments. Interestingly, the bilinear activation map shows that the model focuses on a wider area. This all shows that the bilinear neural network is more accurate in the representation of details. And ResNet-50 can more realistically characterize image features, resulting in a higher recognition rate.

4. Discussion

For water flow image recognition, many traditional image recognition algorithms^33,34 are used, and feature descriptors are used to extract image features through preprocessing, which is often difficult to capture high-level semantic information and complex features, resulting in low learning efficiency, and each step of detection is independent and lacks a global optimization scheme to control. When there are few training samples, deep learning algorithms with large data sets will lead to overfitting of the model at test time, resulting in poor generalization ability of the model. Although the convolutional neural network has the ability of representation learning, it can extract high-order features from the input information and can ensure the translation invariance of the features. However, for images with complex textures, it is difficult to ensure that the features will not be lost, which requires complex preprocessing of the images.

From the collected samples, it can be clearly seen that the mine water inrush images have the characteristics of few samples, complex and dim environments, and complex texture. Therefore, the few-shot learning algorithm and bilinear feature extraction algorithm in this study can solve such problems well. Although the bilinear neural network will increase the number of parameters, exporting the model will not affect the calculation speed when making predictions.

The small sample recognition algorithm is classified by the cosine similarity of the features of the two pictures. Its essence is to compare the similarity of the features of the two pictures instead of actively judging which category it belongs to. Since the bilinear neural network has two feature extractors, the channel directions of the two feature maps are multiplied in pairs to ensure that the features will not be lost, and eventually, the model will pay more attention to the seepage area, thereby improving the final recognition rate.

5. Conclusions

Mine water inrush images are very rare in engineering, and few image samples are available. Additionally, the images have some disturbances, such as low underground light, construction water, surface water reflection, and coal wall seepage. This paper proposes a few-shot recognition algorithm based on a bilinear neural network that uses the high, fine-grained recognition characteristics of bilinear neural networks to recognize water inrush images in complex environments. The following conclusions are drawn.

(1)
The few-shot recognition algorithm compares images and can effectively solve the problem of insufficient training samples. The recognition rates were compared of large- and few-shot image recognition algorithms, and the efficiency of few-shot image recognition was verified.
(2)
Using bilinear neural network feature map superposition, we extracted the image features. The heat map results indicate that the bilinear neural network can extract the features of water seepage under the well with good accuracy and improve the feature vector. Because of the MSE loss function used in this study, the fully connected layer affected the final convergence result.
(3)
This study provides a new idea and method for identifying water inrush in mines. The method is based on deep learning, which can effectively identify water inrush underground and improve mine production intelligence. However, this study compares only the sudden and nonwater inrush situations, and further research on graph water characteristics is needed.

Acknowledgments

Thanks go to all authors for their contributions to the conception and design of the study. Material preparation, data collection, and analysis were performed by S.Z., Y.D., Y.Z., and L.Z. The first draft of the manuscript was written by S.Z., and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

This research was financially supported by the Major Research Instrument Development Project of the National Natural Science Foundation of China (42027801) and the China National Natural Science Foundation (42202283, 42202274, 41877186). The authors thank the editor and reviewers for their constructive suggestions.

The authors declare no competing financial interest.

Notes

Compliance with Ethical Standards The authors have no potential conflicts of interest. Research does not involve human participants and/or animals. All authors have given informed consent. All authors consent to participate. All authors consent to publish. The data and materials are available.

References

Byun H.; Kim J.; Yoon D.; et al. A deep convolutional neural network for rock fracture image segmentation. Earth Sci. Inf. 2021, 14 (4), 1937–1951. 10.1007/s12145-021-00650-1. [DOI] [Google Scholar]
Jayapriya K.; Jacob I. J.; Darney P. E. Hyperspectral image classification using multi-task feature leverage with multi-variant deep learning. Earth Sci. Inf. 2020, 13 (4), 1093–1102. 10.1007/s12145-020-00485-2. [DOI] [Google Scholar]
Yue Z.; Yan B.; Liu H.; et al. An Effective Method for Underwater Biological Multi-Target Detection Using Mask Region-Based Convolutional Neural Network. Water 2023, 15 (19), 3507. 10.3390/w15193507. [DOI] [Google Scholar]
Srivastava H.; Sarawadekar K.. A Depthwise Separable Convolution Architecture for CNN Accelerator. In 2020 IEEE Applied Signal Processing Conference ASPCON; IEEE: Kolkata, India, 2020; pp 1–5.
Yang Y.; Yue J. H.; Li J.; et al. Mine Water Inrush Sources Online Discrimination Model Using Fluorescence Spectrum and CNN. IEEE Access 2018, 6, 47828–47835. 10.1109/ACCESS.2018.2866506. [DOI] [Google Scholar]
Sun E.; Nieto A.; Li Z.; et al. An integrated information technology assisted driving system to improve mine trucks-related safety. Safety Sci. 2010, 48 (10), 1490–1497. 10.1016/j.ssci.2010.07.012. [DOI] [Google Scholar]
Chen J.; Hu Y.; Chen S.; et al. Spatial Wave Measurement Based on U-net Convolutional Neural Network in Large Wave Flume. Water 2023, 15 (4), 647. 10.3390/w15040647. [DOI] [Google Scholar]
Wang J. H.; Lin G. F.; Chang M. J.; et al. Real-Time Water-Level Forecasting Using Dilated Causal Convolutional Neural Networks. Water Resour. Manage. 2019, 33 (11), 3759–3780. 10.1007/s11269-019-02342-4. [DOI] [Google Scholar]
Wu Z.; Ma B.; Wang H.; et al. Identification of sensitive parameters of urban flood model based on artificial neural network. Water Resour. Manage. 2021, 35 (7), 2115–2128. 10.1007/s11269-021-02825-3. [DOI] [Google Scholar]
Hrnjica B.; Bonacci O. Lake Level Prediction using Feed Forward and Recurrent Neural Networks. Water Resour. Manage. 2019, 33 (7), 2471–2484. 10.1007/s11269-019-02255-2. [DOI] [Google Scholar]
Zeng Y.; Meng S.; Wu Q.; et al. Ecological water security impact of large coal base development and its protection. J. Hydrol. 2023, 619, 129319 10.1016/j.jhydrol.2023.129319. [DOI] [Google Scholar]
Liu H.; Qiu Q.; Wu L.; et al. Few-shot learning for name entity recognition in geological text based on GeoBERT. Earth Sci. Inf. 2022, 979–991. 10.1007/s12145-022-00775-x. [DOI] [Google Scholar]
Liu Q.; Li J.; Li Y.; et al. Recognition Methods for Coal and Coal Gangue Based on Deep Learning. IEEE Access 2021, 9, 77599–77610. 10.1109/ACCESS.2021.3081442. [DOI] [Google Scholar]
Wang W. Z.; Wang Z. Characteristic analysis and recognition of coal and rock based on visual technology. Coal Technol. 2014, 33 (2), 272–274. [Google Scholar]
Zhang S.; Zhang M.. On Identification of Coal and Rock Images, 2018; IEEE, 2018. [Google Scholar]
Li J.; Yong Y.; Ge H.; et al. Generative adversarial nets in laserinduced fluorescence Spectrum image recognition of mine water inrush. Int. J. Distrib. Sens. Networks 2019, 15 (10), 1550147719884894. 10.1177/1550147719884894. [DOI] [Google Scholar]
Alfarzaeai M. S.; Niu Q.; Zhao J.. et al. Coal/Gangue Recognition Using Convolutional Neural Networks and Thermal Images. In IEEE Access; IEEE, 2020; pp 76780–76789.
Si L.; Xiong X.; Wang Z.; et al. A Deep Convolutional Neural Network Model for Intelligent Discrimination between Coal and Rocks in Coal Mining Face. Math. Probl. Eng. 2020, 2020, 2616510. 10.1155/2020/2616510. [DOI] [Google Scholar]
Wong A.; Yuille A.. One Shot Learning via Compositions of Meaningful Patches. In Proceedings of the IEEE International Conference on Computer Vision 2015; pp 1197–1205.
Finn C.; Abbeel P.; Levine S.. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In International Conference on Machine Learning; PMLR, 2017; pp 1126–1135.
Koch G.; Zemel R.; Salakhutdinov R.. Siamese Neural Networks for One-shot Image Recognition ICML Deep Learning Workshop 2015; Vol. 2 (1), .
Li H.; Eigen D.; Dodge S.. et al. Finding Task-Relevant Features for Few-Shot Learning by Category Traversal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2019; pp 1–10.
Atkeson C. G.; Moore A. W.; Schaal S. Locally Weighted Learning for Control. Artif. Intell. Rev. 1997, 11 (1–5), 75–113. 10.1023/A:1006511328852. [DOI] [Google Scholar]
Vinyals O.; Blundell C.; Lillicrap T. P.. et al. Matching Networks for One Shot Learning. In Advances in Neural Information Processing Systems 2016.
Lin T.; RoyChowdhury A.; Maji S.. Bilinear CNN Models for Fine-Grained Visual Recognition. In Proceedings of the IEEE International Conference on Computer Vision ICCV; IEEE, 2015; pp 1449–1457.
Lin T. Y.; RoyChowdhury A.; Maji S. Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40 (6), 1309–1322. [DOI] [PubMed] [Google Scholar]
Kong S.; Fowlkes C.. Low-Rank Bilinear Pooling for Fine-Grained Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR 2017; pp 365–374.
Sun Q.; Wang Q.; Zhang J.; et al. Hyperlayer bilinear pooling with application to fine-grained categorization and image retrieval. Neurocomputing 2018, 282, 174–183. 10.1016/j.neucom.2017.12.020. [DOI] [Google Scholar]
Gao Y.; Beijbom O.; Zhang N.. et al. Compact Bilinear Pooling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016; pp 317–326.
Wei X.; Zhang Y.; Gong Y.. et al. Grassmann Pooling as Compact Homogeneous Bilinear Pooling for Fine-Grained Visual Classification. In Proceedings of the European Conference on Computer Vision ECCV 2018; pp 355–370.
Cui Y.; Zhou F.; Wang J.. et al. Kernel Pooling for Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVRP 2017; pp 2921–2930. [DOI] [PMC free article] [PubMed]
Hacıefendioğlu K.; Demir G.; Başağa H. B. Landslide detection using visualization techniques for deep convolutional neural network models. Nat. Hazards 2021, 109 (1), 329–350. 10.1007/s11069-021-04838-y. [DOI] [Google Scholar]
Lin F.; Chang W. Y.; Lee L. C.. et al. Applications of Image Recognition for Real-Time Water Level and Surface Velocity. In 2013 IEEE International Symposium on Multimedia; IEEE: Anaheim, CA, USA, 2013; pp 259–262. [Google Scholar]
Mettes P.; Tan R. T.; Veltkamp R.. On the Segmentation and Classification of Water in Videos. In 2014 International Conference on Computer Vision Theory and Applications VISAPP; IEEE: Lisbon, Portugal, 2014.

[ref1] Byun H.; Kim J.; Yoon D.; et al. A deep convolutional neural network for rock fracture image segmentation. Earth Sci. Inf. 2021, 14 (4), 1937–1951. 10.1007/s12145-021-00650-1. [DOI] [Google Scholar]

[ref2] Jayapriya K.; Jacob I. J.; Darney P. E. Hyperspectral image classification using multi-task feature leverage with multi-variant deep learning. Earth Sci. Inf. 2020, 13 (4), 1093–1102. 10.1007/s12145-020-00485-2. [DOI] [Google Scholar]

[ref3] Yue Z.; Yan B.; Liu H.; et al. An Effective Method for Underwater Biological Multi-Target Detection Using Mask Region-Based Convolutional Neural Network. Water 2023, 15 (19), 3507. 10.3390/w15193507. [DOI] [Google Scholar]

[ref4] Srivastava H.; Sarawadekar K.. A Depthwise Separable Convolution Architecture for CNN Accelerator. In 2020 IEEE Applied Signal Processing Conference ASPCON; IEEE: Kolkata, India, 2020; pp 1–5.

[ref5] Yang Y.; Yue J. H.; Li J.; et al. Mine Water Inrush Sources Online Discrimination Model Using Fluorescence Spectrum and CNN. IEEE Access 2018, 6, 47828–47835. 10.1109/ACCESS.2018.2866506. [DOI] [Google Scholar]

[ref6] Sun E.; Nieto A.; Li Z.; et al. An integrated information technology assisted driving system to improve mine trucks-related safety. Safety Sci. 2010, 48 (10), 1490–1497. 10.1016/j.ssci.2010.07.012. [DOI] [Google Scholar]

[ref7] Chen J.; Hu Y.; Chen S.; et al. Spatial Wave Measurement Based on U-net Convolutional Neural Network in Large Wave Flume. Water 2023, 15 (4), 647. 10.3390/w15040647. [DOI] [Google Scholar]

[ref8] Wang J. H.; Lin G. F.; Chang M. J.; et al. Real-Time Water-Level Forecasting Using Dilated Causal Convolutional Neural Networks. Water Resour. Manage. 2019, 33 (11), 3759–3780. 10.1007/s11269-019-02342-4. [DOI] [Google Scholar]

[ref9] Wu Z.; Ma B.; Wang H.; et al. Identification of sensitive parameters of urban flood model based on artificial neural network. Water Resour. Manage. 2021, 35 (7), 2115–2128. 10.1007/s11269-021-02825-3. [DOI] [Google Scholar]

[ref10] Hrnjica B.; Bonacci O. Lake Level Prediction using Feed Forward and Recurrent Neural Networks. Water Resour. Manage. 2019, 33 (7), 2471–2484. 10.1007/s11269-019-02255-2. [DOI] [Google Scholar]

[ref11] Zeng Y.; Meng S.; Wu Q.; et al. Ecological water security impact of large coal base development and its protection. J. Hydrol. 2023, 619, 129319 10.1016/j.jhydrol.2023.129319. [DOI] [Google Scholar]

[ref12] Liu H.; Qiu Q.; Wu L.; et al. Few-shot learning for name entity recognition in geological text based on GeoBERT. Earth Sci. Inf. 2022, 979–991. 10.1007/s12145-022-00775-x. [DOI] [Google Scholar]

[ref13] Liu Q.; Li J.; Li Y.; et al. Recognition Methods for Coal and Coal Gangue Based on Deep Learning. IEEE Access 2021, 9, 77599–77610. 10.1109/ACCESS.2021.3081442. [DOI] [Google Scholar]

[ref14] Wang W. Z.; Wang Z. Characteristic analysis and recognition of coal and rock based on visual technology. Coal Technol. 2014, 33 (2), 272–274. [Google Scholar]

[ref15] Zhang S.; Zhang M.. On Identification of Coal and Rock Images, 2018; IEEE, 2018. [Google Scholar]

[ref16] Li J.; Yong Y.; Ge H.; et al. Generative adversarial nets in laserinduced fluorescence Spectrum image recognition of mine water inrush. Int. J. Distrib. Sens. Networks 2019, 15 (10), 1550147719884894. 10.1177/1550147719884894. [DOI] [Google Scholar]

[ref17] Alfarzaeai M. S.; Niu Q.; Zhao J.. et al. Coal/Gangue Recognition Using Convolutional Neural Networks and Thermal Images. In IEEE Access; IEEE, 2020; pp 76780–76789.

[ref18] Si L.; Xiong X.; Wang Z.; et al. A Deep Convolutional Neural Network Model for Intelligent Discrimination between Coal and Rocks in Coal Mining Face. Math. Probl. Eng. 2020, 2020, 2616510. 10.1155/2020/2616510. [DOI] [Google Scholar]

[ref19] Wong A.; Yuille A.. One Shot Learning via Compositions of Meaningful Patches. In Proceedings of the IEEE International Conference on Computer Vision 2015; pp 1197–1205.

[ref20] Finn C.; Abbeel P.; Levine S.. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In International Conference on Machine Learning; PMLR, 2017; pp 1126–1135.

[ref21] Koch G.; Zemel R.; Salakhutdinov R.. Siamese Neural Networks for One-shot Image Recognition ICML Deep Learning Workshop 2015; Vol. 2 (1), .

[ref22] Li H.; Eigen D.; Dodge S.. et al. Finding Task-Relevant Features for Few-Shot Learning by Category Traversal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2019; pp 1–10.

[ref23] Atkeson C. G.; Moore A. W.; Schaal S. Locally Weighted Learning for Control. Artif. Intell. Rev. 1997, 11 (1–5), 75–113. 10.1023/A:1006511328852. [DOI] [Google Scholar]

[ref24] Vinyals O.; Blundell C.; Lillicrap T. P.. et al. Matching Networks for One Shot Learning. In Advances in Neural Information Processing Systems 2016.

[ref25] Lin T.; RoyChowdhury A.; Maji S.. Bilinear CNN Models for Fine-Grained Visual Recognition. In Proceedings of the IEEE International Conference on Computer Vision ICCV; IEEE, 2015; pp 1449–1457.

[ref26] Lin T. Y.; RoyChowdhury A.; Maji S. Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40 (6), 1309–1322. [DOI] [PubMed] [Google Scholar]

[ref27] Kong S.; Fowlkes C.. Low-Rank Bilinear Pooling for Fine-Grained Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR 2017; pp 365–374.

[ref28] Sun Q.; Wang Q.; Zhang J.; et al. Hyperlayer bilinear pooling with application to fine-grained categorization and image retrieval. Neurocomputing 2018, 282, 174–183. 10.1016/j.neucom.2017.12.020. [DOI] [Google Scholar]

[ref29] Gao Y.; Beijbom O.; Zhang N.. et al. Compact Bilinear Pooling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016; pp 317–326.

[ref30] Wei X.; Zhang Y.; Gong Y.. et al. Grassmann Pooling as Compact Homogeneous Bilinear Pooling for Fine-Grained Visual Classification. In Proceedings of the European Conference on Computer Vision ECCV 2018; pp 355–370.

[ref31] Cui Y.; Zhou F.; Wang J.. et al. Kernel Pooling for Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVRP 2017; pp 2921–2930. [DOI] [PMC free article] [PubMed]

[ref32] Hacıefendioğlu K.; Demir G.; Başağa H. B. Landslide detection using visualization techniques for deep convolutional neural network models. Nat. Hazards 2021, 109 (1), 329–350. 10.1007/s11069-021-04838-y. [DOI] [Google Scholar]

[ref33] Lin F.; Chang W. Y.; Lee L. C.. et al. Applications of Image Recognition for Real-Time Water Level and Surface Velocity. In 2013 IEEE International Symposium on Multimedia; IEEE: Anaheim, CA, USA, 2013; pp 259–262. [Google Scholar]

[ref34] Mettes P.; Tan R. T.; Veltkamp R.. On the Segmentation and Classification of Water in Videos. In 2014 International Conference on Computer Vision Theory and Applications VISAPP; IEEE: Lisbon, Portugal, 2014.

PERMALINK

Image Recognition of Mine Water Inrush Based on Bilinear Convolutional Neural Network with Few-Shot Learning

Shuai Zhang

Yuanze Du

Yingwang Zhao

Lifu Zhou

Abstract

1. Introduction

2. Methods

2.1. Network Structure

Figure 1.

Figure 2.

2.2. Bilinear Feature Extraction Network

Figure 3.

2.3. Cosine Similarity based on the Bilinear Neural Network

Figure 4.

2.4. Loss Function Calculation

3. Experimental Analysis

3.1. Data Set Analysis

3.1.1. Data Set Division

Figure 5.

3.1.2. Data Augmentation

Table 1. Data Set Distribution.

3.2. Evaluation Results

Figure 6.

Table 2. Accuracy of Different Training Methods.

3.3. Influence of the Fully Connected Layer on the Prediction Results

Figure 7.

3.4. Comparison of the Two Feature Extraction Methods for Different CNN Models

Figure 8.

Figure 9.

Figure 10.

4. Discussion

5. Conclusions

Acknowledgments

Notes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases