Skip to main content
Heliyon logoLink to Heliyon
. 2024 Mar 3;10(6):e26169. doi: 10.1016/j.heliyon.2024.e26169

Can public opinions improve the effect of financial early warning ? -- an empirical study on the new energy industry

Ziya Yang a, Yucheng Zhu b,1, Jiaxin Chen a,1, Songyan Xie a,1, Cheng Liu c,
PMCID: PMC10965472  PMID: 38545220

Abstract

Public opinion will significantly affect investor decision-making and stock prices, which ultimately has an impact on the long-term development of the new energy industry. This paper mainly aims to delve in the impact of public opinion on the efficacy of financial risk early warning effect and try to establish an enhanced financial risk early warning model for the new energy list companies. To achieve this, we collect the financial data and public evaluation texts of 185 new energy listed companies, converting the text into emotional indicators which are combined with financial indicators to build a financial risk early warning model for new energy listed companies. The contributions of this paper are as follows: (1) The experiment validation demonstrates that the combination of 7 deep learning models and Bagging algorithm highly improves the accuracy of the sentiment analysis model, achieving an accuracy of 84.09%. (2) The accuracy of financial early warning models is generally enhanced after adding sentiment indicators, among which the accuracy of the BP neural network model reached 95.78%. (3) Through clustering analysis, the evaluation models can productively divide the warning intervals, thereby bolstering the interpretability and applicability of early warning results. Therefore, we suggest that when establishing the financial early warning system, it's necessary to take public opinions into consideration. Aside from improving the early warning effect, it also can be used as a separate indicator for daily monitoring.

Keywords: New energy, Early warning system, Sentiment analysis, Fanatical analysis, Data mining, Deep learning

1. Introduction

In recent years, the new energy industry unprecedented growth, playing a crucial role in achieving industrial transformation and reducing carbon emissions [1]. However, an increasing number of new energy enterprises have resulted in financial crises due to the lagging development of the industry, inefficiency policy subsidy funds, and imperfect energy related systems [2,3]. Instituting a robust early warning system could empower these companies to anticipate potential financial crises in advance. Nonetheless, the existing early warning system has many limitations. For instance, the typical early warning system relies exclusively on financial indicators, without taking into account the role of non-financial indicators. Moreover, these systems lack the capability to offer real-time market conditions monitoring restricting outputs to semi-annual and annual reports. Besides, most studies rely on basic statistical models. These models not only lack precision but also remain incompatible with pre-trained architectures, which need refinements to update the data.

At present, the mainstream financial early warning models can be divided into several categories, including statistical methods, deep learning models, neural network models, and combination models of multiple methods [4]. But most of them exist certain limitations. For single variable early warning model, it focuses on one variable, which often leads to errors [5]. For logistic models, while mathematically explainable, relies heavily on historical data, reducing the ability of dynamic early warning. Besides, the logistic model has strict assumptions about the samples, which cannot be well matched with reality [6,7]. Other models like SVM (support vector machine) are accurate but not always right for specific tasks and can be tricky to use its algorithm, requiring quadratic planning in memorizing complex tasks [8]. Compared to these, neural networks have strong adaptability, better generalization ability, and no strict constraints on data. They are widely used in bankruptcy forecasting, financial market forecasting, and securities markets [9]. Additionally, the division of the enterprises is not objective. The early warning system in the past can only divide companies into crisis enterprises and non-crisis enterprises, which is not realistic [[10], [11], [12]].

The stock market remains acutely responsive to market sentiment. With the development of big data analysis, many studies now include sentiment indicators in the financial early warning. Experiments show that the sentiment indicators enhanced the efficacy of financial early warning models [13]. Apart from improving the model accuracy, sentiment analysis offers companies insights into understanding the corporate image and identifying potential PR crises, automating weekly and monthly reports as a separate market warning indicator. Nonetheless, most research tends to rely on the financial emotion binary dataset instead of the emotion dictionary. While these dictionaries are user-friendly, they have inherent limitations, often leading to compromised accuracy [14]. Utilizing deep learning's capability to make full use of contextual information, multi-layered neural networks can be employed to extract data features, thus achieving better learning performance. By using pre-trained models, it saves subsequent development time, reduces the difficulty of use, and enhanced model accuracy [15,16]. Consequently, our experiment adopts 7 deep learning models and combines the Bagging algorithm. In this way, the accuracy of model and the optimization of the early warning model are highly elevated.

Therefore, this paper delineates three objectives.

  • 1.

    Enhancing the accuracy of sentiment analysis models tailored to Chinese financial texts.

  • 2.

    Assessing the potential of investor emotion indicators in refining financial risk early warning model.

  • 3.

    Segment early warning intervals with great precision.

Based on the above objectives, our strategy is to combine seven deep learning models with the Bagging algorithm for sentiment analysis and add the sentiment indicators when building a financial early warning system. To segment the warning intervals, we deploy clustering techniques and the TOPSIS model. Concurrency, different comparisons are made in the selection of dimension reduction methods and early warning models. To this end, we have collected the financial statement of 185 new energy listed companies in 2021 and their related comments texts. After sentiment analysis, the public sentiment indicators were calculated and then added to the financial data for comparison. One data includes only 24 financial indicators, and the other adds public sentiment indicators to financial indicators. The data after dimension reduction is used as the input of the classification models. Additionally, the cluster analysis model is used to divide up the warning periods so that the accuracy may be calculated and compared to the classification model's result. Finally, Our study culminates in a comprehensive financial evaluation, supplemented with strategic recommendations.

2. Literature review

2.1. Financial early warning model

2.1.1. Traditional statical model

In 1932, Fitzpatrick built a single variable early warming model for the first time. Beaver discovered that cash flow and total debt ratio had good ability to differentiate during the entire five years in his first relatively complete statistically based single variable discriminate model he presented in 1966. However, the prediction ability of the current asset ratio was much weaker [17]. In 1968, Altman collected 66 listed companies as samples to construct a multivariate discriminant model based on 22 initial financial ratios of these companies. He applied it to bankruptcy identification and financial analysis. This led to the proposed Z-score discriminant model [18]. In 1980, Ohlson began to use the logistic model to predict financial distress [19]. In 2020, Toan Luu Duc Huynh et al. employed Transferring Entropy, a non-parametric statistic, to explore the possible role of cryptocurrencies in financial modeling and risk management within the energy markets [20].

2.1.2. Machine learning model

The rise of computer science has seen an increase in the application of machine learning and deep learning for financial analysis. In 1990, Marcus D. Odom and Ramesh Sharda compared the effect of neural network and multiple discriminant analysis in financial early warning. He introduced neural network into financial early warning for the first time, demonstrating its superiority over discriminant analysis in smaller datasets [21]. Endri built four early warning models based on SVM to predict the delisting of Islamic stocks (ISSI) [22]. Li and Jingxiang used embedded feature selection technology and introduced LASSO penalty into standard SVM. According to the results, L1-SVM shows better an ability to eliminate redundant features when classifying [23]. Samitas et al. use SVM to build the EWS (early warning system) for the financial crisis, it has also confirmed that EWS with SVM can effectively trigger the early warning signals [24]. Tan et al. compared the Probit model and the ANN model in predicting financial distress of credit cooperatives, finding that ANN model is better than Probit model [25]. Romil Rawat et al. utilize machine learning model to detect Emotet malware infections and identify Emotet-related congestion flows in the finance industry since 2014 [26]. Sun, Xiaojun, and Yalin Lei creatively divided the warning intervals through cluster analysis and the osculating value method. They built BP neural network early warning model for China's listed mining companies. This classification method can help people better understand the early warning level of enterprises, with an accuracy rate of 87.5% [27]. Du, Guansan, Zixian Liu, and Haifeng Lu used the genetic algorithm to optimize the BP neural network. By scientifically dividing the warning interval using the '3σ′ law, they achieved a remarkable 97% accuracy." [28].

2.2. Sentiment analysis

Sentiment analysis methods mainly include emotion dictionaries and machine learning [29]. Al-Ghuribi et al. Based on emotional dictionaries, use TF-IDF weighting to calculate emotional intensity of text, achieving favorable results on large-scale unlabeled text [30]. Maqsood et al. extracted more than 5000 English words according to the stock-related evaluation statistics of the Twitter platform, and used the Sent WordNet dictionary to divide the Twitter vocabulary into positive, neutral, and negative emotions, and classified the emotions of the Twitter evaluation results, though these methods face challenges from the emotional categories of texts [31]. However, for emerging words, the emotional dictionary may not be able to update timely.Tim Loughran and Bill McDonald highlighted the inadequacy of general dictionaries for financial contexts. Almost three-quarters of the words recognized as negative by the general dictionary were not usually considered negative in the financial text [32]. However, Jie Wang et al. pointed out that data annotation in the financial field is difficult and expensive, which also blocks sentiment analysis models that use manually annotated financial text [33]. With the advent of deep learning, newer models improved text analysis capabilities. Bo Wang and Min Liu proposed a convolutional neural network based on multi-layer attention mechanism for sentence relationship classification, which can better mine structured features in text [34]. Min Yang et al. propose a Chinese-text sentiment analysis model based on Elmo and RNN, which uses the Elmo model to learn the pre-trained corpus. Experimental results show that Elmo-RNN can effectively improve the accuracy of text sentiment analysis [35].

2.3. Application of sentiment analysis in the financial field

Asur and Huberman's work suggested that stock market trends might be discerned from text data mining [36].He Z et al. investigate the effect of investor risk compensation (IRC) on stock market returns and the role of investor sentiment in influencing the link between IRC and stock returns and found that investor sentiment has a significant impact on stock returns [37]. Z Jin et al. used investor sentiment and Empirical Mode Decomposition (EMD) to break down the complex stock price series and employed LSTM to analyze the relationships among time series data and confirm that investor sentiment can effectively improve the accuracy of close price prediction [38]. Tang et al. proposed a Target-Dependent LSTM, dividing the input text into two parts, and sending them to LSTM according to the position of the target, which improves the performance of traditional LSTM models [39]. Salunkhel's research blended transfer learning and regression for sentiment prediction. The transfer learning method uses BERT and different regression methods, of which linear support vector regression performed the best [40]. S Wu et al. proposed S–I-LSTM, a stock price prediction method that combines multiple data sources and investor sentiment [41]. AB Eliacik et al. propose a method that utilizes a PageRank-based algorithm to identify influential users and analyze sentiment polarity in topic-based microblogging communities. The study validates the effectiveness of the approach using real-world Twitter datasets and shows its correlation with the behavior of the stock exchange [42]. In 2022, Marco Ortu et al. employed four deep learning algorithms (MLP, CNN, LSTM, ALSTM) and three types of features (technical, transaction, and social indicators) to predict cryptocurrency prices. They used a Bidirectional Encoder Repre-sentations from a transformer (BERT) to extract social media indicators. The most important discovery for hourly frequency classification of unrestricted models is that adding transaction and social media indicators to the model can effectively improve the average accuracy, accuracy, recall rate, and f1 sco

re [43].

The new energy industry is developing rapidly, but its financial security remains concerning. However, the existing financial risk early warning system has the following problems. The first problem is the simplification of early warning indicators. When selecting early warning indicators, people focus only on financial indicators and ignore the role of other indicators, such as investor sentiment indicators, macroeconomic indicators). Secondly, the division of early warning intervals is not reasonable. Most previous studies divided companies into two categories (health and crisis) based on the label of whether the company was specially treated. However, we think that this classification method is so crude and not objective. Besides, the above research shows that public opinion has a significant impact on the stock price of listed companies. Sentiment analysis of stock market texts is widely used in both stock forecasting and market supervision. However, there are some unsolved problems in these studies. For instance, the amount of extracted text data is too small, resulting in low accuracy and no representativeness of the experiments. Besides, the vocabulary is not abundant in the process of constructing an emotion semantic thesaurus. Whereas the financial vocabulary in practical use is highly specialized and has rich emotional varieties. Finally, most of the research is conducted based on the sentiment lexicon and using a single model for analysis, which has low accuracy and cannot be applied widely.

To address these concerns, we plan to conduct sentiment analysis on the stock market public opinion text through 7 deep learning models with Bagging algorithm. The sentiment indicators are combined with the traditional financial indicators to construct a more comprehensive financial early warning model for the new energy enterprises. Given the limitations of statistical models, our focus will remain on deep learning frameworks. Additionally, we propose a more nuanced companies classification methods using cluster analysis model and the Entropy-Topsis evaluation model. We conclude with an empirical analysis of the early warning effect.

3. Materials and methods

3.1. Data

3.1.1. Sample selection

Firstly, we obtained 192 stocks of new energy listed companies updated on September 24, 2022, from the Huaxi Security website as the pre-selected sample. We subsequently gathered the 2021 financial statements for these entities from the Cninfo website, QianZhan database, and NetEase Finance website. However, due to incomplete statements of several companies, our final sample consisted of 185 new energy companies. Given our objective to assess the impact of public sentiment indicators on financial early warning systems, we also obtained relevant comments on these 185 listed companies from http://guba.eastmoney.com in 2021, as shown in Fig. 1. Through the method of manual annotation, we randomly selected the comments of listed companies other than the samples, evaluated their emotional valence—either positive or negative—and subsequently created a Chinese financial sentiment binary dataset encompassing 8392 comments. In fact, we also considered other popular websites for investors to communicate and obtain information, such as XueQiu, Straight Flush, Guotai Junan, etc. However, we opted against these platforms as their textual data is often extensive, intertwining relevant and irrelevant information, thus complicating the analysis process. As a large investor website in China, Eastmoney website owns a huge scale of text data. And most of its text is sentence level, which is easier to collect and analyze, greatly reducing labor costs.

Fig. 1.

Fig. 1

Comments in the Eastmoeny guba.

3.1.2. The indexes of financial early warning system

Based on the previous research and the investigation of the new energy listing companies, we decided to build the model from 6 aspects, solvency, operating capacity, profitability, development ability, cash flow situation and public opinions. All the top 5 aspects are financial indicators. These 6 aspects and the related indicators are described below, which is shown as Table 1.

  • 1.

    Solvency: Solvency refers to the ability of an enterprise to repay various matured debts, which are generally divided into short-term solvency and long-term solvency. Timely repayment of matured debts stands as an important indicator for assessing a company's financial health. Analyzing solvency offers insights into a company's sustained operational capability, which is helpful to predict the future situation of enterprises.

  • 2.

    Operating capacity: operating capacity reflects the status of capital turnover, which can be analyzed to understand the business status and management level of the enterprise. The turnover of funds is closely related to each link of the enterprise's supply, production, and sales. Any problem in the link will affect the normal turnover of funds, subsequently affecting the profitability and financial status of the enterprise.

  • 3.

    Profitability: Profitability refers to the ability of an enterprise to obtain profits. Profitability is an important business objective of an enterprise and the material basis for its survival and development. It is essential guarantee for businesses to repay debts, in addition to being related to the interests of business owners.

  • 4.

    Development capacity: This aspect assesses a company's growth potential within its operational purview—encompassing scale augmentation, consistent profit surges, and an enhancement in market competitiveness. Through the analysis of the development capacity of enterprises, we can deduce their development potential and predict their business prospects, thus providing an important basis for managers and investors to make decisions.

  • 5.

    Cash flow: This refers to the cash inflow, cash outflow and the total amount generated from operating activities, investment activities and financing activities. It can be utilized to evaluate the enterprise's operational health, the availability of cash to repay debts, and the asset's liquidity.

  • 6.

    Public opinions: Through sentiment analysis of text, we can obtain public sentiment indicators and understand how the public evaluates a company. Investors are very sensitive to the rise and fall of a company's stock price. And their perspectives reflect whether they are optimistic about a company, furnishing us with invaluable insights.

Table 1.

Financial indicators.

Secondary indicators Tertiary indicators
Solvency capacity Equity ratio (x1)
Debt to equity ratio (x2)
Cash flow ratio (xx3)
Current ratio (4)
Quick ratio (x5)
Asset liability ratio (x6)
Operational capacity Accounts receivable turnover rate (x7)
Inventory turnover rate (x8)
Current assets turnover rate (x9)
Fixed assets Turnover rate (x10)
Total asset turnover rate (x11)
Profitability Ratio of profits to Cost (x12)
Net assets per share (x13)
Earnings per share (x14)
Operating profit margin (x15)
Net profit margin (x16)
Gross profit margin (x17)
Return on total assets (x18)
Development capacity Operating revenue growth rate (x19)
Total asset growth rate (x20)
Net asset growth rate (x21)
Cash flow Ratio of net operating cash flow to sales revenue (x22)
Return on operating cash flows of assets (x23)
Ratio of net operating cash flow to liabilities (x24)
Public opinions Sentiment indicator 1 (x25)
Sentiment indicator 2 (x26)

3.2. Experimental Frame

The entire experimental framework encompasses three parts, namely semantic analysis, data processing, and model construction, which is shown in Fig. 2.

Fig. 2.

Fig. 2

Experimental flow chat.

Step 1. Semantic analysis: After data collection, the initial three data sets of the first step are formed, that is, financial statement data, public comments of listed companies and manually annotated data. Through the comprehensive evaluation of 7 deep learning models and Bootstrap Aggregating algorithm [44], the sentiment indexes are finally obtained. Combine it with the financial statement data set, one with sentiment indicators and the other without sentiment indicators. In this section, we will explore how to enhance the accuracy of sentiment analysis models based on Chinese financial text.

Step 2. Data processing: This step consists of two parts, dimensionality reduction and clustering. To discern the superior method, the dimensionality of the two data was reduced by 9 models and clustered by K-mean [45]. The best one is used as the model input of third step. Clustering effectively categorized the 185 firms into five distinct classes. The Entropy-TOPSIS model [46] was employed to assess these categories, subsequently determining the financial standing of each. This segment is instrumental in segmenting the early warning model's intervals.

Step 3. Model construction: The results of the last evaluation step are used as labels. Finally, Decision tree [47], Naive Bayes [48], K-NN [49], SVM [50] and BP neural network [51] models are used to conduct classify experiments. Comparing the output of the model with the labels, we can obtain the accuracy of every financial risk early warning model. Besides, in this step, we aim to explore the optimization effect of the investor emotion indicator for the financial early warning model.

3.3. Method

3.3.1. Bootstrap aggregating

Originally proposed by Leo Breiman in 1996, Bootstrap aggregating (Bagging) is a group learning algorithm that generates different predictors to obtain aggregate predictors, which is one of the most efficient computationally intensive processes to improve unstable estimators or classifiers [52]. The Bagging algorithm can be combined with other classification algorithms and regression algorithms to improve its accuracy and stability, also to avoid overfitting by reducing the variance of the results. Within sentiment analysis, we harness the bagging algorithm to make comprehensive evaluations based on seven deep learning models.

As can be seen from Fig. 3, the training set of bagging's individual weak learner is obtained by random sampling. Through random sampling of t times, t sample sets with a sample size of m are obtained. The t sample sets are trained independently to obtain t base models, with these models operating in parallel, facilitating simultaneous training. Finally, we can get the strong learner through the aggregation strategy for these t basic learners. The process of self-sampling can be represented by equation (1). Then, the following uses the regression function to illustrate the advantages of Bagging over a single model.

ycom(x)=1mm=1Mym(x) (1)

Where, ycom(x) is output of the aggregated predictor, ym(x) is output of the MTH base learner, and M is number of base learners.

Fig. 3.

Fig. 3

The structure of Bagging.

Assuming that the regression function to be predicted is h(x), The output of each model can be written as a true value plus an error, such as equation (2). Where, m(x) is error of the MTH base learner.

ym(x)=h(x)+m(x) (2)

The sum of squared mean errors of a single model is calculated by equation (3)

E(x){[ym(x)h(x)]2}=Ex[m(x)2] (3)

Where, E(x){[ym(x)h(x)]2} is expected squared error of a single model, and Ex[m(x)2] is variance of the error of the MTH base learner.

The sum of squared average errors for all models is calculated by equation (4)

EAV=1Mm=1MEx[m(x)2] (4)

And the sum of the squared expected errors is shown in equation (5). The above results show that the average error of a model can be reduced by M factors by simply averaging M versions of the model, which means bagging reduces the bias.

Ecom=Ex(ycom(x)h(x))2 (5)
=Ex(1mm=1Mym(x)h(x))2
=Ex(1mm=1M(h(x)+m(x)h(x))2
=Ex(1mm=1Mm(x)+h(x)h(x))2
=Ex(1mm=1Mm(x))2
=1Mm=1MEAV

3.3.2. Entropy – TOPSIS

Since the actual financial status of new energy listed companies is unknown, the final classification model's precision remains elusive. Therefore, before establishing the early warning model, it is necessary to divide the actual financial status of these listed companies to compare with the output of the model. First, dimensionality reduction of two dataset are performed on the datasets to truncate redundant features with retaining the data information. Besides, K-means clustering is used to divide the 185 listed companies into 5 categories. After that it is supposed to evaluate and rank these 5 types of companies, namely health, good, normal, mild warning, serious warning.

For the selection of evaluation models, we chose Entropy-TOPSIS. TOPSIS was proposed by Hwang and Yoon in 1981, and the standard TOPSIS method attempts to obtain the scheme that is closest to the best point and farthest from the worst point. The entropy method is simply utilized to determine the weights of individual indicators [53]. TOPSIS requires that all indicators be positive, that is, the larger the value, the better the situation.

Hence, we need to deal with negative indicators before calculating. After the processing of the negative indicators, all data is normalized and the average value of the individual indicators for all listed companies in each category is calculated. Subsequent steps involve determining the optimal and suboptimal values and gauging the distance between each category and these two points. Because all the data has been positive, so the optimal values is the maximum value of each indicator, and the worst value is the minimum value. Based on the results, we can conduct a comprehensive evaluation. The higher the comprehensive evaluation score, the better the financial situation of the company.

  • 1.

    The specific steps of the entropy weight method are as follows

Step 1. Standardization of indicators.

Equation (6) is used for standardization of positive indicators and equation (7) is used for standardization of negative indicators.

Di,j=D1,jmin{D1,j,,Dn,j}max{D1,j,,Dn,j}min{D1,j,,Dn,j} (6)
Di,j=max{D1,j,,Dn,j}Di,jmax{D1,j,,Dn,j}min{D1,j,,Dn,j} (7)

Where, Di,j is the original value of indicator j for scheme i, n is total number of schemes.

  • Step 2. Calculate the information entropy of the evaluation index by equation (8)

Hi=Kj=1mfij*lnfij,i=1,2,,n (8)

Where, fij is standardized value of indicator j for scheme , fij=Diji=1mDij, k=1lnm,0Hi1, and stipulate that when fij=0,limfijlnfij=0;

Step 3. Calculate the weight of each indicator by equation (9).

wi=1HinHi,i=1,2,3,,n (9)
  • 2.

    The specific steps of Topsis method are as follows:

Step 1. Standardization of indicators.

The formulas for standardization of indicators align with the entropy weight method, shown as equation (6) and equation (7).

Step 2. Determine the positive and negative ideal solutions of the evaluation object by equation (10).

Determine the positive ideal solution R+={r1,r2,,rn} and the negative ideal solution R={r1,r2,,rn}. For positive indicators, the positive ideal solution is its maximum and the negative ideal solution is its minimum value. For negative indicators, the positive ideal solution is its minimum and the negative ideal solution is its maximum.

rij=wij*cij (10)

Where, wij is weight of indicator i, cij is standardized value of indicator j for scheme i after weighting.

Step 3. Calculate the Euclidean distance by equation (11) and equation (12).

The Euclidean distance to R+ using different decision schemes is denoted S+, and the distance to R is denoted S.

S+=j=1n(rijrj+)2 (11)
S=j=1n(rijrj)2 (12)

Step 4. Calculate the score by equation (13).

Calculate the proximity to the positive ideal solution, denoted as Ci+(i=1,2,,m).The larger Ci+ indicates that the decision is closer to the positive ideal solution, and the smaller Ci+ indicates that the decision is closer to the negative ideal solution.

Ci=sisi+si+ (13)

3.3.3. Back propagation neural network

The BP neural network (back propagation neural network) algorithm is a multilayer feedforward network trained according to the error backpropagation algorithm, making it one of the most widely used neural network models [54]. The core idea of BP neural networks is gradient descent, which minimizes the sum of squared errors between the actual output value and the desired output value [55]. The process of BP neural network is mainly divided into two stages. The first stage is the forward propagation of the information, from the input layer, through the hidden layer, and finally to the output layer. The second stage is the backpropagation of the error, from the output layer, moving to the hidden layer, and ultimately to the input layer. It adjusts the weight and bias of the layer.

For comparison purposes, we simultaneously construct decision trees, Naive Bayes, KNN, SVM models. Fig. 3 shows the structure of a neural network.

Fig. 4 shows the structure of BP neutral network. The main training process of BP neural network mainly includes the following steps.

Fig. 4.

Fig. 4

The structure of BP neutral network.

Step 1. Network initialization.

  • 1)

    Determine the input vector: x=[x1,x2,,xn]T (n is the number of neurons in the input layer).

  • 2)

    Determine the output vector y and the desired output vector o , y=[y1,y2,yq]T (q is the number of neurons in the output layer), o=[o1,o,oq]T.

  • 3)

    Determine the hidden layer output vector: b=[b1,b2,bp]T (p is the number of neurons in the hidden layer))

  • 4)

    Initialize the connection weight from the input layer to the hidden layer. wij=[w1j,w2j,wtj,wnj]T, j=1,2,,p.

  • 5)

    Initializes the connection weights from the hidden layer to the output layer. wjk=[w1k,w2k,wtk,wpk]T,k=1,2,,q.

Step 2. The calculation of the hidden layer bj.

  • 1)

    Calculate the activation value sj for each neuron in the hidden layer by equation (14).

sj=i=1nwijxiθj,(j=1,2,,p) (14)

2) Calculate the output value of the hidden layer j unit. Substitute the activation value of equation (14) into the activation function equation (15) to obtain the output value of the hidden layer j.

bj=f(sj)=11+e(i=1nwijxi+θj),(j=1,2,,p) (15)

Where wij is the weight of the input layer to the hidden layer, θj is the hidden layer unit

threshold value.

Step 3. The calculation of the output layer yk;

  • 1)

    Calculate the activation value of each neuron in the output layer sk by equation (16).

sk=i=1pwjkbjθk,(j=1,2,,q) (16)
  • 2)

    Calculate the actual output value yk for the k cell of the output layer by equation (17)

yk=f(sk)=11+e(j=1pwjkbj+θk),(k=1,2,,q) (17)

Where wjk is the weight of the hidden layer to the output layer, θk is the threshold for the output layer

Step 4. Determine whether the iteration is over.

Evaluate if the error between the network output yk and the expected output ok meets the accuracy requirements. If satisfied, the network iteration ends. Otherwise, employ steps 5. and steps 6. to update the weights and thresholds. Then recalculate the network output by equation (14) ∼ (17) and compare it with the expected value, until the accuracy requirements are reached

Step 5. Weight update

wij(t+1)=wij(t)+η[(1β)D(t)+βD(t1)],(i=1,2,,n) (18)
wjk(t+1)=wjk(t)+η[(1β)D(t)+βD(t1)],(j=1,2,,p) (19)

Through equation (18) and equation (19), the weight is updated, where η is the learning rate, and >0 , D(t)=Jwij(t), D(t)=Jwjk(t) , β is the momentum factor and 0β1.

Step 6. The threshold updates.

Update θj and θk by equation (20) and equation (21), respectively, based on the error between the network output yk and the expected output ok.

θj(t+1)=θj(t)+ηbj(1bj)k=1qwjk(okyk) (20)
θk(t+1)=θk(t)+(okyk) (21)

4. Results

4.1. semantic analysis results

We collect the relevant comments of 185 new energy listed companies from http://guba.eastmoney.com during 2021's last quarter. Each company had approximately 16,000 comment sentences. Considering the particularity and professionalism of financial vocabulary, we collect 8392 reviews of other listed companies and divide positive and negative evaluations through manually annotating. We then divided the data into a training set, test set, and validation set using a 7:2:1 ratio.

Fig. 5 below provides review samples. Despite the abundance of slang and specialized vocabulary, all comments can be split into sentences, significantly simplifying the difficulty of manual labeling. For example, the comment “No fun, the empty dog!” represents a negative emotion. The "empty dog" means the bears hired by someone to deceive people to quickly sell their chips with a certain purpose. Besides, comments like “The market is red, why are you still green?” also represents a negative emotion. In China, "red" and "green" symbolize the stock prices' rise and fall, respectively. In the process of labeling, we often see some comments like "diving champion" "Start diving?!" "The little leeks were cut again" "The poor leek doesn't know what happened yet" Such comments. "Diving" refers to a sharp drop in a stock price that falls rapidly like diving into the water. "Leek" is used to describe retail investors, who are in a disadvantageous position in the stock market, and when major shareholders sell stocks at a high level, the money of retail investors is harvested like leeks.

Fig. 5.

Fig. 5

Examples of comments.

It is worth mentioning that the majority of previous studies used panel data, calculating the average value of the financial indicators in the previous three years, and using the label of whether the company received preferential treatment. Then find the relationship between the two to achieve the purpose of financial forecasting. But since the new energy is a fast-growing industry, it is impossible for the financial statistics of the previous three years to accurately reflect the current state of the businesses. Second, dividing all enterprises into two broad categories (crisis and health) is not realistic if the label is whether they are specially treated. In this paper, we divide the companies into 5 different situations, and we show that when a mild crisis occurs, companies must act to stop it from getting worse. On the other hand, this accomplishes the same goal as forecasting, namely, halting the crises' progress.

Seven deep learning models were trained on datasets, that is MLP [56], RNN [57], LSTM [58], GRU [59], CNN + LSTM [60], BiLSTM [61], TextCNN [62], and the accuracy of a single model was tested by the test set. Training set and test set are used to help train the models.

For classification models, confusion matrix is commonly used to evaluate the effect, and there are multiple evaluation indicators in the confusion matrix, which can evaluate the advantages and disadvantages of classification results from different angles.

  • TP: The actual category of the sample is positive, and the model prediction is positive.

  • FN: The actual category of the sample is positive, but the model prediction is negative.

  • FP: The actual category of the sample is negative, but the model prediction is positive.

  • TN: The actual category of the sample is negative, and the model prediction is negative.

According to Fig. 6, the confusion matrix, we can calculate the following evaluation indicators.

  • Precision: The proportion of correctly predicted positive samples out of all predicted positives, which is calculated by equation (22). This metric assesses the accuracy of the detector in terms of successful detection.

  • Recall: The proportion of actual positive samples that are accurately predicted as positive, shown as equation (23). It evaluates the coverage of the detector across all objects subject to inspection.

  • Accuracy: The proportion of samples that the model predicts to be correct to the total sample. The value is calculated trough formula (24).

Fig. 6.

Fig. 6

Confusion matrix.

F-Score: Given that there's often a trade-off between precision and recall, the F-Score serves as a comprehensive index to balance the two, calculated by equation (25). It evaluates the classification model more comprehensively. A larger F-score indicates a higher quality model.

Precision=TPTP+FP (22)
Recall=TPTP+FN (23)
Accuracy=TP+TNTP+FN+FP+TN (24)
Fscore=(1+β2)Precision*Recallβ2*Precision+Recall (25)

After the model training, use the validation set to verify the models. MLP, RNN, LSTM, GRU, CNN + LSTM, BiLSTM, TextCNN and the mixed model that uses the Bagging algorithm to conduct a comprehensive evaluation based on the seven deep learning models are respectively evaluated on the validation set to get the experimental results. The classification results are shown in Fig. 7. And the results of cross-validation are shown as Fig. 8. According to the confusion matrixes, we can calculate the evaluation metrics of the test set, which is shown in Table 2. The accuracy of a single model is close to 80%. Notably, the accuracy of comprehensive evaluation using the Bagging algorithm achieves 84.09%. Hence, we selected the mixed model to carry out subsequent experiments.

Fig. 7.

Fig. 7

Confusion matrixes of the classification results.

Fig. 8.

Fig. 8

Cross-validation.

Table 2.

Matrixes of classification results.

Models Precision Recall F1-score Accuracy
BiLSTM 0.82 0.80 0.81 0.7809
TextCNN 0.82 0.83 0.82 0.7941
CNN + LSTM 0.83 0.78 0.81 0.7819
GRU 0.82 0.80 0.81 0.7839
LSTM 0.83 0.83 0.83 0.7970
MLP 0.82 0.78 0.80 0.7719
RNN 0.81 0.75 0.78 0.7508
Model mixed 0.8409

Based on the mixed model, we conduct semantic analysis on 185 companies, and obtained positive and negative evaluations of several texts. The number of positive evaluations was P, and the number of negative evaluations was N. Through formula (26) and formula (27), sentiment index 1 and sentiment index 2 were finally calculated. The sentiment indexes are then added to the original financial indicator data for comparative analysis.

sentimentindex1=PNP+N (26)
sentimentindex1=PP+N (27)

4.2. Dimension reduction and clustering results

Dimension reduction chiefly aims to minimize redundancy while preserving dominant data constituents—namely, feature vectors with significant difference in data distribution. To avoid data variability and reduce the complexity of the model, L1 regularization is first used. Most of the financial early warning models are constructed using dimensionality reduction methods such as principal component analysis or factor analysis. Here, we use varied methodologies to explore the impact of different dimension reduction methods on the clustering effect.

Through DNN [63], T-sne [64], LLE [65], PCA [66], IPCA (Incremental PCA) [67], KPCA (Kernel PCA) [68], Factor Analysis [69], SparsePCA [70], TruncatedSVD [71], these 9 models to reduce the dimension from 24 to 26 to 6 dimensions, respectively. Subsequent to this, K-means clustering method is carried out and the results were shown in Fig. 9. The results unequivocally demonstrate that the clustering performance of the Deep Neural Network (DNN) method outperforms the others, as evidenced by the greater dispersion of data points among the five categories. In contrast, the outcomes of Factor Analysis, Kernel PCA (KPCA), SparsePCA, TruncatedSVD, and Locally Linear Embedding (LLE) exhibit comparatively lower clustering quality, with distinct categories displaying higher cohesion in the visual representation. According to the model comparison, DNN is finalized as the dimension reduction model which is realized by AE (autoencoder), and the model parameters are shown in Table 3.

Fig. 9.

Fig. 9

The results of dimension reduction and clustering.

Table 3.

Model parameters of DNN.

Layer (type) Output Shape Param #
Input_1(InputLayer) [(None, 26)] 0
batch_normalization (BatchNo (None, 26) 104
dense (Dense) (None, 256) 6912
dense_1 (Dense) (None, 128) 32896
dense_2 (Dense) (None, 64) 8256
dense_3 (Dense) (None, 10) 650
dense_4 (Dense) (None, 6) 66
dense_5 (Dense) (None, 10) 70
dense_6 (Dense) (None, 64) 704
dense_7 (Dense) (None, 128) 8320
dense_8 (Dense) (None, 256) 33024
dense_9 (Dense) (None, 26) 6682

4.3. results of early warning model

Through dimension reduction and clustering, the original 185 rows and 24 columns of financial indicator data become 5 rows and 6 columns. Likewise, the data consisting of 185 rows, 26 columns of financial metrics, coupled with 2 sentiment metrics, also reconfigured into 5 rows with 6 columns. Those are 5 types of companies, each with 6 indicators. Since Entropy-TOPSIS requires all indicators to be forward, which means that the higher the indicator value, the better. However, discerning the directionality of reduced-dimension data is not straightforward. Hence, we restore the indicators of companies to 24 and 26. Then, evaluated them on this basis. The next step is to make all indicators positive and calculate the average value of each metric for each category. After determining the weight of each indicator and the best and worst points, calculate the positive ideal solution distance and negative ideal solution distance of each type of company, and finally give a comprehensive score. The higher the score, the better the financial position of the company.

The evaluation results of the five types of companies without sentiment index were 0.6793, 0.5454, 0.5118, 0.3766 and 0.2616, respectively, corresponding to health, good, normal, mild warning, and serious warning. The evaluation results added sentiment index are 0.6547, 0.5409, 0.5375, 0.3855 and 0.2516 respectively. The sorting of the two data is the same, just the scores are different. The ranking of listed companies and the number and proportion of each category are shown in Table 4.

Table 4.

Financial status evaluation results.

Level I II III IV V
Financial Status Health Good Normal Mild warning Serious warning
Class of clustering 0 1 2 3 4
Number of each calss 83 30 40 21 11
Propotion 44.87% 16.21% 21.62% 11.35% 5.95%

Finally, the evaluation results are used as labels. Based on that, we construct the early warning model. Besides, we comprehensively evaluate the performance of several classification algorithms, including Decision Trees, Gaussian Naïve Bayes, KNN, SVM, and BP neural networks. These models were chosen due to their distinct characteristics and applicability in financial forecasting tasks. The dataset was dividied into training and test sets following a 7:3 ratio.

Results of the test set are shown in Table 5. Notably, our experimental analysis revealed the significant impact of incorporating sentiment indicators into the construction of our financial early warning model. The accuracy of the Decision Tree, KNN, Gaussian Naive Bayes, and BP neural network models ranged between 85% and 89%, while the SVM model achieved an impressive accuracy rate of 90%. Remarkably, the inclusion of sentiment indicators led to a noteworthy improvement in the accuracy of all models, with each surpassing the 90% accuracy threshold. On average, the model accuracy witnessed an enhancement of approximately 5.34%. Among all the models, the BP neural network exhibited exceptional performance, achieving an accuracy rate of 95.78% after the integration of sentiment indicators. This is particularly noteworthy given that previous studies reported an accuracy level of approximately 85%.

Table 5.

Model comparison.

Model Without sentiment indexes Added sentiment indexes
Decision Tree 0.8827 0.9107
KNN 0.8571 0.9285
Gaussian NB 0.8763 0.9464
SVM 0.9007 0.9239
BP 0.8892 0.9578

5. Discussion

5.1. Model improvements

In this paper, we address three primary concerns of financial risk early warning model and establish an enhanced financial risk early warning model for new energy enterprises. The core aspects of these improvements are.

  • 1)

    Firstly, how can we improve the effect of sentiment analysis model based on Chinese financial text? The availability of annotated Chinese public commentary text data to the financial domain is notably sparse. Compared with film evaluation text and shopping evaluation text, there are many industry slangs and professional words in the stock market and the types of emotions are more diverse. Through MLP, RNN, LSTM, GRU, CNN + LSTM, BiLSTM, TextCNN 7 sentiment analysis models and using Bagging algorithm to convert public opinion text into sentiment indicators. In previous studies, the accuracy of using emotional dictionaries and single model achieved roughly 65%–70%. Our approach combines 7 deep learning models with bagging algorithms, achieving a remarkable accuracy of 84.09%.

  • 2)

    Secondly, whether the investor sentiment indicators have a good optimization effect on the financial risk early warning model? Our experiments show that integrating sentiment indicators can enhance the model's accuracy. The accuracy of the BP neural network model with the highest accuracy reached 95.78%. In addition to optimizing the accuracy of the model, sentiment analysis can also monitor the company's public opinion separately, which can realize autonomous regular reporting, save labor costs, and prevent public relations crises. In a stock market, highly susceptible to information, market public opinion profoundly influences investor decisions, which in turn, can have significant implications for stock prices.

  • 3)

    Thirdly, how to divide the early warning intervals more accurately? Unlike previous studies that broadly categorized companies as ST or None-ST, our methodology offers a more nuanced approach. Traditional categorization methods directly linked the past financial situation with the future financial situation, which will lead to heuristic decision bias. We divide the company into five categories through cluster analysis and comprehensively score it through the evaluation model. The advantage of this is that the company can be divided into more categories, offering a more objective and comprehensive perspective. Previous divisions could only divide them into 2 categories, without a single difference in severity, simplifying complex situations.

5.2. Financial analysis of the new energy industry

Based on the experimental results, about 61% of the A-share new energy listed companies are in a good state or above. And 82.7% of the listed companies exhibited no signs of financial crisis. Therefore, we can reckon that the whole industry has good financial risk resistance ability and develop in a stable situation.

Companies listed in the first level report an operating profit margin that exceeds the industry average by 2.09 percentage points and is 1.86 times greater than companies in normal financial positions. Additionally, the net profit margin is 2.52 times higher than the industry average. Including gross margin and return on total assets, they are significantly higher than the average level. In terms of solvency, the current ratio and quick ratio of these healthy companies are 1.72 and 1.33 respectively. That is, each unit of current liabilities is guaranteed by 1.72 units of current assets, and each unit of current liabilities is guaranteed by 1.33 units of quick assets. Meanwhile, the accounts receivable turnover rate and inventory turnover rate are also much higher than the industry average. The liquidity of accounts receivable is an important factor affecting the credibility of the quick ratio, which also confirms why the quick ratio of these listed companies is so good. While these enterprises earn high profits, they also attach great importance to repaying debts on time and inventory management, to improve the efficiency of capital utilization.

Based on an analysis of 11 other companies in severe financial crisis, profitability indicators and operating capacity indicators are the most distinguishable. From the perspective of profitability indicators, the factors leading to the serious financial crisis are net profit margin, operating profit margin and the ratio of profit to cost. Among the 11 companies, 5 have negative net profit margins and operating margins, and 4 have a negative profit to cost ratio. Other positive ones are also generally low, with the highest net profit margin at just 3.08%. This may be because new energy projects generally require large investments and have a long project cycle. If the investment fails, the consequences caused by the inability to recover the capital investment are more serious than in other industries. In addition, because the new energy industry is an emerging industry, iteration is fast. In the process of investment project implementation, the enterprise does not have a unified reference standard, the management system is not perfect, and the implementation process lacks effective supervision, which may lead to high costs, poor operating effect, and ultimately business failure.

At the same time, the short-term solvency of these enterprises is poor, and the current ratio and quick ratio are lower than the industry average. Consequently, if current assets are not realized in time, a financial crisis will result as capital turnover becomes problematic. Furthermore, their accounts receivable account for a relatively high proportion and are growing every year. Coupled with poor operating capacity, accounts receivable turnover days and inventory turnover days are getting longer and longer, making it easy to fall into a debt crisis. Follow-up research can focus on these two aspects, focusing on analyzing the root causes of enterprise crises, and studying how to better monitor the changes of these two indicators while building an early warning system.

5.3. Limitations and future prospects

One of the primary limitations of our study pertains to the application of sentiment analysis. We acknowledge that Transformer models, known for their state-of-the-art performance in natural language understanding, typically demand access to extensive training datasets to achieve optimal results. However, due to the limited scale of the dataset, we were unable to leverage more advanced text analysis models. Future research could continuously expand the dataset to apply more advanced models or explore ways to overcome data constraints. Transfer learning and domain adaptation may also enhance sentiment analysis in data-scarce financial contexts.

Another significant limitation pertains to the construction of our financial early warning system. In our approach, we employed classification models to carry out financial early warning. This limitation is notable because it restricts the system's capability to facilitate vertical comparisons among companies within the same interval. Our model, while effective in providing risk assessments, does not offer fine-grained granularity in risk quantification. Therefore, we think that the application of predictive models may help it becomes feasible to precisely delineate thresholds and map the predicted scores of individual companies into distinct risk intervals. This approach could enable more nuanced and fine-grained comparisons among companies within the same category.

6. Conclusions

In this paper, drawing from the collected annual financial statements and associated reviews of 185 new energy companies listed in the A-share segment in 2021, we curated a dataset based on 24 financial indicators from 5 aspects. Through the manual annotation method, we randomly select the comments of other listed companies and label the comments with positive and negative sentiments. forming a unique Chinese financial sentiment binary dataset comprising 8329 comments. Based on this, the sentiment analysis is carried out. Utilizing seven deep learning models, coupled with Bagging algorithm, the accuracy of validation set reaches 84.09% - a marked 15% improvement over preceding research. Two sentiment indicators are obtained through sentiment analysis and added to the financial indicator dataset for comparative experiments. The BP neural network model has the maximum accuracy, with a score of up to 95.78%, according to the experimental analysis. Additionally, adding sentiment indicators has often increased the accuracy of other models. However, our study still exhibits certain limitations. Firstly, the scale of Chinese financial sentiment binary dataset is still quite small, limiting the application of other more advanced models such as Transformer. Secondly, in the construction of our early warning system, our choice of a classification model led to somewhat coarse results. Consequently, we propose that future research should continuously expand the dataset and the utilization of precision-enhanced models. Regarding the construction of early warning models, exploring predictive models could be considered to enhance the model's vertical comparability performance.

In light of our findings, we suggest that: while conducting regular annual report analysis, new energy listed companies should establish a financial early warning system. Dynamic early warning allows firms to detect financial crises in advance, enabling effective responses to prevent mild financial issues from escalating into severe crises. In addition, new energy companies should pay more attention to profitability indicators and operating capacity indicators, especially in contexts where accounts receivable metrics display unfavorable trends. If the proportion of accounts receivable continues to increase, and accounts receivable turnover continues to decline, managers should take measures.

Data availability statement

The data that support the findings of this study are available on request from the corresponding author, [CL], upon reasonable request.

CRediT authorship contribution statement

Ziya Yang: Writing – review & editing, Writing – original draft, Visualization, Validation, Software, Methodology, Investigation, Conceptualization. Yucheng Zhu: Validation, Software, Data curation. Jiaxin Chen: Writing – review & editing, Validation, Software, Data curation. Songyan Xie: Writing – review & editing, Software, Formal analysis, Data curation. Cheng Liu: Methodology, Investigation, Data curation.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors of this study sincerely thank Professor Liu Cheng of Sichuan Agricultural University for his guidance.

Contributor Information

Ziya Yang, Email: 202009344@stu.sicau.edu.cn.

Yucheng Zhu, Email: 202008084@stu.sicau.edu.cn.

Jiaxin Chen, Email: 202108991@stu.sicau.edu.cn.

Songyan Xie, Email: 202109324@stu.sicau.edu.cn.

Cheng Liu, Email: liucheng@sicau.edu.cn.

References

  • 1.Zou Caineng, et al. Energy revolution: from a fossil energy era to a new energy era. Nat. Gas. Ind. B. 2016;3(1):1–11. [Google Scholar]
  • 2.Xu Bin, Lin Boqiang. Assessing the development of China's new energy industry. Energy Econ. 2018;70:116–131. [Google Scholar]
  • 3.Zhang Liang, et al. Based on information fusion technique with data mining in the application of finance early-warning. Procedia Computer Science. 2013;17:695–703. [Google Scholar]
  • 4.Wei Xianfu. A method of enterprise financial risk analysis and early warning based on decision tree model. Secur. Commun. Network. 2021:2021. [Google Scholar]
  • 5.Altman Edward I. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Finance. 1968;23(4):589–609. [Google Scholar]
  • 6.Lyu Jincheng. Construction of enterprise financial early warning model based on logistic regression and BP neural network. Comput. Intell. Neurosci. 2022;2022 doi: 10.1155/2022/2614226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Greene William H. Pearson Education India; 2003. Econometric Analysis. [Google Scholar]
  • 8.Horak Jakub, Vrbka Jaromir, Suler Petr. Support vector machine methods and artificial neural networks used for the development of bankruptcy prediction models and their comparison. J. Risk Financ. Manag. 2020;13(3):60. [Google Scholar]
  • 9.Yang Baoan, et al. An early warning system for loan risk assessment using artificial neural networks. Knowl. Base Syst. 2001;14(5–6):303–306. [Google Scholar]
  • 10.Yim Juliana, Mitchell Heather. Comparison of country risk models: hybrid neural networks, logit models, discriminant analysis and cluster techniques. Expert Syst. Appl. 2005;28(1):137–148. [Google Scholar]
  • 11.Shen Guicheng. The prediction model of financial crisis based on the combination of principle component analysis and support vector machine. Open J. Soc. Sci. 2014;2(9):204. [Google Scholar]
  • 12.Yi Wang. Z-score model on financial crisis early-warning of listed real estate companies in China: a financial engineering perspective. Systems Engineering Procedia. 2012;3:153–157. [Google Scholar]
  • 13.Jia Keliang, Li Zhinuo. 2020 International Conference on Computer Information and Big Data Applications (CIBDA) IEEE; 2020. Chinese micro-blog sentiment classification based on emotion dictionary and semantic rules; pp. 309–312. [Google Scholar]
  • 14.Wang Yingjie, et al. A review of the application of natural language processing in the field of text sentiment analysis. Comput. Appl. 2022;42(4):1011–1020. [Google Scholar]
  • 15.Zhang Lei, Wang Shuai, Liu Bing. Deep learning for sentiment analysis: a survey. Wiley Interdisciplinary Reviews: Data Min. Knowl. Discov. 2018;8(4):e1253. [Google Scholar]
  • 16.Dang Nhan Cach, María N. Moreno-García, and Fernando De la Prieta. "Sentiment analysis based on deep learning: a comparative study. Electronics. 2020;9(3):483. [Google Scholar]
  • 17.Beaver William H. Financial ratios as predictors of failure. J. Account. Res. 1966:71–111. [Google Scholar]
  • 18.Altman Edward I. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Finance. 1968;23(4):589–609. [Google Scholar]
  • 19.Ohlson James A. Financial ratios and the probabilistic prediction of bankruptcy. J. Account. Res. 1980:109–131. [Google Scholar]
  • 20.Huynh T.L.D., Shahbaz M., Nasir M.A., et al. Financial modelling, risk management of energy instruments and the role of cryptocurrencies. Ann. Oper. Res. 2020:1–29. [Google Scholar]
  • 21.Odom Marcus D., Ramesh Sharda. 1990 IJCNN International Joint Conference on Neural Networks. IEEE; 1990. A neural network model for bankruptcy prediction; pp. 163–168. [Google Scholar]
  • 22.Endri E., Kasmir K., Syarif A. Delisting sharia stock prediction model based on financial information: support Vector Ma-chine. Decision Science Letters. 2020;9(2):207–214. [Google Scholar]
  • 23.Li Jingxiang, et al. Feature selection for support vector machine in the study of financial early warning system. Qual. Reliab. Eng. Int. 2014;30(6):867–877. [Google Scholar]
  • 24.Samitas A., Kampouris E., Kenourgios D. Machine learning as an early warning system to predict financial crisis. Int. Rev. Financ. Anal. 2020;71 [Google Scholar]
  • 25.Tan Clarence NW., Dihardjo Herlina. A study of using artificial neural networks to develop an early warning predictor for credit union financial distress with comparison to the probit model. Manag. Finance. 2001;27(4):56–77. [Google Scholar]
  • 26.Rawat R., Rimal Y.N., William P., et al. Malware threat affecting financial organization analysis using machine learning approach. Int. J. Inf. Technol. Web Eng. 2022;17(1):1–20. [Google Scholar]
  • 27.Sun Xiaojun, Lei Yalin. Research on financial early warning of mining listed companies based on BP neural network model. Resour. Pol. 2021;73 [Google Scholar]
  • 28.Du Guansan, Liu Zixian, Lu Haifeng. Application of innovative risk early warning mode under big data technology in Inter-net credit financial risk assessment. J. Comput. Appl. Math. 2021;386 [Google Scholar]
  • 29.Kalyani J., Bharathi P., Jyothi P. 2016. Stock Trend Prediction Using News Sentiment analysis[J] arXiv preprint arXiv:1607.01958. [Google Scholar]
  • 30.AL-Ghuribi S.M., Mohd Noah S.A., Tiun S. Unsupervised semantic approach of aspect-based sentiment analysis for large-scale user reviews. IEEE Access. 2020;8:218592–218613. [Google Scholar]
  • 31.Maqsood H., Mehmood I., Maqsood M., et al. A local and global event sentiment based efficient stock exchange forecasting using deep learning. Int. J. Inf. Manag. 2020;50:432–451. [Google Scholar]
  • 32.Loughran T., McDonald B. When is a liability not a liability? Textual analysis, dictionaries, and 10-ks. J. Finance. 2011;(66):35–65. [Google Scholar]
  • 33.Wang Jie, Xu Bingxin, Zu Yujie. Deep learning for aspect-based sentiment analysis." 2021 international conference on machine learning and intelligent systems engineering (MLISE) IEEE. 2021:267–271. [Google Scholar]
  • 34.Wang L., Cao Z., Melo G.D., et al. Proceedings of Meeting of the Association for Computational Linguistics; 2016. Relation Classification via Multi-Level Attention CNNs[C] pp. 1298–1307. [Google Scholar]
  • 35.Yang M., Xu J., Luo K., et al. Sentiment analysis of Chinese text based on Elmo-RNN model[C] J. Phys. Conf. 2021;1748(2) IOP Publishing. [Google Scholar]
  • 36.Asur Sitaram, Huberman Bernardo A. vol. 1. IEEE; 2010. Predicting the future with social media. (2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology). [Google Scholar]
  • 37.He Z., He L., Wen F. Risk compensation and market returns: the role of investor sentiment in the stock market. Emerg. Mark. Finance Trade. 2019;55(3):704–718. [Google Scholar]
  • 38.Jin Z., Yang Y., Liu Y. Stock closing price prediction based on sentiment analysis and LSTM. Neural Comput. Appl. 2020;32:9713–9729. [Google Scholar]
  • 39.Tang Duyu, et al. Effective LSTMs for target-dependent sentiment classification. 2015 arXiv preprint arXiv:1512.01100. [Google Scholar]
  • 40.Salunkhe Ashish, Mhaske Shubham. Aspect based sentiment analysis on financial data using transferred learning approach using pre-trained BERT and regressor model. Int. Res. J. Eng. Technol.(IRJET) 2019;6:1097–1101. [Google Scholar]
  • 41.Wu S., Liu Y., Zou Z., et al. S_I_LSTM: stock price prediction based on multiple data sources and sentiment analysis. Connect. Sci. 2022;34(1):44–62. [Google Scholar]
  • 42.Eliacik A.B., Erdogan N. Influential user weighted sentiment analysis on topic based microblogging community. Expert Syst. Appl. 2018;92:403–418. [Google Scholar]
  • 43.Ortu Marco, et al. On technical trading and social media indicators for cryptocurrency price classification through deep learning. Expert Syst. Appl. 2022;198 [Google Scholar]
  • 44.Breiman Leo. Bagging predictors. Mach. Learn. 1996;24(2):123–140. [Google Scholar]
  • 45.Kanungo Tapas, et al. An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002;24(7):881–892. [Google Scholar]
  • 46.Olson David L. Comparison of weights in TOPSIS models. Math. Comput. Model. 2004;40(7–8):721–727. [Google Scholar]
  • 47.Song Yan-Yan, Ying L.U. Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry. 2015;27(2):130. doi: 10.11919/j.issn.1002-0829.215044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Rish Irina. An empirical study of the naive Bayes classifier. IJCAI 2001 workshop on empirical methods in artificial intelligence. 2001;3(22) [Google Scholar]
  • 49.Guo Gongde, et al. OTM Confederated International Conferences" on the Move to Meaningful Internet Systems. Springer; Berlin, Heidelberg: 2003. KNN model-based approach in classification. [Google Scholar]
  • 50.Cherkassky Vladimir, Yunqian Ma. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Network. 2004;17(1):113–126. doi: 10.1016/S0893-6080(03)00169-2. [DOI] [PubMed] [Google Scholar]
  • 51.Jin Wen, et al. vol. 3. IEEE; 2000. The improvements of BP neural network learning algorithm. (WCC 2000-ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings). 16th world computer congress 2000. [Google Scholar]
  • 52.Bühlmann Peter, Yu Bin. Analyzing bagging. Ann. Stat. 2002;30(4):927–961. [Google Scholar]
  • 53.Behzadian Majid, et al. A state-of the-art survey of TOPSIS applications. Expert Syst. Appl. 2012;39(17):13051–13069.3. [Google Scholar]
  • 54.Li Jing, et al. Advances in Computer Science and Information Engineering. Springer; Berlin, Heidelberg: 2012. Brief introduction of back propagation (BP) neural network algorithm and its improvement; pp. 553–558. [Google Scholar]
  • 55.Jin Wen, et al. vol. 3. IEEE; 2000. The improvements of BP neural network learning algorithm. (WCC 2000-ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings). 16th world computer congress 2000. [Google Scholar]
  • 56.Pinkus Allan. Approximation theory of the MLP model in neural networks. Acta numerical. 1999;8:143–195. [Google Scholar]
  • 57.Williams Ronald J., Zipser David. A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1989;1(2):270–280. [Google Scholar]
  • 58.Gers Felix A., Jürgen Schmidhuber, Cummins Fred. Learning to forget: continual prediction with LSTM. Neural Comput. 2000;12(10):2451–2471. doi: 10.1162/089976600300015015. [DOI] [PubMed] [Google Scholar]
  • 59.Rana Rajib. Gated recurrent unit (GRU) for emotion classification from noisy speech. 2016 arXiv preprint arXiv:1612.07778. [Google Scholar]
  • 60.Wang Jin, et al. vol. 2. 2016. Dimensional sentiment analysis using a regional CNN-LSTM model. (Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics). Short papers) [Google Scholar]
  • 61.Xu Guixian, et al. Sentiment analysis of comment texts based on BiLSTM. IEEE Access. 2019;7:51522–51532. [Google Scholar]
  • 62.Guo Bao, et al. Improving text classification with weighted word embeddings via a multi-channel TextCNN model. Neurocomputing. 2019;363:366–374. [Google Scholar]
  • 63.Li Guanpeng, et al. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 2017. Understanding error propagation in deep learning neural network (DNN) accelerators and applications. [Google Scholar]
  • 64.Van der Maaten Laurens, Hinton Geoffrey. Visualizing data using t-SNE. Journal of machine learning research. 2008;9:11. [Google Scholar]
  • 65.Zhang Yiling, et al. A multitask multiview clustering algorithm in heterogeneous situations based on LLE and LE. Knowl. Base Syst. 2019;163:776–786. [Google Scholar]
  • 66.Martinez Aleix M., Kak Avinash C. Pca versus lda. IEEE Trans. Pattern Anal. Mach. Intell. 2001;23(2):228–233. [Google Scholar]
  • 67.Jeong Dong Hyun, et al. ipca: an interactive system for pca based visual analytics. Comput. Graph. Forum. 2009;28(3) Oxford, UK: Blackwell Publishing Ltd. [Google Scholar]
  • 68.Cao L.J., et al. A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector ma-chine. Neurocomputing. 2003;55(1–2):321–336. [Google Scholar]
  • 69.Rummel Rudolf J. Northwestern University Press; 1988. Applied Factor Analysis. [Google Scholar]
  • 70.Zou Hui, Hastie Trevor, Tibshirani Robert. Sparse principal component analysis. J. Comput. Graph Stat. 2006;15(2):265–286. [Google Scholar]
  • 71.Hansen Per Christian. The truncatedsvd as a method for regularization. BIT Numerical Mathematics. 1987;27(4):534–553. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author, [CL], upon reasonable request.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES