Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2023 Jan 5:1–22. Online ahead of print. doi: 10.1007/s11063-022-11137-5

A Radial Basis Function Neural Network for Stochastic Frontier Analyses of General Multivariate Production and Cost Functions

Parag C Pendharkar 1,
PMCID: PMC9815069  PMID: 36624804

Abstract

Production function techniques often impose functional form and other restrictions that limit their applicability. One common limitation in popular production function techniques is the requirement that all inputs and outputs must be positive numbers. There is a need to develop a production function analysis technique that is less restrictive in the assumptions it makes, and inputs it can process. This paper proposes such a general technique by linking fields of neural networks and econometrics. Specifically, two radial basis function (RBF) neural networks are proposed for stochastic production and cost frontier analyses. The functional forms of production and cost functions are considered unknown except that they are multivariate. Using simulated and real-world datasets, experiments are performed, and results are provided. The results illustrate that the proposed technique has broad applicability and performs equal to or better than the traditional stochastic frontier analysis technique.

Keywords: Radial basis functions, Stochastic frontier analysis, Neural networks, Production functions, Cost functions

Introduction

There are two popular techniques for production frontier analysis. The first technique is the data envelopment analysis (DEA) technique, which uses a non-parametric approach to fit piecewise linear production frontier on a set of data points. Standard production function assumptions of monotonicity and convexity are made while fitting the production frontier. The second technique is an econometric approach that assumes a known production function form (Cobb–Douglas, Translog, etc.), a standard normal distribution for random errors in production (shocks), and a one-sided half-normal distribution for technical inefficiency. The econometric approach allows for the inclusion of statistical noise and production errors which the DEA method does not allow. The DEA method does not require knowledge of production function form and measures both technical and scale inefficiencies. Besides these two primary approaches, many other approaches are proposed in the literature. Among some of these approaches are Free Disposal Hull [1], stochastic non-smooth envelopment of data [2], Support Vector Frontiers [3] and Kernel regression [4] approaches.

Neural networks (NN) are widely used for function approximation, density estimation, kernel regression, classification [5], data generation [6], deep learning [7] and interpolation problems [8]. The NN literature has matured over the last three decades and is also continuously growing [9]. The complementary nature of these two fields was noticed nearly two decades ago, and some studies have used the DEA to improve the forecasting capabilities of NN [10]. Other studies have used hybrid RBF neural networks, and DEA for classification [11], regression, segmentation and cluster analysis problems [12]. This paper illustrates that a simple combination of radial basis function (RBF) neural network and econometric frontier analysis approach yields a simple and powerful approach for stochastic frontier analysis (SFA). The approach is similar to kernel regression approaches [4] used in econometric literature [13]. The primary difference is that the production function estimation used in this approach is not a regression function, but it is a function learned by interpolation. The use of kernel regression can be incorporated into the proposed approach by using a standard normalization procedure. There are only two requirements for this technique to work. First, the production function should not be univariate and second, all data points must have unique values [14]. A reader may note that the DEA frontier may also be considered an interpolation frontier developed by using certain selected data points that define a convex set boundary. The overall proposed procedure is lightweight and is deployable as IoT using an approach similar to the deep learning application used in COVID-19 detection [15].

The novelty of the proposed approach, when compared to the existing popular production frontier analysis approaches, is as follows. The traditional DEA approach [16] makes assumptions about data generated from a common production process, the convexity of inputs and outputs, no errors in measurements, the monotonicity of the relationship between inputs and outputs, and non-zero values of inputs and outputs [17]. When these assumptions are satisfied, DEA models perform very well. However, DEA models are often applied to datasets from service [18] and software industries [19], where most of these assumptions are either violated or not appropriate. When the assumptions are violated, resulting DEA scores cannot be considered reliable. Even when all assumptions are satisfied, DEA models are still criticized for providing too many decision-making units (DMUs) as fully efficient [20]. The approach presented in this paper does not make any restrictive assumptions made by the DEA approach and is very benign in its assumptions. Both negative and positive data can be used in the proposed approach and the approach does not provide too many DMUs as fully efficient (100% efficiency score). Popular econometric SFA assumes a predetermined production function form such as Cobb–Douglas or Translog production functions [21]. The assumption of predetermined production function form is even more restrictive than the DEA model. The production function form of the SFA model requires computation of logarithms of inputs and outputs and negative values cannot be incorporated. Additionally, when predetermined production functions do not fit datasets then all of the errors are attributed to random noise and all DMUs get an efficiency score of 1. Kernel regression approaches [4] are rarely used because they require solving an inconsistent set of equations with no guarantee of obtaining a solution. The approach presented in this paper does not make any restrictive assumptions, does not use predetermined production function forms, and does not require solving an inconsistent set of equations. The approach is computationally efficient and will provide reasonably unbiased DMU efficiency estimates for any production function dataset, including datasets with either negative inputs or negative output. The provided framework is also general enough for future extensions and use of other heuristic procedures [22].

The rest of the paper is organized as follows. In Sect. 2, a basic introduction to the SFA and formulation of the RBF neural network model are provided. In Sect. 3, the RBF neural network model is applied to simulated and real-world datasets and results are reported. In Sect. 4, the paper concludes with a summary and provides directions for future research.

Stochastic Frontier Analysis Preliminaries and RBF Neural Network for Stochastic Frontier Analysis

Assume a firm using an input vector of n inputs, x, to produce an output y. If there are k such firms in an industry, then the production function for the industry may be written as follows:

yi=fxi;β+vi-ui, 1

where i = 1,…,k, β is a vector of technology parameters to be estimated and ui0 is one-sided technical inefficiency, which is defined as the ratio of the current output for the ith firm and the maximum feasible output. The term vi is two-sided representing statistical noise or random shocks [23]. If the production function is Cobb–Douglas [24] then the stochastic production function can be written as follows:

lnqi=β0+j=1nβjlnzji+vi-ui. 2

The term vi is called statistical noise. The statistical noise is assumed to be independent and identically distributed with zero mean and two-sided normal distribution. The one-sided technical inefficiency distribution, ui, may take any one of many distributions including half-normal, exponential and truncated normal distributions. The variables qi and zij are output and input quantities.

If statistical noise and technical efficiency distributions are assumed to be independent of each other and input variables, then it is easy to define likelihood functions for the aforementioned predetermined distributions and obtain maximum likelihood estimates. For half-normal case [25], the expected value of ui can be computed using the following formula:

Eμi|ei=σλ1+λ2φeiλ/σΦ-eiλ/σ-eiλσ, 3

where φ (.) is normal density function, Φ (.) is cumulative of standard normal density function, ei=vi-ui, λ = σu / σv, and σ=σu2+σv22. The technical efficiency, TEi, is computed using the following formula:

TEi=1-Eui|ei. 4

When the asymmetric distribution of ui is assumed to be exponential [25], the expected value of ui is computed using the following formula:

Eui|ei=ei-θσv2+σvφei-θσv2/σvΦei-θσv2/σv, 5

where θ = 1/σv.

Some researchers have argued against the use of pre-determined half-normal and exponential distributions because both of these distributions have a mode at zero [26]. This mode of zero causes technical efficiency scores to be artificially high. To remedy this shortcoming, truncated normal distributions are often used. For these distributions, uiφ~+μ,σu2 and viφ~0,σv2. For truncated model [27], the expected value of ui is computed using the following formula:

Eui|ei=σλ1+λ2φeiλσ+uiσλΦ-eiλσ+uiσλ-eiλσ+uiσλ. 6

While truncated and two-parameter gamma distribution models remedy some of the deficiencies of half-normal and exponential models, these models increase the complexity of the parameter estimation process. This increase in complexity often outweighs any potential benefits these models bring. As a result, half-normal and exponential distribution models are more popular models in the SFA literature [26].

When the production function form is not known then RBF can be used to approximate the functional form of the production function from a given dataset. RBF neural networks are a broad set of procedures that can universally approximate any continuous function over a compact set [28]. The RBF approximates fxi;β using the following Green’s function:

fxi;βr=1NβpGxi-tr, 7

where N is the number of basis functions, βr is a set of weights, and tr are the basis function centers that are either randomly picked from the training data or are selected using cluster analysis techniques, such as K-means cluster analysis or the expectation-maximization algorithm. A popular choice for Green’s function G(.) is the multivariate Gaussian function [29]:

φrxi=Gxi-tr=exp-γxi-tr2, 8

where γ > 0 is a user-defined scalar that controls dispersion, and xi is the input vector under consideration. Figure 1 illustrates an RBF neural network for 3 inputs, one output and five hidden nodes or basis functions (N = 5). This network is a universal approximator of any multivariate continuous function for a sufficiently large number of hidden units [29]. What makes the RBF neural network useful for SFA is that the learning of two layers of the network can be done independently using different techniques [30]. Multiple outputs can be added, but a reader may note that the terms viand ui will be computed for each output separately. The issue of dealing with multiple outputs is considered out of the scope of current research.

Fig. 1.

Fig. 1

Plot of RBF scores for different N values and True Efficiency Score

The number of hidden nodes in the RBF neural network is a design issue, and learning data sample size is one limiting factor for selecting the number of nodes. A decision-maker must consider the tradeoff between the number of parameters vs. the size of the dataset for reliable estimation of parameters in the hidden-to-output layer. The number of parameters that need to be estimated is (N + 3), where N is the number of hidden nodes. The additional 3 parameters are: intercept and terms vi and ui. Since the dataset size is k, N should satisfy constraint N≤ (k−3). When N satisfies the constraint, solving inconsistent equations problems can be avoided for computing hidden-to-output layer parameters. The constraint, N≤ (k−3), highlights the inherent limitations in applying traditional kernel regression procedures for learning stochastic frontier functions. Kernel regression approaches, such as those proposed by Fan et al. [4], cannot be incorporated into the RBF framework because of the requirement of N = k for kernel regression procedures. Generally, kernel regression solution approaches can only be inexact heuristic solution procedures because the number of parameters, estimated in kernel regression stochastic frontier models, will be higher than the size of the dataset. The lower bound on the number of hidden nodes should be two times the number of input nodes [31]. The lower and upper bounds, on the number of hidden nodes, set a limit on the minimum number of examples required to build an RBF neural network model for SFA. Any time (2⋅n) > (k−3), the dataset should be considered too small for RBF neural network analysis. The Cover theorem [31] mentions that the number of hidden nodes in an RBF should not be greater than half the total number of examples in a training dataset. Thus, a reasonable heuristic for the number of hidden nodes is: N2n,k2 as long as 2nk2.

Once the exact number of hidden nodes is determined, the centers (tr) for the hidden nodes can be learned using cluster analysis techniques (K-Means or expectation-maximization algorithm). When the K-means cluster analysis procedure is used to find centers, additional information, such as the number of clusters with zero memberships, can be used to decide whether the number of hidden nodes selected in an RBF neural network is too high. Too many clusters with zero cluster memberships mean that the number of hidden nodes is larger than necessary.

The output of the RBF neural network from Fig. 1 is similar to Eq. (2) when log (i.e., ln(.)) parameters on the right-hand side of the equation are replaced by radial basis functions (ϕ(.)), and when the dependent variable, yi, replaces ln (qi). Thus, the SFA that is used to solve Eq. (2) can also be used to solve parameters for lines connecting hidden-to-output nodes. The only difference between Eq. (2), and output from the RBF neural network is that N > n. In other words, the number of parameters (βj) to be estimated, in the RBF neural network, will be higher. The learning mechanisms for input-to-hidden nodes and hidden-to-output nodes are independent of each other. This independence allows for their implementation at different times, with different software, and even on different computers. The results will have to be combined in the end, but the RBF framework provides a lot of implementation flexibility. Standard SFA solution procedures can be used to solve the hidden-to-output layer parameters [23]. This paper uses standard half-normal assumption distribution for parameter ui (Eq. 3), and two-sided zero-mean normal distribution assumption for parameter vi.

In Eq. (1), the function fxi;β may be considered either an interpolation function or a regression function because the SFA problem is not a forecasting problem. The RBF function approach, discussed so far, was from an interpolation function point of view. The RBF neural network framework allows for the use of both interpolation functions and regression functions [32]. Kernel regression neural networks are called normalized RBF neural networks [33]. Such normalized RBF neural networks require the satisfaction of the following constraint:

i=1kϕrxi=1,forallr={1,,N}. 9

If ϕ¯ denotes the normalized kernel that satisfies Eq. (9), then the value of such kernel for basis function r{1,,N} can be obtained by using the following formula:

φr¯xi=exp-γxi-tr2i=1kexp-γxi-tr2. 10

The γ parameter controls dispersion, and its value decides if radial-basis functions are peaked or flat. To avoid selecting values that will lead to extremely peaked or flat distributions, the literature [34] suggests using the following value for γ.

γ=Nmaxts-tr2,s,r1,,N,andsr. 11

For stochastic cost efficiency, an analogous derivation of the dual cost problem [35] is as follows:

lnci=β0+βylnqi+j=1nβjlnpji+vi-ui, 12

where cj is cost, qi is output and pji are input prices.

The stochastic frontier model for both production and cost function can be generalized into the following form:

yi=β0+j=1wβjxji+vi-aui, 13

where a=1,forproductionfunctions-1,forcostfunctions and

w=n,forproductionfunctionsn+1,forcostfunctions

Taking production function from Eq. (2), and mapping its corresponding variables to Eq. (13) provides yi=ln (qi), and xji=ln (zji). Performing a similar exercise for Eq. (12) and Eq. (13), provides yi=ln (ci), and xji=ln (pji). The extra term in the value of w, for cost functions, is the term xji=ln (qi), when w = j = n + 1. For the RBF neural networks, the ln () values will be replaced by ϕ () basis function values, and the values for n will be replaced by N, which is the number of radial basis functions.

Datasets, Experiments and Results

To test RBF neural network and normalized RBF models for SFA, simulated and real-world datasets were used. In particular, the following two models [4] were used to generate simulated datasets:

Model 1: yi=1+x1i+x2i+ϵi, and

Model 2: yi=1+ln1+x1i+ln1+x2i+ϵi, where ϵi=vi-ui, bivariate regressors Inline graphic. In the hidden-to-output layer SFA algorithm, for RBF shown in Fig. 1, a half-normal technical inefficiency distribution was used. Datasets were generated using Monte Carlo simulations with known technical efficiency. The technical efficiency of a dataset, e-ui, was compared with the predicted technical efficiency from RBF models. Eleven datasets were generated, the first one was used for illustration purposes to facilitate reader understanding of the procedure, and the other 10 datasets were used for statistical analysis. All datasets had a sample size of 30 (i.e., k=30). For bivariate inputs, the number of hidden nodes heuristic was computed with the lower bound on the number of hidden nodes as 4 and the upper bound as 15. In other words, N should be greater than 4 and less than 15. In the experiments conducted in this section, three values of N∈ {5, 7, 10} were used. The reader can see in the following illustrative examples that higher values of N often lead to zero membership clusters in K-Means cluster analysis. Additionally, the results show that values of N that are close to its lower bound are sufficient for accurate results.

Tables 1 and 2 show the illustrative data generated for Model 1 and Model 2, respectively. The value of yi is not reported because it can be computed from the data provided. For example, the value of yi for Model 1 is ϑTh,where hT=[1,x1i,x2i,vi,-ui] and ϑT=[1,1,1,1,1]. The reported data is approximated to three decimal places. Tables 3 and 4 illustrate the input-to-hidden layer cluster analysis results for different values of N (i.e., tr vector values) for Model 1 and Model 2, respectively. As expected with a higher value of N, cluster memberships get sparse and for Model 1, for N = 10, there are a couple of zero membership clusters. In all experiments of this section, the value of γ is computed using Eq. 11. One noticeable difference between Model 1 and Model 2 is that value of γ is slightly stable around the value of 0.3 for Model 1, but this value increases with an increase in the value of N for Model 2. Once tr vector values and γ are known, values of ϕrxi and ϕr¯xi can be computed using Eqs. (8) and  (10), respectively. Once these values are computed, the dataset is ready for the traditional SFA algorithm so that hidden-to-output layer analysis can be conducted.

Table 1.

Illustrative Monte-Carlo Dataset for Model 1

x 1 x 2 v u e− u(True efficiency)
9.117 10.825 0.441 1.181 0.307
10.083 9.771 −0.205 6.529 0.001
9.396 10.262 −0.978 3.587 0.028
9.631 9.102 −1.573 1.096 0.334
9.162 7.843 −0.37 7.237 0.001
9.718 10.094 −0.221 2.221 0.109
13.57 9.049 −0.535 0.929 0.395
13.408 11.173 −0.262 1.626 0.197
11.147 11.735 −1.095 0.367 0.693
10.787 9.628 0.549 2.887 0.056
8.722 11.193 −2.175 0.762 0.467
9.415 10.954 0.145 5.675 0.003
9.381 8.585 0.669 3.661 0.026
10.535 9.967 −0.849 1.129 0.323
10.483 10.261 −0.057 0.157 0.854
10.108 10.523 0.292 5.862 0.003
9.66 10.677 −0.615 1.696 0.183
9.074 10.674 −0.014 0.233 0.792
10.005 10.678 0.412 0.242 0.785
11.139 9.687 −0.937 2.384 0.092
10.427 10.379 0.835 3.131 0.044
10.181 7.747 1.095 2.407 0.09
10.657 8.832 0.321 1.407 0.245
10.584 11.202 0.015 0.14 0.87
8.384 9.339 −1.165 1.06 0.346
10.019 10.328 0.495 3.77 0.023
9.574 11.508 −1.216 3.108 0.045
7.966 9.111 1.591 1.281 0.278
8.671 9.041 0.526 0.233 0.792
9.68 9.507 0.265 5.373 0.005

Table 2.

Illustrative Monte-Carlo Dataset for Model 2

ln (1 + x 1) ln (1 + x2) v u e−u(True efficiency)
2.456 2.460 −0.507 1.217 0.296
2.413 2.456 −1.483 5.749 0.003
2.548 2.414 −0.106 7.367 0.001
2.343 2.540 0.245 0.273 0.761
2.421 2.368 0.146 0.074 0.929
2.315 2.407 −1.118 0.600 0.549
2.324 2.344 −0.597 3.115 0.044
2.366 2.392 −0.679 1.588 0.204
2.455 2.462 0.955 6.651 0.001
2.569 2.280 −0.131 0.322 0.725
2.468 2.415 0.649 1.317 0.268
2.296 2.329 0.599 0.123 0.884
2.379 2.467 −0.193 1.093 0.335
2.346 2.201 −1.981 2.291 0.101
2.371 2.202 0.363 1.649 0.192
2.500 2.536 0.704 1.124 0.325
2.432 2.384 −0.753 4.977 0.007
2.484 2.201 −0.017 0.931 0.394
2.451 2.232 −0.442 1.860 0.156
2.323 2.390 1.701 3.074 0.046
2.479 2.426 −0.031 1.515 0.220
2.267 2.428 0.204 0.121 0.886
2.468 2.491 −1.048 0.764 0.466
2.451 2.428 −0.735 1.643 0.193
2.493 2.391 −1.357 5.030 0.007
2.260 2.489 0.729 3.519 0.030
2.379 2.570 0.607 0.567 0.567
2.400 2.400 −0.828 0.260 0.771
2.407 2.376 2.888 4.538 0.011
2.475 2.488 0.919 5.380 0.005

Table 3.

Input to Hidden Layers Cluster Centers, Cluster Memberships and γ (Model 1)

N x 1 x 2 Cluster Membership γ (using Eq. 11)
5 11.469 8.543 3 0.344
10.298 10.015 10
11.713 11.370 3
8.866 8.837 6
9.370 10.846 8
7 10.118 10.168 9 0.322
10.861 9.382 3
11.147 11.735 1
10.584 11.202 1
9.054 8.681 7
9.280 10.870 7
13.489 10.111 2
10 10.866 11.469 2 0.336
13.570 9.049 1
10.576 9.949 6
8.340 9.164 3
10.000 8.140 3
9.528 10.702 11
11.927 8.692 0
13.408 11.173 1
11.564 11.223 0
9.564 9.065 3

Table 4.

Input to Hidden Layers Cluster Centers, Cluster Memberships and γ (Model 2)

N x 1 x 2 Cluster Membership γ (using Eq. 11)
5 2.479 2.451 10 50.359
2.334 2.269 4
2.385 2.397 8
2.326 2.499 5
2.502 2.237 3
7 2.405 2.384 5 55.429
2.297 2.398 6
2.488 2.415 5
2.449 2.480 7
2.527 2.240 2
2.389 2.212 3
2.361 2.555 2
10 2.488 2.415 5 71.667
2.396 2.461 2
2.302 2.408 3
2.361 2.555 2
2.413 2.209 4
2.405 2.384 5
2.569 2.280 1
2.471 2.487 5
2.310 2.336 2
2.260 2.489 1

Tables 5 and 6 illustrate the results of experiments with illustrative example datasets for two models and two RBFs. The last row of the tables shows the root-mean-square (RMS) error between the true efficiency score columns shown in Tables 1 and 2 and estimated technical efficiency scores reported in different columns of Tables 5 and 6. The lower RMS values were obtained when either N = 5 or when N = 7. Figures 1 and 2 illustrate plots of true efficiency and estimated technical efficiency scores for RBF and Normalized RBF for different values of N for Model 1. Figures 3 and 4 show a similar plot for Model 2.

Table 5.

RBF and Normalized RBF Technical Efficiencies for Model 1

k RBF Normalized RBF
N = 5  N = 7  N = 10  N = 5  N = 7  N = 10
1 0.400 0.600 0.811 0.400 0.603 0.810
2 0.002 0.002 0.012 0.002 0.002 0.012
3 0.014 0.014 0.014 0.014 0.014 0.015
4 0.160 0.102 1.000 0.158 0.102 1.000
5 0.000 0.000 0.003 0.000 0.000 0.003
6 0.144 0.129 0.268 0.143 0.130 0.274
7 1.000 0.369 1.000 1.000 0.366 1.000
8 0.870 1.000 1.000 0.878 1.000 1.000
9 0.051 0.438 0.388 0.052 0.444 0.380
10 0.191 0.216 0.243 0.189 0.216 0.244
11 0.029 0.120 1.000 0.029 0.120 1.000
12 0.004 0.004 0.007 0.004 0.004 0.006
13 0.054 0.045 0.315 0.054 0.044 0.312
14 0.185 0.239 0.389 0.184 0.239 0.399
15 0.723 1.000 1.000 0.717 1.000 1.000
16 0.003 0.004 0.004 0.003 0.004 0.004
17 0.101 0.095 0.101 0.101 0.095 0.099
18 0.682 1.000 1.000 0.683 1.000 1.000
19 1.000 1.000 1.000 1.000 1.000 1.000
20 0.060 0.067 0.017 0.059 0.067 0.018
21 0.080 0.109 0.096 0.080 0.109 0.097
22 0.082 0.501 1.000 0.082 0.503 1.000
23 1.000 1.000 1.000 1.000 1.000 1.000
24 0.272 0.519 0.362 0.276 0.513 0.359
25 0.051 0.095 0.038 0.051 0.095 0.039
26 0.045 0.046 0.067 0.044 0.046 0.068
27 0.011 0.017 0.179 0.011 0.017 0.182
28 0.201 1.000 0.569 0.202 1.000 0.532
29 1.000 1.000 1.000 1.000 1.000 1.000
30 0.015 0.010 0.075 0.014 0.011 0.075
RMS 0.297 0.296 0.377 0.298 0.296 0.377

Bold values indicate the lowest RMS error

Table 6.

RBF and Normalized RBF Technical Efficiencies for Model 2

k RBF Normalized RBF
N = 5  N = 7  N = 10  N = 5  N = 7  N = 10
1 0.377 0.815 0.617 0.421 0.965 0.473
2 0.001 0.003 0.003 0.001 0.003 0.002
3 0.001 0.000 0.001 0.001 0.000 0.003
4 1.000 1.000 0.580 1.000 1.000 0.396
5 1.000 1.000 0.999 1.000 0.627 0.794
6 0.066 0.208 0.129 0.063 0.242 0.603
7 0.014 0.020 0.013 0.014 0.018 0.014
8 0.049 0.121 0.102 0.048 0.093 0.111
9 0.007 0.016 0.012 0.008 0.019 0.009
10 0.922 0.281 0.622 0.759 0.224 1.000
11 0.933 1.000 1.000 1.000 1.000 1.000
12 1.000 1.000 1.000 1.000 1.000 1.000
13 0.264 0.940 1.000 0.278 1.000 1.000
14 0.023 0.028 0.045 0.024 0.027 1.000
15 0.569 0.972 0.998 0.580 1.000 0.469
16 1.000 1.000 1.000 1.000 1.000 1.000
17 0.003 0.004 0.004 0.003 0.003 0.003
18 1.000 1.000 0.981 0.870 1.000 1.000
19 0.367 0.471 0.254 0.348 0.481 0.218
20 0.096 0.268 0.152 0.092 0.283 0.484
21 0.455 0.472 0.435 0.494 0.520 0.517
22 0.364 0.719 0.996 0.328 0.972 1.000
23 0.388 0.794 0.428 0.427 1.000 0.599
24 0.150 0.247 0.254 0.162 0.244 0.169
25 0.003 0.002 0.002 0.003 0.002 0.004
26 0.029 0.022 0.120 0.027 0.023 0.496
27 1.000 0.992 0.343 0.962 0.897 0.953
28 0.227 0.467 0.546 0.230 0.336 0.342
29 0.144 0.189 0.192 0.144 0.120 0.137
30 0.028 0.052 0.029 0.031 0.066 0.043
RMS 0.299 0.349 0.317 0.296 0.378 0.358

Bold values indicate the lowest RMS error

Fig. 1.

Fig. 1

 A RBF for Stochastic Production Frontier Analysis

Fig. 2.

Fig. 2

Plot of Normalized RBF scores for different N values and True Efficiency Score

Fig. 3.

Fig. 3

Plot of RBF scores for different N values and True Efficiency Score

Fig. 4.

Fig. 4

Plot of Normalized RBF scores for different N values and True Efficiency Score

For the illustrative example, since 6 different estimated efficiency scores are available (3 for RBF and 3 for Normalized RBF), it is possible to compute the average values of all these scores and create an ensemble score. This ensemble score can then be compared with the true efficiency score values. The primary advantage of doing this is that it reduces the error of the estimated efficiency score being drastically different from the true efficiency score. When ensemble efficiency scores are computed, the RMS errors between true efficiency scores and ensemble efficiency scores for models 1 and 2 were 0.279 and 0.291, respectively. In the illustrative example, the RMS values for ensemble efficiency scores are slightly lower than the lowest RMS values reported in Tables 5 and 6. Although this is a fortunate occurrence, the primary objective of ensemble scores is lowering the error in estimating true efficiency scores, and the only guarantee in ensemble efficiency scores is that the RMS value of ensemble efficiency scores will not be the highest when compared to the individual estimators used in computing the ensemble efficiency scores [36]. Figures 5 and 6 illustrate the ensemble efficiency scores and true efficiency scores for the two models.

Fig. 5.

Fig. 5

Plot of True and Ensemble Efficiency Scores for Model 1

Fig. 6.

Fig. 6

Plot of True and Ensemble Efficiency Scores for Model 2

Ten additional sample replications1 were considered for the two models, and Tables 7 and 8 illustrate the RMS errors between true efficiency and estimated efficiency for different models and different RBF techniques. The best RMS error for each sample is underlined in the Tables.

Table 7.

RBF and Normalized RBF RMS Errors for Model 1

Replication
Sample
RBF Normalized RBF
N = 5  N = 7  N = 10  N = 5  N = 7  N = 10
1 0.332 0.356 0.386 0.332 0.357 0.399
2 0.393 0.580 0.462 0.316 0.713 0.464
3 0.833 0.835 0.832 0.835 0.835 0.835
4 0.345 0.435 0.425 0.345 0.436 0.425
5 0.279 0.286 0.385 0.279 0.286 0.376
6 0.415 0.395 0.492 0.416 0.396 0.494
7 0.353 0.367 0.406 0.353 0.367 0.398
8 0.407 0.370 0.491 0.407 0.372 0.507
9 0.379 0.474 0.493 0.367 0.477 0.495
10 0.399 0.422 0.511 0.403 0.422 0.523

Table 8.

RBF and Normalized RBF RMS Errors for Model 2

Replication
Sample
RBF Normalized RBF
N = 5  N = 7  N = 10  N = 5  N = 7  N = 10
1 0.311 0.300 0.347 0.311 0.300 0.348
2 0.390 0.591 0.393 0.386 0.636 0.385
3 0.807 0.292 0.807 0.806 0.291 0.807
4 0.309 0.212 0.394 0.308 0.214 0.401
5 0.298 0.336 0.366 0.299 0.321 0.375
6 0.258 0.253 0.422 0.278 0.253 0.400
7 0.330 0.432 0.418 0.330 0.431 0.419
8 0.355 0.377 0.439 0.339 0.382 0.405
9 0.219 0.277 0.313 0.219 0.278 0.315
10 0.250 0.321 0.399 0.250 0.314 0.364

Pairwise difference in means 2-tail |t|-tests were performed to find the differences in mean RMS values, for a given technique and given Model, for different values of N. For Table 7, the difference of means between N = 5 and N = 10 for the RBF technique was significant with |t|-value (df = 9) of 6.86, with a mean value of RMS for N = 5 being lower at 0.4135 vs. approximately at 0.49 for the RBF with N = 10. A similar result was obtained for Normalized RBF with |t|-value (df = 9) of 6.35, with a mean value of RMS for N = 5 being lower at 0.4053 vs. approximately at 0.439 for Normalized RBF with N = 10. Using data for Model 2, the results were similar as well. For the RBF technique, the difference of means between N = 5 and N = 10 was significant with |t|-value (df = 9) of 4.49, with a mean value of RMS for N = 5 being lower at 0.3527 vs. approximately at 0.43 for the RBF with N = 10. For the Normalized RBF and Model 2, the difference of means between N = 5 and N = 10 was significant with |t|-value (df = 9) of 5.02, with a mean value of RMS for N = 5 being lower at 0.3526 vs. approximately at 0.42 for an RBF with N = 10. All differences in means between N = 7 and N = 5; and N = 7 and N = 10 were non-significant. When found significant, the differences in RMS means were significant at a 99% level of statistical confidence.

The results indicate that smaller-size RBF (N = 5) does better than the RBF neural networks of size N = 10. Also, between the RBF-based interpolation model and the Normalized RBF-based kernel technique, no technique appears to be a clear winner because the differences in RMS means for best-performing RBF and Normalized RBF techniques at N = 5 are non-significant. The lowest RMS values, at N = 5, are nearly shared in equal numbers by both techniques. The RMS errors are fairly stable for both techniques for a fixed value of N, so picking either model for estimating the efficiency score should be sufficient. Generally, as shown in the illustrative example, larger-size neural networks with large values of N will increase the likelihood of zero membership clusters and smaller-size neural network models may be slightly better to use and may result in lower RMS errors. Unfortunately, there are no performance guarantees in deciding the number of hidden nodes for a neural network, and a designer is always better served in trying out a different number of hidden nodes before settling for the best model design. In simulated datasets, the values of true efficiency were known. In real-world datasets, the value of true efficiency will be unknown and ensemble models can also be used in lowering the estimation error between true efficiency and model-estimated efficiency scores.

The comparison between the models proposed in this research and traditional SFA models is difficult because traditional SFA models assume a predetermined production function form. When the Cobb–Douglas production function form is used for Model 1 datasets, all errors are attributed to random errors and values of ui=0 for all examples in the datasets. This results in efficiency scores of 1 for all examples. Model 2 is a log-linear representation of the following Cobb–Douglas production function form:

eyi=e1+x1i1+x2i.

Traditional SFA analysis on Model 2 gave competitive results because of its strict adherence to the Cobb–Douglas production function form. Table 9 illustrates the results of traditional SFA analyses for Models 1 and 2.

Table 9.

Traditional SFA Results

Replication Sample Model 1 (RMS) Model 2 (RMS)
1 0.820 0.216
2 0.823 0.388
3 0.835 0.804
4 0.809 0.196
5 0.775 0.310
6 0.774 0.247
7 0.799 0.193
8 0.838 0.341
9 0.828 0.237
10 0.856 0.231

All RBF models from Table 7 outperformed traditional SFA model results for Model 1. Pairwise differences in means 2-tail |t|-tests between Model 2 were only significant for RBF with N = 10 (|t|-value = 4.56, df = 9) and Normalized RBF model with N = 10 (|t|-value = 4.29, df = 9), where traditional SFA had lower RMS values. The results indicate that the proposed models with appropriate size (i.e., N values) generally perform similar or better than the traditional SFA model, but the underlying production function type will heavily govern the performance of the traditional SFA model. Figures 7 and 8 illustrate the comparison of the best-case error performance scores (the numbers underlined in Tables 7 and 8) with traditional RMS errors from Table 9. The RBF Model 1 scores are better than the SFA Model 1 scores. No significant difference in means was found between the best RBF Model 2 scores and the SFA Model 2 scores.

Fig. 7.

Fig. 7

Best RBF Model 1 Error vs. Traditional SFA Model 1 Error Plot

Fig. 8.

Fig. 8

Best RBF Model 2 Error vs. Traditional SFA Model 2 Error Plot

To further compare RBF models with the traditional SFA model, a real-world dataset [37] was obtained where the Cobb–Douglas function is known to fit well. The data comes from the artisanal fishing sector in Nigeria and is shown in Table 10. The 19-year dataset shows the number of canoes, the number of fishers, and the number of fish (annual production) caught by Nigerian fishers. The number of canoes decreased over time as a result of motorization, while the number of fishers remained relatively constant due to larger size canoes employing more fishers per canoe. Following Cobb–Douglas production function was used for traditional SFA on Nigerian artisanal fisheries data:

Production=β0Canoesβ1Fishersβ2.

Table 10.

The Nigerian Artisanal Fisheries Data (1976–1994)

Year Canoes Fishers Production (in thousands)
1976 134,337 413,832 327,561
1977 137,447 424,838 331,280
1978 138,447 425,298 336,138
1979 133,728 446,152 356,888
1980 133,723 459,065 274,158
1981 120,142 440,592 323,916
1982 105,239 416,959 377,683
1983 129,555 472,122 376,984
1984 109,638 342,219 246,784
1985 80,688 302,234 140,873
1986 77,134 408,927 160,169
1987 76,644 437,465 145,755
1988 77,144 447,850 185,181
1989 77,155 470,250 171,332
1990 76,981 452,187 170,459
1991 77,093 457,102 168,211
1992 77,076 459,847 184,407
1993 77,050 456,381 106,276
1994 77,073 457,775 124,117

For RBF and Normalized RBF models, a value of N = 5 was used. Table 11 illustrates the efficiency scores obtained by the three models. All models have similar statistical performance and the correlation coefficient between the traditional SFA model and the RBF models was about 0.78. The two tails |t|-statistics for differences in means between the SFA model and the RBF models were 0.78 (RBF, df = 18) and 0.81(Normalized RBF, df = 18), respectively.

Table 11.

Efficiency Scores for Nigerian Artisanal Fisheries Data (1976–1994)

Year SFA RBF (N = 5) Normalized RBF (N = 5)
1976 0.914 0.941 0.941
1977 0.910 0.975 0.974
1978 0.910 1.000 1.000
1979 0.923 0.986 0.986
1980 0.881 0.760 0.760
1981 0.930 0.761 0.760
1982 0.958 1.000 1.000
1983 0.933 1.000 1.000
1984 0.922 1.000 1.000
1985 0.910 0.979 0.979
1986 0.925 0.923 0.922
1987 0.911 0.804 0.804
1988 0.937 1.000 1.000
1989 0.928 0.892 0.890
1990 0.929 0.916 0.915
1991 0.927 0.895 0.893
1992 0.936 0.977 0.976
1993 0.848 0.567 0.565
1994 0.880 0.660 0.658

The results of experiments on both simulated and real-world datasets indicate that the proposed RBF models will perform equal or better than the traditional SFA model. This allows the RBF model broad applicability. The proposed RBF model is not computationally expensive compared to traditional neural networks that are also capable of general approximations [38]. There are a few limitations of RBF models, where their applicability will be limited by a minimum of 2 independent variables. Additionally, sample sizes must be adequate for RBF learning. The traditional SFA model can be used to test the returns-to-scale relationship between production function inputs and output. Since the production function equation will be unknown in the RBF models, testing such a relationship will not be possible in the RBF model.

Summary, Conclusions and Directions for Future Work

This paper connects two established fields for neural networks and econometrics; and presents a new RBF neural network framework for SFA. Using simulated and real-world datasets, this paper proposes and tests two RBF models for SFA and reports the results. The primary benefits of the proposed models are that the models will always provide results as long as the problem has two or more inputs and datasets satisfy (2⋅n) ≤ (k−3) constraint. Any advances in econometric literature, including the use of different inefficiency score distributions or the use of panel data, can be directly incorporated into the proposed framework with minor modifications.

The unique aspect of this paper was the use of an interpolation model for SFA. It appears that interpolation models did not receive a lot of attention in econometrics literature. The current study illustrates that the results of these models are not drastically different from kernel regression models, but future research may consider using similar models for production frontier analysis. The current study only used one output in the production function. The framework allows for multiple outputs, but each of the multiple outputs will have to be treated as a unique independent production function in the hidden-to-output layer of the RBF neural network. Overall, the production functions for all outputs are not entirely independent because the input-to-hidden layer nodes and the basis centers will be the same for all production function outputs. These challenges of incorporating multiple outputs into the RBF neural network model may be addressed by future researchers.

The current paper used standard methods and functions for the RBF neural network. Different methods and learning techniques can also be used with possible slight differences in results. The Green’s function (Eq. 7) used in the current research was a multivariate Gaussian function and cluster centers were determined using the K-Means cluster analysis. Other Green’s functions and cluster analysis techniques can be used for the input-to-hidden layer unsupervised learning. Additionally, standard Newton techniques were used to compute connection weights for hidden-to-output layer neurons. Expectation maximization [39] and simulated likelihood [40] approaches can also be used to estimate these connection weights. When trying different techniques and solution procedures, ensembles may be built to lower estimation errors. Future research is needed to investigate the merits of such procedures.

Declarations

Conflict of interest

There are no competing financial or non-financial interests to disclose.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Deprins D, Simar L, Tulkens H (1984) Measuring labor inefficiencies in post offices. In: The performance of public enterprises: concepts and measurements. Amsterdam, The Netherlands, pp 243–267
  • 2.Kuosmanen T, Kortelainen M. Stochastic non-smooth envelopment of data: semi-parametric frontier estimation subject to shape constraints. J Prod Anal. 2012;38:11–28. doi: 10.1007/s11123-010-0201-3. [DOI] [Google Scholar]
  • 3.Valero-Carreras D, Aparicio J, Guerrero NM. Support vector frontiers: a new approach for estimating production functions through support vector machines. Omega Int J Manag Sci. 2021 doi: 10.1016/j.omega.2021.102490. [DOI] [Google Scholar]
  • 4.Fan Y, Li Q, Weersink A. Semiparametric estimation of stochastic production frontier models. J Bus Econ Stat. 1996;14:460–468. [Google Scholar]
  • 5.Lippmann R. An introduction to computing with neural nets. IEEE Assp Magazine. 1987;4:4–22. doi: 10.1109/MASSP.1987.1165576. [DOI] [Google Scholar]
  • 6.Bongini P, Bianchini M, Scarselli F. Molecular generative graph neural networks for drug discovery. Neurocomputing. 2021;450:242–252. doi: 10.1016/j.neucom.2021.04.039. [DOI] [Google Scholar]
  • 7.Bhosale YH, Patnaik SK. Application of deep learning techniques in diagnosis of covid-19 (coronavirus): a systematic review. Neural Process Lett. 2022 doi: 10.1007/s11063-022-11023-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Haykin S. Neural networks, second. Upper Saddle River, NJ: Prentice Hall; 1999. [Google Scholar]
  • 9.Abdou MA. Literature review: efficient deep neural networks techniques for medical image analysis. Neural Comput Appl. 2022;34:5791–5812. doi: 10.1007/s00521-022-06960-9. [DOI] [Google Scholar]
  • 10.Pendharkar P, Rodger J. Technical efficiency-based selection of learning cases to improve forecasting accuracy of neural networks under monotonicity assumption. Decis Support Syst. 2003;36:117–136. doi: 10.1016/S0167-9236(02)00138-0. [DOI] [Google Scholar]
  • 11.Pendharkar PC. A hybrid radial basis function and data envelopment analysis neural network for classification. Comput Oper Res. 2011;38:256–266. doi: 10.1016/j.cor.2010.05.001. [DOI] [Google Scholar]
  • 12.Pendharkar PC. A hybrid radial basis function DEA and its applications to regression, segmentation and cluster analysis problems. Mach Learn Appl. 2021;6:100092. [Google Scholar]
  • 13.Kneip A, Simar L. A general framework for frontier estimation with panel data. J Productivity Anal. 1996;7:187–212. doi: 10.1007/BF00157041. [DOI] [Google Scholar]
  • 14.Micchelli CA. Interpolation of scattered data: distance metrices and conditionally positive definite functions. Constr Approx. 1986;2:11–22. doi: 10.1007/BF01893414. [DOI] [Google Scholar]
  • 15.Bhosale YH, Patnaik SK (2022) IoT deployable lighweight deep learning application for COVID-19 detection with lung diseases using RaspberryPi. In: Proceedings of IEEE international conference on IoT and blockchain technologies. IEEE, Ranchi, India, pp 1–6
  • 16.Charnes A, Cooper WW, Rhodes E. Measuring the efficiency of decision making units. Eur J Oper Res. 1978;2:429–444. doi: 10.1016/0377-2217(78)90138-8. [DOI] [Google Scholar]
  • 17.Banker RD, Charnes A, Cooper WW. Some models for estimating technical and scale inefficiencies in data envelopment analysis. Manag Sci. 1984;30:1078–1092. doi: 10.1287/mnsc.30.9.1078. [DOI] [Google Scholar]
  • 18.Banker RD, Kauffman RJ, Morey RC. Measuring gains in operational efficiency from information technology: a study of the positran deployment at Hardee’s Inc. J Manag Inform Syst. 1990;7:29–54. doi: 10.1080/07421222.1990.11517888. [DOI] [Google Scholar]
  • 19.Pendharkar P. Scale economies and production function estimation for object-oriented software component and source code documentation size. Eur J Oper Res. 2006;172:1040–1050. doi: 10.1016/j.ejor.2004.10.023. [DOI] [Google Scholar]
  • 20.Dyson RG, Thannassoulis E. Reducing weight flexibility in data envelopment analysis. J Oper Res Soc. 1988;39:563–576. doi: 10.1057/jors.1988.96. [DOI] [Google Scholar]
  • 21.Kitchenham BA. The question of scale economies in software– why cannot researchers agree? ‎Inf Softw Technol. 2002;44:13–24. doi: 10.1016/S0950-5849(01)00204-X. [DOI] [Google Scholar]
  • 22.Pendharkar PC, Koehler GJ. A general steady state distribution based stopping criteria for finite length genetic algorithms. Eur J Oper Res. 2007;176:1436–1451. doi: 10.1016/j.ejor.2005.10.050. [DOI] [Google Scholar]
  • 23.Aigner DJ, Lovell CAK, Schmidt P. Formulation and estimation of stochastic frontier production functions. J Econ. 1977;6:21–37. doi: 10.1016/0304-4076(77)90052-5. [DOI] [Google Scholar]
  • 24.Meeusen W, van den Broeck J. Efficiency estimation from Cobb–Douglas production functions with composed error. Int Econ Rev. 1977;18:435–444. doi: 10.2307/2525757. [DOI] [Google Scholar]
  • 25.Jondrow J, Lovell CAK, Materov IS, Schmidt P. On the estimation of technical inefficiency in the stochastic frontier production function model. J Econ. 1982;23:269–274. [Google Scholar]
  • 26.Murillo-Zamorano LR. Economic efficiency and frontier techniques. J Econ Surv. 2004;18:33–77. doi: 10.1111/j.1467-6419.2004.00215.x. [DOI] [Google Scholar]
  • 27.Greene WH, Schmidth SS (1993)Oxford University Press, Oxford,pp 68–119
  • 28.Park J, Sandberg IW. Universal approximation using radial-basis-function networks. Neural Comput. 1991;3:246–257. doi: 10.1162/neco.1991.3.2.246. [DOI] [PubMed] [Google Scholar]
  • 29.Poggio T, Girosi F (1990) Networks for approximation and learning. In: Proceedings of IEEE. pp 1481–1497
  • 30.Lowe D (1991) What have neural networks to offer statistical pattern processing. In: Proceedings of the SPIE Conference on Adaptive Signal Processing. San Diego, CA, pp 460–471
  • 31.Cover TM. Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Trans Electron Comput EC. 1965;–14:326–334. doi: 10.1109/PGEC.1965.264137. [DOI] [Google Scholar]
  • 32.Xu L, Krzyzak A, Yuille A. On radial basis function nets and Kernel regression: statistical consistency, convergence rates, and receptive field size. Neural Netw. 1994;7:609–628. doi: 10.1016/0893-6080(94)90040-X. [DOI] [Google Scholar]
  • 33.Kreyzak A, Linder T, Lugosi G. Nonparametric estimation and classification using radial basis functions. IEEE Trans Neural Netw. 1996;7:475–487. doi: 10.1109/72.485681. [DOI] [PubMed] [Google Scholar]
  • 34.Lowe D (1989) Adaptive radial basis function nonlinearities, and the problem of generalisation. In: First IEE International Conference on Artificial Neural Networks. London, UK, pp 171–175
  • 35.Kumbhakar SC, Lovell CAK. Stochastic frontier analysis. Cambridge, UK: Cambridge University Press; 2000. [Google Scholar]
  • 36.Pendharkar PC. Ensemble based ranking of decision making units. Inform Syst Oper Res. 2013;51:151–159. [Google Scholar]
  • 37.Amire AV (2003) Monitoring, measurement and assessment of fishing capacity: the Nigerian experience. United Nations
  • 38.Wang S. Adaptive non-parametric efficiency frontier analysis: a neural-network-based model. Comput Oper Res. 2003;30(2):279–295. doi: 10.1016/S0305-0548(01)00095-8. [DOI] [Google Scholar]
  • 39.de Andrade BB, Souza GS. The EM algorithm for standard stochastic frontier models. Pesquisa Operacional. 2019;39:361–378. doi: 10.1590/0101-7438.2019.039.03.0361. [DOI] [Google Scholar]
  • 40.Greene WH. Simulated likelihood estimation of the Normal-Gamma stochastic frontier function. J Prod Anal. 2003;19:179–190. doi: 10.1023/A:1022853416499. [DOI] [Google Scholar]

Articles from Neural Processing Letters are provided here courtesy of Nature Publishing Group

RESOURCES