Predictive Model Markup Language (PMML) Representation of Bayesian Networks: An Application in Manufacturing

Saideep Nannapaneni; Anantha Narayanan; Ronay Ak; David Lechevalier; Rachael Sexton; Sankaran Mahadevan; Yung-Tsun Tina Lee

doi:10.1520/SSMS20180018

. Author manuscript; available in PMC: 2019 Jul 2.

Published in final edited form as: Smart Sustain Manuf Syst. 2018;2:10.1520/SSMS20180018. doi: 10.1520/SSMS20180018

Predictive Model Markup Language (PMML) Representation of Bayesian Networks: An Application in Manufacturing

Saideep Nannapaneni ¹, Anantha Narayanan ², Ronay Ak ³, David Lechevalier ⁴, Rachael Sexton ³, Sankaran Mahadevan ⁵, Yung-Tsun Tina Lee ³

PMCID: PMC6604043 NIHMSID: NIHMS1523705 PMID: 31276104

Abstract

Bayesian networks (BNs) represent a promising approach for the aggregation of multiple uncertainty sources in manufacturing networks and other engineering systems for the purposes of uncertainty quantification, risk analysis, and quality control. A standardized representation for BN models will aid in their communication and exchange across the web. This paper presents an extension to the Predictive Model Markup Language (PMML) standard, for the representation of a BN, which may consist of discrete variables, continuous variables, or their combination. The PMML standard is based on Extensible Markup Language (XML) and used for the representation of analytical models. The BN PMML representation is available in PMML v4.3 released by the Data Mining Group. We demonstrate the conversion of analytical models into the BN PMML representation, and the PMML representation of such models into analytical models, through a Python parser. The BNs obtained after parsing PMML representation can then be used to perform Bayesian inference. Finally, we illustrate the developed BN PMML schema for a welding process.

Keywords: PMML, Bayesian networks, Uncertainty, XML, Analytics, Standard, Manufacturing

1. Introduction

Recent technological developments in sensors and data collection have resulted in the availability of large amounts of data from the operation of engineering systems such as manufacturing, aerospace, and civil infrastructure systems [1]. However, in the era of digital transformation, data collection alone is not enough to make smart decisions. To create a smart manufacturing environment where all available information—from within the plant floor and from along the supply chain—is captured in real-time, data analytics techniques are used to derive actionable insights to improve the efficiency of the process and product quality. Such data-driven decision-making can provide manufacturing sector with several advantages [2]. One of the key aspects within data mining is the construction of appropriate analytical (predictive) models from data to enable the required analyses such as surface roughness prediction [3], process monitoring [4], health management, prognostics and diagnostics [5–7], optimization [8], and uncertainty quantification [9,10]. Several types of analytical models to represent the data have been developed and used, such as support vector machines (SVMs), neural networks (NNs), Gaussian process regression (GPR), decision trees, and Bayesian networks (BNs). In addition to building such analytical models, it is also necessary to be able to exchange these models across different data mining environments. This model exchange capability enables collaborative development of analytical models and rapid model deployment without allocating the additional resources for rebuilding the models. For universal communication of models, standards and protocols [11] are necessary that result in standardized representations of the analytical models.

There are two prominent standards that are being used for the communication of analytical models between two data mining environments: (1) Predictive Model Markup Language (PMML) [12], and (2) Portable Format for Analytics (PFA) [13]. PMML represents the de facto standard for the representation and communication of predictive analytic models [12,14]. PFA represents an emerging JSON-based standard for the representation of statistical models and their associated model and data transformations. Similar to PFA, PMML documents are also intermediate text files, in the Extensible Markup Language (XML) file format [14], produced by data analysis tools and later consumed by a scoring engine in the production environment. Both these standards are being developed and maintained by the Data Mining Group (DMG), an independent consortium that develops data mining standards [15]. A primary difference between the two formats is that PFA enables analytics along with model representation and PMML is primarily used for model representation. Between these two standards, PMML has been widely used and supported by several statistical programming tools such as R, Python and SAS, therefore, this paper primarily focusses on PMML. PMML defines a set of rules for the representation of analytical models (called the PMML schema) such that they can easily be exchanged. The entire model development and deployment can be carried out in the following steps: (1) An analytical model such as a BN is created using available data; (2) The analytical model is converted to a PMML representation (using a parser) and communicated to an appropriate practitioner; and (3) The practitioner imports the PMML representation into an analytical model and uses it for his/her needs accordingly. It should be noted that PMML representation is only a textual representation of a model and has no predictive capabilities. The PMML representation needs to be first converted to an analytical model in a programming language, such as R or Python, for prediction.

As mentioned above, several types of manufacturing process analytics are carried out, one of which is uncertainty quantification, which refers to the quantification of several uncertainty sources, and the aggregation and effect on the quantities of interest (QoIs). The QoIs in the manufacturing domain typically refer to the Key Performance Indicators (KPIs) such as energy consumption per part, cycle time, total process time for manufacturing/machining a part, throughput, cost per part, and overall equipment effectiveness (OEE). In complex multistage and multi-level manufacturing processes, the evaluation of KPIs and the uncertainty associated with them becomes increasingly difficult as the uncertainty aggregates from multiple individual processes that affect the overall KPIs. The different uncertainty sources in manufacturing analytics can be related to the inputs (i.e., variability in the input material properties such as density), process variability, models (i.e., uncertainty in the modeling of individual manufacturing processes), and sensors (associated with measurement errors). In our earlier work [9,16], we have used a BN as a mathematical framework for the aggregation of uncertainty from multiple sources to quantify the uncertainty in the KPIs. BNs facilitate two types of analyses: (1) Forward uncertainty propagation, where the uncertainty in the outputs (such as KPIs) can be estimated, given the uncertainty about various inputs, model parameters, and model errors, and (2) Bayesian inference, where unmeasured inputs or model parameters can be estimated given observations of the outputs.

BNs have been used for various applications in manufacturing such as predicting surface roughness [17], tool wear [18,19], maintenance [20], fault diagnosis [21], job-shop scheduling [22], supply chain diagnostics [23], and life cycle analysis [24]. Their ability to incorporate various uncertainty sources and facilitate uncertainty quantification analysis makes them particularly useful for the manufacturing domain. To ease the adoption of BNs in manufacturing through a variety of commercial tools, we believe that it is beneficial to have a standardized representation of such models. To this end, this paper develops a standardized representation of a Bayesian network using PMML, which as mentioned above is the de facto standard for the representation and communication of predictive analytical models.

The previous version of PMML, v4.2.1 [25], did not have a PMML schema for BN representation. However, a schema for a Naïve Bayes model was available. A Naïve Bayes model is a probabilistic classification model and can be considered as a special case of a two-level BN, where the input features are in the first level and the classification class is in the second level [26]. Since Naïve Bayes is a classification model, it cannot be used for the probabilistic prediction of KPIs, which needs to be performed using a regression model constructed using available data. On the other hand, a BN can accommodate dependence between several parameters/features, can facilitate multi-level modeling for the purposes of classification or regression, as opposed to a Naïve Bayes model, which is a two-level classification model that assumes independence between features when conditioned on the classification class. Therefore, we have developed a generalized schema for a BN, and it is available in the latest version of PMML, v4.3 [27]. In this paper, we explain the BN PMML schema and demonstrate its usage using a real-world case study of a welding process. It is worth mentioning that PMML v4.3 also has a schema for Gaussian process regression (GPR) [28], which provides probabilistic prediction. There are a few differences between a BN and a GPR. As mentioned earlier, a BN can be used for both forward uncertainty propagation and Bayesian inference whereas a GPR is typically used for prediction at new input values. A BN is used to aggregate uncertainty from multiple processes in a manufacturing network and estimate the overall uncertainty in the output QoIs resulting from several individual manufacturing processes [29]. A primary disadvantage of a BN compared to GPR is that the prediction or inference can be computationally expensive as it is mostly sampling-based, although there are special cases of a BN where analytical estimation is possible; these cases are later mentioned in Section 2.2.

The overall contributions made through this paper are: (1) Standardized representation of a BN using PMML; (2) Development of a Python-based parser for conversion of PMML representations into analytical models and vice-versa; and (3) Demonstration of proposed methods using a welding process case study.

The rest of the paper is organized as follows. Section 2 provides a brief background of BNs, their construction, and inference techniques. Section 3 discusses the BN PMML representation, and Section 4 presents a BN PMML parser for conversion of a PMML descriptive model to an analytical model, which can be used for forward uncertainty propagation and Bayesian inference. Section 5 illustrates the proposed methodology using the case study of a welding process followed by concluding remarks and future work in Section 6.

2. Theory of Bayesian Networks

A BN represents the joint probability distribution of a set of random variables through a directed acyclic graphical model, consisting of nodes and directed arcs where nodes represent random variables and arcs represent the dependence between the nodes. Using the directional information from the graphical model, the joint probability of the random variables is represented as a product of conditional and marginal probability distributions. The joint probability of n random variables, X = {X₁, X₂, …X_n} can be represented as

f (X) = \prod_{i = 1}^{n} f (X_{i} | P a_{X_{i}})

(1)

where $P a_{X_{i}}$ represents the set of parent nodes of X_i, i.e., nodes from which the arcs direct to X_i and $f (X_{i} | P a_{X_{i}})$ represents the conditional probability distribution of X_i conditioned on its parent nodes. If X_i has no parent nodes (also referred to as root nodes), then $f (X_{i} | P a_{X_{i}})$ represents the marginal distribution of X_i. It should be noted that the joint probability will be equal to the product of marginal probabilities when all the nodes are independent to each other. For illustration, consider a 5-node BN shown in Figure 1. Using the dependence information from Figure 1, the joint probability distribution of X₁, X₂, X₃, X₄, and X₅ can be described as

f (X_{1}, X_{2}, X_{3}, X_{4}, X_{5}) = f (X_{1}) f (X_{2}) f (X_{3} | X_{1}, X_{2}) f (X_{4} | X_{3}, X_{1}) P (X_{5} | X_{2})

(2)

Figure 1: — An illustrative Bayesian network.

Techniques for the construction of BNs are discussed in Section 2.1. BNs are primarily used to update our knowledge on a subset of random variables when another disjoint subset of random variables is observed. Depending upon the network complexity, several analytical and approximate techniques were developed in the literature to perform Bayesian inference; these techniques are discussed in Section 2.2.

2.1. Bayesian network construction

Construction of a Bayesian network involves estimation of dependence relationships between variables, both qualitatively and quantitatively through marginal and conditional probability distributions. The techniques for the construction of BNs can be categorized into three types: (1) Using available mathematical models [16]; (2) Data-driven approaches [30]; and (3) Hybrid approaches [29,31]. The available mathematical models can be derived from the physics of the system, or empirical models constructed from a combination of expert knowledge and experiments regarding the system. When mathematical models are unavailable but data regarding system operation is available, the BN can be constructed using BN learning algorithms, which are discussed later in this section. In some cases, it may be possible that the available mathematical models explain some aspects of the system, and data is available regarding the unknown aspects of the system. In such scenarios, a hybrid approach for learning is implemented where a BN is built in two steps. In the first step, a partial BN is constructed using available mathematical models or expert domain knowledge. In the second step, the partial BN constructed in the first step is used as a starting point (as opposed to a completely unconnected graph in the case of data-driven approach) to learn the remaining dependencies between the variables. We discuss below the learning algorithms for BN construction from data. Constructing a BN involves learning the dependence between variables, both qualitatively and quantitatively. Qualitative learning is typically referred to as structure learning whereas quantitative learning involves the estimation of parameters of the conditional and marginal probability distributions associated with several nodes in the Bayesian network. The techniques for learning a BN structure is discussed below.

Given an amount of data on a set of random variables, we first identify the BN structure (network topology) that best explains the dependence between the variables. The structure learning algorithms can be classified into three categories: (1) Constraint-based; (2) Score-based; and (3) Hybrid, i.e., a combination of constraint-based and score-based methods [30]. Constraint-based methods perform conditional independence tests between the random variables, and then identify the BN structure that satisfies the independence tests. Commonly used independence tests for discrete variables include Mutual Information Test (MIT), G-test and Chi-squared test, and those for continuous variables include MIT and conditional correlations [30,32]. The expressions of MIT for discrete and continuous variables are given in Eqs. 3 and 4, respectively. Note that MIT can be used between discrete-discrete and continuous-continuous variable pairs but not for discrete-continuous variable pairs.

I_{X, Y} = \sum_{Y} \sum_{X} f (x, y) l o g (\frac{f (x, y)}{f (x) f (y)})

(3)

I_{X, Y} = \int_{Y} \int_{X} f (x, y) l o g (\frac{f (x, y)}{f (x) f (y)})

(4)

where f(x, y) represents the joint probability distribution of X and Y, and f(x) and f(y) represent the marginal distributions of X and Y respectively. The conditional and marginal distributions of the considered variables in Eqs. 3 and 4 are derived from the available data in the learning process. Some constraint-based algorithms include the PC algorithm [33], Grow-Shrink [34], and Incremental Association Markov Blanket (IAMB) [35] along with its variants such as Fast Incremental Association and Interleaved Incremental Association [35,36].

Score-based methods assign a score for every possible BN; this score is calculated based on the goodness-of-fit of the BN in fitting the available data. Using the scoring measure, a set of heuristic optimization techniques are used to learn the BN structure that optimizes the defined score. Commonly used scoring measures include Likelihood, Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Bayesian Dirichlet Equivalence (BDe), and Minimum Description Length (MDL) [30]. Some commonly used measures (BIC and BDe) are provided in Eqs. 5 and 6. The BDe score refers to the posterior probability of a BN given data.

B I C = k \times ln (n) - 2 \times ln (L)

(5)

f (G | D) \propto f (D | G) f (G) = f (G) \int f (D | G, Θ) f (Θ | G) d Θ

(6)

In Eq. 5, L, k and n represent the likelihood of observing the available data given a BN, the number of free parameters that are estimated, and the number of available data samples, respectively. In Eq. 6, G and D refer to a BN structure and available data, respectively. Some commonly used optimization techniques for obtaining an optimal BN include greedy search algorithms such as Hill-Climbing and Tabu search [30]. A key difference between the constraint-based algorithms and score-based algorithms is that the former may result in a partially directed graph whereas the latter always results in a directed graph. The inability of the constraint-based methods to obtain a directed graph is because the directions cannot be deduced from the available data. The directionality of the graph is particularly useful to decompose the joint probability distribution into a set of conditional and marginal probability distributions; this decomposition is useful to carry out Bayesian inference.

Hybrid algorithms employ a combination of constraint-based and score-based techniques to obtain an optimal BN structure. The optimal BN through hybrid algorithms is obtained in two stages. In the first stage, constraint-based methods are used to obtain a partially directed graph. In the second stage, the score-based methods are used to obtain the orientation of undirected edges in the partially directed graph that best explains the available data. Some examples of hybrid learning algorithms include Max-Min Hill Climbing and 2-phase Restricted Maximization [37]. The next step after obtaining an optimal BN structure is the estimation of the parameters in the conditional dependence relationships between the random variables. Typically, the principle of Maximum Likelihood is used for parameter learning.

Incorrect Bayesian network learning can lead to incorrect model predictions, which can result in incorrect decision-making when the model is used for further analytics such as probabilistic design optimization and system health diagnosis and prognosis. When the Bayesian network is learnt from validated mathematical models, its structure (dependence relationships) may be known precisely. Conversely, Bayesian network learning from data often depends on the amount of available data, and the complexity of model (e.g., maximum number of parent nodes for a child node) [38]. In addition, different learning methods (constraint-based, score-based or hybrid) and different learning metrics within each method (e.g., different scoring metrics in score-based methods) may result in different Bayesian networks. Therefore, the constructed Bayesian network models need to be validated using additional data (collected independently of training data) before using them for further analytics and decision-making. The readers are referred to [39] for a review of validation techniques.

2.2. Uncertainty propagation and Bayesian inference

The BN constructed using the techniques in Section 2.1 can now be used for inferring the posterior distributions of unobserved variables (denoted as X_unobs) using any data (D) on the observable variables (X_obs) via Bayes’ theorem as

f (X_{u n o b s} | X_{o b s} = D) = \frac{f (X_{o b s} = D | X_{u n o b s}) f (X_{u n o b s})}{\int f (X_{u n o b s}, X_{o b s} = D) d X_{u n o b s}}

(7)

In Eq. 7, the terms f(X_unobs|X_obs = D), f(X_unobs), and f(X_obs = D|X_unobs) represent the posterior distributions of unobserved variables, their prior distributions of observed variables. The denominator term, ∫ f(X_unobs, X_obs = D) dX_unobs, refers to the probability of observable variables to be equal to the data; this is a deterministic value that can be computed by marginalizing over the unobservable variables. Updating in a generic BN can be computationally intractable due to high-dimensional integration, i.e., exact inference is prohibitively expensive. However, exact algorithms are available for updating in special classes of BNs such as discrete BNs, conditional linear Gaussian networks, and networks where conditional dependence relationships are modeled using mixtures of truncated exponentials or truncated polynomials [40]. When exact updating techniques are not available, sampling-based techniques are used to obtain the posterior distributions. In the case of forward uncertainty propagation when the observed variables are the inputs, the distributions of downstream variables can be obtained using Monte Carlo sampling over the conditional dependence relationships. For Bayesian inference, techniques such as Markov Chain Monte Carlo (MCMC) methods [41], variational methods [42], bootstrap filters [43], approximate Bayesian computation (ABC) [44], and unscented transforms [45] can be used to obtain the posterior distributions of the upstream nodes given observations of the downstream nodes. The above BN learning procedure and Bayesian inference are summarized in the flowchart shown in Figure 2.

Figure 2. — Flowchart summarizing the Bayesian network learning and inference procedures.

3. Bayesian Network – PMML Schema

PMML is an XML-based standard that defines schemas for representing different machine learning models. The schemas are defined using the XML Schema Definition Language (XSD) standard [46]. In this section, we describe the schema that we developed to represent BNs. While we use some XSD terminology in describing this schema, the conceptual model may be applied in any notation or environment. The figures in this section were generated by a software tool for viewing and editing XSDs. The XSD schema itself is tool agnostic, and may be used with any XML processing tool. An XML document contains a sequence of ‘elements’ and their ‘attributes’, and elements may contain other elements in a hierarchical tree-like structure. The schema defines the types of these elements and attributes, and their structure. In addition, the schema for BN uses other terms already defined in the PMML standard, and we refer the reader to the PMML standard for more details on these terms [12].

A BN typically consists of two types of nodes: root nodes and non-root nodes. Root nodes are variables that can be described without dependence on any other nodes. Similarly, non-root nodes are variables that are not independent but are described based on their parent nodes. The root nodes and non-root nodes can be either discrete or continuous in nature. The following information is needed for a complete description of any BN.

Marginal probability tables for discrete root nodes, i.e., a set of all possible values that can be taken by a discrete variable and their probabilities.
Marginal probability distributions of continuous root nodes, which include a distribution type (e.g., Gaussian, Uniform, etc.) and its corresponding distribution parameters.
Conditional probability distribution tables of discrete non-root (child) nodes defined on a set of parent nodes, i.e., for a given realization of a discrete parent node or a given range of continuous parent node, a conditional probability table (possible values and their probabilities) is defined.
Conditional probability distributions of continuous non-root nodes defined over a set of parent nodes; this conditional distribution consists of a distribution type and its parameters dependent on the parent nodes.

We describe here the PMML schema to represent the above information. Figure 3 provides a high-level view of the BN PMML schema. We first define an element type called BayesianNetworkmodel, which will be the root element of our BN model. Figure 3 shows the main constituents of a BayesianNetworkModel in PMML. The attributes of this element are shown in the box at the top, along with the attribute type. Its child elements are shown in the rounded rectangles on the bottom branch of the figure. The generic BN PMML representation provides a set of rules based on which instance BN models can be defined. modelName represents the name of the instance BN model such as “Welding Process BN”. functionName represents if the model is of classification or regression type, and its type MINING-FUNCTION is a special type defined elsewhere in the PMML standard. algorithmName refers to the name of the algorithm (e.g., MCMC) that is used for prediction process. isScorable identifies if the model can be used to perform predictions. Most of the elements shown in Figure 3 are generic and apply to all predictive models in PMML, except the element BayesianNetworkNodes, which was specifically defined for describing BNs. The description regarding the standard elements can be found in [15] and [47], and in this paper, we focus on the BayesianNetworkNodes element. Figure 4 describes the BayesianNetworkNodes schema. The nodes in a BN can be either discrete or continuous, which is specified in Figure 4.

Figure 4. — Schema of a *BayesianNetworkNodes* element.

3.1. DiscreteNode element

The schema for the DiscreteNode is provided in Figures 5 and 6. A DiscreteNode element has two attributes: name and count. name refers to the variable name. count is an optional attribute that refers to the total number of cases (frequency) in the data used to compute the probability values, and is useful for updating the probability values when new data is available. DiscreteNode is a complex element, i.e., it is dependent on other elements such as DerivedField, DiscreteConditionalProbability, and ValueProbability. A discrete variable can be either a root node or a non-root node. If a discrete node is root node, then the possible values and their probabilities are defined using the ValueProbability element, as shown in Figure 7. ValueProbability is a basic element, i.e., this does not depend on any other elements, as opposed to DiscreteNode. The possible values are provided using the value attribute, and the probabilities are provided using the probability attribute. The probability attribute is of type PROB-NUMBER, which is a special type defined in the PMML standard, and represents real numbers between 0.0 and 1.0. As opposed to a discrete root node, a discrete non-root node is defined using the DiscreteConditionalProbability element. As discussed above, a conditional probability table of a discrete non-root node is described based on the values of the parent nodes. In the DiscreteConditionalProbability element, the parent node values are provided using the ParentValue element and the probabilities conditional on these parent node values are provided using the ValueProbability element. The ParentValue, as shown in Figure 8, is also a basic element where the parent nodes and their values are given using the parent and value attributes. The parent attribute is of type FIELD-NAME, which specifies that the value of parent must be another element in the model (i.e. the parent node must exist in the BN model). In some cases, when a discrete non-root node is dependent on a continuous node, the continuous node is discretized and the conditional probability table of the discrete node is defined in each discretized range of the continuous variable. This discretization transformation is described using the DerivedField element, a standard PMML element that is specified once but used in multiple model elements within the PMML document for reducing the size of PMML documents that contain multiple model elements (refer to PMML v4.3 [27] for more details). Thus, schema elements are defined to describe discrete root and non-root nodes. Consider a discrete non-root node K₂ with two possible values, K₂ = 0,1 dependent on two nodes (parent nodes): a discrete root node, K₁, which has two possible values, K₂ = 0,1; and a continuous root node, J₁, which is defined as a uniform distribution between −1 and 1. A sample conditional probability table for K₂ can be defined as shown in Table 1. The discretization of the continuous node (see Table 1), can be represented using DerivedField element. In Table 1, Pr(.) represents the probability function, which outputs the probability of an event. In the following subsection, we consider continuous nodes.

Figure 5. — Schema of a *DiscreteNode* element.

Figure 6. — Schema of a *DiscreteConditionalProbability* element.

Figure 7. — Schema of a *ValueProbability* element.

Figure 8. — Schema of a *ParentValue* element.

TABLE 1.

Conditional probability table of a discrete variable with discrete and continuous parent variables.

	K₁ = 0 −1 ≤ J₁ < 0	K₁ = 0 0 ≤ J₁ ≤ 1	K₁ = 1 −1 ≤ J₁ < 0	K₁ = 1 0 ≤ J₁ ≤ 1
Pr(K₂ =0\|K₁, J₁)	0.2	0.4	0.5	0.8
Pr(K₂ = 1\|K₁, J₁)	0.8	0.6	0.5	0.2

Open in a new tab

3.2. ContinuousNode element

Similar to a discrete node, a continuous node is defined using the ContinuousNode element, as illustrated in Figure 9. The ContinuousNode element has the same two attributes as the DiscreteNode element, i.e., name and count. The name of the continuous node and the number of entries used to define the probability distribution of the continuous node are described using the name and count attributes. The probability distributions of continuous root and non-root nodes are described using ContinuousDistribution and ContinuousConditionalProbability elements, shown in Figures 10 and 11 respectively. In this paper, we provide four possible options for a continuous distribution: Normal (Gaussian), Lognormal, Uniform, and Triangular. Each distribution type has its associated distribution parameters. For example, the parameters of a Normal distribution are the mean and standard deviation whereas a Triangular distribution has three parameters: lower and upper bounds, and the mode value. For illustration, the schema elements of a Normal distribution are provided in Figures 12–14. Similarly, schema elements for the other three distribution types are defined but are not shown in this paper. Please refer to [27] for more details. The ContinuousConditionalProbability element is used to describe a continuous non-root node, which can have both discrete and continuous parent nodes. The ContinuousConditionalProbability element is defined over two other elements: ParentValue and ContinuousDistribution. The discrete parent nodes and their values are defined using the ParentValue element whereas the ContinuousDistribution element is used to define the conditional probability distributions dependent on the continuous parent nodes.

Figure 10. — Schema of a *ContinuousDistribution* element.

Figure 11. — Schema of a *ContinuousConditionalProbability* element.

Figure 12. — Schema of a *NormalDistributionForBN* element.

Figure 14. — Schema of a *Variance* element.

Consider a continuous non-root node J₂ whose parent nodes consist of a discrete node K₁ with two possible values K₁ = 0,1 and a continuous node J₁. A sample conditional distribution of J₂ can be defined as follows: f(J₂|J₁, K₁ = 0)~N(3 × J₁, 1) and f(J₂|J₁, K₁ = 1)~N(5 × J₁, 1). Here, f(.) represents a probability distribution and N(.) represents a Gaussian/Normal distribution. For every value of the discrete parent node (K₁ = 0,1) a continuous variable is defined through a probability distribution whose parameters are dependent on the continuous parent nodes (J₁).

3.3. Representation of functional relationships between variables

As discussed in Section 2.1, one of the procedures for the construction of a BN is by using available physics-based models. Physics-based models often represent functional relationships between the inputs and the output, i.e., for given values of the inputs, the output is a deterministic quantity. Since physics-based models are affected by several uncertainty sources (such as model parameters, model form assumptions, numerical approximations), the output can be stochastic if model uncertainty is considered [48]. Such stochastic models can be represented using the techniques presented in Sections 3.1 and 3.2. In this section, we discuss the representation of such functional nodes. In the BN terminology, the inputs can be considered as parent nodes of an output (non-root node). Let X represents a non-root node, and Π_X represents its parent nodes with X = g(Π_X), where g(.) represents a functional relationship between the inputs (Π_X) and output (X).

We discuss two approaches here regarding the representation of functional nodes. The first approach approximates the functional node as a continuous variable with a Gaussian conditional probability distribution (CPD), given as N(g(Π_X), ∈), i.e., the mean of the Gaussian CPD is the functional relationship between the inputs (Π_X) and output (X), and the standard deviation is equal to ∈, a small number close to zero (e.g., the machine precision, which is in the order of 10⁻¹⁵). A Gaussian CPD is used here; however, any continuous distribution such as lognormal or triangular can also be used by choosing their parameters appropriately. The second approach to represent a functional node is to model it as a discrete node with two states whose conditional probability table $(X | Π_{X} = Π_{X}^{*})$ is defined as follows.

Here, $Π_{X}^{*}$ is realization of Π_X on which the conditional probability table of X is defined. The two states of X are $X = g (Π_{X}^{*})$ and $X \neq g (Π_{X}^{*})$ . As X has a functional relationship, the probability of X being equal to $g (Π_{X}^{*})$ is always equal to 1, and 0 otherwise. A primary drawback with the second approach is that the states of the output variable (X) change with the values of its parent nodes $(Π_{X} = Π_{X}^{*})$ . For example, if $Π_{X} = Π_{X}^{*}$ , then $g (Π_{X}^{*})$ is one of the states; however, if $Π_{X} = Π_{X}^{+}$ , then $g (Π_{X}^{*})$ may not be one of the states but $g (Π_{X}^{+})$ is. Such changing states cannot be captured using the schema for discrete variables discussed in Section 3.2, where the states are known and their corresponding probabilities are learnt from physics-based models, expert knowledge, or data. So, we adopt the first approach in this paper to represent a functional node.

4. BN PMML Parser

The subsequent step after the development of the BN PMML schema is the development of a parser that can translate an analytical BN model into its corresponding PMML representation following the above schema. Here, analytical BN model is defined as a BN model on which predictions can be performed as opposed to a “descriptive” BN model such as its PMML representation, on which predictions cannot be performed directly. A “descriptive” BN model only illustrates the variables present in a BN and the dependence relationships between them. One popular Python library for probabilistic programming is PyMC3, which is primarily concerned with building and sampling the posterior distributions of Bayesian models [49]. It is possible to define a BN using PyMC3, and subsequently perform sampling or include observations that will change the posterior distributions. This makes the model generated using PyMC3 an example of an analytical model. PMML functions as a language-agnostic “descriptive model”, to which one would need a parser that can translate to and from analytic libraries like PyMC3. To illustrate an implementation of such a parser, a tool was developed in the Python programming language^*. This example tool utilizes pyMCNet, an extension to PyMC3 that allows for the definition of BNs using graphs via the NetworkX package [50], which enables fast prototyping and visualization of BN models. Currently, only a few continuous node types are supported in this tool, though in theory any distribution allowed by PyMC3 could be added. Because the PMML schema allows BN nodes to be defined using expressions, some parsing is done using the symbolic-math library SymPy [51], to allow node expression definitions using other node values. Figures 15 and 16 provide an overview of the parser. Figure 15 illustrates the flow of transformation from Python to PMML format while Figure 16 illustrates the flow of transformation from PMML to Python. In particular, the parser can be used to convert a PyMC3 model in Python into its corresponding PMML representation and vice versa. The Python-to-PMML direction maps the names of XML node definitions to known PyMC3 random variables, using Sympy to parse functional node relationships, which then populates a pyMCNet graph. The PMML-to-Python direction maps nodes from PMML to PyMC3 distributions and their parameters, from which the parameter functional relationships (FieldRef in PMML [27]) are used to add edges in the pyMCNet graph.

Figure 15. — Flowchart illustrating the functionality of Python to PMML parser

Figure 16. — Flowchart illustrating the functionality of PMML to Python parser

5. Case study: Welding process

In this section, we demonstrate the BN PMML representation for the energy prediction of a welding process and its parsing into an analytical model. Welding represents a fabrication process of joining two metal parts to form a single part and is one of the fundamental processes in manufacturing. There are different types of welding processes, i.e., different ways to join two metal parts such as electric arc welding and torch welding [52]. Electric arc welding using an electric arc while torch welding uses an oxyacetylene torch to melt and join the metal parts. Energy efficiency represents a key metric for sustainability evaluation of a manufacturing process [53]. An approach for energy efficiency evaluation is the comparison of theoretical energy consumption and the real-world energy consumption of the welded products. The theoretical energy computation is greatly affected by several uncertainty sources such as the parameters of the welding process; therefore, their identification and quantification are necessary for a comprehensive sustainability evaluation. In the case study below, we describe the arc welding process and equations for computing the theoretically minimum energy consumption, identify the uncertain parameters, and connect them to the overall energy using a BN. We construct the associated BN and represent it using the BN PMML standard or schema. In the presence of any process data, the process parameters are updated using Bayes theorem; these updated parameters are used to update the minimum energy value, which can later be used for sustainability evaluation in the presence of experimental energy values.

5.1. Process description

Consider the cross-section of the weld between two metal pieces as shown in Figure 17 [16]. If L represents the length of the weld, l, h, g, t and e represent the weld parameters, as shown in Figure 16, then the overall volume of the weld, V, can be calculated as shown in Eq. 8.

V = L \times (0.75 \times l \times h + g \times t + 0.5 \times (l - g) \times (t - e))

(8)

Assuming that the metal and the filler are of the same material, the minimum theoretical energy required for the welding process is given as

E = ρ \times V \times (C_{p} \times (T_{f} - T_{i}) + H)

(9)

where ρ, C_p, T_f, T_i, and H represent the density of the welding material, heat capacity, and final and initial temperatures of the weld and the latent heat, respectively. Using the above equations, a BN showing the dependence relationships for the energy prediction of a welding process is shown in Figure 18.

Figure 18. — BN for energy prediction of welding process using physics-based models [16].

For illustration purposes, this example considered a physics-based deterministic model for the description of energy consumption in the welding process. In some cases, a physics-based model may not accurately quantify the energy consumption, due to several uncertainty sources such as model form assumptions, unknown model parameters, and numerical approximations. We refer the reader to the work of Nannapaneni and Mahadevan [48] for more details regarding the representation and quantification of model uncertainty sources. Section 2 discussed three ways to construct a BN – physics models, data-driven, or their combination. This example illustrates the BN construction using available physics-based models. The parameters (L, t, e, H, C_p, T_i, and T_f) are assumed to be known while the parameters (l, g, and h) are assumed to be uncertain and quantified using Gaussian distributions whose parameters are unknown. However, prior knowledge regarding the distribution parameters are assumed available. In addition, the density of the weld material is not known precisely but prior knowledge is assumed available. The values of all the known and unknown parameters are given in Table 3. The BN of the welding process, after removing all the known variables in Figure 18 and adding nodes for the distribution parameters of l, g, and h, is given in Figure 19.

TABLE 3.

Welding process parameters.

Parameter	Value
L(mm)	500
t(mm)	15
e(mm)	11
H(kJ/kg)	270
C_p(kJ/kg.K)	0.5
T_i(K)	300
T_f(K)	1600
l	N(μ_l, σ_l)
μ _l	U(8.3,8.6)
σ _l	U(0.2,0.7)
g	N(μ_g, σ_g)
μ _g	U(1.6,2.2)
σ _g	U(0.05,0.2)
h	N(μ_h, σ_h)
μ _h	U(2.5,2.8)
σ _h	U(0.3,0.6)
ρ(kg/m³)	N(8250,10)

Open in a new tab

In Figure 19, green circular, yellow circular and blue squared represent uniform, normal, and deterministic nodes respectively. Nodes are either observed (grayed-out) or unobserved. The edge variable indicated the type of node relationship, either a deterministic/functional relationship (“x”), or a stochastic (in this case, location parameter “μ” or scale parameter “σ”). Details of the BN construction and inference regarding the welding process were discussed in [16]. To model sensor uncertainty, the Energy node is split into a deterministic node E_d (described as E above in Figure 19), and a stochastic node E_L with uncertainty that models the likelihood of our sensor data. It should be noted that the volume (V) and Energy (E) are functional nodes, i.e., given the values of the parent nodes, the values of V and E are known deterministically due to the presence of functional relationships as shown in Eqs. 8 and 9. Following the procedure for the representation V and E are represented as discrete nodes, each with two states dependent on the values of their parent nodes.

5.2. PMML Representation Parsing and Prediction

The PMML representation of the BN for welding process using the BN PMML schema described in Section 3 is given in Figure 20, where scheme elements such as MiningSchema, and DataFields are not shown, and indicated with a (…), for brevity. Note the static node mu_l, the deterministic node “E_d”, and the stochastic node “E_L” having a functional relationship with “E_d”. The PMML representation of the BN for the welding process described using the BN PMML schema can now be shared with whomever necessary for their computational requirements without rebuilding the BN. After transferring the BN PMML file of the welding process, the Python parser described in Section 4 is used to convert the PMML file into an analytical model in Python using the PyMC3 package [49]. For ease in notation, we term the environment where the BN PMML representation of the welding process is created as the training environment and the environment where this PMML file is used for prediction as testing environment following the notation in [28].

Let 100 data points be available in the testing environment on the parameters l, h, g, and E_L collected through sensors. The sensor uncertainty in the measurement of l, h and g, is assumed to be quantified using a Gaussian distribution with zero mean and standard deviation of 0.1 mm. Similarly, the sensor uncertainty in the energy measurement is quantified using a Gaussian distribution with zero mean and a standard deviation of 1 kJ. A Gaussian distribution with zero mean is used to model the sensor uncertainty as both the positive and negative values occur with equal probability due to the symmetric nature of a Gaussian distribution. This allows E_L itself to be modeled as a Gaussian, namely, E_L ~ N(E_d, 1). The objective is to estimate the distribution parameters μ, σ (mean and standard deviation) for dimension variables l.h and g, along with our belief of the weld filler material density (ρ); this is performed using the Bayes’ theorem as

f (μ_{l}, σ_{l}, μ_{h}, σ_{l}, μ_{g}, σ_{g}, ρ | l_{o b s}, h_{o b s}, g_{o b s}, E_{o b s}) \propto f (l_{o b s}, h_{o b s}, g_{o b s}, E_{o b s} | μ_{l}, σ_{l}, μ_{h}, σ_{l}, μ_{g}, σ_{g}, ρ) f (μ_{l}, σ_{l}, μ_{h}, σ_{l}, μ_{g}, σ_{g}, ρ)

(9)

The No-U-Turn Sampling algorithm (NUTS), which uses adaptive Hamiltonian Monte Carlo for efficient sampling [54], is used for inference; the prior and posterior distributions along with the true value used to generate synthetic data, are shown in Figure 21. A total of 80,000 samples were generated using the NUTS algorithm and the last 40,000 are used for constructing the posterior distributions. The first 40,000 samples are ignored (burn-in) to account for the convergence of Markov chain of samples [41]. It can be observed that the variance of posterior distributions of the dimensional parameters (l, g, h) have reduced due to the observation data. The variance reduction density (ρ) is not significant; this can be attributed to the insensitivity of the observed parameters to the energy consumption as illustrated in our previous work using variance-based global sensitivity analysis [16].

Figure 21. — Comparison of the prior and posterior distributions for the inference parameters (a) mean of parameter ‘l’, (b) standard deviation of ‘l’, (c) mean of parameter ‘h’, (d) standard deviation of parameter ‘h’, (e) mean of parameter ‘e’, (f) standard deviation of parameter ‘e’, and (g) density.

6. Conclusion and Future Work

This paper presents the Predictive Model Markup Language (PMML) representation of a generic Bayesian network (BN), which may contain discrete (categorical) variables, continuous variables, or their combination. BNs are probabilistic acyclic graphical models, which have been studied for a variety of applications such as uncertainty quantification, design optimization under uncertainty, risk analysis, and quality control. BNs represent a joint probability distribution over a set of random variables through a combination of marginal and conditional distributions. The BN PMML schema accommodates both discrete variables and continuous variables described by Normal (Gaussian), Lognormal, Uniform, and Triangular distributions. The availability of such a PMML schema helps the exchange of BN models across the PMML compliant software platforms; this enables industry practitioners to use the models directly without rebuilding their own, and hence saving valuable time and computational resources. PMML schema-based models are descriptive in nature; i.e., they cannot directly be used for prediction or inference. Therefore, a parser was developed that allows converting a descriptive BN PMML model into an analytical model in Python using the PyMC3 and NetworkX packages. The analytical model is then used to perform predictions using Markov Chain Monte Carlo methods. In addition, the developed parser also enables the conversion of an analytical model in Python into its corresponding PMML representation. Thus, the parser enables conversion in both directions: from descriptive PMML platform into an analytical model in Python and vice versa. In this paper, we demonstrate the developed methods for PMML representation and parsing for a BN related to a welding process.

Future work should accommodate other parametric distribution types, such as Beta, Multinomial, Dirichlet, and Exponential, and non-parametric distributions, such as kernel density estimations to represent both discrete and continuous variables. In this work, a functional node is approximated using a Gaussian conditional probability distribution with the mean equal to the functional relationship and a small standard deviation, which is in the order of the machine precision. This approach creates errors in the computation; therefore, future work should consider effective representation of functional (deterministic) nodes in the BN PMML schema. In addition, PMML schema for variants of BNs such as dynamic BNs and hierarchical BNs need to be considered. Future work will also be aimed at adding support for generic BNs in the Portable Format for Analytics (PFA) standard. Since PFA offers standard ways to procedurally specify data manipulation algorithms, it offers the possibility of describing advanced inference algorithms within the standard model in a tool independent way, which is currently not possible with PMML.

Figure 13. — Schema of a *Mean* element.

TABLE 2.

Conditional probability table of a functional variable.

	$Π_{X} = Π_{X}^{*}$
$Pr (X = g (Π_{X}^{}) \| Π_{X} = Π_{X}^{})$	1
$Pr (X \neq g (Π_{X}^{}) \| Π_{X} = Π_{X}^{})$	0

Open in a new tab

Acknowledgements

The research reported in this paper is supported by funds from the National Institute of Standards and Technology (NIST) under cooperative agreements with Vanderbilt University (Grant No. 70NANB16H297) and University of Maryland (Grant No. 70NANB14H250), and by the NIST Foreign Guest Researcher Program. The support is gratefully acknowledged.

Disclaimer

Certain commercial systems are identified in this paper. Such identification does not imply recommendation or endorsement by NIST; nor does it imply that the products identified are necessarily the best available for the purpose. Further, any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NIST or any other supporting U.S. government or corporate organizations.

*

https://github.com/usnistgov/pmml_pymcBN

References

[1].Dubey R, Gunasekaran A, Childe SJ, Wamba SF, and Papadopoulos T, 2016, “The Impact of Big Data on World-Class Sustainable Manufacturing,” Int. J. Adv. Manuf. Technol, 84(1–4), pp. 631–645. [Google Scholar]
[2].Kang HS, Lee JY, Choi S, Kim H, Park JH, Son JY, Kim BH, and Noh, S. Do, 2016, “Smart Manufacturing: Past Research, Present Findings, and Future Directions,” Int. J. Precis. Eng. Manuf. - Green Technol, 3(1), pp. 111–128. [Google Scholar]
[3].Asiltürk İ, and Çunkaş M, 2011, “Modeling and Prediction of Surface Roughness in Turning Operations Using Artificial Neural Network and Multiple Regression Method,” Expert Syst. Appl, 38(5), pp. 5826–5832. [Google Scholar]
[4].Ak R, Helu MM, and Rachuri S, 2015, “Ensemble Neural Network Model for Predicting the Energy Consumption of a Milling Machine,” 20th Design for Manufacturing and the Life Cycle Conference, Boston, MA. [Google Scholar]
[5].Lee J, Wu F, Zhao W, Ghaffari M, Liao L, and Siegel D, 2014, “Prognostics and Health Management Design for Rotary Machinery Systems - Reviews, Methodology and Applications,” Mech. Syst. Signal Process, 42(1–2), pp. 314–334. [Google Scholar]
[6].Zhao R, Wang D, Yan R, Mao K, Shen F, and Wang J, 2017, “Machine Health Monitoring Using Local Feature-Based Gated Recurrent Unit Networks,” IEEE Trans. Ind. Electron, 65(2), pp. 1539–1548. [Google Scholar]
[7].Jardine AKS, Lin D, and Banjevic D, 2006, “A Review on Machinery Diagnostics and Prognostics Implementing Condition-Based Maintenance,” Mech. Syst. Signal Process, 20(7), pp. 1483–1510. [Google Scholar]
[8].Jayal AD, Badurdeen F, Dillon OW, and Jawahir IS, 2010, “Sustainable Manufacturing: Modeling and Optimization Challenges at the Product, Process and System Levels,” CIRP J. Manuf. Sci. Technol, 2(3), pp. 144–152. [Google Scholar]
[9].Nannapaneni S, and Mahadevan S, 2016, “Manufacturing Process Evaluation Under Uncertainty: A Hierarchical Bayesian Network Approach,” Proceedings of the ASME 2016 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, p. V01BT02A026–V01BT02A026. [Google Scholar]
[10].Bhinge R, Biswas N, Dornfeld D, Park J, Law KH, Helu M, and Rachuri S, 2014, “An Intelligent Machine Monitoring System for Energy Prediction Using a Gaussian Process Regression,” 2014 IEEE International Conference on Big Data (Big Data), pp. 978–986. [Google Scholar]
[11].Guazzelli A, Stathatos K, and Zeller M, 2009, “Efficient Deployment of Predictive Analytics through Open Standards and Cloud Computing,” ACM SIGKDD Explor. Newsl, 11(1), p. 32. [Google Scholar]
[12].Guazzelli A, Zeller M, Lin W, and Williams G, 2009, “PMML: An Open Standard for Sharing Models,” R J, 1(1), pp. 60–65. [Google Scholar]
[13].Pivarski J, Bennett C, and Grossman RL, 2016, “Deploying Analytics with the Portable Format for Analytics (PFA),” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ‘16, pp. 579–588. [Google Scholar]
[14].Bray T, Paoli J, Sperberg-McQueen CM, Maler E, and Yergeau F, 1997, “Extensible Markup Language (XML),” World Wide Web J, 2(4), pp. 27–66. [Google Scholar]
[15].“The Data Mining Group (DMG)” [Online]. Available: http://dmg.org. [Accessed: 10-Jul-2018]. [Google Scholar]
[16].Nannapaneni S, Mahadevan S, and Rachuri S, 2016, “Performance Evaluation of a Manufacturing Process under Uncertainty Using Bayesian Networks,” J. Clean. Prod, 113, pp. 947–959. [Google Scholar]
[17].Correa M, Bielza C, Ramirez MDJ, and Alique JR, 2008, “A Bayesian Network Model for Surface Roughness Prediction in the Machining Process,” Int. J. Syst. Sci, 39(12), pp. 1181–1192. [Google Scholar]
[18].Tobon-Mejia DA, Medjaher K, and Zerhouni N, 2012, “CNC Machine Tool’s Wear Diagnostic and Prognostic by Using Dynamic Bayesian Networks,” Mech. Syst. Signal Process, 28, pp. 167–182. [Google Scholar]
[19].Nannapaneni S, Mahadevan S, and Dubey A, 2018, “Real-Time Control of a Cyber-Physical Manufacturing Process under Uncertainty,” ASME 2018 International Manufacturing Science and Engineering Conference, College Station, Texas, USA. [Google Scholar]
[20].Kurz D, Kaspar J, and Pilz J, 2011, “Dynamic Maintenance in Semiconductor Manufacturing Using Bayesian Networks,” Automation Science and Engineering (CASE), 2011 IEEE Conference on, pp. 238–243. [Google Scholar]
[21].Chan A, and McNaught KR, 2008, “Using Bayesian Networks to Improve Fault Diagnosis during Manufacturing Tests of Mobile Telephone Infrastructure,” Journal of the Operational Research Society, pp. 423–430. [Google Scholar]
[22].Masruroh NA, and Poh KL, 2007, “A Bayesian Network Approach to Job-Shop Rescheduling,” IEEM 2007: 2007 IEEE International Conference on Industrial Engineering and Engineering Management, pp. 1098–1102. [Google Scholar]
[23].Kao H-Y, Huang C-H, and Li H-L, 2005, “Supply Chain Diagnostics with Dynamic Bayesian Networks,” Comput. Ind. Eng, 49(2), pp. 339–347. [Google Scholar]
[24].Zhu JY, and Deshmukh A, 2003, “Application of Bayesian Decision Networks to Life Cycle Engineering in Green Design and Manufacturing,” Eng. Appl. Artif. Intell, 16(2 SPEC.), pp. 91–103. [Google Scholar]
[25].“Predictive Model Markup Language (PMML) v4.2.1” [Online]. Available: http://dmg.org/pmml/v4-2-1/GeneralStructure.html. [Accessed: 10-Jul-2018]. [DOI] [PMC free article] [PubMed] [Google Scholar]
[26].Friedman N, Geiger D, and Goldszmidt M, 1997, “Bayesian Network Classifiers,” Mach. Learn, 29(2–3), pp. 131–163. [Google Scholar]
[27].“PMML 4.3 - General Structure” [Online]. Available: http://dmg.org/pmml/v4-3/GeneralStructure.html. [Accessed: 10-Jul-2018]. [Google Scholar]
[28].Park J, Lechevalier D, Ak R, Ferguson M, Law K, Lee Y, and Rachuri S, 2017, “Gaussian Process Regression (GPR) Representation in Predictive Model Markup Language (PMML),” Smart Sustain. Manuf. Syst, 1(1), pp. 121–141. [DOI] [PMC free article] [PubMed] [Google Scholar]
[29].Nannapaneni S, Mahadevan S, Dubey A, Lechevalier D, Narayanan A, and Rachuri S, 2017, “Automated Uncertainty Quantification through Information Fusion in Manufacturing Processes,” Smart Sustain. Manuf. Syst, 1(1), pp. 153–177. [Google Scholar]
[30].Scutari M, 2010, “Learning Bayesian Networks with the Bnlearn R Package,” J. Stat. Softw, 35(3), pp. 1–22.21603108 [Google Scholar]
[31].Bartram G, and Mahadevan S, 2014, “Integration of Heterogeneous Information in SHM Models,” Struct. Control Heal. Monit, 21(3), pp. 403–422. [Google Scholar]
[32].Neapolitan RE, 2004, Learning Bayesian Networks, Pearson Prentice Hall. [Google Scholar]
[33].Spirtes P, Glymour C, and Scheines R, 1993, Causation, Prediction, and Search, Springer-Verlag, New York. [Google Scholar]
[34].Margaritis D, 2003, “Learning Bayesian Network Model Structure from Data,” Carnegie Mellon University. [Google Scholar]
[35].Tsamardinos I, Aliferis CF, and Statnikov A, 2003, “Algorithms for Large Scale Markov Blanket Discovery,” The 16th International Florida Artificial Intelligence Research Society Conference, American Associ ation for Artificial Intelligence, St. Augustine, Florida, pp. 376–380. [Google Scholar]
[36].Yaramakala S, and Margaritis D, 2005, “Speculative Markov Blanket Discovery for Optimal Feature Selection,” Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 809–812. [Google Scholar]
[37].Tsamardinos I, Brown LE, and Aliferis CF, 2006, “The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm,” Mach. Learn, 65(1), pp. 31–78. [Google Scholar]
[38].Campos C. P. de, Tong Y, and Ji Q, 2008, “Constrained Maximum Likelihood Learning of Bayesian Networks for Facial Action Recognition,” Computer Vision – ECCV 2008, Springer, Berlin, Heidelberg, pp. 168–181. [Google Scholar]
[39].Ling Y, and Mahadevan S, 2013, “Quantitative Model Validation Techniques: New Insights,” Reliab. Eng. Syst. Saf, 111, pp. 217–231. [Google Scholar]
[40].Simonsson I, and Mostad P, 2016, “Exact Inference on Conditional Linear Γ-Gaussian Bayesian Networks,” Proceedings of the Eighth International Conference on Probabilistic Graphical Models, pp. 474–486. [Google Scholar]
[41].Hastings WK, 1970, “Monte Carlo Sampling Methods Using Markov Chains and Their Applications,” Biometrika, 57(1), pp. 97–109. [Google Scholar]
[42].Fox CW, and Roberts SJ, 2012, “A Tutorial on Variational Bayesian Inference,” Artif. Intell. Rev, 38(2), pp. 85–95. [Google Scholar]
[43].Smith AFM, and Gelfand AE, 1992, “Bayesian Statistics without Tears: A Sampling-Resampling Perspective,” Am. Stat, 46(2), pp. 84–88. [Google Scholar]
[44].Csillery K, Blum M, Gaggiotti O, and Francois O, 2010, “Approximate Bayesian Computation (ABC) in Practice,” Trends Ecol. Evol, 25(7), pp. 410–418. [DOI] [PubMed] [Google Scholar]
[45].Li C, and Sankaran M, 2018, “Efficient Approximate Inference in Bayesian Networks with Continuous Variables,” Reliab. Eng. Syst. Saf, 169, pp. 269–280. [Google Scholar]
[46].“W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures” [Online]. Available: https://www.w3.org/TR/xmlschema11-1/. [Accessed: 10-Jul-2018].
[47].Pechter R, 2009, “What’s PMML and What’s New in PMML 4.0?,” ACM SIGKDD Explor. Newsl, 11(1), p. 19. [Google Scholar]
[48].Nannapaneni S, and Mahadevan S, 2016, “Reliability Analysis under Epistemic Uncertainty,” Reliab. Eng. Syst. Saf, 155, pp. 9–20. [Google Scholar]
[49].Salvatier J, Wiecki TV, and Fonnesbeck C, 2016, “Probabilistic Programming in Python Using PyMC3,” PeerJ Comput. Sci, 2, p. e55. [DOI] [PMC free article] [PubMed] [Google Scholar]
[50].Hagberg AA, Schult DA, and Swart PJ, 2008, “Exploring Network Structure, Dynamics, and Function Using NetworkX,” Proceedings of the 7th Python in Science Conference, pp. 11–16. [Google Scholar]
[51].Meurer A, Smith CP, Paprocki M, Čertík O, Kirpichev SB, Rocklin M, Kumar Am., Ivanov S, Moore JK, Singh S, Rathnayake T, Vig S, Granger BE, Muller RP, Bonazzi F, Gupta H, Vats S, Johansson F, Pedregosa F, Curry MJ, Terrel AR, Roučka Š, Saboo A, Fernando I, Kulal S, Cimrman R, and Scopatz A, 2017, “SymPy: Symbolic Computing in Python,” PeerJ Comput. Sci, 3, p. e103. [Google Scholar]
[52].Weman K, 2012, Welding Processes Handbook, Woodhead Publishing. [Google Scholar]
[53].Mani M, Madan J, Lee JH, Lyons KW, and Gupta SK, 2014, “Sustainability Characterisation for Manufacturing Processes,” Int. J. Prod. Res, 52(20), pp. 5895–5912. [Google Scholar]
[54].Hoffman MD, and Gelman A, “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo,” J. Mach. Learn. Res, 15, pp. 1593–1623. [Google Scholar]

[R1] [1].Dubey R, Gunasekaran A, Childe SJ, Wamba SF, and Papadopoulos T, 2016, “The Impact of Big Data on World-Class Sustainable Manufacturing,” Int. J. Adv. Manuf. Technol, 84(1–4), pp. 631–645. [Google Scholar]

[R2] [2].Kang HS, Lee JY, Choi S, Kim H, Park JH, Son JY, Kim BH, and Noh, S. Do, 2016, “Smart Manufacturing: Past Research, Present Findings, and Future Directions,” Int. J. Precis. Eng. Manuf. - Green Technol, 3(1), pp. 111–128. [Google Scholar]

[R3] [3].Asiltürk İ, and Çunkaş M, 2011, “Modeling and Prediction of Surface Roughness in Turning Operations Using Artificial Neural Network and Multiple Regression Method,” Expert Syst. Appl, 38(5), pp. 5826–5832. [Google Scholar]

[R4] [4].Ak R, Helu MM, and Rachuri S, 2015, “Ensemble Neural Network Model for Predicting the Energy Consumption of a Milling Machine,” 20th Design for Manufacturing and the Life Cycle Conference, Boston, MA. [Google Scholar]

[R5] [5].Lee J, Wu F, Zhao W, Ghaffari M, Liao L, and Siegel D, 2014, “Prognostics and Health Management Design for Rotary Machinery Systems - Reviews, Methodology and Applications,” Mech. Syst. Signal Process, 42(1–2), pp. 314–334. [Google Scholar]

[R6] [6].Zhao R, Wang D, Yan R, Mao K, Shen F, and Wang J, 2017, “Machine Health Monitoring Using Local Feature-Based Gated Recurrent Unit Networks,” IEEE Trans. Ind. Electron, 65(2), pp. 1539–1548. [Google Scholar]

[R7] [7].Jardine AKS, Lin D, and Banjevic D, 2006, “A Review on Machinery Diagnostics and Prognostics Implementing Condition-Based Maintenance,” Mech. Syst. Signal Process, 20(7), pp. 1483–1510. [Google Scholar]

[R8] [8].Jayal AD, Badurdeen F, Dillon OW, and Jawahir IS, 2010, “Sustainable Manufacturing: Modeling and Optimization Challenges at the Product, Process and System Levels,” CIRP J. Manuf. Sci. Technol, 2(3), pp. 144–152. [Google Scholar]

[R9] [9].Nannapaneni S, and Mahadevan S, 2016, “Manufacturing Process Evaluation Under Uncertainty: A Hierarchical Bayesian Network Approach,” Proceedings of the ASME 2016 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, p. V01BT02A026–V01BT02A026. [Google Scholar]

[R10] [10].Bhinge R, Biswas N, Dornfeld D, Park J, Law KH, Helu M, and Rachuri S, 2014, “An Intelligent Machine Monitoring System for Energy Prediction Using a Gaussian Process Regression,” 2014 IEEE International Conference on Big Data (Big Data), pp. 978–986. [Google Scholar]

[R11] [11].Guazzelli A, Stathatos K, and Zeller M, 2009, “Efficient Deployment of Predictive Analytics through Open Standards and Cloud Computing,” ACM SIGKDD Explor. Newsl, 11(1), p. 32. [Google Scholar]

[R12] [12].Guazzelli A, Zeller M, Lin W, and Williams G, 2009, “PMML: An Open Standard for Sharing Models,” R J, 1(1), pp. 60–65. [Google Scholar]

[R13] [13].Pivarski J, Bennett C, and Grossman RL, 2016, “Deploying Analytics with the Portable Format for Analytics (PFA),” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ‘16, pp. 579–588. [Google Scholar]

[R14] [14].Bray T, Paoli J, Sperberg-McQueen CM, Maler E, and Yergeau F, 1997, “Extensible Markup Language (XML),” World Wide Web J, 2(4), pp. 27–66. [Google Scholar]

[R15] [15].“The Data Mining Group (DMG)” [Online]. Available: http://dmg.org. [Accessed: 10-Jul-2018]. [Google Scholar]

[R16] [16].Nannapaneni S, Mahadevan S, and Rachuri S, 2016, “Performance Evaluation of a Manufacturing Process under Uncertainty Using Bayesian Networks,” J. Clean. Prod, 113, pp. 947–959. [Google Scholar]

[R17] [17].Correa M, Bielza C, Ramirez MDJ, and Alique JR, 2008, “A Bayesian Network Model for Surface Roughness Prediction in the Machining Process,” Int. J. Syst. Sci, 39(12), pp. 1181–1192. [Google Scholar]

[R18] [18].Tobon-Mejia DA, Medjaher K, and Zerhouni N, 2012, “CNC Machine Tool’s Wear Diagnostic and Prognostic by Using Dynamic Bayesian Networks,” Mech. Syst. Signal Process, 28, pp. 167–182. [Google Scholar]

[R19] [19].Nannapaneni S, Mahadevan S, and Dubey A, 2018, “Real-Time Control of a Cyber-Physical Manufacturing Process under Uncertainty,” ASME 2018 International Manufacturing Science and Engineering Conference, College Station, Texas, USA. [Google Scholar]

[R20] [20].Kurz D, Kaspar J, and Pilz J, 2011, “Dynamic Maintenance in Semiconductor Manufacturing Using Bayesian Networks,” Automation Science and Engineering (CASE), 2011 IEEE Conference on, pp. 238–243. [Google Scholar]

[R21] [21].Chan A, and McNaught KR, 2008, “Using Bayesian Networks to Improve Fault Diagnosis during Manufacturing Tests of Mobile Telephone Infrastructure,” Journal of the Operational Research Society, pp. 423–430. [Google Scholar]

[R22] [22].Masruroh NA, and Poh KL, 2007, “A Bayesian Network Approach to Job-Shop Rescheduling,” IEEM 2007: 2007 IEEE International Conference on Industrial Engineering and Engineering Management, pp. 1098–1102. [Google Scholar]

[R23] [23].Kao H-Y, Huang C-H, and Li H-L, 2005, “Supply Chain Diagnostics with Dynamic Bayesian Networks,” Comput. Ind. Eng, 49(2), pp. 339–347. [Google Scholar]

[R24] [24].Zhu JY, and Deshmukh A, 2003, “Application of Bayesian Decision Networks to Life Cycle Engineering in Green Design and Manufacturing,” Eng. Appl. Artif. Intell, 16(2 SPEC.), pp. 91–103. [Google Scholar]

[R25] [25].“Predictive Model Markup Language (PMML) v4.2.1” [Online]. Available: http://dmg.org/pmml/v4-2-1/GeneralStructure.html. [Accessed: 10-Jul-2018]. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] [26].Friedman N, Geiger D, and Goldszmidt M, 1997, “Bayesian Network Classifiers,” Mach. Learn, 29(2–3), pp. 131–163. [Google Scholar]

[R27] [27].“PMML 4.3 - General Structure” [Online]. Available: http://dmg.org/pmml/v4-3/GeneralStructure.html. [Accessed: 10-Jul-2018]. [Google Scholar]

[R28] [28].Park J, Lechevalier D, Ak R, Ferguson M, Law K, Lee Y, and Rachuri S, 2017, “Gaussian Process Regression (GPR) Representation in Predictive Model Markup Language (PMML),” Smart Sustain. Manuf. Syst, 1(1), pp. 121–141. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] [29].Nannapaneni S, Mahadevan S, Dubey A, Lechevalier D, Narayanan A, and Rachuri S, 2017, “Automated Uncertainty Quantification through Information Fusion in Manufacturing Processes,” Smart Sustain. Manuf. Syst, 1(1), pp. 153–177. [Google Scholar]

[R30] [30].Scutari M, 2010, “Learning Bayesian Networks with the Bnlearn R Package,” J. Stat. Softw, 35(3), pp. 1–22.21603108 [Google Scholar]

[R31] [31].Bartram G, and Mahadevan S, 2014, “Integration of Heterogeneous Information in SHM Models,” Struct. Control Heal. Monit, 21(3), pp. 403–422. [Google Scholar]

[R32] [32].Neapolitan RE, 2004, Learning Bayesian Networks, Pearson Prentice Hall. [Google Scholar]

[R33] [33].Spirtes P, Glymour C, and Scheines R, 1993, Causation, Prediction, and Search, Springer-Verlag, New York. [Google Scholar]

[R34] [34].Margaritis D, 2003, “Learning Bayesian Network Model Structure from Data,” Carnegie Mellon University. [Google Scholar]

[R35] [35].Tsamardinos I, Aliferis CF, and Statnikov A, 2003, “Algorithms for Large Scale Markov Blanket Discovery,” The 16th International Florida Artificial Intelligence Research Society Conference, American Associ ation for Artificial Intelligence, St. Augustine, Florida, pp. 376–380. [Google Scholar]

[R36] [36].Yaramakala S, and Margaritis D, 2005, “Speculative Markov Blanket Discovery for Optimal Feature Selection,” Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 809–812. [Google Scholar]

[R37] [37].Tsamardinos I, Brown LE, and Aliferis CF, 2006, “The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm,” Mach. Learn, 65(1), pp. 31–78. [Google Scholar]

[R38] [38].Campos C. P. de, Tong Y, and Ji Q, 2008, “Constrained Maximum Likelihood Learning of Bayesian Networks for Facial Action Recognition,” Computer Vision – ECCV 2008, Springer, Berlin, Heidelberg, pp. 168–181. [Google Scholar]

[R39] [39].Ling Y, and Mahadevan S, 2013, “Quantitative Model Validation Techniques: New Insights,” Reliab. Eng. Syst. Saf, 111, pp. 217–231. [Google Scholar]

[R40] [40].Simonsson I, and Mostad P, 2016, “Exact Inference on Conditional Linear Γ-Gaussian Bayesian Networks,” Proceedings of the Eighth International Conference on Probabilistic Graphical Models, pp. 474–486. [Google Scholar]

[R41] [41].Hastings WK, 1970, “Monte Carlo Sampling Methods Using Markov Chains and Their Applications,” Biometrika, 57(1), pp. 97–109. [Google Scholar]

[R42] [42].Fox CW, and Roberts SJ, 2012, “A Tutorial on Variational Bayesian Inference,” Artif. Intell. Rev, 38(2), pp. 85–95. [Google Scholar]

[R43] [43].Smith AFM, and Gelfand AE, 1992, “Bayesian Statistics without Tears: A Sampling-Resampling Perspective,” Am. Stat, 46(2), pp. 84–88. [Google Scholar]

[R44] [44].Csillery K, Blum M, Gaggiotti O, and Francois O, 2010, “Approximate Bayesian Computation (ABC) in Practice,” Trends Ecol. Evol, 25(7), pp. 410–418. [DOI] [PubMed] [Google Scholar]

[R45] [45].Li C, and Sankaran M, 2018, “Efficient Approximate Inference in Bayesian Networks with Continuous Variables,” Reliab. Eng. Syst. Saf, 169, pp. 269–280. [Google Scholar]

[R46] [46].“W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures” [Online]. Available: https://www.w3.org/TR/xmlschema11-1/. [Accessed: 10-Jul-2018].

[R47] [47].Pechter R, 2009, “What’s PMML and What’s New in PMML 4.0?,” ACM SIGKDD Explor. Newsl, 11(1), p. 19. [Google Scholar]

[R48] [48].Nannapaneni S, and Mahadevan S, 2016, “Reliability Analysis under Epistemic Uncertainty,” Reliab. Eng. Syst. Saf, 155, pp. 9–20. [Google Scholar]

[R49] [49].Salvatier J, Wiecki TV, and Fonnesbeck C, 2016, “Probabilistic Programming in Python Using PyMC3,” PeerJ Comput. Sci, 2, p. e55. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R50] [50].Hagberg AA, Schult DA, and Swart PJ, 2008, “Exploring Network Structure, Dynamics, and Function Using NetworkX,” Proceedings of the 7th Python in Science Conference, pp. 11–16. [Google Scholar]

[R51] [51].Meurer A, Smith CP, Paprocki M, Čertík O, Kirpichev SB, Rocklin M, Kumar Am., Ivanov S, Moore JK, Singh S, Rathnayake T, Vig S, Granger BE, Muller RP, Bonazzi F, Gupta H, Vats S, Johansson F, Pedregosa F, Curry MJ, Terrel AR, Roučka Š, Saboo A, Fernando I, Kulal S, Cimrman R, and Scopatz A, 2017, “SymPy: Symbolic Computing in Python,” PeerJ Comput. Sci, 3, p. e103. [Google Scholar]

[R52] [52].Weman K, 2012, Welding Processes Handbook, Woodhead Publishing. [Google Scholar]

[R53] [53].Mani M, Madan J, Lee JH, Lyons KW, and Gupta SK, 2014, “Sustainability Characterisation for Manufacturing Processes,” Int. J. Prod. Res, 52(20), pp. 5895–5912. [Google Scholar]

[R54] [54].Hoffman MD, and Gelman A, “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo,” J. Mach. Learn. Res, 15, pp. 1593–1623. [Google Scholar]

PERMALINK

Predictive Model Markup Language (PMML) Representation of Bayesian Networks: An Application in Manufacturing

Saideep Nannapaneni

Anantha Narayanan

Ronay Ak

David Lechevalier

Rachael Sexton

Sankaran Mahadevan

Yung-Tsun Tina Lee

Abstract

1. Introduction

2. Theory of Bayesian Networks

Figure 1:

2.1. Bayesian network construction

2.2. Uncertainty propagation and Bayesian inference

Figure 2.

3. Bayesian Network – PMML Schema

Figure 3.

Figure 4.

3.1. DiscreteNode element

Figure 5.

Figure 6.

Figure 7.

Figure 8.

TABLE 1.

3.2. ContinuousNode element

Figure 9.

Figure 10.

Figure 11.

Figure 12.

Figure 14.

3.3. Representation of functional relationships between variables

4. BN PMML Parser

Figure 15.

Figure 16.

5. Case study: Welding process

5.1. Process description

Figure 17.

Figure 18.

TABLE 3.

Figure 19.

5.2. PMML Representation Parsing and Prediction

Figure 20.

Figure 21.

6. Conclusion and Future Work

Figure 13.

TABLE 2.

Acknowledgements

*

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases