Multi-factor association dependence modelling for project risk analysis

Byung-Cheol Kim

doi:10.1016/j.mex.2021.101443

. 2021 Jul 5;8:101443. doi: 10.1016/j.mex.2021.101443

Multi-factor association dependence modelling for project risk analysis

Byung-Cheol Kim ^1,^⁎

PMCID: PMC8374643 PMID: 34430332

Abstract

This article presents an analytic (non-simulation) dependence model for quantitative project risk analysis. The multi-factor association model (MFAM) accounts for multiple association factors in typical project settings and provides a closed-form solution to a complete and mathematically consistent correlation matrix. Given standardized, ubiquitous project plans (e.g., work breakdown structure, resource allocation, or risk register), the MFAM establishes a hierarchical tree of association factors, which is subsequently encoded into an analytic model for quantitative risk analysis. In this article, we present the MFAM programmed in Microsoft Excel and demonstrate the computational efficiency of the MFAM using two alternative schedules with distinct resource utilizations. With the enhanced robustness in dealing with multiple risk factors in a project and the computational efficiency from the non-simulation second-moment approach, the MFAM offers additional flexibility for and scalability to large-scale project risk analysis problems. The modelling procedures and solutions presented in this article highlight three potentials of the MFAM as a robust quantitative risk analysis tool.

• The MFAM can be fully programmed in Microsoft Excel using only basic cell functions.

• Under mild assumptions, the MFAM provides reliable risk estimates comparable to Monte Carlo simulation.

• The MFAM is scalable to high-dimensional risk problems (i.e., with multi-thousands or more) with an affordable computational burden.

Keywords: Project risk analysis, Multi-factor dependence model, Structured association, Second-moment approach

Graphical abstract

Specifications table

Subject area	Engineering
More specific subject area	Project risk analysis
Method name	Multi-Factor Association Model (MFAM)
Name and reference of original method	EJOR-D-20-01833R1“Multi-factor Dependence Modelling with Specified Marginals and Structured Association in Large-scale Project Risk Assessment”, European Journal of Operational Research.
Resource availability	Microsoft Excel.

Open in a new tab

*Method details

Outline

This article presents the multi-factor association model (MFAM) for large-scale project risk analysis [3]. The MFAM is an analytic (non-simulation) model that generates a mathematically consistent correlation matrix given marginal distributions and auxiliary risk factors that induce inter-variable associations between the marginals.

In practice, the MFAM can be used to enhance the realism of dependence modelling and risk analysis in three notable ways.

•
Multi-factors. The MFAM enhances the flexibility of dependence modelling by accounting for multiple risk factors commonly observed in real-life project settings (e.g., weather, work types, resources, and risk register).
•
Scalability. The MFAM provides a parameterized, closed-form solution to the correlation matrix generation problem, which is scalable to large-scale projects according to the data available in individual projects.
•
Computational tractability. The MFAM adopts an analytic approach (i.e., the second-moment approach) and provides a computationally tractable algorithm to generate a mathematically consistent correlation matrix.

This article highlights the modelling details and applications of the MFAM in typical project risk analysis problems. Extended applications of the KFAM in more general settings are straightforward.

Multi-factor association modelling

Dependence modelling for risk analysis can be carried out in two ways: the correlation-driven approach and the factor-driven approach [3]. The correlation-driven approach builds a correlation matrix from all pairwise correlation coefficients among the dependent variables in a system. In contrast, the factor-driven approaches introduce auxiliary variables as a proxy of the pairwise association among relevant random variables [1,5].

The primary advantages of the factor-driven approaches are twofold. First, it is often more intuitive to elicit inter-variable associations in terms of the common factors that may influence two or more random variables. Second, a correlation matrix generated by a factor-driven approach is guaranteed to be mathematically consistent [3].

Specifically, the MFAM can be implemented following the four steps in Fig. 1.

Four steps of the MFAM Analysis:

1.
Specify marginal variables. Primary random variables of an uncertain system are identified and specified in terms of probabilistic estimates. To this end, the uniform, normal, triangular, and PERT distributions are commonly used.
2.
Specify association factors. Association factors (AFs), also called common causes, are identified and specified as independent random variables.
3.
Load AFs to marginals. The relationships between the marginal variables and the risk factors are specified.
4.
Generate a correlation matrix. Once a vector of factored random variables is established with a known mathematically consistent correlation matrix, the risk model can be analyzed as a fully-specified joint distribution. Then the results can be visualized in terms of the risk profiles (i.e., histograms and cumulative probability distributions).

MFAM algorithm

The MFAM adopts a multiplicative factor approach in which factor-driven association is generated by multiplying all relevant AFs to individual marginals.

{X_{i}} = {R_{i} b_{i}}, R_{i} = \prod_{k \in θ_{i}} r_{i k}

where i =1,…,d; R_i is an aggregate factor defined as a product of the AFs (r_ik) linked to a variable b_i, and; θ_i is a set of AFs associated with b_i. Since the marginals of X are constructed from b multiplied by the relevant AFs, the former is referred to as ‘factored’ variables, whereas the latter is referred to as ‘base’ variables hereinafter.

This section presents the generic algorithm for the MFAM programming. The key notations used in the derivation are summarized below.

•
X = (X₁,…, X_d)^T: a d × 1 vector of factored variables.
•
b = (b₁,…, b_d)^T: a d × 1 vector of base variables.
•
r = (r₁,…, r_K)^T: a K × 1 vector of AFs.
•
θ_i: a set of AFs associated with b_i.
•
Ψ = [d, K]: a d × K allocation matrix (AM) of the K association factors loaded onto d base variables.
•
$R_{i} = \prod_{k \in θ_{i}} r_{i k}$ : an aggregate factor as a product of the AFs associated with b_i.
•
E[•]: the expectation operator that returns the mean value of a variable.
•
V[•]: an operator that returns the variance of a variable.
•
Sd[•]: an operator that returns the standard deviation of a variable.

A generic algorithm for the MFA correlation computation is presented below.

MFAM algorithm[3]:

Module.1 Read input parameters: b, r, and Ψ.

§1.1 Read the base variables, b = (b₁,…, b_d)^T: the means (

E_{b}

) and variances (

V_{b}

§1.2 Read the association factors, r = (r₁,…, r_K)^T: the means (

E_{r}

) and variances (

V_{r}

§1.3 Read the allocation matrix, Ψ (d × K).

Module.2 Compute auxiliary parameter vectors

E_{r s}

and

E_{b s}

§2.1

E_{b s} = {E [b_{i}^{2}]} = V_{b} + {E_{b}}^{\circ 2}

§2.2

E_{r s} = {E [r_{k}^{2}]} = V_{r} + {E_{r}}^{\circ 2}

Module.3 Compute the means and variances of the factored variables, X = (X₁,…, X_d)^T.

§3.1

E_{X} = {(\prod {E_{r}}^{\circ ψ_{i •}}) {(E_{b})}_{i}}, i = 1, . . ., d

§3.2

V_{X} = {(\prod {E_{r s}}^{\circ ψ_{i •}}) {(E_{b s})}_{i} - (\prod {E_{r}}^{\circ 2 ψ_{i •}}) {(E_{b})}_{i}^{2}}, i = 1, . . ., d

Module.4 Generate the correlation coefficient matrix (i =1,…,d; j =1,…,d)

§4.1

Σ_{r} (i, j | b, r, Ψ) = {\begin{matrix} \frac{V [R_{i \cap j}] E [R_{i - j}] E [R_{j - i}] {(E_{b})}_{i} {(E_{b})}_{j}}{{[{(V_{X})}_{i} \times {(V_{X})}_{j}]}^{0.5}} & i f i \neq j \\ 1 & i f i = j \end{matrix}

where

V [R_{i \cap j}] = \prod {E_{r s}}^{\circ (ψ_{i •} \circ ψ_{j •})} - \prod {E_{r}}^{\circ 2 (ψ_{i •} \circ ψ_{j •})}

E [R_{i - j}] = \prod {E_{r}}^{\circ (ψ_{i •} - ψ_{i •} \circ ψ_{j •})}

E [R_{j - i}] = \prod {E_{r}}^{\circ (ψ_{j •} - ψ_{i •} \circ ψ_{j •})}

Open in a new tab

Note: θ_i∩j is the intersection of θ_i and θ_j, whereas θ_i-j (or θ_j-i) is the set difference of θ_i and θ_j (or θ_j and θ_i); ψ_i_• (ψ_j_•) is a vector of the i-th (j-th) row of Ψ; and, ° is the Hadamard operator for entry-wise product or power computation [2].

Test projects

The MFAM algorithm provides a robust solution to the problem of project risk modelling and analysis. In this paper, we consider a subproject of six work packages (WPs): wp₁, wp₂, …, wp₆. Our goal is to assess the cost uncertainty of the subproject under two resource scenarios: Plan 1 and Plan 2.

•
Plan 1 (One-crew scenario). All WPs are carried out by one crew (Crew A) along with one crane (and crane operator) for wp₂ and wp₆. As a result, the six WPs form a single path of work sequence. See Fig. 2.
•
Plan 2 (Two-crew scenario). Two crews (Crew A and Crew B) are allocated concurrently, and each crew carries out three WPs. See Fig. 2 for the details. As a result, Plan 2 offers a time-saving schedule. Yet, Plan 2 increases the workflow complexity with two parallel paths.

The two plans reflect two common challenges in project resource planning: resource leveling and schedule acceleration. That is, whenever possible the maximum resource demands in a project should be minimized simply to make things easier to manage. Yet, there are situations where hiring more people or other renewable resources (e.g., equipment) accelerates the progress and ultimately the project completion.

Now we are going to carry out a quantitative analysis of the three plans and provide the project manager with some actionable information. To start with, we observe that the cost performance of individual WPs is influenced by the production rates of the resources allocated to the WP. This notion of qualitative association induced by common work types or common resources shared by multiple performance units is defined as the structured association (SA) in Kim [3]. Using the SA approach, the two plans in Fig. 2 can be schematically represented as the association factor trees in Fig. 3.

Fig 3 — Association factor trees. r₁ (Crew A), r₂ (Crane), and r₃ (Crew B)

MFAM programming in Microsoft Excel®

The MFAM algorithm in Section 3 is programmed in Microsoft Excel, following the four-steps in Fig. 1. The resulting Excel MFAM for Plan 2 is shown in Fig. 4.

Fig 4 — Excel MFAM for Plan 2. (Source file: XL-MFAM [3]_Plan-2.xlsx)

Step 1. Specify marginal variables:

The WP costs are estimated in terms of the three-point technique. We consider a situation where all WPs have the same three-point estimates: Optimistic (o) = 0.7, Most Likely (m) = 1.0, and Pessimistic (p) = 1.6. Note that the three-point estimates of the WP costs are stored in Cells “C8:E13” of the Excel MFAM in Fig. 4.

The MFAM adopts the second-moment approach. Thus, a three-point estimate needs to be converted to the corresponding mean and variance. We consider a case where the WP costs have the triangular distribution. Then the corresponding second-moment estimates (SME) are determined as:

E [b_{i}] = \frac{o_{i} + m_{i} + p_{i}}{3}, i = 1, . . ., 6

S d [b_{i}] = \sqrt{\frac{o_{i}^{2} + m_{i}^{2} + p_{i}^{2} - o_{i} m_{i} - m_{i} p_{i} - p_{i} o_{i}}{18}}, i = 1, . . ., 6

Note that the variances are replaced by standard deviations (Sd) for computational simplicity. In the Excel MFAM, the SMEs are calculated in Cells “F8:G13”. For example, $E [b_{1}]$ and $S d [b_{1}]$ in Cell “F8:G8:” are

“F8”: “=(C8+D8+E8)/3” and

(F1.1)

“G8”: “=SQRT((SUMSQ(C8:E8)-C8*D8-D8*E8-E8*C8)/18)”.

(F1.2)

In addition, an auxiliary variable, the mean of a squared base estimate (E_bs) in MFAM Algorithm §2.1, is introduced in Cells “H8:H13”.

E [b_{i}^{2}] = V [b_{i}] + E {[b_{i}]}^{2}, i = 1, . . ., 6

For example, $E [b_{1}^{2}]$ in Cell “H8” is:

“H8”: “=G8^2+F8^2”

(F1.3)

Step 2. Specify association factors:

The AFs identified in the association factor trees (Fig. 3) are estimated in terms of their means and standard deviations and presented in Table 1. Note that Crew B parameters are chosen in a way that represents a situation where a secondary crew (Crew B) is less efficient than the primary crew (Crew A) available in the project. Specifically, the mean = 1.2 (> 1.0) reflects that Crew B is less productive, on average, than Crew A. Moreover, the standard deviation = 0.3 also indicates that the Crew B performance is less reliable than Crew A (with standard deviation = 0.1).

Table 1.

Association factor estimates.

Parameters	Crew A (r₁)	Crane (r₂)	Crew B (r₃)
Mean	1.00	1.00	1.20
St. dev.	0.100	0.100	0.300

Open in a new tab

In the Excel MFAM, the risk factor means and standard deviations are stored in Cells “C17:E18”. Note the means of the squared risk factors (E_rs) in MFAM Algorithm §2.2 are also introduced in Cells “C19:E19”.

E [r_{k}^{2}] = V [r_{k}] + E {[r_{k}]}^{2}, k = 1, 2, 3

For example, $E [r_{1}^{2}]$ in Cell “C19” is:

“C19”: “=C18^2+C17^2”

(F2)

Step 3. Load AFs to marginals.

Given a set of base (factor-free) estimates and a set of AFs, the confounding relationships between them are represented using an allocation matrix (AM, Ψ). This process is straightforward and self-evident. For example, the Plan 2 factor tree in Fig. 3 is coded in Cell “I8:K13” of the Excel MFAM (Fig. 4).

It is worth noting that the three steps discussed above correspond to the first two modules of the MFAM algorithm. The corresponding cells of the modules in the Plan 2 Excel MFAM are summarized in Fig. 5 below.

Fig 5 — Excel MFAM coding - Modules 1 and 2.

Then, the factored variable calculation in Module 3 of the MFAM algorithm can be coded as follows. First, the mean values of the factored variables are computed from MFAM §3.1:

E_{X} = {(\prod {E_{r}}^{\circ ψ_{i •}}) {(E_{b})}_{i}}, i = 1, . . ., 6

For example, $E_{X} (i = 1)$ in Cell “L8” is

“L8”: “{=F8*PRODUCT($C$17:$E$17^I8:K8)}”.

(F3.1)

Likewise, the standard deviation of a factor variable is computed from MFAM §3.2:

S d_{X} = \sqrt{(\prod {E_{r s}}^{\circ ψ_{i •}}) {(E_{b s})}_{i} - (\prod {E_{r}}^{\circ 2 ψ_{i •}}) {(E_{b})}_{i}^{2}}, i = 1, . . ., 6

For example, $S d_{X} (i = 1)$ in Cell “M8” is

“M8”: “{=SQRT(H8*PRODUCT($C$19:$E$19^I8:K8)-F8^2*PRODUCT($C$17:$E$17^(2*I8:K8)))}”.

(F3.2)

Then Module 3 of the Excel MFAM is completed as in Fig. 6.

Step 4. Construct the correlation matrix.

Given the three inputs (i.e., b, r, and Ψ), the MFAM offers a closed-form solution to the problem of generating a mathematically consistent correlation matrix. In the Excel MFAM, the correlation formula in MFAM §4.1 was directly entered as a cell formula (Fig. 7), which turns out to be rather lengthy. For ease of understanding, we decomposed the correlation formula into five components as follows.

Σ_{r} {(i, j | b, r, Ψ)}_{i \neq j} = \frac{V [R_{i \cap j}] E [R_{i - j}] E [R_{j - i}] {(E_{b})}_{i} {(E_{b})}_{j}}{{(S d_{X})}_{i} {(S d_{X})}_{j}}

For example, the cell formula for $Σ_{r} (1, 2)$ in Cell “D24” consists of five components:

“D24”: “{=(F4.a)*(F4.b)*(F4.c)*(F4.d)/(F4.e)}”

(F4)

$V [R_{1 \cap 2}]$	{(PRODUCT(E.rs^(INDEX(AM,$B24,0)INDEX(AM,D$23,0)))-PRODUCT(E.r^(INDEX(AM,$B24,0)INDEX(AM,D$23,0)))^2)}	(F4.a)
$E [R_{1 - 2}]$	{(PRODUCT(E.r^(INDEX(AM,$B24,0)-INDEX(AM,$B24,0)*INDEX(AM,D$23,0))))}	(F4.b)
$E [R_{2 - 1}]$	{(PRODUCT(E.r^(INDEX(AM,D$23,0)-INDEX(AM,$B24,0)*INDEX(AM,D$23,0))))}	(F4.c)
${(E_{b})}_{1} {(E_{b})}_{2}$	INDEX(E.b,$B24)*INDEX(E.b,D$23)	(F4.d)
${(S d_{X})}_{1} {(S d_{X})}_{2}$	INDEX(S.x,$B24)/INDEX(S.x,D$23)	(F4.e)

Open in a new tab

Note: E.b = “F8:F13”; S.x = “M8:M13”; AM = “I8:K13”; E.r = “C17:E17”; E.rs = “C19:E19”.

Once the correlation formula is entered for $Σ_{r} (1, 2)$ , it can be copied to all other off-diagonal elements of the correlation matrix. Finally, the diagonal elements of the correlation matrix are entered as 1’s, which completes the MFAM modelling.

Method validation

The Excel MFAM in Section 5 is generic and can be used for both Plan 1 and 2 in Fig. 3. The inputs and resulting correlation matrix for Plan 1 are presented in Fig. 8.

Fig 8 — Excel MFAM for Plan 1. (Source file: XL-MFAM [3]_Plan-1.xlsx)

Given a set of the factored costs and the corresponding correlation matrix, calculation of the first two moments of the total cost is straightforward using basic statistics. That is, the mean of the sum of random variables is the sum of their means; and the variance of the sum of dependent random variables is the sum of the elements in the variance-covariance matrix.

E [T C] = \sum_{i = 1}^{6} X_{i}

V [T C] = {Sd}_{X}^{T} Σ_{r} S d_{X}

For example, given the Excel MFAM (Figs. 4 and 8), the mean and variance of the total cost (TC) are calculated as:

mean(TC) “=SUM(L8:L13)”

(F5.1)

variance(TC) “=MMULT(MMULT(TRANSPOSE(M8:M13),C24:H29), M8:M13)”

(F5.2)

The MFAM results of the two plans are summarized in Table 2. The table also shows the results from Monte Carlo simulation. The simulations were carried out using @RISK from Palisade (https://www.palisade.com) with 50,000 iterations.

Table 2.

Total cost risks.

Cases	Plan 1			Plan 2
	Mean	St. dev.	P-90	Mean	St. dev.	P-90
Independent	6.600	0.458	7.187	6.600	0.458	7.187
MFAM	6.600	0.835	7.670	7.260	1.190	8.785
Simulation	6.600	0.834	7.694	7.260	1.191	8.819

Open in a new tab

Note: P-90 represents the total cost at 90% confidence level.

For ease of comparison, the resulting risk profiles of the total costs are displayed in Fig. 9. The results also show the effects of the factor-induced correlations against the independence assumption. Note that the MFAM distributions in Fig. 9 are fitted by the lognormal distribution to account for possible skewness commonly observed in project costs and times. Overall, the results show a good match between the MFAM and the simulation. For more detailed comparison, please refer to the supplementary Excel MFAM files.

In practice, quantitative risk assessments provide vital information for risk-adjusted project planning and control. Specifically, the results from MFAM can be used to set the amount of contingency reserve at a chosen confidence level. For this purpose, the total cost risks in Fig. 9 can be displayed as cumulative risk profiles as in Fig. 10. From the graph, the decision maker intuitively grasps the effects of dependence on the total cost risk under the two resource plans. More specifically, Δ₁ and Δ₂ in Fig. 10 represent the additional budgets for the contingency at P-90 under Plan 1 and Plan 2, respectively.

Fig 10 — Effects of factor-induced dependence on risk-adjusted budgets.

Conclusions

This article presents the underlying ideas, generic algorithm, and programming codes of the MFAM for project risk analysis with correlation. The MFAM was programmed in Microsoft Excel using only the cell functions commonly used in practice.

The MFAM provides an analytic (non-simulation) and closed-form solution to the problem of correlation matrix specification and estimation. As a result, the MFAM offers distinct computational advantages over simulation. The MFAM can be further calibrated and optimized according to the data available in individual projects, while preserving the scalability to high-dimensional project risks.

The computational efficiency of the MFAM can be further improved by programming the MFAM algorithm into a VBA (Visual Basic for Applications) code in Excel, which is reserved for the readers as an exciting learning-by-doing exercise.

[4]

Supplementary material and/or Additional information:

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The author sincerely appreciates the insightful comments and encouraging suggestions from the Editor and the three reviewers during the review of the European Journal of Operational Research.

Footnotes

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.mex.2021.101443.

Appendix. Supplementary materials

mmc1.xlsx^{(1.4MB, xlsx)}

mmc2.xlsx^{(1.4MB, xlsx)}

References

1.Cho S. A linear Bayesian stochastic approximation to update project duration estimates. Eu. J. Operation. Res. 2009;196(2):585–593. [Google Scholar]
2.Fallat S.M., Johnson C.R. Hadamard powers and totally positive matrices. Linear Algebra Appl. 2007;423(2-3):420–427. [Google Scholar]
3.Kim B.-C. Dependence modeling for large-scale project cost and time risk assessment: additive risk factor approaches. IEEE Trans. Eng. Manage. 2021 https://ieeexplore.ieee.org/abstract/document/9332228 Early Access: [Google Scholar]
4.Kim B.-C. Multi-factor dependence modelling with specified marginals and structured association in large-scale project risk assessment. Eu. J. Operation. Res. 2021 Under Review. [Google Scholar]
5.van Dorp J.R. A dependent project evaluation and review technique: A Bayesian network approach. Eu. J. Operation. Res. 2020;280(2):689–706. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

mmc1.xlsx^{(1.4MB, xlsx)}

mmc2.xlsx^{(1.4MB, xlsx)}

[bib0001] 1.Cho S. A linear Bayesian stochastic approximation to update project duration estimates. Eu. J. Operation. Res. 2009;196(2):585–593. [Google Scholar]

[bib0002] 2.Fallat S.M., Johnson C.R. Hadamard powers and totally positive matrices. Linear Algebra Appl. 2007;423(2-3):420–427. [Google Scholar]

[bib0003] 3.Kim B.-C. Dependence modeling for large-scale project cost and time risk assessment: additive risk factor approaches. IEEE Trans. Eng. Manage. 2021 https://ieeexplore.ieee.org/abstract/document/9332228 Early Access: [Google Scholar]

[bib0004] 4.Kim B.-C. Multi-factor dependence modelling with specified marginals and structured association in large-scale project risk assessment. Eu. J. Operation. Res. 2021 Under Review. [Google Scholar]

[bib0005] 5.van Dorp J.R. A dependent project evaluation and review technique: A Bayesian network approach. Eu. J. Operation. Res. 2020;280(2):689–706. [Google Scholar]

PERMALINK

Multi-factor association dependence modelling for project risk analysis

Byung-Cheol Kim

Abstract

Graphical abstract

Outline

Multi-factor association modelling

Fig. 1.

MFAM algorithm

Test projects

Fig. 2.

Fig. 3.

MFAM programming in Microsoft Excel®