Algorithm 2 Hybrid Model-2 |
Input: Original data, LSTM-Autoencoder anomaly labels, Mahalanobis anomaly labels Output: Anomaly rates per feature, overall anomaly rate, evaluation scores, potential energy savings 1: Load LSTM-Autoencoder anomaly labels 2: Load Mahalanobis anomaly labels 3: Read and normalize original data 4: Create hybrid anomaly labels (LSTM-Autoencoder OR Mahalanobis) 5: for each feature i in features do 6: labels <- combine LSTM-Autoencoder and Mahalanobis anomaly matrices for feature i 7: silhouetteScores <- calculateSilhouette(normalizedFeatures[i], labels) 8: meanSilhouetteScore <- mean(silhouetteScores) 9: dbi <- calculateDaviesBouldin(normalizedFeatures[i], labels) 10: chs <- calculateCalinskiHarabasz(normalizedFeatures[i], labels) 11: Print meanSilhouetteScore, dbi, chs, anomaly rate for feature i 12: end for 13: overallAnomalyRate <- calculateOverallAnomalyRate(hybridLabels) 14: Print overallAnomalyRate 15: anomalousPower <- identifyAnomalousPower(power, hybridLabels) 16: potentialEnergySavings <- calculateEnergySavings(anomalousPower) 17: Print potentialEnergySavings 18: for each feature i in features do 19: Plot normal and anomalous data points for feature i 20: end for function calculateDaviesBouldin(X, labels) k <- max(labels) clusterMeans <- zeros(k, size(X, 2)) clusterS <- zeros(k, 1) for i <- 1 to k do clusterPoints <- X[labels == i, :] clusterMeans[i, :] <- mean(clusterPoints, 1) clusterS[i] <- mean(sqrt(sum((clusterPoints − clusterMeans[i, :]).^2, 2))) end for R <- zeros(k) for i <- 1 to k do for j <- 1 to k do if i != j then R[i, j] <- (clusterS[i] + clusterS[j])/sqrt(sum((clusterMeans[i, :] − clusterMeans[j, :]).^2)) end if end for end for D <- max(R, [], 2) dbi <- mean(D) return dbi end function function calculateCalinskiHarabasz(X, labels) k <- max(labels) n <- size(X, 1) clusterMeans <- zeros(k, size(X, 2)) overallMean <- mean(X) betweenClusterDispersion <- 0 withinClusterDispersion <- 0 for i <- 1 to k do clusterPoints <- X[labels == i, :] clusterSize <- size(clusterPoints, 1) clusterMeans[i, :] <- mean(clusterPoints, 1) betweenClusterDispersion <- betweenClusterDispersion + clusterSize × sum((clusterMeans[i, :] − overallMean).^2) withinClusterDispersion <- withinClusterDispersion + sum(sum((clusterPoints − clusterMeans[i, :]).^2)) end for chs <- (betweenClusterDispersion/(k − 1))/(withinClusterDispersion/(n − k)) return chs end function |