Abstract
We identify compositionally complex alloys (CCAs) that offer exceptional mechanical properties for elevated temperature applications by employing machine learning (ML) in conjunction with rapid synthesis and testing of alloys for validation to accelerate alloy design. The advantages of this approach are scalability, rapidity, and reasonably accurate predictions. ML tools were implemented to predict Young’s modulus of refractory-based CCAs by employing different ML models. Our results, in conjunction with experimental validation, suggest that average valence electron concentration, the difference in atomic radius, a geometrical parameter λ and melting temperature of the alloys are the key features that determine the Young’s modulus of CCAs and refractory-based CCAs. The Gradient Boosting model provided the best predictive capabilities (mean absolute error of 6.15 GPa) among the models studied. Our approach integrates high-quality validation data from experiments, literature data for training machine-learning models, and feature selection based on physical insights. It opens a new avenue to optimize the desired materials property for different engineering applications.
Subject terms: Structural materials, Theory and computation
Introduction
The conventional alloying method almost always starts with one or two principal metallic elements and advances by incorporation of different alloying elements to engineer desired mechanical and chemical properties1–3. Therefore, the mechanical and chemical properties of the synthesized alloy remain controlled by the principal elements. For instance, Fe is the principal element in steels, Cu/Zn in brass, Ni/Co in superalloys and Ti in titanium alloys4–6. About 15 years ago, Yeh and Cantor7,8 introduced a novel alloy concept known as high entropy alloys (HEA) that consist of multiple-principal elements (N = 5 or more elements) in near equiatomic percentages. The increased complexity introduces higher configurational entropy (growing as kBT NlnN, where T is the temperature) compared to conventional alloys. As the number of elements N increases, the number of pairs grows as ~ N2 and raises the probability of favorable pair-driven formation enthalpy, which introduces a complex-chemistry effect (often referred to as a “cocktail effect”). The mixing of multi-principal elements generally introduces four core effects, such as, high mixing entropy, lattice distortions, slow diffusion, and a “cocktail” effect, which result in a simple microstructure and excellent mechanical properties9–13. Further study revealed that several HEAs, such as the Mo0.5AlNbTa0.5TiZr system, did not overcome the enthalpic contributions due to comparatively lower configurational entropies and featured the formation of secondary phases instead of just solid solution phases. Therefore, a more preferred terminology for such alloy systems has emerged, with the more general naming and definition called CCAs14,15 which is the naming convention used throughout this paper.
The number of elemental compositions is much higher in CCAs than that of traditional metallic alloys because CCAs comprise multiple-principal elements16. Moreover, a broader range of compositional space provides an opportunity to improve mechanical properties, such as Young’s modulus, yield strength, and hardness. However, it is extremely challenging to select the appropriate composition by trial-and-error experiment or intuition17. Atomistic modeling, such as molecular dynamics (MD), density functional theory (DFT), and thermodynamic modeling have been devoted to study phase stabilization, solidification, and crystallization kinetics of CCAs18–25. These techniques are computationally expensive, challenging to apply to the study of large polycrystalline samples, time consuming, and hence cannot be used on a large scale to narrow down the search space. Moreover, the variety of microstructures gives rise to complex and computationally expensive calculations compared to traditional alloys and hence it is challenging to predict the chemistries and compositions for a target property.
Nowadays, data-driven research and more specifically ML, which is widely used in self driving cars26, image classification27, web-searches28, and fraud detection29, is also employed to solve different challenges in materials science30. For instance, Zhang et al.19 found that atomic size difference (δ), mixing entropy () and enthalpy () are the most important features in phase selection of HEAs. Singh et al.31–33 used high-throughput DFT to predict properties through the chemical ranges and revealed correlations with valence electron concentration (VEC), size-difference (bandwidth) and vacancies. Roy et al.34 proposed that the average melting temperature (Tm) is the most important feature to predict the Young’s modulus of low, medium and high entropy alloys. Recent efforts utilizing ML35 considered two additional features such as, Pauling electronegativity difference and difference in VEC and used a neural network (NN) to predict the phases that form in these CCAs. Thus, different features control each property of the alloy and the importance of features varies from property to property.
Here, we have employed different tree-based ensemble ML models, linear regression ML models, kernel-based ML models to predict the Young’s modulus of CCAs consisting of refractory elements. This work initially identified VEC, average melting temperature and difference in atomic radii as the most important physical properties that control the Young’s modulus of CCAs. The study compared the relative merits of different ML models for a training set of refractory alloy data that was gathered from published literature. The model prediction was then validated against the Young’s modulus measured for 32 new alloys synthesized and tested as part of this work. The findings offer considerable promise for alloy down selection based on ML models validated against high-quality experimental data of known provenance.
Methodology
Training data collection and feature selection
Data on Young's modulus for CCAs were collected from existing literature34,36–38. Two different data sets were used for model training. The first data set contains 154 alloys with a mixture of refractory and non-refractory alloys. The second data set contains 96 refractory alloys of Mo, Nb, Ta, W, mixed with some other elements like Al, Cr and Ni. Both datasets are presented in Tables 1 and 2 in the supplementary section. The goal of using two different data sets (one with a mixture of refractory and non-refractory alloys and the other with only refractory alloys) was to examine the effect of the elemental composition of training data on the reliability of the prediction with respect to experimentally synthesized validation data.
For the features that were used to train the ML models, we calculated 11 feature values of these alloys. These features are listed in Table 1. Past studies have shown that all of these features have a direct effect on the Young’s modulus for any alloy. To obtain these features, we collected data on features identified from domain knowledge, such as Pauling electronegativity, VEC, lattice constant, melting temperature, mixing enthalpy and atomic radii. Then we used Python language scripts to calculate the features mentioned in Table 1.
Table 1.
Feature | Description | References |
---|---|---|
Difference in Pauling electronegativity weighted by composition Ci for each element i | 39 | |
Mixing Enthalpy derived from enthalpies Hij for a pair of elements i and j | 40 | |
Mixing entropy; R is the universal gas constant | 41 | |
Difference in atomic radius ri weighted by composition Ci for each element i | 42 | |
Difference in lattice constants weighted by composition Ci for each element i | Analogues to | |
Difference in melting temperatures weighted by composition Ci for each element i | Analogues to | |
A geometrical parameter | 43 | |
Parameter for predicting solid state formation | 42 | |
Average melting temp calculated by rule of mixture | 44 | |
Average lattice constant calculated by rule of mixture | 44 | |
Average valence electron concentration calculated by rule of mixture | 40 |
To see the association between the features, we examined the Pearson correlation coefficients (PCC). Figure 1 shows the PCC for the mixed alloys data set and for the refractory alloys data set. In the PCC “heatmap”, P = + 1 indicates a strong positive correlation and P = − 1 indicates a strong negative correlation. Figure 1 indicates the absence of any significant correlation amongst any pair of features except a and am from Fig. 1a. However, the ML models we considered here can deal with the multicollinearity, and hence this correlation will not have any significant impact on the predictions. Therefore we considered all the features in the model.
Validation data preparation and Young’s modulus measurement
An experimental data set was used to validate the final model predictions. The validation set consisted of 32 alloys in the Mo-based family of refractory CCAs, including Mo, Ta W, Ti, Zr, Al, Cr. The validation alloys used in the study were prepared at Ames Lab Materials Preparation Center in the form of thin metal plates/foils. The alloys (1.5 g each) with selected compositions were synthesized by arc melting using a 32-cavity arc melting system (MTI corp, SP-MAM32). The actual compositions of the alloys after arc melting were quantified by energy dispersive spectroscopy (EDS). The densities of the samples were measured by Archimedes measurement. The arc-melted buttons were then sliced by electrical-discharge machining into near-cylinder shapes (two parallel sides) with thicknesses of ~ 3 mm. The elastic modulus values were measured on the cylinders by the ultrasonic pulse-echo technique using a digital ultrasonic thickness gauge (Olympus, 38DL PLUS).
Machine learning models construction
To predict the Young's modulus, four tree based ensemble methods i.e. Gradient Boosting, Ada Boost, Extreme Gradient Boost (or XGBoost), Random Forest (RF), two linear models i.e. LASSO regression, Ridge regression, two kernel based methods i.e. Gaussian Process Regression and Support Vector Machine (SVM) models were used. These models were trained for the two sets of data separately. Once the data was collected and the feature values were selected for both data sets, the 8 ML models were trained on both the data sets. We obtained 16 models, 8 for the larger data set with both the refractory and non-refractory alloys and 8 for the smaller data set with only refractory alloys. Five-fold cross-validation was used to determine the errors. The cross-validation approach is better than the train-test split approach as it gives more robust estimation of the errors. There exist many good metrics to quantify the predictive strength of the model like root-mean-squared (RMS) error, mean-squared error, mean-absolute error (MAE), and the coefficient of determination R2. We chose to use the MAE as our metric as it most closely represents the format of error as reported in most experimental measurements. Additionally, we also reported the R2 values for the optimized models.
The errors were minimized by performing hyper-parameter optimization using the grid-search algorithm. This algorithm works by determining the test error for all possible combinations of the supplied hyper-parameter values. Out of all combinations, the one with the least error was selected for our model. Each of the algorithms has a different set of hyper-parameters. Once the best hyper-parameters were selected, the optimized model using those hyperparameters was used to make predictions for our validation set whose Young’s modulus had been experimentally measured. Finally, the uncertainty of the predictions i.e. standard deviations was calculated by Bootstrapping method by resampling 100 times for each case. All of the above-mentioned tasks like cross-validation and grid search were performed using the scikit-learn45 library in Python. For our study, we employed all the ML models through the scikit-learn machine learning library for the Python language46. The XGBoost model was implemented through the library created by Tianqi Chen47.
Results and discussion
Model optimization
The ML models were first trained on both data sets. The hyper-parameters were optimized and then the training and validation error were calculated using five-fold cross-validation. We used these hyper-parameters to construct our final optimized models. The optimized hyperparameters are presented in the supplementary section (Table 3 in the supplementary section). These hyperparameters were used to predict the Young’s modulus for the unseen data i.e. the experimentally synthesized validation data set. The cross-validated MAE and R2 values for all the models are presented in Table 2. From Table 2 it is clear that the performance of the Gradient Boosting model is superior to other models both in terms of accuracy (i.e., the MAE is lower and R2 is higher than any other models) and robustness (i.e., the standard deviation of cross-validation is lower). Because of this excellent performance, we will discuss the feature importance and prediction of Young’s modulus generated by the Gradient Boosting model.
Table 2.
Model | Cross-validated training MAE (GPa) | Cross-validated test MAE (GPa) | Cross-validated training R2 | Cross-validated test R2 | ||||
---|---|---|---|---|---|---|---|---|
Refractory and non-refractory dataset | Refractory dataset | Refractory and non-refractory dataset | Refractory dataset | Refractory and non-refractory dataset | Refractory dataset | Refractory and non-refractory dataset | Refractory dataset | |
Gradient Boosting | 0.42 ± 0.26 | 0.36 ± 0.16 | 10.37 ± 1.59 | 6.15 ± 1.19 | 0.99 ± 0.003 | 0.99 ± 0.007 | 0.71 ± 0.080 | 0.90 ± 0.036 |
XGBoost | 0.33 ± 0.28 | 1.04 ± 0.48 | 10.32 ± 1.50 | 6.68 ± 1.22 | 0.99 ± 0.003 | 0.99 ± 0.008 | 0.70 ± 0.076 | 0.89 ± 0.038 |
RF | 5.63 ± 0.59 | 5.54 ± 0.63 | 13.53 ± 1.50 | 9.00 ± 1.08 | 0.95 ± 0.009 | 0.96 ± 0.010 | 0.68 ± 0.076 | 0.89 ± 0.031 |
Ada Boost | 12.79 ± 0.94 | 5.54 ± 0.84 | 18.02 ± 1.57 | 9.31 ± 1.53 | 0.86 ± 0.021 | 0.97 ± 0.011 | 0.62 ± 0.080 | 0.88 ± 0.051 |
SVM | 14.78 ± 1.61 | 1.90 ± 0.57 | 17.83 ± 1.99 | 6.41 ± 1.39 | 0.64 ± 0.060 | 0.97 ± 0.013 | 0.54 ± 0.074 | 0.87 ± 0.053 |
Lasso regression | 19.29 ± 1.44 | 17.53 ± 1.14 | 21.09 ± 1.64 | 18.16 ± 1.41 | 0.60 ± 0.060 | 0.72 ± 0.049 | 0.51 ± 0.076 | 0.67 ± 0.172 |
Ridge regression | 19.37 ± 1.40 | 33.18 ± 3.32 | 21.24 ± 1.95 | 33.34 ± 3.26 | 0.60 ± 0.057 | 0.075 ± 0.007 | 0.51 ± 0.082 | 0.018 ± 0.065 |
Gaussian process | 33.52 ± 1.90 | 34.08 ± 3.28 | 33.81 ± 1.92 | 34.55 ± 3.32 | 4.95 E−6 ± 4.9 E−7 | 1.35 E−5 ± 2.2 E−6 | 0.04 ± 0.028 | 0.090 ± 0.067 |
In our data sets, tree-based ensemble type models perform better than other models to predict Young’s modulus. Ensemble type algorithm showed better performance in other studies to predict materials properties34,48,49. Ensemble methods are meta algorithms that combine several base models to produce a better predictive model. To decrease variance, a bagging ensemble method can be used and to decrease bias a boosting ensemble method can be used. A boosting method converts weak learners to strong ones50–52. Usually, decision stumps are used as the base weak learners, but this is not always the case. Most Boosting methods build models in a stage-wise fashion and they generalize the model by optimizing an arbitrary differentiable loss function. Boosting methods also help prevent the problem of over-fitting to some extent. Additionally, Boosting methods solve the problems of a non-linear relation between target properties and features and help to deal with the collinearity among the features. Furthermore, most boosting methods provide the feature importance associated with the model. Feature importance is important to conclude which features influence Young’s modulus the most. Boosting methods are affected by the presence of outliers. Hence, it is recommended to perform outlier analysis before training the data.
Feature importance
After training the models on both data sets containing refractory and non-refractory alloys using the optimized hyper-parameters, we determined the feature importance associated with the Gradient Boosting model. Feature importance is simply the score assigned to the features based on how useful they are at predicting a target variable. The feature importance for the larger data set containing both the refractory and non-refractory alloys, and smaller data set only with refractory alloys are presented in Fig. 2a,b, respectively. From feature importance, it is clear that the sequence of the features is not identical for both data sets. However, the smaller training data set showed better prediction accuracy as indicated in Table 2. Hence, we selected the important features generated from the smaller data set presented in Fig. 2b. In the next paragraph, we are going to explain the physical significance of some of the important features for the Young’s modulus of CCAs.
We found that VEC was the most important feature and had importance higher than 0.7. While it is not shown here, it is important to mention that other ML models i.e. XGBoost and RF showed good prediction capabilities and identified the VEC as the most important feature with an importance of more than 0.7. In the elastic limit and at a constant value of Poisson’s ratio, the Young’s modulus is related to the bulk modulus (Eq. 1) and hence we will explain the physics of the Young’s modulus dependence on VEC by exploring the physical relationship between bulk modulus and VEC53,54.
1 |
Here, K, E and are the bulk modulus, Young’s modulus and Poisson’s ratio, respectively. Gilman et al.53,54, reported that materials with higher valence electron density (VED) (valence electrons/unit volume) possess higher bulk modulus. As the number of valence electrons increases, the bulk modulus increases, and it decreases as the atomic size increases. The bulk modulus is determined predominantly by the resistance of the valence electrons to compression. In a metallic system, electrons behave like a dense gas, or liquid, with only a very small amount of viscosity. Hence, the greater the electron density, the more the resistance to compression, and the higher the bulk modulus and the Young’s modulus. For instance, osmium, possesses a VED 17% higher than for diamond and correspondingly exhibits a bulk modulus 4% greater as well53,54. Though we considered VEC instead of VED in this work, it still follows the upward trend of Young’s modulus both for training and validation data sets with VEC as presented in Fig. 3a,b. Our calculated feature importance indicates that the melting point of alloys, which is an indirect metric of bond strength34,55, has an impact on Young’s modulus, which generally increases with increasing melting temperature as presented in Fig. 3c,d. The geometrical parameter λ, which is a function of mixing entropy () and the difference in atomic radii (δ) has a significant impact on Young’s modulus. The δ parameter has an impact on cohesive energy and Young’s modulus increases with increasing cohesive energy56,57. In our case, we have seen that a lower value of δ results in higher Young’s modulus as presented in Fig. 3e,f. The difference in atomic radius influences the distribution of alloying elements and metallic bond energy. The electronegativity has an impact on the electron density of atoms and the larger value of electronegativity result in a higher Young’s modulus of metallic alloys58. Additionally, larger electronegativity differences () and higher mixing enthalpy () increases the probability of formation of intermetallic brittle phases, which have lower Young’s modulus. Therefore, these two parameters could play an important role to determine Young’s modulus of CCAs34.
It is important to mention that Roy et al.34 predicted Young’s modulus of low, medium and high entropy alloys composed of 5 elements by employing Gradient Boosting method and found that average melting temperature (Tm) was the most important feature without considering the impact of VEC. Corresponding MAE for their study was 23.59 GPa. In this study, we achieved significantly better performance (MAE = 6.15 GPa) by considering VEC in the feature sets. From the above discussion, we propose that VEC is the most important feature that determines the Young’s modulus of this refractory alloy system. Therefore, it is essential to include VEC as a key parameter in the design of new CCAs with tailored Young’s modulus.
Experimental validation
We finally used the trained Gradient Boosting model to predict Young's modulus of unseen CCAs, which are the experimentally synthesized 32 CCAs mostly composed of Mo–Ta–Ti–W–Zr elements. As the experimental validation alloys are all refractory alloys, we examined how the types of training sets have impact on the prediction of Young’s modulus. When we trained the Gradient Boosting model with larger data set containing both refractory and non-refractory alloys the predictions of the Young’s modulus were significantly off compared to experimentally measured Young’s modulus as presented in Fig. 4a. The predicted value consistently underestimated the experimental value. In contrast, we have achieved excellent predictions when we consider only the refractory alloys to train the Gradient Boosting model as presented in Fig. 4b. Only 2 predictions (alloy numbers 6 and 8) out of 32 alloys are outside of 68.3% confidence interval (± σ, where σ is the standard deviation of each prediction. Table 3 presents the actual value of experimental Young’s modulus, mean prediction of Young’s modulus with the percentage of error and standard deviation when the model was trained with refractory alloys. 26 of the alloys had errors ≤ 5% and a few of the predictions are almost identical compared to experimental values.
Table 3.
Alloy number | Alloy composition (actual at. % compositions by EDS) | Experimental Young's modulus (GPa) | Mean prediction | % Error | Standard deviation (± σ) |
---|---|---|---|---|---|
1 | Mo85.25Ta9.52Ti2.29Zr2.94 | 257.6 | 248.2 | 3.7 | 18.9 |
2 | Mo82.23W1.29Ta9.46Ti3.27Zr3.36Al0.39 | 260 | 246.5 | 5.2 | 17.6 |
3 | Mo82.93W2Ta9.89Ti2.4Zr2.72Al0.05 | 256.3 | 247.4 | 3.5 | 18.1 |
4 | Mo80.67W3.3Ta10.34Ti2.45Zr3.13Al0.05Cr0.06 | 264.7 | 253.9 | 4.1 | 18.6 |
5 | Mo76.41W7.23Ta10.69Ti2.33Zr3.17Al0.16 | 268.9 | 255.5 | 5.0 | 17.6 |
6 | Mo78.92W4.27Ta10.72Ti2.7Al3.39 | 273.7 | 244.1 | 10.8 | 17.1 |
7 | Mo84.31W2.48Ta5.84Ti2.64Zr2.95Al1.79 | 261.2 | 245.1 | 6.2 | 17.3 |
8 | Mo85.25W3.05Ta5.51Ti2.28Zr3.39Al0.23Cr0.29 | 272.2 | 248.8 | 8.6 | 18.5 |
9 | Mo79.73W0.09Ta12.36Ti3.92Zr3.88Cr0.03 | 243.9 | 245.5 | − 0.7 | 17.4 |
10 | Mo78.53W1.06Ta12.53Ti3.68Zr4.18Cr0.03 | 237.6 | 245.3 | − 3.2 | 17.3 |
11 | Mo78.58W2.14Ta11.19Ti3.79Zr4.3 | 240 | 246.1 | − 2.6 | 17.5 |
12 | Mo75.86W3.13Ta12.65Ti3.89Zr4.47 | 250.3 | 245.6 | 1.9 | 17.4 |
13 | Mo75.66W3.69Ta12.2Ti3.8Zr4.65 | 238.4 | 245.6 | − 3.0 | 17.4 |
14 | Mo73.77W7.67Ta10.17Ti3.7Zr4.69 | 265.8 | 252.7 | 4.9 | 17.4 |
15 | Mo81.5W1.63Ta6.37Ti3.9Zr4.51Al1.96Cr0.13 | 241.1 | 244.6 | − 1.4 | 17.8 |
16 | Mo78.86W2.93Ta7.48Ti3.69Zr5.36Cr1.68 | 257.4 | 248.0 | 3.7 | 17.5 |
17 | Mo79.92Ta9.87Ti4.69Zr5.45Cr0.07 | 237.1 | 245.3 | − 3.5 | 17.9 |
18 | Mo76.31W0.41Ta9.3Ti6.22Zr7.29Al0.37Cr0.08 | 246.3 | 242.8 | 1.4 | 18.4 |
19 | Mo80.87W1.02Ta6.98Ti5.23Zr5.88Al0.03 | 249.7 | 246.4 | 1.3 | 18.0 |
20 | Mo76.47W3.17Ta8.64Ti5.25Zr6.45Cr0.02 | 247.3 | 246.0 | 0.5 | 17.5 |
21 | Mo73.61W5.27Ta10.49Ti4.71Zr5.93 | 240.1 | 246.2 | − 2.5 | 17.5 |
22 | Mo71.98W6.62Ta9.97Ti5.06Zr6.32Cr0.06 | 241 | 246.2 | − 2.2 | 17.3 |
23 | Mo80.03W1.49Ta4.47Ti5.24Zr6.01Al2.73Cr0.04 | 240.4 | 236.9 | 1.4 | 19.8 |
24 | Mo78.09W3.06Ta4.93Ti4.92Zr7.9Cr1.1 | 243.6 | 245.7 | − 0.9 | 18.7 |
25 | Mo81.65W0.17Ta18.12Ti0.05 | 260.1 | 255.1 | 1.9 | 16.3 |
26 | Mo78.35W1.61Ta20.03 | 266.1 | 254.8 | 4.2 | 15.9 |
27 | Mo76.96W2.93Ta20Ti0.1 | 267.1 | 255.3 | 4.4 | 15.7 |
28 | Mo75.99W3.83Ta20.18 | 270.8 | 256.1 | 5.4 | 16.1 |
29 | Mo76.32W3.14Ta20.48Ti0.05Cr0.01 | 255.8 | 255.3 | 0.2 | 15.7 |
30 | Mo74.54W4.2Ta21.25 | 270.4 | 257.0 | 5.0 | 15.3 |
31 | Mo80.97W3.88Ta14.61Zr0.04Al0.49 | 265.3 | 254.8 | 4.0 | 16.3 |
32 | Mo77.21W4.17Ta17.69Ti0.34Zr0.07Al0.1Cr0.41 | 272.8 | 255.3 | 6.4 | 15.7 |
From Fig. 4 and Table 3 we conclude that the quality of the training data is very important to predict the target property accurately. We have a larger training set (154 alloys) with refractory and non-refractory alloys. On the other hand, we have a smaller training set (96 alloys) only with refractory alloys. Since the training set was more homogeneous for the smaller data set, we achieved better predictions. Moreover, the predicted Young’s modulus followed the trend with the experimental Young’s modulus with some exceptions as presented in Fig. 4b. Therefore, it is not only the size of the training data but also the quality and relevance of the training data that are important for better predictions.
Conclusion
We have presented an approach that uses ML with high throughput experimental synthesis and mechanical testing of alloys to predict the Young’s modulus of CCAs reliably. We conclude that among the eight ML models we used, Gradient Boosting had the best predictive strength. The prediction of Young’s modulus was influenced by the model chosen and by the composition of training data. Our experimental validation set was composed of refractory alloys, and when the models were trained with data containing only refractory alloys, the predictions were closer to the experimental values. This shows that when training ML models to predict characteristics of alloys, it is advantageous to include alloys of similar composition in the training data set. The valence electron concentration is the most important feature governing the Young’s modulus of refractory CCAs and can be used to rapidly screen alloys. Since feature importance also appears to be influenced by the choice of training data set, it is important to choose carefully the training data set based on the type of alloy being studied and validate against high-quality experimental data of known provenance. The integration of experimental synthesis and testing, machine learning, and physics-based interpretation demonstrated in this work holds considerable promise for alloy design and property prediction.
Supplementary Information
Acknowledgements
This effort was principally supported by the U.S. Department of Energy's (DOE) Office of Energy Efficiency and Renewable Energy (EERE) under the Advanced Manufacturing Office (Project WBS 2.1.0.19) through Ames Laboratory, which is operated for the U.S. DOE by Iowa State University under contract DE-AC02-07CH11358. HK was supported in part through the National Science Foundation (NSF) Mathematical Sciences Graduate Internship (MSGI) Program sponsored by the NSF Division of Mathematical Sciences. This program is administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the U.S. Department of Energy (DOE) and NSF. ORISE is managed for DOE by ORAU. This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of its employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
Author contributions
H.K. and M.F.N.T. contributed equally to this work as first authors. They performed the dataset construction, data analysis and wrote the manuscript. G.O. synthesized and characterized the validation data set. A.R, G.B., G.O., J.C., and D.J. oversaw results and discussion and reviewed the manuscript. R.D. provided technical expertise to CCA data, extracted data from research articles, oversaw the results and reviewed the manuscript.
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Code availability
The codes that support the findings of this study are available from the corresponding author upon reasonable request.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Hrishabh Khakurel and M. F. N. Taufique.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-021-96507-0.
References
- 1.Huang SC, et al. Mechanical properties of zirconium-based random alloys: Alloying elements and composition dependencies. Comput. Mater. Sci. 2017;127:60–66. doi: 10.1016/j.commatsci.2016.10.028. [DOI] [Google Scholar]
- 2.Inoue A, et al. Marzouki, development and applications of highly functional Al-based materials by use of metastable phases. Mater. Res. 2015;18:1414–1425. doi: 10.1590/1516-1439.058815. [DOI] [Google Scholar]
- 3.Abdelaziz MH, Paradis M, Samuel AM, Doty HW, Samuel FH. Effect of aluminum addition on the microstructure, tensile properties, and fractography of cast Mg-based alloys. Ann. Mater. Sci. Eng. 2017;2:1–10. [Google Scholar]
- 4.Schinhammer M, Hänzi AC, Löffler JF, Uggowitzer PJ. Design strategy for biodegradable Fe-based alloys for medical applications. Acta Biomater. 2010;6:1705–1713. doi: 10.1016/j.actbio.2009.07.039. [DOI] [PubMed] [Google Scholar]
- 5.Long H, Mao S, Liu Y, Zhang Z, Han X. Microstructural and compositional design of Ni-based single crystalline superalloys—A review. J. Alloy. Compd. 2018;743:203–220. doi: 10.1016/j.jallcom.2018.01.224. [DOI] [Google Scholar]
- 6.Hayama AOF, et al. Effects of composition and heat treatment on the mechanical behavior of Ti–Cu alloys. Mater. Des. 2014;55:1006–1013. doi: 10.1016/j.matdes.2013.10.050. [DOI] [Google Scholar]
- 7.Yeh JW, et al. Nanostructured highentropy alloys with multiple principal elements: Novel alloy design concepts and outcomes. Adv. Eng. Mater. 2004;6:299–303. doi: 10.1002/adem.200300567. [DOI] [Google Scholar]
- 8.Cantor B, Chang ITH, Knight P, Vincent AJB. Microstructural development in equiatomic multicomponent alloys. Mater. Sci. Eng. A. 2004;375–377:213–218. doi: 10.1016/j.msea.2003.10.257. [DOI] [Google Scholar]
- 9.Yim D, Kim HS. Fabrication of the high-entropy alloys and recent research trends: A review. Korean J. Met. Mater. 2017;55:671–683. [Google Scholar]
- 10.Ren B, et al. Corrosion behavior of CuCrFeNiMn high entropy alloy system in 1 M sulfuric acid solution. Mater. Corros. 2012;63:828–834. doi: 10.1002/maco.201106072. [DOI] [Google Scholar]
- 11.Kang YB, Shim SH, Lee KH, Hong SI. Dislocation creep behavior of CoCrFeMnNi high entropy alloy at intermediate temperatures. Mater. Res. Lett. 2018;6:689–695. doi: 10.1080/21663831.2018.1543731. [DOI] [Google Scholar]
- 12.Fu ZQ, MacDonald BE, Monson TC. Influence of heat treatment on microstructure, mechanical behavior, and soft magnetic properties in an fcc-based Fe29Co28Ni29Cu7Ti7 high-entropy alloy. J. Mater. Res. 2018;33:2214–2222. doi: 10.1557/jmr.2018.161. [DOI] [Google Scholar]
- 13.Tikhonovsky MA, Salishchev GA, Yurchenko NY, Stepanov ND, Zherebtsov SV. Aging behavior of the HfNbTaTiZr high entropy alloy. Mater. Lett. 2018;211:87–90. doi: 10.1016/j.matlet.2017.09.094. [DOI] [Google Scholar]
- 14.Qiu Y, et al. A lightweight single-phase AlTiVCr compositionally complex alloy. Acta Mater. 2017;123:115–124. doi: 10.1016/j.actamat.2016.10.037. [DOI] [Google Scholar]
- 15.Jensen JK, et al. Characterization of the microstructure of the compositionally complex alloy Al1Mo0.5Nb1Ta0.5Ti1Zr1. Scr. Mater. 2016;121:1–4. doi: 10.1016/j.scriptamat.2016.04.017. [DOI] [Google Scholar]
- 16.Ye YF, Wang Q, Lu J, Liu CT, Yang Y. High-entropy alloy: Challenges and prospects. Mater. Today. 2016;19:349–362. doi: 10.1016/j.mattod.2015.11.026. [DOI] [Google Scholar]
- 17.Miracle DB, Senkov ON. A critical review of high entropy alloys and related concepts. Acta Mater. 2017;122:448–511. doi: 10.1016/j.actamat.2016.08.081. [DOI] [Google Scholar]
- 18.Ma D, Grabowski B, Körmann F, Neugebauer J, Raabe D. Ab initio, thermodynamics of the CoCrFeMnNi high entropy alloy: Importance of entropy contributions beyond the configurational one. Acta Mater. 2015;100:90–97. doi: 10.1016/j.actamat.2015.08.050. [DOI] [Google Scholar]
- 19.Zhang C, Zhang F, Chen S, Cao W. Computational thermodynamics aided high-entropy alloy design. J. Occup. Med. 2012;64:839–845. [Google Scholar]
- 20.Jiang C, Uberuaga BP. Efficient ab initio modeling of random multicomponent alloys. Phys. Rev. Lett. 2016;116:105501. doi: 10.1103/PhysRevLett.116.105501. [DOI] [PubMed] [Google Scholar]
- 21.Saal JE, Berglund IS, Sebastian JT, Liaw PK, Olson GB. Equilibrium high entropy alloy phase stability from experiments and thermodynamic modeling. Scr. Mater. 2017;146:5–8. doi: 10.1016/j.scriptamat.2017.10.027. [DOI] [Google Scholar]
- 22.Lederer Y, Toher C, Vecchio KS, Curtarolo S. The search for high entropy alloys: A high-throughput ab-initio approach. Acta Mater. 2018;159:364–383. doi: 10.1016/j.actamat.2018.07.042. [DOI] [Google Scholar]
- 23.Sanchez JM, Vicario I, Albizuri J, Guraya T, Garcia JC. Phase prediction, microstructure and highhardness of novel light-weight high entropy alloys. J. Mater. Res. Technol. 2018;424:1–9. [Google Scholar]
- 24.Tapia AJSF, Yim D, Kim HS, Lee BJ. An approach for screening single phase high-entropy alloys using an inhouse thermodynamic database. Intermetallics. 2018;101:56–63. doi: 10.1016/j.intermet.2018.07.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Senkov ON, Miller JD, Miracle DB, Woodward C. Accelerated exploration of multiprincipal element alloys with solid solution phases. Nat. Commun. 2015;6:6529. doi: 10.1038/ncomms7529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Bojarski, M. et al. End to end learning for self-driving cars. Preprint at arXiv:1604.07316 (2016).
- 27.He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers: Surpassing humanlevel performance on ImageNet classification. In: Bajcsy R, Hager G, editors. 2015 IEEE International Conference on Computer Vision (ICCV) IEEE; 2015. pp. 1026–1034. [Google Scholar]
- 28.Pazzani M, Billsus D. Learning and revising user profiles: The identification of interesting web sites. Mach. Learn. 1997;27:313–331. doi: 10.1023/A:1007369909943. [DOI] [Google Scholar]
- 29.Chan PK, Stolfo SJ. Toward scalable learning with non-uniform class and cost distributions: A case study in credit card fraud detection. In: Agrawal R, Stolorz P, Piatetsky G, editors. KDD’98 Proc. Fourth International Conference on Knowledge Discovery and Data Mining. AAAI Press; 1998. pp. 164–168. [Google Scholar]
- 30.Rickman JM, Balasubramanian G, Marvel CJ, Chan HM, Burton M-T. Machine learning strategies for high-entropy alloys. J. Appl. Phys. 2020;128:221101. doi: 10.1063/5.0030367. [DOI] [Google Scholar]
- 31.Singh P, Sharma A, Smirnov AV, Diallo MS, Ray P, Balasubramanian G, Johnson DD. Design of high-strength refractory complex solid-solution alloys. npj Comput. Mater. 2018;4:16. doi: 10.1038/s41524-018-0072-0. [DOI] [Google Scholar]
- 32.Singh P, Smirnov AV, Alam A, Johnson DD. First-principles prediction of incipient order in arbitrary high-entropy alloys: Exemplified in Ti0.25CrFeNiAlx. Acta Mater. 2020;189:248–254. doi: 10.1016/j.actamat.2020.02.063. [DOI] [Google Scholar]
- 33.Singh P, et al. Vacancy-mediated complex phase selection in high entropy alloys. Acta Mater. 2020;194:540–546. doi: 10.1016/j.actamat.2020.04.063. [DOI] [Google Scholar]
- 34.Roy A, Babuska T, Krick B, Balasubramanian G. Machine learned feature identification for predicting phase and Young’s modulus of low-, medium- and high-entropy alloys. Scr. Mater. 2020;185:152–158. doi: 10.1016/j.scriptamat.2020.04.016. [DOI] [Google Scholar]
- 35.Islam N, Huang W, Zhuang HL. Machine learning for phase selection in multi-principal element alloys. Comput. Mater. Sci. 2018;150:230–235. doi: 10.1016/j.commatsci.2018.04.003. [DOI] [Google Scholar]
- 36.Senkov O, Miracle D, Chaput K, Couzinie J. Development and exploration of refractory high entropy alloys—A review. J. Mater. Res. 2018;33:3092–3128. doi: 10.1557/jmr.2018.153. [DOI] [Google Scholar]
- 37.Li W, Liu P, Liaw PK. Microstructures and properties of high-entropy alloy films and coatings: A review. Mater. Res. Lett. 2018;6(4):199–229. doi: 10.1080/21663831.2018.1434248. [DOI] [Google Scholar]
- 38.Couzinié J-P, Senkov ON, Miracle DB, Dirras G. Comprehensive data compilation on the mechanical properties of refractory high-entropy alloys. Data Brief. 2018;21:1622–1641. doi: 10.1016/j.dib.2018.10.071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Fang S, Xiao X, Xia L, Li W, Dong Y. Relationship between the widths of supercooled liquid regions and bond parameters of Mg-based bulk metallic glasses. J. Non-Cryst. Solids. 2003;321:120–125. doi: 10.1016/S0022-3093(03)00155-8. [DOI] [Google Scholar]
- 40.Guo S, Ng C, Lu J, Liu CT. Effect of valence electron concentration on stability of fcc or bcc phase in high entropy alloys. J. Appl. Phys. 2011;109:103505. doi: 10.1063/1.3587228. [DOI] [Google Scholar]
- 41.Takeuchi A, Inoue A. Classification of bulk metallic glasses by atomic size difference, heat of mixing and period of constituent elements and its application to characterization of the main alloying element. Mater. Trans. 2005;46:2817–2829. doi: 10.2320/matertrans.46.2817. [DOI] [Google Scholar]
- 42.Yang X, Zhang Y. Prediction of high-entropy stabilized solid-solution in multi-component alloys. Mater. Chem. Phys. 2012;132:233–238. doi: 10.1016/j.matchemphys.2011.11.021. [DOI] [Google Scholar]
- 43.Singh AK, Kumar N, Dwivedi A, Subramaniam A. A geometrical parameter for the formation of disordered solid solutions in multi-component alloys. Intermetallics. 2014;53:112–119. doi: 10.1016/j.intermet.2014.04.019. [DOI] [Google Scholar]
- 44.Senkov ON, Wilks GB, Miracle DB, Chuang CP, Liaw PK. Refractory high-entropy alloys. Intermetallics. 2010;18:1758–1765. doi: 10.1016/j.intermet.2010.05.014. [DOI] [Google Scholar]
- 45.Breiman, L. Arcing The Edge. Technical Report 486. Statistics Department, University of California, Berkeley (1997).
- 46.Pedregosa F, et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
- 47.Tianqi, C. & Carlos, G. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794 (2016).
- 48.Mamun O, Wenzlick M, Hawk J, et al. A machine learning aided interpretable model for rupture strength prediction in Fe-based martensitic and austenitic alloys. Sci. Rep. 2021;11:5466. doi: 10.1038/s41598-021-83694-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Mamun O, Wenzlick M, Sathanur A, et al. Machine learning augmented predictive and generative model for rupture life in ferritic and austenitic steels. npj Mater. Degrad. 2021;5:20. doi: 10.1038/s41529-021-00166-5. [DOI] [Google Scholar]
- 50.Schapire RE. The strength of weak learnability. Mach. Learn. 1990;5:197–227. [Google Scholar]
- 51.Friedman JH. Greedy function approximation: A gradient boosting machine (PDF) Ann. Stat. 2001;29:1189–1232. doi: 10.1214/aos/1013203451. [DOI] [Google Scholar]
- 52.Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997;55:119–139. doi: 10.1006/jcss.1997.1504. [DOI] [Google Scholar]
- 53.Gilman JJ. Electronic Basis of the Strength of Materials, Chapter 12. Cambridge University Press; 2003. [Google Scholar]
- 54.Gilman JJ, Cumberland RW, Kaner RB. Design of hard crystals. Int. J. Refract. Met. Hard Mater. 2006;24:1–5. doi: 10.1016/j.ijrmhm.2005.05.015. [DOI] [Google Scholar]
- 55.Rickman JM. Data analytics and parallel-coordinate materials property charts. npj Comput. Mater. 2018;4:5. doi: 10.1038/s41524-017-0061-8. [DOI] [Google Scholar]
- 56.Roy A, Sreeramagiri P, Babuska T, Krick B, Ray PK, Balasubramanian G. Lattice distortion as an estimator of solid solution strengthening in high-entropy alloys. Mater. Charact. 2021;172:110877. doi: 10.1016/j.matchar.2021.110877. [DOI] [Google Scholar]
- 57.Pettifor DG. Electron theory of metals. In: Cahn RW, Haasen P, editors. Physical Metallurgy. Elsevier; 1983. [Google Scholar]
- 58.Li K, Kang C, Xue D. Electronegativity calculation of bulk modulus and band gap of ternary ZnO-based alloys. Mater. Res. Bull. 2012;47:2902–2905. doi: 10.1016/j.materresbull.2012.04.115. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.
The codes that support the findings of this study are available from the corresponding author upon reasonable request.