Machine Learning-Driven Comparative Analysis and Optimization of Cu-Ni-Si and Cu Low Alloys: From Data-Driven Interpretation to Inverse Design
Keywords:
Cu-Ni-Si alloys, Machine learning, Electrical conductivity, Hardness prediction, SHAP analysis, Alloy designAbstract
The development of high-performance copper alloys requires balancing mechanical strength and electrical conductivity, properties that are often inversely correlated due to competing strengthening mechanisms. This study presents a comparative machine learning analysis of Cu-Ni-Si and Cu low alloys using a curated dataset of 1690 entries derived from the Gorsse et al. database, comprising 1507 samples with hardness measurements and 1685 samples with electrical conductivity data. Three ensemble-based regression algorithms, Random Forest, XGBoost, and Gradient Boosting, were trained to predict Vickers hardness (HV) and electrical conductivity (%IACS) from an augmented feature set encompassing alloy composition, thermomechanical processing parameters, missingness indicators, and physics-informed descriptors (valence electron concentration, atomic size mismatch, electronegativity difference, and Ni:Si atomic ratio). XGBoost achieved optimal performance for hardness prediction (R2 = 0.8554, RMSE = 29.90 HV), while Gradient Boosting performed best for electrical conductivity (R2 = 0.8400, RMSE = 5.96%IACS). Averaged tree-based feature-importance analysis identified valence electron concentration as the most influential predictor for hardness (39.9%), followed by aging temperature (11.2%), while Cu content dominated conductivity prediction (37.7%), followed by aging time (8.9%). Complementary SHAP analysis confirmed these trends while revealing directional relationships and nonlinear feature interaction effects. Composition-grouped cross-validation by unique alloy formula (K = 10) yielded substantially lower performance, with grouped CV R2 = 0.438 for hardness and 0.293 for conductivity, indicating that generalization to unseen alloy formulations remains limited. The models were further applied for practical tasks, including property prediction for new alloy compositions, processing parameter optimization via differential evolution with metallurgical constraints (achieving hardness up to 293.9 HV or conductivity up to 45.7%IACS for the same base composition, with prediction intervals reported), and inverse design to identify alloy formulations meeting specified target properties. This work demonstrates the potential of interpretable machine learning to support copper alloy development by enabling rapid computational screening of the compositional and processing parameter space, subject to the generalization limitations identified herein.