a) Scores plots of OPLS-DA modeling of BAL cell proteome alterations between Smokers (circle) vs Never-smokers (open triangles) OPLS-DA models suggested a perfect separation and predictive power between Smokers and Never-smokers p(CV-ANOVA) = 6.2 × 10−19) using 506 significant proteins (|p(corr)| > 0.34, n = 42, Additional file 1: Table S2). Stratification by gender revealed a highly significant separation in females (b; p(CV-ANOVA) = 3.2 × 10−10), fitted using 401 significant proteins (Additional file 1: Table S3), with a near perfect predictive power of 98% (R2 = 0.98, Q2 = 0.94). Also in males, the group separation was significant (c; p(CV-ANOVA) = 4.5 × 10−6), fitted with 301 significant proteins (Additional file 1: Table S4), with a predictive power of 87% (R2 = 0.95, Q2 = 0.87) . As displayed by the Venn diagram in panel d, a total of 199 proteins were altered in both female and male Smokers, while 202 proteins were altered only in female Smokers, and 102 proteins were altered only in male Smokers. An additional 75 proteins were found to be altered only in the joint gender model. The majority of the significantly altered proteins from the gender stratified models were altered also in the joint gender model; 95% in male and 86% female Smokers. Keys: t[1]: scores of OPLS-DA predictive component; to[1]: scores of the first orthogonal component from OPLS-DA model; p:cross-validated (CV)-ANOVA p-value for significance of group separation in the model