The Statistical Significance of Stepwise Regression Models Developed by Forward Selection: A Monte Carlo Calibration

The Statistical Significance of Stepwise Regression Models Developed by Forward Selection: A Monte Carlo Calibration

By Shelby H. McIntyre, David Bruce Montgomery, V. “Seenu” Srinivasan, Barton Weitz
1981Working Paper No. 624

Information for evaluating the statistical significance of stepwise regression models developed with a forward selection procedure is presented. Cumulative distributions of the adjusted coefficient of determination (R-2) under the null hypothesis of independence between the dependent variable and m potential independent variables are derived from a Monte Carlo study. The study design employed sample sizes of 25, 50 and 100, available independent variables of 10, 20 and 40, and three criteria for including variables into the regression model. The results reveal that the biases involved in testing statistical significance by two well-known rules are very large, thus demonstrating the desirability of using the Monte Carlo cumulative (R-2) distributions developed in this paper. Extensions to the correlated predictor case are considered.