Summary

Predictor selection is a complex issue. It has been studied in many fields like Statistics, Data Analysis, Predictions, and Machine learning. In data science, it is addressed at the individual attribute level and at multiple attributes level. At the individual attribute level, it is called single variant analysis, it is many studies the relationship between individual attribute and the dependent variable like what we have done in Chapter 4 and 5. Correlation analysis between individual attributes and the dependent variable can provide the prediction power of each individual attribute so the selection can take predictors as wished. The multiple attribute analysis, called multivariant analysis, focused on and covariant among the multiple attributes, so the strong correlation or collinearity can be identified, so only representative attribute can be selected as a predictor. Predictor selection is also influenced and affected by the model constructed. this will become clear in later Chapters after the model construction is introduced.