Advanced Predictive Modeling Using IBM SPSS Modeler (v18.1.1)
This course presents advanced models to predict categorical and continuous targets. Before reviewing the models, data preparation issues are addressed such as partitioning, detecting anomalies, and balancing data. The participant is first introduced to a technique named PCA/Factor, to reduce the number of fields to a number of core fields, referred to as components or factors. The next units focus on supervised models, including Decision List, Support Vector Machines, Random Trees, and XGBoost. Methods are reviewed to combine supervised models and execute them in a single run, both for categorical and continuous targets.
• Business Analysts• Data Scientists• Users of IBM SPSS Modeler responsible for building predictive models
- Preparing data for modeling
- Reducing data with PCA/Factor
- Creating rulesets for flag targets with Decision List
- Exploring advanced supervised models
- Combining models
- Finding the best supervised model
• Familiarity with the IBM SPSS Modeler environment (creating, editing, opening, and saving streams).• Familiarity with basic modeling techniques, either through completion of the courses Predictive Modeling for Categorical Targets Using IBM SPSS Modeler and/or Predictive Modeling for Continuous Targets Using IBM SPSS Modeler, or by experience with predictive models in IBM SPSS Modeler.
1: Preparing data for modeling • Address general data quality issues • Handle anomalies • Select important predictors • Partition the data to better evaluate models • Balance the data to build better models
2: Reducing data with PCA/Factor • Explain the idea behind PCA/Factor • Determine the number of components/factors • Explain the principle of rotating a solution
3: Creating rulesets for flag targets with Decision List • Explain how Decision List builds a ruleset • Use Decision List interactively • Create rulesets directly with Decision List
4: Exploring advanced supervised models • Explain the principles of Support Vector Machine (SVM) • Explain the principles of Random Trees • Explain the principles of XGBoost
5: Combining models • Use the Ensemble node to combine model predictions • Improve model performance by meta-level modeling
6: Finding the best supervised model • Use the Auto Classifier node to find the best model for categorical targets • Use the Auto Numeric node to find the best model for continuous targets