Predictive Analytics Proof of Concept (POC) September 2014
12 Slides2.19 MB
Predictive Analytics Proof of Concept (POC) September 2014
Additional Information on the Sac State Predictive Analytics POC ECUCAUSE 2014 Poster Session
Proof of Concept (POC) Objectives 1. Provide predictive insights for a university-wide strategic issue/program (e.g. student success and student retention) 2. Demonstrate the capability of predictive analytics for broader application 3. Develop expertise with IBM SPSS Modeler in partnership with the vendor and key campus leaders 4. Identify gaps in the data and next steps for architecting and deploying a predictive analytics solution
Predictive Analytics Journey Indicators: Milestones: 1.Enroll full time 2.Earn summer credits 3.Complete a college success course or first-year experience program 1.First semester grade point average 2.Second semester grade point average Subset of Indicators and Milestones identified by the Institute for Higher Education Leadership and Policy (IHELP): “Student Flow Analysis: CSU Student Progress Toward Graduation” *
SPSS Modeler “Stream” Using Factors from Published Study
Inside the “Super Node” Additional Data Prep
“Auto Prep” Option in SPSS Modeler Choose Speed, Accuracy, or Manual
How Good was the POC Model?
SPSS Modeler Predictions for Each Student in Cohort
Predictive Analytics POC Lessons Learned 1. 2. 3. Learned how to use IBM SPSS Modeler Data Prep is key – and time consuming! Consider moving some of the Data Prep to the ETL layer (i.e. model the data so it can easily be used at input for analytics) 4. You must “know your data” 5. You must be familiar with statistical methods to prep the data properly and to understand the results 6. Optimal Predictive Analytics Project Team: Data Modeler, BI Analyst, Subject Matter Expert from functional area, and Data Scientist 7. Correlation vs. Cause 8. The output may be one step in developing advising programs, identifying advising cohorts or for advising individuals; however, caution should be taken in directly advising a student based on one predictive model looking 5 years out 9. Predictive analytics is an on-going, iterative process 10. There is an opportunity to write the predicted outcomes to the data warehouse and use them to track the usefulness of the model and to create dashboards to track the success of resulting programs
Additional POC Work 1. In addition to focusing on the IHELP indicators and milestones, several models using a broader set of data from the data warehouse were developed 2. Experimented with different cohort years and different targets 3. Used IBM SPSS Modeler to develop a basic POC Faculty Retention Model 4. Used IBM SPSS Modeler for descriptive analytics for AD ASTRA event and scheduling data
Predictive Analytics Next Steps 1. 2. 3. 4. 5. 6. 7. Link to campus strategic plan, identify an opportunity for predictive analytics to contribute to its success, and build target models with the “optimal team” as described previously (tight collaboration with campus functional areas) Continue to develop models focused on student success, but explore other areas such as university advancement, scheduling, etc. Move from using flat file extracts to connecting IBM SPSS Modeler directly to the data warehouse Develop data models and ETLs to better prep the data for predictive analytics and data mining Identify missing data or data gaps and close the gaps if possible with the data that we have Partition data to develop the model on a subset of data and then test its predictive power on the remaining set Continue to learn and build expertise with SPSS Modeler and its capabilities as well as continue to build expertise in statistical methods in general