Ivy Tech develops machine learning algorithm to identify at-risk students and provide early intervention

By shifting to Google Cloud Platform, Ivy Tech scales to manage 12 million data points from student interactions, improving the AI engine they built to analyze student engagement and predicting course outcomes with 80% accuracy.

As universities move more and more of their services online, they have found themselves awash in the data they collect and back up every day. Ivy Tech, a community college with campuses across Indiana, has over 50,000 students whose daily interactions with the institution have generated over 26 terabytes of stored data. Lige Hensley, Chief Technology Officer at Ivy Tech, and his team realized that this enormous data warehouse gave them a unique opportunity to ask new questions about student behavior. How, for example, might they analyze the data to help students better succeed in the classroom? It was also a chance to shift a paradigm. “In higher ed, most data analysis is done backwards,” Hensley explains. “It tells you in hindsight what worked and what didn’t. It’s reactive.” But for their data to help students most it needed to be predictive. The team started calling their data warehouse the New Thing, and soon it became NewT for short.

"By shifting to GCP and TensorFlow we can do a lot more what ifs much faster."

Lige Hensley, Chief Technology Officer, Ivy Tech

Helping students succeed using AI

Ignoring course grades, attendance, or any demographical data, the team built an algorithm to look for patterns in students’ online behavior based on anonymized and aggregated metadata from their learning management system. At first they built the algorithms by hand, refining as they went along. Then, as they added more and more training data, they shifted to machine learning, which led to new insights: for example, they discovered that the more students interacted with the college online the higher their course grades. Soon the algorithms could predict a student’s final grade in a course with 60%-70% accuracy by week two of the semester.

In the fall of 2016 they decided to pilot an experiment. Pulling data from 10,000 course sections, they identified 16,000 students statistically at risk for failing by the second week of the semester. Over the next two weeks they assigned outreach workers to call each student and record the results, which were surprising. Some students had lost their electricity and didn’t know Ivy Tech offered emergency resources. Others were missing course materials so the team intervened with the bookstore to make sure the books became available. Most of the obstacles students encountered were not strictly academic, and many were easily solved once the institution knew about them. By the end of the semester the pilot had helped 3,000 of the students they contacted get a grade of C or better—and 98% of them were either neutral or happy about being contacted. “We had the largest percentage drop in bad grades (Ds and Fs) that the college had recorded in fifty years,” Hensley reports. “That one phone call wasn’t everything but it certainly made a bigger dent than we had ever seen.” Now he estimates that Project Student Success has helped 34,712 students—and counting.

Leveraging GCP tools to build a flexible AI engine

Now the team can run their models and generate predictions every day with 80% accuracy. The ever-growing volume of data continually improves the predictions, but also creates new challenges. “We couldn’t buy a box big enough to do this anymore,” Hensley asserts. “Going to the cloud was the only way to go. We had hit the limit.” A year ago, the team decided to migrate to the integrated suite of AI tools on Google Cloud Platform (GCP), using TensorFlow to generate their machine learning system. P.J. Hinton, Senior Data Scientist at Ivy Tech, points out that “instead of having to beg for more memory or more cores we can use TensorFlow to scale up. The fact that TensorFlow is open source and you can get it running on your development environment to test locally before pushing it up to ML Engine is very attractive to me so I can test before it costs more. Cloud DataPrep helps you get data into the system quickly, join data sets together without SQL, and then look at patterns to get a feel for it. And it’s very good at summarizing data numerically and visually so you can see what’s going on.”

“Bringing data together from a variety of sources is what tells the complete picture,” Hensley argues. “Looking at student engagement on learning management systems alone can’t do that.” He notes that most data analysis programs simply build new case studies to analyze new scenarios: they can add data but as the scenarios change each model becomes obsolete and must be rebuilt from scratch. NewT instead provides an AI engine that can flexibly create its own models as scenarios change.

Hensley and Hinton have a long list of next questions to ask through NewT: how does one semester’s grades predict long-term student performance? How do Ivy Tech’s different campuses across the state create different learning environments and student outcomes? How do student grades map against workplace-readiness certifications? These questions, and their answers, lie just around the corner. “By looking at the data holistically we’ve learned a lot,” Hensley said. “We’re combining AI and more traditional statistical analysis for more valuable results. This pointed us in directions we didn’t know to look. By shifting to GCP and TensorFlow we can do a lot more what ifs much faster. Now we can take it to the next level, which is where we’re at today.”

"The fact that Tensorflow Library is open source and lets me test locally is very important to me for managing costs."

P.J. Hinton, Senior Data Scientist, Ivy Tech

Thanks for signing up!

Let us know more about your interests.