We operate a number of heavy repair facilities for performing maintenance and upgrades to our rail car fleet. When maintenance is needed and the car arrives at a shop, the extent of labor required is not known until the car is thoroughly inspected and an estimate is written. Rail car type, miles traveled, loads/unloads, commodity properties, customer handling, due compliance programs, etc. all can influence the amount of repair work required. Being able to better predict maintenance would improve scheduling of rail cars and allocation at different shops.
We have a rudimentary system in place based on random forests built in Python (pandas, scikit-learn, Orange) / Excel, but it needs refinement.
Blue sky objective would be to build this insight into a tool that helps direct shop loading decisions (e.g., can we justify sending this car 500 miles further away to a shop with more available capacity?).
Experiential Learning Program Details
|School||University of Notre Dame Mendoza College of Business|
|Engagement Format||Capstone - Small Team Consulting Project - Students work in small groups of 2-6 directly with faculty and host company project champions on developing real solutions to real-world challenges.|
|Students Enrolled||5 Students per Group (61 Enrolled in Program)|
|Meeting Day & Time||Monday OR Wednesday (3:00 - 4:50 PM ET)|
|Student Time Commitment||4-7 Hours Per Week|
|Company Time Commitment||2 Hours|
|Touchpoints & Assignments||Due Date||Submission|
Key Project Milestones
February 7, 2020 - Define
Introduction to GATX’s business with focus on rail car repair operations. The current process and preliminary data will be shared with students for initial exploration.The goal of the project is to predict expected labor hours required in several categories of repair (cleaning / commodity flare, mechanical, interior blast, interior lining, exterior paint) using known characteristics of the car. Accurate labor predictions will inform future loads on labor and can predict expected cycle times of repair events.
Define data scope (time period, data items)Finalize milestones and project charter
February 28, 2020 - Measure/Analyze
The data provided will be from production databases and will be shared in its raw, unclean state. It will need be cleaned/wrangled into something that is suitable for input into a machine learning algorithm.Once in a manageable state, the team will need to decide which data is relevant to the problem at hand. Feature selection can be justified with statistics or by knowing the relationship between the feature and the target.
Feature selectionRegression model selection
March 27, 2020 - Improve
The team will develop a method for applying predictive analytics and evaluate the effectiveness of their method.This method will be used by the business user(s) on an on-going basis to evaluate current loads and inform daily operational loading decisions. As such, the method must be well-defined and programmatic. Required inputs and generated outputs must be documented for repeatability.
Specification of training input file (time scope, file format, column names, data types, etc.)Specification of prediction input fileSpecification of output fileTraining program / scriptFinal prediction program / scriptEvaluation of model (cross-validation scores [R², RMSE, MAE, etc.] and known issues)
April 17, 2020 - Control
Define how the technical output can be utilized by the business for sustained effectiveness.Identify any known gaps in the final model and suggest solutions for these gaps.
Mockups / suggestions for front-end user interface for use by the operations scheduler and shop managementFinal report out (slide deck, white paper, or similar) targeted toward management–Audience for report should be assumed to be technical (familiar with stats) but not experts in ML or predictive analytics–Target ~15 minute presentation or ~5 minute read
There are no resources currently available