Time series forecasting to predict future values of important quantities or locations that had been measured over time. Values at different points in time are not independent, but are correlated (in both time-directions), with a causal relationship possible only in the forward direction. Additionally, the data often reveal a cyclical correlation, whether yearly, monthly, etc. For example, outdoor temperatures have daily and yearly seasonality and sales typically have weekly and yearly cycles.
New data scientists often make the mistake of treating time-series data like independent tabular data and ignore these constraints. The resulting models will: 1) be overconfident in what they know — as will the model creators — and 2) not account for the effects of problem-specific factors that can give additional insight and accuracy. Thus, time series techniques were created to address these issues.
Applications
Demand Forecasting
Demand Forecasting is the practice of predicting future sales and plays a crucial role in nearly every industry. In the Consumer Goods industry, these forecasts inform decisions on inventory management, production planning, and capacity optimization as well as provide essential input into pricing, market potential, and business growth strategies. Although no model can perfectly predict the future, approaches and techniques are becoming so sophisticated, that many manufactured items are produced just-in-time, reducing the need for storage and extra transportation. In other industries, such as energy or healthcare, demand forecasts are imperative so that high demand of electricity is planned for and life-saving commodities, such as blood, are acquired when there is an anticipated shortage.
Extrapolation of demand forecasts can help prepare a company by providing a probable financial outlook based on each aspect of manufacturing and expected sales, allowing the organization to be more efficient in resource allocation. Nearly every business decision in an organization, from purchase of raw materials to human resource planning, depends on these forecasts.
Survival Analysis
Survival models have their origin in healthcare, in which they were used to predict how long someone would live from looking at the trajectory of a handful of their medical records over time. In the general case, the event could be an injured worker returning to work, a machine requiring maintenance, or a client moving to a competitor.
One assumes the event will eventually happen for all cases, but your data only knows the output for some cases. For some other cases the event may have already happened, but you don’t know about it; that problem is called “censored” data. The challenge you need to solve is to be able to intelligently predict the time to the event for all cases, even those not in your data. The key is to use the data to estimate probability distributions, which the models can “condition” by the specifics of each case’s inputs.
Users unfamiliar with the Survival Model framework are tempted to turn them into a binary classification problem: set a time and then predict if an entity will survive until that time. For example, predicting how likely an injured employee is to return to work within six months. The problem with this approach is twofold. First, the binning decision throws away information: they treat someone who returns after a month the same as someone who returns one day short of six months, and they treat someone returning to work one day over six months as different from someone who returned a day earlier. This is a major issue with binning or categorizing continuous data. Second, if the question changes to “returns within three months”, or “doesn’t return for nine months”, they must recode their data and create a new model which may have very different characteristics and issues.
Case Studies
- We have used Survival Models to predict:
- Customers continuing a subscription
- Employees returning to work from on-the-job injuries
- Effects of government actions during the COVID-19 crisis for a European government’s leadership team (using epidemiological compartment models)
- Unusual spending patterns in government grant recipients (comparing grants to peers)
- Faulty machine sensors (using changepoint detection and classification)
Call Center Staffing Demand Forecast
In this engagement, we partnered with a global fast-food enterprise and their internal call-center for owner-operators to ask questions of the support center or corporate headquarters. The inquiries to the call-center encompass any topic, but this project focused on IT calls. It’s goal: to collaboratively analyze call-center trends and to deploy a minimally-viable forecast of call center demand to demonstrate applicability for call-center staff planning.
Our client had recently taken call-center staffing in-house from a third-party vendor and many processes were still in play. We were able to visualize the data and provide early insights to the client team. We used an Erlang calculator to forecast incoming and outbound calls. We overlaid historic call data with staffing levels to indicate where they were likely understaffed and put the historic and forecasted data into a series of Tableau workbooks to help gain insights for scheduling determinations.
Shipping Logistics
Our well-known logistics company client was three months into a highly-visible analytics project with an urgent need to generate forecast results. Given the strategic importance of this effort, they needed to quickly scale a prototype forecast model into their automated production system that interfaced with a new platform for their planners. We provided a bridge between technical experts and the development team responsible for the platform. We worked collaboratively with the prototype model authors, software developers and architects, database administrators, and business stakeholders to ensure that our framework met requirements, interfaced with existing systems, and provided the flexibility needed for future development. We also offered a valued perspective on statistical and optimization techniques for the prototype designers.
We implemented an automated framework for inbound demand forecasting at a major logistics company. In three weeks we delivered a functioning, forecasting framework and within six months had scaled this to produce 35 million forecasts on over 2000 locations in under one hour. This framework features automated execution and algorithm selection at short, medium, and long-term horizons. At a two-week horizon, our forecasts had a median accuracy of 88%.