Synthetic Controls: A Smarter Way to Measure Change

People walking in the rain

A quick-service restaurant corporation sought a way to accurately determine how a variety of factors—both controllable and uncontrollable—impacted the business. Elder Research developed a flexible, easy-to-use software library for fitting synthetic controls to data, even though traditional control groups were unavailable. Synthetic controls are a data-driven way to estimate what would have happened without a change, helping businesses measure true impact.

The Challenge

Businesses often need to know the causal effect of an action or an event such as adverse weather, a nearby location closing, or an internal business decision. For example, did introducing a new menu item boost sales or cut into the sales of other items?

The gold standard for this kind of estimation requires defining a randomized control group, a set of locations for which the action will not take place. The ability to randomly assign locations into test and control groups makes it possible to measure the effect of the event on the test group by comparing test locations with their control counterparts.

But what if traditional control groups are unavailable?

Random assignment is not always possible. In real-world scenarios, changes can happen without a control group for comparison, and many outside factors can also affect the results alongside the actions we can control.

Without a control group, we can observe what happens after a given event. But distinguishing between the changes caused by the event itself and the changes caused by external factors is difficult or impossible.

To identify these effects, we effectively need a copy of the location of interest: one that experienced the event and another that did not. This is what synthetic controls provide.

The Solution

To meet this challenge, Elder Research leveraged a technique developed in the field of economics: synthetic controls. Since we did not have an actual control group, we simulated one by reviewing locations that did not experience the event and finding the combination of these that best simulated the location we wanted to study.

This blend of unaffected locations was carefully matched to the affected location for the period before the event occurred. Once the synthetic control was set, we were able to compare the outcome in the test location to our simulated prediction of what would have happened had the event not occurred.

In this way, we quantified how much of a change we saw after an event was due to the event itself.

Because our client’s analysts preferred to work with the Python programming language, we authored a Python package for deriving and working with these synthetic controls. We knew the client would have a wide range of scenarios that could benefit from synthetic controls, so we designed the package to work with common Python data formats and integrate seamlessly with the data in the client’s ecosystem. Our solution included:

  • support for several open-source solvers used to find the optimal synthetic control solution to match the test location.
  • functions to test the robustness of the fitted synthetic control, to make sure that the results were stable and not overly reliant on any one control location.
  • functions to plot test and control data, including lifts attributable to the test event (see figure below), and to calculate confidence intervals and compute statistical significance.

Synthetic controls figure for case study

On the left, the red line tracks one location’s weekly sales, and we observe a sales boost beginning at the gray vertical line—which marks the closure of a nearby location. The corresponding blue line is this location’s control: a combination of locations outside of the market, chosen to closely match the target location in the shaded pre-closure period.

By comparing the actual series to the control, the package helps to reveal sales lift, both weekly (center) and cumulatively (right). The yellow-shaded region also indicates the range of lifts to be expected (90 percent of the time) in the absence of a true effect. The comparative size of the sales series provides assurance of the reality of the measured effect.

The Results

We applied the synthetic control package to several test cases, including restaurant closings, potential boosts or cannibalizations of menu items when new menu items are introduced, and extreme weather events. In each of these cases, we were able to simulate what would have happened if the event had not occurred and calculate confidence intervals for the effects of the event.

For example, in the case of new menu items, our client was able to determine whether the sales benefit of introducing new menu items was offset by losses in sales for other items. Because the synthetic control groups were matched in time to the test groups, we could isolate the effects of the events of interest from other external factors that might also affect sales. This allowed the client to judge the dollar impacts of a wide range of events, both those controlled by the client and those that were not.

Further, because we created the package in collaboration with the client and wrote documentation materials to make it accessible, the client was able to continue using it following our engagement and to introduce it to new teams across their organization. Thus, rather than a one-off analysis, we provided the client with a versatile tool that could be reapplied to many current and future situations, such as sporting events, promotional campaigns, road construction, new location openings or nearby competitors.

By making an easy-to-use software solution for synthetic control, we helped our client’s analysts speed up the time required to go from the observation of an uncontrolled business event to actionable results with an understanding of real dollar impacts.