Skip to main content
Skip table of contents

Don't you need multiple years of data to understand seasonality?

The short answer is: no, you don’t need multiple years of data because in an experimental context we don’t need to explicitly control for seasonality. This is because the control group in our experiment will, presumably, experience seasonality in the same way as the treatment group. So when we are comparing the treatment group to the control group, the impacts of seasonality (and many other things!) are already “controlled for” just by the nature of controlled experimentation.

Given the importance of controlling for seasonality with multiple years of data in the context of a media mix model, it’s easy to understand why some marketing analysts assume that you need to explicitly control for seasonality in a GeoLift experiment. However, due to the magic of active intervention via a structured experiment and a light “parallel trends” assumption, we should be covered in the context of GeoLift.

To start, let’s imagine that we’re working in the context of a clinical trial testing a medication that treats the symptoms of the flu. We structure a standardized randomized control trial to run in October where we recruit 5,000 individuals with flu systems. Half will receive treatment with a placebo and the other half will receive our new medication. At the end of the trial we’ll compare time-to-resolution between our treated subjects and our untreated subjects. 

So the question is: do we need to “control for” seasonality? We ran this experiment during flu season, do we need to control for that?

The answer is: no! Because of how we structured the experiment, we believe that seasonality should be equally impacting our test and control groups so we don’t need to “control” for seasonality at all.

The situation is similar in the context of a GeoLift experiment. The idea behind the synthetic control method is that we’re creating a modeled control group that, at least in theory, should be similar to what we’d get if we did a large-scale randomized experiment like in the clinical trial example above. So the theory is similar: we believe that our control geographies should experience seasonality the same way that our treatment geographies do, and so when we measure the “lift” of treatment above control, seasonality is already accounted for (to the extent that it’s impacting both the treatment and control geographies during the experimental period).

There is one particular “watchout” that is particularly important in the context of synthetic controls where we aren’t doing true prospective randomization and we’re often working with relatively small sample sizes:

Highly seasonal businesses might have higher variance during a holiday season (e.g., when the total volume of sales spikes dramatically around Black Friday). This can cause problems in the design stage because you could design an experiment that’s well-powered for business-as-usual time but that is not well-powered during the high-variance holiday season.

This means that it can be useful to use last year’s holidays sales when you’re designing your analysis, and it can also be helpful to include a full year’s worth of data when doing the analysis so that the synthetic control method can attempt to find control markets that match the treatment markets during prior holiday seasons as well as more recently.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.