From Pre-Post to Difference-in-Differences: Enhancing Your Evaluation Strategy
When evaluating the impact of a program or intervention on your clients, one of the most common methods is the pre-post test design. This approach involves collecting data before and after the intervention to measure any changes. It’s a straightforward approach, but it cannot answer an important question about the change: why it happened or even what contributed to it.
You could run a hypothesis test on the change score and conclude, ‘there is a statistically significant difference between time A and time B.’ However, you can’t attribute the change score to the program. So how can we improve our evaluations to better understand the true effect of our programs?
Difference-in-Differences provides one accessible approach.
What is Difference-in-Differences (DiD)?
Difference-in-Differences addresses many of the challenges associated with pre-post test designs by introducing a comparison group that does not receive the intervention. This control group helps us control for possible external factors that could influence the change score over time. Here’s how DiD works:
First, we measure the same outcomes for both the treatment group (which receive the intervention) and the control group (which doesn’t) before the intervention begins (A-B in the graph above). Second, during our evaluation period, we aim to prevent any spillover effect by keeping the groups separate from one another. We also assume that the groups would follow the same trend on the outcomes over time without the intervention. Third, once our evaluation period concludes, we measure the same outcomes for both groups again (C-A and D-B in the graph above). By controlling for external factors, DiD gives us a clearer understanding of how much of the observed change can be attributed to the intervention.
Example: Evaluating a Foster Care Support Program
Let’s imagine you’re evaluating two versions of a foster care support program. The newer program offers additional support but comes at a higher cost compared to the original program. You want to know if the increased cost is justified by better outcomes.
Step 1: Pre-Test
At the start of the program, you measure key outcomes for both groups- those enrolled in the newer program and those in the original program. These are your pre-intervention measurements (A & B in the graph above).
Step 2: Control for Spillover
Throughout the study, you ensure that there is no ‘spillover effect’ between the groups, meaning the clients in each group don’t influence each other.
Step 3: Post-Test
After the intervention period, you measure the same outcomes again for both groups (C & D in the graph above).
Step 4: Calculate the Impact
To calculate the impact of the new program, you compare the change in the outcomes for the treatment group (C-A) with the change in outcomes for the control group (D-B). This gives you the difference-in-differences- the effect of the newer program compared to the original program.
Why Difference-in-Differences improves your claims
Using a pre-post test design without a comparison group, you could say, “The outcome changed by 15% over the time period.” While this gives you some insight, you wouldn’t be able to confidently say that the change was due to your program- it could be due to any number of external factors.
With Difference-in-Differences, however, you could say, “The newer program led to a 15% improvement in [specific outcome] compared to the standard approach to care.” By comparing trends between the groups, DiD allows you to strengthen your causal claims. You can attribute the change more confidently to the intervention.
When to use Difference-in-Differences
DiD is particularly useful when you want to understand the effect of an intervention but can’t randomly assign participants to treatment and control groups. This is common in many real-world settings of nonprofits. By leveraging a comparison group, you can still derive meaningful insights and make stronger causal claims about your program’s effectiveness.
What’s Next?
In our next post, we’ll explore an important question: What if the program we’re interested in evaluating has already started?