Hey everyone!
Welcome to the second part of our exploration into Causal Impact analysis. In the first part, we provided a quick dive into the theoretical aspects of Causal Impact analysis. Now, let's roll up our sleeves and delve into the practical application of Causal Impact analysis.
We will guide you through a specific dataset, demonstrating how to implement the library and interpret results.
This hands-on approach empowers you not only to use the Causal Impact library effectively but also to draw meaningful conclusions from your analyses.
Imagine launching a wide advertising campaign in the UK to promote a new app feature, aiming to increase installations by reaching a larger audience through bloggers. However, placing part of this audience in a control group, where they don't see the new feature, might create a negative impression.
To address this, a decision is made to roll out the feature for the entire Region B, while Region A serves as the control group without the brand campaign.
Compare the two groups, and find out whether the brand campaign has influenced the growth of installations in the region or not.
Step 1: Packages
First, install the CausalImpact package with pip install
and execute it.
from causal_impact import CausalImpact
Step 2: Dataset Preparation
Load the dataset you want to analyze. Below is an example dataset with installs:
plaintextCopy code| date | y_installs (Test Group) | x_installs (Control Group) |
|------------|--------------------------|----------------------------|
| 2023-01-01 | ... | ... |
| 2023-01-02 | ... | ... |
| ... | ... | ... |
Step 3: Calculations and Graph
date_infer = '2023-01-18' # Date of feature rollout
df.columns = ['y', 'x']
ci = CausalImpact(df, date_infer, n_seasons=7)
result = ci.run(max_iter=1000, return_df=True)
ci.plot()
Step 4: Reading Graph Results
Causal Impact produces several outputs, but two are especially useful: the graphs above and a summary of impact.
Plots:
Observation vs Predictions:
Difference and Cumulative Impact:
Relative Uplift Calculation:
index_infer = np.where(np.array(df.index)==date_infer)[0][0]
post_infer_result = result[result.index>=index_infer]
rel_diff = post_infer_result.pred_diff/ post_infer_result.pred
print ('Relative uplift is {}%'.format(np.round(rel_diff.mean()*100, 1)))
This calculation helps represent the relative uplift, providing valuable insights into the effectiveness of the advertising campaign.
By comparing the test and control groups, we gained insights into whether the brand campaign influenced the growth of installations in the region.
The visualizations, especially the Observation vs Predictions and Difference and Cumulative Impact plots offer an understanding of the intervention's effects.
Armed with this hands-on experience, you are now equipped to leverage Causal Impact analysis in real-world scenarios, making informed decisions based on statistical significance and relative uplift. Cheers to unraveling insights and driving data-driven strategies!