-
Notifications
You must be signed in to change notification settings - Fork 89
Add Comparative Interrupted Time Series documentation with geo-experimentation example #595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Comparative Interrupted Time Series documentation with geo-experimentation example #595
Conversation
Co-authored-by: drbenvincent <[email protected]>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #595 +/- ##
=======================================
Coverage 93.21% 93.21%
=======================================
Files 35 35
Lines 5511 5511
Branches 358 358
=======================================
Hits 5137 5137
Misses 246 246
Partials 128 128 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
drbenvincent
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have an entry in the glossary for Comparative interrupted time-series. But we should add a sphinx reference back to the glossary in the first mentions of the term in the notebook. We should also slightly expand that glossary description with a bit more detail.
Under "Key characteristics of CITS" add that additional predictors (like temperature or time of year) can also be added
None of the outputs in the notebook are displaying in the rendered docs html. Do you need me to manually execute the notebook? I can push the executed notebook if so.
- Expanded glossary entry for CITS with implementation details
- Added note about additional predictors (temperature, time of year)
- Replaced markdown links with proper Sphinx citations (:cite:)
- Added glossary term references using {term} syntax
- Executed notebook to display outputs in rendered docs
Co-authored-by: drbenvincent <[email protected]>
All addressed in commit a5f944c:
Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
drbenvincent
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit checks are failing. If you run the pre commit checks it should auto resolve some or all issues.
Referring back to Juan's blog post, there is no campaign_end. We should remove that variable, and from the plot.
And the start date of the campaign is wrong. In the original blog post it is listed as start_campaign = datetime(2022, 7, 1). Check correctness of dates in the text.
Remove the line az.style.use("arviz-doc")
- Updated campaign start date to July 1, 2022 (was incorrectly May 2)
- Updated text to reflect July 1-31, 2022 campaign period
- Removed campaign_end variable (not in original blog post)
- Removed campaign end line from visualization
- Removed az.style.use("arviz-doc") line
- Fixed formatting issues (pre-commit checks now pass)
Co-authored-by: drbenvincent <[email protected]>
All addressed in commit 3ed8d81:
The notebook now correctly matches Juan's blog post with a July 1-31, 2022 campaign period. |
drbenvincent
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change the section from "Visualize Pre-Treatment Trends" to "Visualise data"
In that section, add a new scatter plot of just pre-treatment data. Control on the x axis, treatment on the y-axis. Add some explanation to point out the correlation of the two.
Please use the following text to update the end of the notebook, probably the "when to use CITS" section
Differences (why CITS is often preferred when feasible)
-
Control for concurrent events
• ITS: vulnerable if something else changes at the same time (e.g., a new guideline, coding change, economic shift).
• CITS: the control series helps account for those shared shocks, improving internal validity. -
Core identifying assumption
• ITS: assumes that, absent the intervention, the pre-intervention trend would have continued (and no other intervention affects the series at that time).
• CITS: assumes that, absent the intervention, the treated and control would have followed similar trajectories (often framed as a “parallel trends” idea, but allowing rich pre-trends and seasonality). -
Data requirements
• ITS: one series is enough.
• CITS: needs a credible control series (and must check for spillovers/contamination). -
Failure modes
• CITS can be biased if the control is poorly chosen (different underlying drivers), if the intervention indirectly affects the control (spillover), or if other changes affect treated and control differently.
⸻
When to use which
• Use ITS when a good control group/series does not exist, but you have many pre- and post-intervention observations and can argue no major coincident changes.
• Use CITS when you can identify a credible comparison series and want stronger protection against coincident events and secular trends.
- Changed section title from "Visualize Pre-Treatment Trends" to "Visualise data" - Added scatter plot showing correlation between treatment and control in pre-treatment period - Added explanation text highlighting the correlation between units - Completely rewrote "When to Use CITS" section with detailed comparisons: - Differences between ITS and CITS (concurrent events, assumptions, data requirements, failure modes) - Clear guidance on when to use ITS vs CITS vs Synthetic Control - Improved structure and clarity of key takeaways Co-authored-by: drbenvincent <[email protected]>
All addressed in commit a27ec36:
The scatter plot shows strong correlation (r ≈ 0.976) between treatment and control in the pre-treatment period, supporting the use of CITS. |
drbenvincent
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original blog post used a Student T likelihood. We should also do that, and we can do that through the use of custom priors with the Prior class. There is probably an example in one of the other notebooks.
Thanks the notebook title to "Comparative Interrupted Time Series: a geo-experimentation example"
- Changed title to "Comparative Interrupted Time Series: a geo-experimentation example" - Added import for Prior from pymc_extras.prior - Updated LinearRegression model to use Student T likelihood (nu=4) matching original blog post - Model now uses robust Student T distribution instead of Normal for observations - Pre-commit checks pass, notebook executes successfully Co-authored-by: drbenvincent <[email protected]>
Both addressed in commit 41346bf:
The model now matches the approach from Juan's original blog post with Student T likelihood for more robust inference. |
|
Bear in mind I've pushed some commits to update and re-run the notebook |
|
bugbot run |
PR SummaryAdd multiple CITS/geo-experimentation and ITS-related citations to Written by Cursor Bugbot for commit 188e814. This will update automatically on new commits. Configure here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Bugbot reviewed your changes and found no bugs!
Streamlined and clarified markdown explanations in the CITS notebook for improved readability and conciseness. Combined and reworded sections to better explain comparative interrupted time series, its implementation, assumptions, and practical guidance, while removing redundant bullet points and enhancing the overall narrative flow.
Plan: Comparative Interrupted Time Series Documentation
All tasks completed successfully!
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.