jwdid
: Flexible Estimation of staggered DID designsFebruary 5, 2025
In this framework, the TE for \(i\) is defined as:
\[\theta_i = y_{i,1}(1) - y_{i,1}(0)\]
Which may not be useful. Instead, we settle focus on something different. “average treatment effect” on treated (ATET or ATT)
\[\begin{aligned} ATT &= E[\theta_i | D_i = 1] = E[y_{i,1}(1) - y_{i,1}(0) | D_i = 1] \\ &=E_1[\theta_i] = E_1[y_{i,1}(1)] - E_1[ y_{i,1}(0) ] \end{aligned} \]
But is still not identified, because \(E_1[y_{i,1}(0)]\) is not observed.
How to calculate \(E(y_{i,1}(0)|D=1)\) ?
First, Decompose it: \[E_1(y_{i,1}(0))=E_1(y_{i,0}(0)) + E_1(\lambda_i)\]
Under No anticipation: \(E_1(y_{i,0}(0))=E_1(y_{i,0}(1))=E_1(y_{i,0})\)
Under Parallel trends: \(E_1(\lambda_i)=E_0(\lambda_i)=E_0(y_{i,1}-y_{i,0})\)
Thus putting it all together:
\[\begin{aligned} ATT &= E_1[y_{i,1}] - [ E_1(y_{i,0}) + (E_0[y_{i,1}]-E_0[y_{i,0}]) ] \\ &= [E_1[y_{i,1}] - E_1(y_{i,0})] - [ E_0[y_{i,1}]-E_0[y_{i,0}] ] \end{aligned} \]
Setting aside estimation of Standard errors, the ATT can be estimated by simply comparing average outcomes for the treated and control groups before and after treatment.
or using the following regression:
\[y_{it} = \alpha + \beta D_i + \gamma t + \theta (D_i \times t) + \epsilon_{it} \]
Where \(\theta\) is the ATT.
Answering some of these tough questions!
\[y_{it} = \gamma t + \theta (D_i \times t) + \beta_i + \epsilon_{it} \]
\[y_{it} = \theta (W_{it}) + \beta_i + \gamma_t + \epsilon_{it} \]
\[y_{it} = \theta (W_{it}) + \beta_i + \gamma_t + \delta X_i + \epsilon_{it}\]
\[\begin{aligned} \text{PTA}&: E_1(y_{i,1}-y_{i,0})=E_0(y_{i,1}-y_{i,0}) \\ \text{CPTA}&: E_1(y_{i,1}-y_{i,0}|X)=E_0(y_{i,1}-y_{i,0}|X) \end{aligned} \]
\[ATT(X) = E_1[y_{i,1}|X] - E_1[y_{i,0}|X] - [E_0[y_{i,1}|X] - E_0[y_{i,0}|X]] \]
\[\begin{aligned} y_{it} &= \alpha &&+ \beta D &&+ \gamma t &&+ \theta (D \times t) \\ &+ \lambda X &&+ \lambda_D X \times D &&+ \lambda_D X \times t &&+ \lambda_{DT} \color{blue}{\tilde X} (D \times t) \\\ &+ \epsilon_{it} \end{aligned} \]
csdid[2]
forces you to use time fixed characteristics with panel data, But not for RC datajwdid
does not impose this restriction.\[y_{it} = \beta_i + \gamma_t + \theta W_{it} + \epsilon_{it}\]
did2s
and did_imputation
)csdid[2]
)jwdid
)jwdid
)Wrong:
\[y_{it} = \beta_i + \lambda_t + \theta (W_{it}) + \epsilon_{it}\]
Right:
\[y_{it} = \beta_i + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) + \epsilon_{it}\]
\[y_{it} = \beta_i + \lambda_t + \sum_{g=g_0}^G \sum_{t=t_0}^{t=g-2} \theta_{gt} \mathbb{1}(g, t) + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) + \epsilon_{it}\]
csdid[2]
with out controls and Balance panel\[y_{it} = \beta_g + \lambda_t + \color{red}{\sum_{g=g_0}^G \sum_{t=0}^{t=g-2} \theta_{gt} \mathbb{1}(g, t)} + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) + \epsilon_{it}\]
jwdid
may produce too many ATT(G,T) to analyze \(\rightarrow\) aggregate!
Simple: \(ATT = \frac{\sum_g \sum_t \theta_{gt} \mathbb{\omega}(g, t) \mathbb{1}(t\geq g)}{\sum_g \sum_t \mathbb{\omega}(g, t) \mathbb{1}(t\geq g)}\) estat simple
Group: \(ATT(g) = \frac{\sum_{t} \theta_{gt} \mathbb{\omega}(g, t)\mathbb{1}(t\geq g)}{\sum_{t} \mathbb{\omega}(g, t)\mathbb{1}(t\geq g)}\) estat group
Time: \(ATT(t) = \frac{\sum_{g} \theta_{gt} \mathbb{\omega}(g, t)\mathbb{1}(t\geq g)}{\sum_{g} \mathbb{\omega}(g, t)\mathbb{1}(t\geq g)}\) estat calendar
Event: \(ATT(e) =\frac{\sum_g \sum_t \theta_{gt} \mathbb{\omega}(g, t) \mathbb{1}(t-g=e) }{\sum_g \sum_t \mathbb{\omega}(g, t)\mathbb{1}(t-g=e)}\) estat event
where \(\mathbb{\omega}(g, t)\) is the weight (total number of units in group \(g\) observed at time \(t\))
Allowing for \(X\) heterogeneity is simple. Simply consider a flexible model with interactions! \[\begin{aligned} y_{it} &= \color{red}{\beta_0} &&+ \beta_i &&+ \lambda_t &&+ \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) &&+ \\ &\ \ \ \ \delta X_{it} &&+ \sum_g \delta_g X_{it} \mathbb{1}(g) &&+ \sum_t \delta_t X_{it} \mathbb{1}(t) &&+ \sum_{g=g_0}^G \sum_{t=g}^{t=T} \delta_{gt} \color{blue}{X_{it}} \mathbb{1}(g, t) &&+\\ &\ \ \ \ \epsilon_{it} \end{aligned} \]
Considerations:
In Stata, jwdid
can also estimate both types of models:
* Demeaning Data (Θ is ATT(G,T))
jwdid y, tvar(tvar) ivar(ivar) gvar(gvar) [never]
* Using X as is (Θ is not ATT(G,T))
* May be faster
jwdid y, tvar(tvar) ivar(ivar) gvar(gvar) [never] xasis
xasis
?\[\begin{aligned} \theta_{gt} &= E[\hat y_{i,t}(X_{it},\mathbb{1}(g,t)=1) - \hat y_{i,t}(X_{it},\mathbb{1}(g,t)=0)|g,t] \end{aligned} \]
There are few changes to consider:
\[\begin{aligned} y_{it}^* &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) \\ E(y_{it}) &= G(y_{it}^*) \end{aligned} \]
Thus we need a model appropriate for \(G()\) (logit, poisson, tobit, etc)
jwdid
jwdid
can estimate these type of models simply by using method()
.
jwdid
will use cohort FE instead of individual FEppmlhdfe
(so far) would still add individual fixed effects.cre
option.
jwdid
Note that \(\theta_{gt}\) is not ATT(G,T) on outcome, but on latent variable.
however, one can request aggregations of the latent or outcome variable afterwards
jwdid
already incorporates this by using flexible specifications, but produces Average ATT’s.margins
, it is possible to estimate ATTs for different discrete sub-groups (not continuous):** setup
jwdid y i.x1 x2, tvar(tvar) ivar(ivar) gvar(gvar) never
** ATTs for specific groups
estat simple // average ATT
estat simple, over(x1) // ATT estimated for each group of x1
// For observations where x2 is between 0 and 1
estat [simple|calendar|group|event], ores( x2>0 | x2<1 )
// For observations where x1 is 0
estat [simple|calendar|group|event], ores( x1==0 )
jwdid
framework can be adapted to these scenarios, albeit with limitations.trtvar
that defines treatment intensity.** setup
jwdid y i.x1 x2, tvar(tvar) ivar(ivar) gvar(gvar) trtvar(trtvar) [never]
** ATT aggregation
*** Estimates Treatment effect, assuming Full Intensity (T=1)
estat [simple|group|calendar|event]
*** Estimates Treatment effect, assuming intensity as observed
estat [simple|group|calendar|event] , asis
ores()
or over()
to estimate ATTs for different sub-groups.\[\begin{aligned} y_{it} &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \sum_h \theta_{gth} \mathbb{1}(g, t, h) + \epsilon_{it} \end{aligned} \]
Estimation with jwdid
is very simple:
** setup
jwdid y i.x1 x2, tvar(tvar) ivar(ivar) gvar(gvar) xattvar( trt_l trt_h) [never]
** Assume trt_m is the base treatment. trt_l and trt_h are potential treatments.
** The base-line treatment will be dropped.
estat [simple|group|calendar|event]
provide a single ATTover()
or ores()
, one could estimate ATT Heterogeneityjwdid
is that we do not need to be concerned with the estimation of standard errors.
vce(unconditional)
to aggregation commands
estat [simple|group|calendar|event], vce(unconditional)
reghdfe
or ppmlhdfe
as the estimation method.
regress cre or poisson cre
instead (Stata does this)jwdid
has other “advanced” options that could further help model specification.
fevar()
: Allows introducing FE other than Panel (only with reghdfe
or ppmlhdfe
)
\[\begin{aligned} y_{it} &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) + \omega_j+ \epsilon_{it} \end{aligned} \]
exovar()
: Variables not interacted with treatment \(G\) nor time \(T\), nor both.\[\begin{aligned} y_{it} &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) + \phi X_{it} + \epsilon_{it} \end{aligned} \]
xtvar()
and xgvar()
: variables that will only interact the time or group fixed effects. \[\begin{aligned}
y_{it} &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) +
\sum_{g=g_0}^G \gamma_{g} \mathbb{1}(g) X_{it} + \epsilon_{it} \\
y_{it} &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) +
\sum_{t=t_0}^T \gamma_{t} \mathbb{1}(g) X_{it} + \epsilon_{it}
\end{aligned}
\]
anticipation(#)
: Allows to set a different period as baseline for the treatment. (default is 1) (g-1)
hettype()
: Allows to impose some restrictions on heterogeneity type. Default its timecohort
time
, cohort
, event
, twfe
, eventcohort
And For Event aggregation:
window(#1 #2)
as is windowcwindow(#1 #2)
Censored windowcsdid
, did_imputation
, did_multiple_dyn
and jwdid
jwdid
to be as flexible as possible
If you are interested, you can install the latest version of jwdid
using
net install jwdid, from(https://raw.githubusercontent.com/friosavila/stpackages/main)
You can find me on
Oceania Stata Conference 2025