# How to `suest` a HDFE

I propose a feasible strategy to compare coefficients across models with high dimensional fixed effects
Stata
Programming
Fixed Effects
Author

Fernando Rios-Avila

Published

July 14, 2023

## Introduction

A question I have seen online many…many…many times is how to compare the coefficients of a model that has been estimated using a highdimensional set of fixed effects.

The starting answer has always been…to `suest` the both equations, or stack both equations to compare the effects. However, `suest` will not work with `reghdfe` nor `xtreg`. And stacking equations is even less intuitive.

Today, however, I will present you an easy way to do this with with a little command of my own creation, but also using some simple syntax.

Setup

To use the strategies I will present here you will need `reghdfe` (from `ssc`) and `cre` (from `fra`, my own repository). You will need `frause` from ssc.

## `suest` the problem

Lets start with a simple wage regression model, where we aim to compare the coefficients of men and women. For this, we will use the data set `oaxaca`, and a simple Mincerian regression model:

First, lets estimate both models:

``````qui:frause oaxaca, clear
qui:reg lnwage educ exper tenure age if female==0
est sto male
qui:reg lnwage educ exper tenure age if female==1
est sto female``````

and use `suest` to put them together, and test if coefficients are different from each other or not. For this I will use `lincom` and `test` commands:

``````qui: suest male female
lincom [male_mean]:educ-[female_mean]:educ
test [male_mean=female_mean]:educ
test [male_mean=female_mean], common``````
``````
( 1)  [male_mean]educ - [female_mean]educ = 0

------------------------------------------------------------------------------
| Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
(1) |  -.0300371   .0119579    -2.51   0.012    -.0534741   -.0066001
------------------------------------------------------------------------------

( 1)  [male_mean]educ - [female_mean]educ = 0

chi2(  1) =    6.31
Prob > chi2 =    0.0120

( 1)  [male_mean]educ - [female_mean]educ = 0
( 2)  [male_mean]exper - [female_mean]exper = 0
( 3)  [male_mean]tenure - [female_mean]tenure = 0
( 4)  [male_mean]age - [female_mean]age = 0

chi2(  4) =   26.63
Prob > chi2 =    0.0000``````

I could also use more involved methods like creating my own `ml` or `gmm` option, but there is no need in this simplified method.

## Stacking

The next option is do Stacking. This sounds difficult, but its nothing different than using the old trick of interactions. we simply need to estimate a model where all covariates are interacted with our sampling indicator (gender):

``````qui:reg lnwage i.female##c.(educ exper tenure age), robust
lincom 1.female#c.educ
test 1.female#c.educ
test 1.female#c.educ 1.female#c.exp 1.female#c.tenure 1.female#c.age``````
``````
( 1)  1.female#c.educ = 0

------------------------------------------------------------------------------
lnwage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
(1) |   .0300371   .0119956     2.50   0.012     .0065062    .0535681
------------------------------------------------------------------------------

( 1)  1.female#c.educ = 0

F(  1,  1424) =    6.27
Prob > F =    0.0124

( 1)  1.female#c.educ = 0
( 2)  1.female#c.exper = 0
( 3)  1.female#c.tenure = 0
( 4)  1.female#c.age = 0

F(  4,  1424) =    6.62
Prob > F =    0.0000``````

Again we will obtain the same results as before.

But now the hard question. What if we have a HDFE?

## Stacking FE

To simulate the sitution of a high-dimensional FE, I will use `age`. This will allow me to still obtain point estimates using simple regression (and say `suest`), while comparing it to the alternative:

``````qui:reg lnwage educ exper tenure i.age if female==0
est sto male
qui:reg lnwage educ exper tenure i.age if female==1
est sto female
qui:suest male female, cluster(age)
lincom [male_mean]:educ-[female_mean]:educ
test [male_mean=female_mean]:educ exper tenure``````
``````
( 1)  [male_mean]educ - [female_mean]educ = 0

------------------------------------------------------------------------------
| Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
(1) |  -.0354668   .0095251    -3.72   0.000    -.0541357   -.0167978
------------------------------------------------------------------------------

( 1)  [male_mean]educ - [female_mean]educ = 0
( 2)  [male_mean]exper - [female_mean]exper = 0
( 3)  [male_mean]tenure - [female_mean]tenure = 0

chi2(  3) =   30.72
Prob > chi2 =    0.0000``````

Now the second method, using `reghdfe`

``````egen age_fem = group(age  fem)
qui:reghdfe lnwage i.female##c.(educ exper tenure), abs(female#age) cluster(age_fem)
lincom 1.female#c.educ
test 1.female#c.educ 1.female#c.exp 1.female#c.tenure ``````
``````
( 1)  1.female#c.educ = 0

------------------------------------------------------------------------------
lnwage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
(1) |   .0354668   .0104804     3.38   0.001     .0146424    .0562911
------------------------------------------------------------------------------

( 1)  1.female#c.educ = 0
( 2)  1.female#c.exper = 0
( 3)  1.female#c.tenure = 0

F(  3,    89) =    6.49
Prob > F =    0.0005``````

## Using `suest` and correlated random effects model `cre`

Now we use Correlated Random Effects model to estimate the FE models:

``````qui:cre, keep prefix(ml) abs(age):reg lnwage educ exper tenure if female==0
est sto male
qui:cre, keep prefix(m2) abs(age):reg lnwage educ exper tenure if female==1
est sto female
qui:suest male female, cluster(age)
lincom [male_mean]:educ-[female_mean]:educ
test [male_mean=female_mean]:educ exper tenure``````
``````
( 1)  [male_mean]educ - [female_mean]educ = 0

------------------------------------------------------------------------------
| Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
(1) |  -.0354668   .0095251    -3.72   0.000    -.0541357   -.0167978
------------------------------------------------------------------------------

( 1)  [male_mean]educ - [female_mean]educ = 0
( 2)  [male_mean]exper - [female_mean]exper = 0
( 3)  [male_mean]tenure - [female_mean]tenure = 0

chi2(  3) =   30.72
Prob > chi2 =    0.0000``````

Which gives me exactly the same result!

## Hard Example:

Lets do this with a harder example, using `nlswork` dataset, comparing of a wage regression coefficients between north and south:

``````webuse nlswork, clear
egen cl = group(idcode south)
qui: reghdfe ln_wage i.south##c.(age msp  not_smsa c_city union tenure hours) , abs(idcode#south) cluster(cl)
test 1.south#c.age 1.south#c.msp 1.south#c.union``````
``````(National Longitudinal Survey of Young Women, 14-24 years old in 1968)
(8 missing values generated)

( 1)  1.south#c.age = 0
( 2)  1.south#c.msp = 0
( 3)  1.south#c.union = 0

F(  3,  3586) =    1.24
Prob > F =    0.2941``````

But also using CRE:

``````webuse nlswork, clear
qui:cre, abs(idcode) keep prefix(m1): regress ln_wage age msp not_smsa c_city union tenure hours if south==0
est sto north
qui:cre, abs(idcode) keep prefix(m2): regress ln_wage age msp not_smsa c_city union tenure hours if south==1
est sto south
qui:suest north south
test [north_mean=south_mean]: age msp union``````
``````(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

( 1)  [north_mean]age - [south_mean]age = 0
( 2)  [north_mean]msp - [south_mean]msp = 0
( 3)  [north_mean]union - [south_mean]union = 0

chi2(  3) =    3.82
Prob > chi2 =    0.2821``````

## Conclusions

There you have it. Two ways to compare coefficients across two models using interactions or `suest`.

Both provide the same results, if you cluster variables with the absorbed variable.

Hope you find it useful