```
qui:frause oaxaca, clear
qui:reg lnwage educ exper tenure age if female==0
est sto male
qui:reg lnwage educ exper tenure age if female==1
est sto female
```

## Introduction

A question I have seen online many…many…many times is how to compare the coefficients of a model that has been estimated using a highdimensional set of fixed effects.

The starting answer has always been…to `suest`

the both equations, or *stack* both equations to compare the effects. However, `suest`

will not work with `reghdfe`

nor `xtreg`

. And *stacking* equations is even less intuitive.

Today, however, I will present you an easy way to do this with with a little command of my own creation, but also using some simple syntax.

To use the strategies I will present here you will need `reghdfe`

(from `ssc`

) and `cre`

(from `fra`

, my own repository). You will need `frause`

from ssc.

`suest`

the problem

Lets start with a simple wage regression model, where we aim to compare the coefficients of men and women. For this, we will use the data set `oaxaca`

, and a simple *Mincerian* regression model:

First, lets estimate both models:

and use `suest`

to put them together, and test if coefficients are different from each other or not. For this I will use `lincom`

and `test`

commands:

```
qui: suest male female
lincom [male_mean]:educ-[female_mean]:educ
test [male_mean=female_mean]:educ
test [male_mean=female_mean], common
```

```
( 1) [male_mean]educ - [female_mean]educ = 0
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
(1) | -.0300371 .0119579 -2.51 0.012 -.0534741 -.0066001
------------------------------------------------------------------------------
( 1) [male_mean]educ - [female_mean]educ = 0
chi2( 1) = 6.31
Prob > chi2 = 0.0120
( 1) [male_mean]educ - [female_mean]educ = 0
( 2) [male_mean]exper - [female_mean]exper = 0
( 3) [male_mean]tenure - [female_mean]tenure = 0
( 4) [male_mean]age - [female_mean]age = 0
chi2( 4) = 26.63
Prob > chi2 = 0.0000
```

I could also use more involved methods like creating my own `ml`

or `gmm`

option, but there is no need in this simplified method.

## Stacking

The next option is do *Stacking*. This sounds difficult, but its nothing different than using the old trick of interactions. we simply need to estimate a model where all covariates are interacted with our sampling indicator (gender):

```
qui:reg lnwage i.female##c.(educ exper tenure age), robust
lincom 1.female#c.educ
test 1.female#c.educ
test 1.female#c.educ 1.female#c.exp 1.female#c.tenure 1.female#c.age
```

```
( 1) 1.female#c.educ = 0
------------------------------------------------------------------------------
lnwage | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
(1) | .0300371 .0119956 2.50 0.012 .0065062 .0535681
------------------------------------------------------------------------------
( 1) 1.female#c.educ = 0
F( 1, 1424) = 6.27
Prob > F = 0.0124
( 1) 1.female#c.educ = 0
( 2) 1.female#c.exper = 0
( 3) 1.female#c.tenure = 0
( 4) 1.female#c.age = 0
F( 4, 1424) = 6.62
Prob > F = 0.0000
```

Again we will obtain the same results as before.

But now the hard question. What if we have a HDFE?

## Stacking FE

To simulate the sitution of a high-dimensional FE, I will use `age`

. This will allow me to still obtain point estimates using simple regression (and say `suest`

), while comparing it to the alternative:

```
qui:reg lnwage educ exper tenure i.age if female==0
est sto male
qui:reg lnwage educ exper tenure i.age if female==1
est sto female
qui:suest male female, cluster(age)
lincom [male_mean]:educ-[female_mean]:educ
test [male_mean=female_mean]:educ exper tenure
```

```
( 1) [male_mean]educ - [female_mean]educ = 0
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
(1) | -.0354668 .0095251 -3.72 0.000 -.0541357 -.0167978
------------------------------------------------------------------------------
( 1) [male_mean]educ - [female_mean]educ = 0
( 2) [male_mean]exper - [female_mean]exper = 0
( 3) [male_mean]tenure - [female_mean]tenure = 0
chi2( 3) = 30.72
Prob > chi2 = 0.0000
```

Now the second method, using `reghdfe`

```
egen age_fem = group(age fem)
qui:reghdfe lnwage i.female##c.(educ exper tenure), abs(female#age) cluster(age_fem)
lincom 1.female#c.educ
test 1.female#c.educ 1.female#c.exp 1.female#c.tenure
```

```
( 1) 1.female#c.educ = 0
------------------------------------------------------------------------------
lnwage | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
(1) | .0354668 .0104804 3.38 0.001 .0146424 .0562911
------------------------------------------------------------------------------
( 1) 1.female#c.educ = 0
( 2) 1.female#c.exper = 0
( 3) 1.female#c.tenure = 0
F( 3, 89) = 6.49
Prob > F = 0.0005
```

## Hard Example:

Lets do this with a harder example, using `nlswork`

dataset, comparing of a wage regression coefficients between north and south:

```
webuse nlswork, clear
egen cl = group(idcode south)
qui: reghdfe ln_wage i.south##c.(age msp not_smsa c_city union tenure hours) , abs(idcode#south) cluster(cl)
test 1.south#c.age 1.south#c.msp 1.south#c.union
```

```
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)
(8 missing values generated)
( 1) 1.south#c.age = 0
( 2) 1.south#c.msp = 0
( 3) 1.south#c.union = 0
F( 3, 3586) = 1.24
Prob > F = 0.2941
```

But also using CRE:

```
webuse nlswork, clear
qui:cre, abs(idcode) keep prefix(m1): regress ln_wage age msp not_smsa c_city union tenure hours if south==0
est sto north
qui:cre, abs(idcode) keep prefix(m2): regress ln_wage age msp not_smsa c_city union tenure hours if south==1
est sto south
qui:suest north south
test [north_mean=south_mean]: age msp union
```

```
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)
( 1) [north_mean]age - [south_mean]age = 0
( 2) [north_mean]msp - [south_mean]msp = 0
( 3) [north_mean]union - [south_mean]union = 0
chi2( 3) = 3.82
Prob > chi2 = 0.2821
```

## Conclusions

There you have it. Two ways to compare coefficients across two models using interactions or `suest`

.

Both provide the same results, if you cluster variables with the absorbed variable.

Hope you find it useful