Simultaneous/Uniform Confidence Intervals

Featuring uci

Introduction

Confidence intervals (CI), also known as interval estimates, are ranges of values that provided along point estimates as potential values for population parameters, based on the estimation method, level of confidence, and degree of precision of those estimates.

One usual confusion that arises from CI is its interpretation. Its important to emphasize that CI are just estimates, and are not guaranteed to contain the true estimate.

For example, when we say the estimate for \(\alpha\) is 1 with a 95% CI of \((-.5,.5)\), it doesnt mean that the true parameter will be contained in that confidence interval with a 95% probability.

Instead, it means that if we repeat the sampling excercise a large number of times, and construct a CI of the same level, it will contain the true parameter 95% of the time.

Code

clear
set scheme white2
set obs 200
gen p = (_n*2-1)/(2*_N)
gen sd = (1/sqrt(500))
gen b1 = invnormal(p)*sd
gen ll=b1-sd*invnormal(0.95)
gen uu=b1+sd*invnormal(0.95)

gen n=p
two (rarea ll uu n if uu<0,color(red%50)) (rarea ll uu n if ll>0,color(red%50)) ///
    (rarea ll uu n if inrange(0,ll,uu), color(green%20)) (line b1 n, color(gs1)) , ///
    legend(order(1 "Rejection of Null" 3 "Null not Rejected"))  
graph export uci1.png, width(1000)  height(500) replace

As an example, Figure 1 shows a 90% Confidence intervals for a coefficient that by construction has a true value of zero. When considering the CI, the middle 90% of the coefficients contain the true paramater of zero. But the upper and lower 5% do not. Thus, with a 90% confidence interval, we have 10% chances of rejecting the Null when it is true.

The Problem of Simultaneous hypothesis

While CI are useful to understand the level of precision of an estimate, they are not a good tool when one is interest in testing multiple hypothesis at the same time.

Consider the following example:

Say that you have access to two independent sources of data, and you test the same hypothesis in both. Say, on top of this, that you use 90% confidence intervals to make your hypothesis. How likely is that either of the test will come as possitve?

The answer in this case is straigh forward. The chances that the first estimator will be a false positive: 10%. When negative (90%) there is also a 10% chance of being a false positive, in the second sample: \(10%+90%*10%=19%\). So using a second sample, almost doubles the probability of finding a false significant result.

Let’s take the excercise a bit farther. If we run the same experiment on 7 different samples, the chances of finding a possitive results is now over 50%. Its easy to see that one can very easily falsify results by just adding the number of samples used for the analysis.¹

What other alternatives do we have? Well, there are at least 2:

Use joint hypothesis tests (wald-test). This will adjust for possible correlations (or lack there of) among the estimated parameters, and provide a single statistic that tests if at least one of the parameters is different from zero.
Use Simultaneous Confidence Intervals, which adjusts standard critical values, so that only \(\alpha%\) of the cases we jointly rejected coefficients of interest. I like to think of this as a way to translate joint tests into single hypothesis testing.

How will this work?

Lets reconsider the previous example. When I have two independent samples the naive joint share of false significance tests would be given by:

\[\alpha_{12} = \alpha_1 + (1-\alpha_1)* \alpha_2\]

where \(1-\alpha_j\) is the level of confidence you would normaly used for single Hypothesis testing.

Now, we know \(\alpha_j\) is to high, and that it has to be the same for both groups. Our goal is find an appropriate \(\alpha_j\), so that the joint probability of false positive is 5% (\(\alpha_{12}\)):

\[\alpha_{12}= \alpha + (1-\alpha)* \alpha = 2\alpha - \alpha^2\]

Which suggest the level of significance we need to use on each individual sampe should be:

\[\alpha = 1-\sqrt{1-\alpha_{12}}\]

For the previous case, if one is trying to use a 90% confidence interval, we need to set \(\alpha_{12}\) at 10% (0.1). Which implies that for each individual test we need to use an \(\alpha\) at 0.0513 or 5.13%. Which is almost half of what we started with.

Of course, if you have more than 2 coefficients you need testing, or if there are some correlations among them, the formula above may need to be adjusted. (some approaches do something like that)

UCI via Influence Functions (IF)

The first time I came through the idea of uniform confidence Intervals was when I started working with csdid (@callaway_santanna_2021), where both standard errors and confidence intervals are estimated via Influence functions.

This is done as follows:

Derive the influence functions of your estimator.
Use a multiplicative wildbootstrap to disturb the IF, estimate standard errors and absolute value of t-statistics for all your parameters.
For each iteration of the wildbootstrap, choose the largest t-statistic. You have now a single column of t-statistics.
From this new variable choose the (\(1-\alpha\)) percentile. This will be your new critical value.
Re-estimate your Confidence intervals.

Because the new critical value was constructed by looking at the highest t-stat, it will have a the same properties as the manual significance level I provided above.

Here a small example of how it works.

sysuse auto, clear
gen one=1
mata
    x  =st_data(.,"mpg foreign one")
    y  =st_data(.,"price")
    n  = rows(y)
    xx =cross(x,x); xy =cross(x,y)
    ixx=invsym(xx)
    b  =ixx*xy; e = y:-x*b
    iff = 74*(x:*e)*ixx
    iffm=J(1000,3,0)
    for(i=1;i<=1000;i++){
        iffm[i,]=mean(iff:*rnormal(n,1,0,1))
    }
    se=diagonal(variance(iffm):^.5)'
    ts=abs(iffm:/se);ts=rowmax(ts);_sort(ts,1)
    tcrit=ts[ceil(0.95*rows(ts))]
    st_matrix("tbl", (b,se',b:-tcrit*se',b:+tcrit*se'))
end

Code

reg price mpg foreign, robust
matrix rowname tbl =mpg foreign cons
matrix colname tbl =b se ll uu
matrix list tbl
mata:"Adjusted Critical Value"
mata:tcrit


Linear regression                               Number of obs     =         74
                                                F(2, 71)          =      12.72
                                                Prob > F          =     0.0000
                                                R-squared         =     0.2838
                                                Root MSE          =     2530.9

------------------------------------------------------------------------------
             |               Robust
       price | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         mpg |  -294.1955   60.33645    -4.88   0.000     -414.503   -173.8881
     foreign |   1767.292   607.7385     2.91   0.005     555.4961    2979.088
       _cons |   11905.42   1362.547     8.74   0.000     9188.573    14622.26
------------------------------------------------------------------------------

tbl[3,4]
                  b          se          ll          uu
    mpg  -294.19553   59.797238   -433.1367  -155.25436
foreign   1767.2922   616.19052   335.54999   3199.0345
   cons   11905.415   1335.2436   8802.9257   15007.905
  Adjusted Critical Value
  2.323538262

First, I estimate the influence function iff for the model coefficients. Then, I disturb them using a standard normal noise. I estimate standard errors and t-statistics based on the disturb data iffm, and obtain the adjusted critical value tcrit. This critical value is then used to reconstruct the confidence intervals.

I compare the results from the regression analysis, with the adjusted critical value. And, as expected, they are a bit wider than the original confidence interval.

Via Simulation

The strategy I show above is relatively simple if you know how to estimate the IF. However, beyond simple statistics, obtaining them may be too involved for the average user.

So, one alternative I’m suggesting here is to do the same, but via simulation.

For any regression analysis obtain the matrix of coefficients b and variance covariance V.
Draw N normal distributed samples with mean zero and Variance covariance V.
Obtain the z/t statistic from the simulated coefficients, and the model estimated variance matrix.
Just like before, obtain the new critical value, based on the new set of t-statistics, and adjust the Confidence intervals!

Here is where my new command uci comes into play. First, you may need to install it using fra. Also if you want to use fra please see here for instructions:

fra install uci, replace

checking uci consistency and verifying not already installed...
installing into c:\ado\plus\...
installation complete.

The command does not have a helpfile yet, but its use is straigh forward.

If you run it after any regression command, it will take the information from that regression and obtain the modified confidence intervals.
If one wants to do it using other pre-saved information, one can do so using the option bmatrix() and vmatrix().
One can choose how many repetitions to run, using reps(#), where the default is 999
Its possible to set a seed using rseed() for cases where replication is needed.
You can also set the level of confidence intervals level(#), where the default is 95.

So, let me redo the above example:

reg price mpg foreign, robust
uci


Linear regression                               Number of obs     =         74
                                                F(2, 71)          =      12.72
                                                Prob > F          =     0.0000
                                                R-squared         =     0.2838
                                                Root MSE          =     2530.9

------------------------------------------------------------------------------
             |               Robust
       price | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         mpg |  -294.1955   60.33645    -4.88   0.000     -414.503   -173.8881
     foreign |   1767.292   607.7385     2.91   0.005     555.4961    2979.088
       _cons |   11905.42   1362.547     8.74   0.000     9188.573    14622.26
------------------------------------------------------------------------------
Uniform Confidence Intervals based on Simulation

rtable2[3,5]
                  b          se           t          ll          lu
    mpg  -294.19553   60.336453  -4.8759169  -434.12725  -154.26381
foreign   1767.2922   607.73848   2.9079815   357.83104   3176.7534
  _cons   11905.415   1362.5472   8.7376169    8745.409   15065.422
Repetitions:999
Level:95
new T:2.31919

This provides a slighly different results, because of how standard errors are estimated, and the random nature bootstrap procedure, but it should be a good approximation to the Uniform Confidence intervals.

Conclusions

This aims to provide a small description of what Uniform confidence intervals are, and how to estimate them.

It also introduces a new small command, uci, which should help to estimate this type of confidence intervals, for almost any model.

Comments and suggestions are welcome.

Til next time.

Footnotes

This will not happen if we just increase the sample↩︎