Chapter 8 Bootstrap

8.1 Principles of Bootstrap

Example from Efron’s original 1979 paper.

##      mean        sd 
## 0.8155341 0.2507561
##       mean              sd        
##  Min.   :0.3411   Min.   :0.1109  
##  1st Qu.:0.7191   1st Qu.:0.2171  
##  Median :0.9299   Median :0.2615  
##  Mean   :0.9348   Mean   :0.2663  
##  3rd Qu.:1.1501   3rd Qu.:0.3134  
##  Max.   :1.9982   Max.   :0.5270

8.2 Parametric Bootstrap

A parametric model is specified to the data, and the model parameters are estimated by likelihood methods, moment methods, or other methods. Parametric bootstrap samples are generated from the fitted model, and for each sample the quantity to be bootstrapped is calculated.

Parametric bootstrap is often used to approximate the null distribution of a testing statistic which is otherwise unwieldy. The uncertainty of parameter estimation can be accounted for.

8.2.1 Goodness-of-Fit Test

Goodness of fit test with the KS statistic. Consider two different null hypotheses: \[ H_0: \mbox{ the data follows $N(2, 2^2)$ distribution,} \] and \[ H_0: \mbox{ the data follows a normal distribution.} \] Note that the first hypothesis is a simple hypothesis while the second one is a composite hypothesis.

## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  pvals[1, ]
## D = 0.043786, p-value = 0.04322
## alternative hypothesis: two-sided
## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  pvals[2, ]
## D = 0.47133, p-value < 2.2e-16
## alternative hypothesis: two-sided

Compare the histograms of the p-values obtained from applying ks.test() to \(N(2, 2^2)\) and to \(N(\hat\mu, \hat\sigma^2)\) with fitted mean \(\hat\mu\) and variance \(\hat\sigma^2\). The first one is what is expected from a \(U(0, 1)\) distribution, but the second one is not, which confirms that direct application of ks.test() to fitted distribution is incorrect for the goodness-of-fit of the hypothesized distribution with unknown parameters that need to be estimated.

A parametric bootstrap procedure can be employed for such tests. The test statistic remain the same, but its null distribution is approximated from parametric bootstrap.

## Warning in ks.test(pvals, "punif"): ties should not be present for the
## Kolmogorov-Smirnov test
## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  pvals
## D = 0.042632, p-value = 0.05277
## alternative hypothesis: two-sided

Note that the KS statistic and the CvM statistic are functionals of the empirical distribution and the fitted parametric distribution. Faster alternatives are possible with the multiplier CLT (Kojadinovic and Yan 2012).

Chi-squared test is another goodness-of-fit test. Here is an example of testing for generalized Pareto distribution. The correct degrees of freedom depends on the estimation method (Chernoff and Lehmann 1954).

## [1] 0.6938062
## [1] 0.007492507

8.4 Semiparametric bootstrap

Heffernan and Tawn (2004) applied a semiparametric bootstrap method (Davison 1997) in multivariate extreme value modeling. The challenge here is that, although univariate margins are fitted fine with generalized extreme value (GEV) distributions, an extreme-value dependence structure may be too strong an assumption to be practical. How does one obtain bootstrap samples of a semiparametric model where the marginal models are specified but the dependence structure is not?

To ensure that the bootstrap samples replicate both the marginal and the dependence features ofthe data, Heffernan and Tawn (2004) proposed a two-step bootstrap algorithm. A nonparametric bootstrap is employed first, ensuring the preservation of the dependence structure; then a parametric step is carried out to assess the uncertainty in the estimation of the parametric marginal models. The precise procedure is as follows. The original data are first transformed to have Gumbel margins using the fitted marginal models from the original data. A nonparametric bootstrap sample is then obtained by sampling with replacement from the transformed data. We then change the marginal values of this bootstrap sample, ensuring that the marginal distributions are all Gumbel and preserving the associations between the ranked points in each component. Specifically, for each \(i\), \(i = 1, \ldots, d\), where \(d\) is the multivariate dimension, replace the ordered sample of component \(Y_i\) with an ordered sample of the same size from the standard Gumbel distribution. The resulting sample is then transformed back to the original margins by using the marginal model that was estimated from the original data. The bootstrap samples obtained this way have desired univariate marginal distributions and dependence structure entirely consistent with the data as determined by the associations between the ranks of the components of the variables.

##  num [1:10, 1:6, 1:1000] 0.344 0.333 0.467 0.188 0.467 ...

What if the data is not balanced and the blocks are of different sizes?

8.5 Multiplier bootstrap

8.5.1 Multiplier central limit theorem

The theoretical background and presentation of the multiplier CLT can be found in Section~2.9 of Van Der Vaart and Wellner (1996). Let \(X_1, \ldots, X_n\) be a random sample from a distribution \(P\). Let \(\delta_x(t) = I(x <= t)\). With the notation \(Z_i(t) = \delta_{X_i}(t) - P(t)\), the empirical CLT can be written as \[ \frac{1}{\sqrt{n}} \sum_{i = 1}^n Z_i \to \mathbb{G} \] where \(\mathbb{G}\) is a Brownian bridge.

Let \(\xi_1, \ldots, \xi_n\) be a set of iid random variables with mean zero and variance 1, and independent of \(X_1, \ldots, X_n\). The multiplier CLT asserts that \[ \frac{1}{\sqrt{n}} \sum_{i=1}^n \xi_i Z_i \to \mathbb{G}, \] under some conditions about the (Donsker) class of the distribution \(P\).

A more refined and deeper result is the conditional multiplier CLT which states that \[ \frac{1}{\sqrt{n}} \sum_{i=1}^n \xi_i Z_i \to \mathbb{G}, \] given almost every sequence of \(Z_i, Z_2, \ldots\), under only slightly stronger conditions.

For illustration, consider a scalar. We first look at the unconditional version; that is, the observed data is not fixed.

Now we look at the conditional version where the observed data are conditioned on.

The process version retains the dependence structure which saves much computing time in many applications such as goodness-of-fit tests.

##    user  system elapsed 
##   2.303   0.000   2.303
##    user  system elapsed 
##   2.101   0.000   2.101

Application of the multiplier CLT needs the asymptotic representation of the random quantities or process of interest.

Question: for goodness-of-fit tests with unknown parameters, how would the procedure of the conditional multiplier approach change? This is the subject of Kojadinovic and Yan (2012).

8.6 Exercises

  1. Suppose that \(X\) and \(Y\) are independent \(\Gamma(\alpha_1, \beta_1)\) and \(\Gamma(\alpha_2, \beta_2)\) variables, where \(\Gamma(a, b)\) have mean \(a b\). We are interested in point and interval estimation of \(\theta = \mathbb{E}(X) / \mathbb{E}(Y)\) based on two independent samples of size \(n_1\) and \(n_2\), respectively. Consider for example \(\alpha_1 = \beta_1 = 2\), \(\alpha_2 = 4\), \(\beta = 2\), \(n_1 = 10\) and \(n_2 = 15\). Set the random seed to be 123 for reproducibility. Let \(\bar X\) and \(\bar Y\) be the sample means. Consider statistic \(T = \bar X / \bar Y\).

    1. Given the sample, draw bootstrap samples of \(T\) using the nonparametric method and the parametric method with sample size \(B = 1000\).
    2. Correct the bias of \(T\) in estimating \(\theta\).
    3. Construct a 95% bootstrap percentile confidence interval for \(\theta\).
    4. Repeat the experiments 1000 times. Compare the average bias with the exact bias; compare the empirical coverage of the 95% bootstrap confidence interval with the nominal level.
  2. One goodness-of-fit diagnosis is the QQ plot. Consider a random sample of size \(n\) from \(N(\mu, \sigma^2)\). A QQ plot displays the empirical quantiles against the theoretical quannntiles. For uncertainty assessment, a pointwise confidence interval constructed from simulation is often overlaid . In practice, the parameters of the distribution to be checked are unknown and the estimation uncertainty needs to be take into account. Let \(n = 10\), \(\mu = 0\), and \(\sigma^2 = 1\).

    1. Construct a QQ plot with pointwise confidence intervals with known \(\mu\) and \(\sigma^2\).
    2. Construct a QQ plot with pointwise confidence intervals with estimated \(\mu\) and \(\sigma^2\). The uncertainty in estimated parameters can be realized by bootstrapping.
    3. Repeat with sample size \(n \in \{20, 30\}\).

References

Chernoff, Herman, and EL Lehmann. 1954. “The Use of Maximum Likelihood Estimates in \(\chi\)2 Tests for Goodness of Fit.” The Annals of Mathematical Statistics. JSTOR, 579–86.

Davison, Anthony Christopher. 1997. Bootstrap Methods and Their Application. Cambridge University Press.

Heffernan, Janet E, and Jonathan A Tawn. 2004. “A Conditional Approach for Multivariate Extreme Values (with Discussion).” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 66 (3). Wiley Online Library: 497–546.

Kojadinovic, Ivan, and Jun Yan. 2012. “Goodness-of-Fit Testing Based on a Weighted Bootstrap: A Fast Large-Sample Alternative to the Parametric Bootstrap.” Canadian Journal of Statistics 40 (3): 480–500. https://doi.org/10.1002/cjs.11135.

Van Der Vaart, Aad W, and Jon A Wellner. 1996. Weak Convergence. Springer.