Bootstrap How-to

Posted on May 2, 2012


To find an interval estimate for a numerical quantity such as the mean height of all 17-year-old males, use a technique called a Bootstrap. This page gives you instructions and background.

  1. Start with a collection of data. You’re going to generalize from this dataset.
  2. Make a measure for the thing you want to estimate, e.g., the mean of the height.
  3. Right-click the collection and make a sample.
  4. Inspect the new collection. Use its Sample panel. Change the number being sampled to the number of cases in the original collection. (So if the original collection had 52 cases, the sample will also have 52 cases.)
  5. Make sure that with replacement is checked.
  6. Collect measures from the sample. (As many as practical, e.g., 1000) Notice that this is from the sample, not from the original collection. The name of the measures collection will be something like Measures from Sample of Collection1.
  7. Plot the measure, see its (probably bell-shaped) distribution.
  8. Put the 5th and 95th percentile lines on the plot. Use Plot Value, and the formula percentile(95, ).
  9. Read off the interval!

I made a video about this:

Posted in: How-to, measures