In an earlier post, we talked about how to do statistical tests and estimates using summary statistics.

But sometimes you start with a one- or two-way table of counts, and you want the actual collection of data. Here’s one way to get it. It’s a workaround, but it’s reasonably quick:

- Figure out how many
*different*cases you need. It’s usually the number of cells in the table. For example, suppose you want to reproduce the data in the table shown. There are 12 different cases—just different numbers of each one. - Pick the first type of case. You’d make two columns,
**Sex**and**Marital**, and one case:**Male**,**MarriedP**. - Make a summary table and put both attributes on it. You’ll only see one case, but that will change.
- Select the single case and Copy it.
- Paste it, repeatedly, until you have 10 of them.
- Select all ten. Copy them.
- Paste until you have 110. Of course, you’re watching the summary table from step 3 in order to tell if you’re done.
- Make the next case:
**Female**,**MarriedP**. - Repeat steps 4–7 until you have 114. (At the end, when you have 110, you’ll need to copy and paste 4 more.)
- Do the same for the other types of case.

I recently used this to make a “population” from which we polled. Every student got an identical population of 10,000 voters, and did sampling to make polls of various numbers of voters. Note that Fathom won’t copy more than 5,000 cases at a time.

Advertisements

Posted in:

*How-to, work-arounds*
May 27th, 2012 → 9:53 am

[…] Where Fathom falls short is when you want these data but don’t want to do a traditional test or estimate. For example, suppose you have poll results: 500 out of 1000 men prefer chocolate. 550 out of 1000 women do. Are the underlying proportions really different? In Fathom, you can easily test that difference (normal approximation, using z), or get an interval estimate for the difference of proportions to any degree of confidence you want. But if you want to do a randomization procedure of some kind, you need a collection with 2000 cases and two attributes (sex and preference)—which actually makes sense given what you’re trying to do. (New post about how to do this.) […]