Herbert M Barber, Jr, PhD, PhD
Perhaps one of the most misunderstood terms today is sampling. The term comes up particularly during election seasons, but of course sampling is used in numerous fields every day. In any regard, most people seem to have a poor understanding of sampling, even in its most rudimentary form.
The idea behind sampling is to allow a subset of a population to represent the population. For example, we would likely prefer to investigate a subset, or sample, from a population of ten million people rather than investigate all ten million in the population. Subsequently, ensuring our sample is drawn such that inferences can be made across our population becomes imperative…and this is where people begin misunderstanding sampling.
There are several types of sampling techniques, but we roughly divide the various sampling techniques into probabilistic sampling and non-probabilistic sampling. One method stands scientific scrutiny, and one, well, does not. So, in situations involving billions of dollars, of course, or where rendered decisions are of paramount importance, probabilistic sampling is the recommended approach.
Simple Random Sampling is the most recognized probabilistic sampling technique. Simple random sampling mandates that each element within a population has an equal chance of being selected as part of the subset, or sample. The elementary example involves one placing numbers in a hat, with each person drawing a number until the numbers are gone. Unfortunately, this is not random sampling, at all. In fact, it creates substantial bias that yields no randomness and thus, no validity.
For this simple technique to be random, each person drawing must have an equal chance of drawing any of the numbers in the pool. For example, if there are ten persons, and ten numbers in the hat, the tenth person drawing a number from the hat must have an equal chance of drawing all ten numbers, not simply what is left after others have drawn.
Systematic Sampling requires one to simply select every nth element in the population, be it persons or widgets. Ideally, to use systematic sampling, elements initially should be arranged randomly with no defined order; and a number should be selected as the counter, or nth number. Of course, the remaining task is to draw every nth element to serve in the sample. However, the question often arises as to how to select that nth number, or interval size. Should it be every 10th element, every 100th element, etc? While there are different ways of selecting this interval, a suggested way is to use the ratio of population to intended sample size (nth interval = population/intended sample size). Of course, this begs the question of how large a sample should be to offer a reasonable level of validity.