Variation of Recatch Example

Originally posted to on Monday, July 13, 2009 2:13:07 PM.

“I would love to see an example following upon the idea of estimating the population of all prospective clients which uses similar sampling method as recatching example. Could you do it for me?

Best regards


We might need more details to work out the specific mechanics of this one, but we can discuss the concept. First, it is worth pointing out that the recatch example is just a way of using two independent sampling methods and comparing the overlap. In the case of the fish in the lake, the sampling methods were sequential (one was done after the other was done) and the overlap of the samples was determined by the tags that were left with the first sample of fish. Then when the second sample of fish was gathered, the proportion of that sample with tags would show how many fish were caught in both samples. From this and knowledge of each sample size, the entire population could be estimated.

But we don’t have to think of this as being sequential sampling where the first sampling leaves a mark of some kind (e.g. the tags on the fish) so that we see the overlap in the second sample. We can also run samples at the same time as long as we can identify individuals. People are simple enough to identify (since they have names, unique email addresses, etc.) so we don’t have to “tag” them between samples. (This is convenient, since I find that people rarely sit still while I try to apply the tag gun to their ear lobe.)

So if we had two independent sources attempt to identify prospects out of a population pool we could estimate the size of the prospect population. If two independent teams were using two different methods (perhaps two different phone surveyors or two different teams surveying people in malls), and if identification is captured, then the two teams could compare notes after the survey and determine how many individuals came up in both surveys.

The trick would be to find sampling methods that were truly independent of each other and the target population. If the population was “prospects in the city of Houston” and the sampling methods were mall surveys, then we should consider the possibility that not all prospects are equally likely to visit malls. If both survey methods were biased in the same way (tending to sample the same small subset of the target population), then the “recatch” method would underestimate the population size. If we used two completely different sampling methods (one mall survey and one phone survey) and the two methods were biased in a way that made prospects in one method less likely to be found by the other method, then the method will overestimate the total population.

As you can see, there are many variations on this method and each has challenges. The error could be high but, as I point out in the book, if it told you more than you knew before, then it can be a useful measurement.


Doug Hubbard