CLASSIFICATION OF ERRORSSince the sample is
only part of the whole, extrapolation
inevitablyleads to errors. These are of two
kinds: sampling error (“random error”)and
non-sampling error (“systematic error”). The
latter is often called“bias,” without
connoting any prejudice. Sampling error
results from theluck of the draw when
choosing a sample: we get a few too many
units ofone kind, and not enough of another.
The likely impact of sampling erroris
usually quantified using the “SE,” or
standard error. With probabilitysamples, the
SE can be estimated using (i) the sample
design and (ii) thesample data.As the
“sample size” (the number of units in the
sample) increases,the SE goes down, albeit
rather slowly. If the population is
relatively ho-mogeneous, the SE will be
small: the degree of heterogeneity can
usuallybe estimated from sample data, using
the standard deviation or some anal-ogous
statistic. Cluster samples—especially with
large clusters—tend tohave large SEs,
although such designs are often
cost-effective.Non-sampling error is often
the more serious problem in practicalwork,
but it is harder to quantify and receives
less attention than samplingerror.
Non-sampling error cannot be controlled by
making the samplebigger. Indeed, bigger
samples are harder to manage. Increasing
thesize of the sample—which is beneficial
from the perspective of samplingerror—may be
counter-productive from the perspective of
non-sampling
---------------------------------------------
-----------------------------------
Page 3
Sampling3error. Non-sampling error itself
can be broken down into three
maincategories: (i) selection bias, (ii)
non-response bias, and (iii) responsebias.
We discuss these in turn.(i) “Selection
bias” is a systematic tendency to exclude
one kind ofunit or another from the sample.
With a convenience sample, selectionbias is
a major issue. With a well-designed
probability sample, selectionbias is
minimal. That is the chief advantage of
probability samples.(ii) Generally, the
people who hang up on you are different
fromthe ones who are willing to be
interviewed. This difference
exemplifiesnon-response bias. Extrapolation
from respondents to non-respondents
isproblematic, due to non-response bias. If
the response rate is high (mostinterviews
are completed), non- response bias is
minimal. If the responserate is low, non-
response bias is a problem that needs to be
considered. Atthe time of writing, U.S.
government surveys that accept any
respondentin the household have response
rates over 95%. The best
face-to-faceresearch surveys in the U.S.,
interviewing a randomly-selected adult in
ahousehold, get response rates over 80%. The
best telephone surveys getresponse rates
approaching 60%. Many commercial surveys
have muchlower response rates, which is
cause for concern.(iii) Respondents can
easily be lead to shade the truth, by
interviewerattitudes, the precise wording of
questions, or even the juxtaposition ofone
question with another. These are typical
sources of response bias.Sampling error is
well-defined for probability samples. Can
the con-cept be stretched to cover
convenience samples? That is debatable
(seebelow). Probability samples are
expensive, but minimize selection bias,and
provide a basis for estimating the likely
impact of sampling error.Response bias and
non-response bias affect probability samples
as wellas convenience samples.TRADING
NON-RESPONDENTS FOR RESPONDENTSMany surveys
have a planned sample size: if a
non-respondent is en-countered, a respondent
is substituted. That may be helpful in
controllingsampling error, but makes no
contribution whatsoever to reducing bias.
Ifthe survey is going to extrapolate from
respondents to non-respondents, itis
imperative to know how many non-respondents
were encountered.HOW BIG SHOULD THE SAMPLE
BE?There is no definitive statistical answer
to this familiar question. Big-ger samples
have less sampling error. On the other hard,
smaller samples
---------------------------------------------
-----------------------------------
Page 4
4David A. Freedmanmay be easier to manage,
and have less non- sampling error. Bigger
sam-ples are more expensive than smaller
ones: generally, resource constraintswill
determine the sample size. If a pilot study
is done, it may be possibleto judge the
implications of sample size for accuracy of
final estimates.The size of the population
is seldom a determining factor, providedthe
focus is on relative errors. For example,
the percentage breakdown ofthe popular vote
in a U.S. presidential election—with 200
million potentialvoters—can be estimated
reasonably well by taking a sample of
severalthousand people. Of course, choosing
a sample from 200 million peopleall across
the U.S. is a lot more work than sampling
from a population of200,000 concentrated in
Boise, Idaho.STRATIFICATION AND
WEIGHTSOften, the sampling frame will be
partitioned into groupings called“strata,”
with simple random samples drawn
independently from each stra-tum. If the
strata are relatively homogeneous, there is
a gain in statisticalefficiency. Other ideas
of efficiency come into play as well. If we
sam-ple blocks in a city, some will be
sparsely populated. To save interviewertime,
it may be wise to sample such blocks at a
lower rate than the densely-populated ones.
If the objective is to study determinants of
poverty, it maybe advantageous to
over-sample blocks in poorer
neighborhoods.If different strata are
sampled at different rates, analytic
proceduresmust take sampling rates into
account. The “Horvitz-Thompson” estima-tor,
for instance, weights each unit according to
the inverse of its selectionprobability.
This estimator is unbiased, although its
variance may be high.Failure to use proper
weights generally leads to bias, which may
be largein some circumstances. (With
convenience samples, there may not be
aconvincing way to control bias by using
weights.) An estimator based ona complex
design will often have a larger variance
than the correspondingestimator based on a
simple random sample of the same size:
clustering isone reason, variation in
weights is another. The ratio of the two
variancesis called “the design effect."RATIO
AND DIFFERENCE ESTIMATORSSuppose we have to
audit a large population of claims to
determinetheir total audited value, which
will be compared to the “book
value.”Auditing the whole population is too
expensive, so we take a sample. Arelatively
large percentage of the value is likely to
be in a small percentageof claims. Thus, we
may over-sample the large claims and
under-sample
---------------------------------------------
-----------------------------------
Page 5
Sampling5the small ones, adjusting later by
use of weights. For the moment, however,let
us consider a simple random sample.Suppose
we take the ratio of the total audited value
in the sampleclaims to the total book value,
then multiply by the total book value ofthe
population. This is a “ratio estimator” for
the total audited valueof all claims in the
population. Ratio estimators are biased,
because theirdenominators are random: but
the bias can be estimated from the data,
andis usually offset by a reduction in
sampling variability. Ratio estimatorsare
widely used.Less familiar is the “difference
estimator.” In our claims example, wecould
take the difference between the audited
value and book value for eachsample claim.
The sample average—dollars per claim—could
then bemultiplied by the total number of
claims in the population. This estimatorfor
the total difference between audited and
book value is unbiased, andis often
competitive with the ratio estimator.Ratio
estimators and the like depend on having
additional informationabout the population
being sampled. In our example, we need to
knowthe number of claims in the population,
and the book value for each; theaudited
value would be available only for the
sample. For stratification,yet other
information about the population would be
needed. We mightuse the number of claims and
their book value, for several different
stratadefined by size of claim.
Stratification improves accuracy when there
isrelevant additional information about the
population.COMPUTING THE STANDARD ERRORWith
simple random samples, the sample average is
an unbiased es-timate of the population
average—assuming that response bias and
non-response bias are negligible. The SE for
the sample average is generallywell
approximated by the SD of the sample,
divided by the square rootof the sample
size. With complex designs, there is no
simple formula forvariances; procedures like
“the jackknife” may be used to get
approximatevariances. (The SE is the square
root of the variance.) With
non-linearstatistics like ratio estimators,
the “delta method” can be used.THE SAMPLING
DISTRIBUTIONWe consider probability samples,
setting aside response bias and non-response
bias. An estimator takes different values
for different samples(“sampling
variability”); the probability of taking on
any particular valuecan, at least in
principle, be determined from the sample
design. The
---------------------------------------------
-----------------------------------
Page 6
6David A. Freedmanprobability distribution
for the estimator is its “sampling
distribution.” Theexpected value of the
estimator is the center of its sampling
distribution,and the SE is the spread.
Technically, the “bias" in an estimator is
thedifference between its expected value and
the true value of the estimand.SOME
EXAMPLESIn 1936, Franklin Delano Roosevelt
ran for his second term, againstAlf Landon.
Most observers expected FDR to swamp
Landon—but notthe Literary Digest, which
predicted that FDR would get only 43% of
thepopular vote. (In the election, FDR got
62%.) The Digest prediction wasbased on an
enormous sample, with 2.4 million
respondents. Samplingerror was not the
issue. The problem must then be non-sampling
error,and to find its source, we need to
consider how the sample was chosen.The
Digest mailed out 10 million questionnaires
and got 2.4 mil-lion replies—leaving ample
room for non-response bias. Moreover,
thequestionnaires were sent to people on
mailing lists compiled from carownership
lists and telephone directories, among other
sources. In 1936,cars and telephones were
not as common as they are today, and the
Di-gest mailing list was overloaded with
people who could afford what wereluxury
goods in the depression era. That is
selection bias.We turn now to 1948, when the
major polling organizations (includ-ing
Gallup and Roper) tapped Dewey—rather than
Truman—for the pres-idency. According to one
celebrated headline,DEWEY AS GOOD AS
ELECTED, STATISTICS CONVINCE ROPER.The
samples were large—tens of thousands of
respondents. The issue wasnon-sampling
error, the problem being with the method
used to choose thesamples. That was “quota
sampling.” Interviewers were free to
chooseany subjects they liked, but certain
numerical quotas were prescribed.
Forinstance, one interviewer had to choose 7
men and 6 women; of the men,4 had to be over
40 years of age; and so forth.Quotas were
set so that, in the aggregate, the sample
closely resem-bled the population with
respect to gender, age, and other control
variables.But the issue was, who would vote
for Dewey? Within each of the
samplecategories, some persons were more
likely then others to vote Republican.No
quota could be set on likely Republican
voters, their number being un-known at the
time of the survey. As it turns out, the
interviewers preferredRepublicans to
Democrats—not only in 1948 but in all
previous electionswhere the method had been
used.
---------------------------------------------
-----------------------------------
Page 7
Sampling7Interviewer preference for
Republicans is another example of selec-tion
bias. In 1936, 1940, and 1948, Roosevelt won
by substantial margins

election bias in the
polls did not affect predictions by enough
to matter.But the 1948 election was a much
closer contest, and selection bias tiltedthe
balance in the polls. Quota sampling looks
reasonable: it is still widelyused. Since
1948, however, the advantages of probability
sampling shouldbe clear to all.Our final
example is a proposal to adjust the U.S.
census. This is acomplicated topic, but in
brief, a special sample survey (“Post
Enumera-tion Survey”) is done after the
census, to estimate error rates in the
census.If error rates can be estimated with
sufficient accuracy, they can be cor-rected.
The Post Enumeration Survey is a stratified
block cluster sample,along the lines
described above. Sample sizes are huge
(700,000 peoplein 2000), and sampling error
is under reasonable control.
Non-samplingerror, however, remains a
problem—relative to the small errors in the
cen-sus that need to be fixed. For
discussion from various perspectives,
seeImber (2001). Also see Freedman and
Wachter (2003).SUPERPOPULATION MODELSSamples
of convenience are often analyzed as if they
were simple ran-dom samples from some large,
poorly-defined parent population. This
un-supported assumption is sometimes called
the “super-population model.”The frequency
with which the assumption has been made in
the past doesnot provide any justification
for making it again, and neither does
thegrandiloquent name. Assumptions have
consequences, and should onlybe made after
careful consideration: the problem of
induction is unlikelyto be solved by fiat.
For discussion, see Berk and Freedman
(1995).An SE for a convenience sample is
best viewed as a de minimis errorestimate:
if this were—contrary to fact—a simple
random sample, theuncertainty due to
randomness would be something like the SE.
However,the calculation should not be
allowed to divert attention from
non-samplingerror, which remains the primary
concern. (The SE measures samplingerror, and
generally ignores bias.)SOME PRACTICAL
ADVICESurvey research is not easy; helpful
advice will be found in the ref-erences
below. Much attention needs to be paid in
the design phase. Theresearch hypotheses
should be defined, together with the target
popula-tion. If people are to be
interviewed, the interviewers need to be
trained
---------------------------------------------
-----------------------------------
From:
http://216.239.41.104/search?q=cache:UxKhkqkAyLYJ
src="http://images.dmusic.com/v7/emoticons/irked.gif" align="middle" alt=":s (Irked)" title=":s (Irked)" />tat-www.berkeley.edu/~census/sample.pdf+small+sampling+extrapolation+to+larger+population&hl=en
Mroop, surely someone as sophisticated and urbane as yourself has encountered the notion that extrapolation of conclusions made from a small sample, to the larger aggregate can, and certainly often DOES lead to errors.
For example, the "vast majority" of North Americans at this time, are white.
So, if you were to accept this "vast majority" of North Americans are white, and go and randomly select pedestrians (every four pedestrian walking north on a certain street) in HARLEM....I would tend to believe you might not get the same percentage of whites to non-whites, that you would if you selected people walking in downtown Salt Lake City in December 23rd.
I think that making a statement that the vast majority of file sharers know ANYTHING might not qualify for good scientific rigor my friend.
I think I probably took as many statistical courses as you, but you went the way of the law, and I went the way of the healing arts.