·
*Why
sample? *Most of the time, it is
not possible to observe every case of something. Instead, we select a **sample** from
the total **population** of cases.
Sampling always introduces some bias, or deviation from the true
distribution of the population, but it is possible to estimate that bias and
report it with the results. The estimated
bias will depend on the size of the sample (the larger the sample, the smaller
the bias introduced by sampling)—not the size of the sample in relation
to the total population (i.e., the proportion of the population), but the
absolute size (whether the total population is large or small, a certain sample
size will be needed to obtain a specified “confidence
interval.” The necessary size
of the sample will depend on the use to which it will be put. The greater the variability in the
responses or the more risky a false conclusion, the larger the sample must be.

·
*Types of
samples:* Some sampling schemes
are so common that have their own names:

o *Population “sample”:* Sample “everyone”

o *Simple
random sample: *Cases are
randomly selected from the total population. This is the “put the names in the a hat” method. The modern, quantitative approach is to
use a random number generator (BTW, Excel spreadsheets have a function that
will do this) to select cases from a list.

o *Systematic
sample:* This is the approach of
picking the n^{th }name on each page of a phone book, for example. This will also generate a random sample,
if there is no bias in the original list.
It could result in an undersampling of a
particular ethnic group if they are represented by a small set of family names
(e.g., a large number of Koreans share the family name “Lee,” and a
large number of Hmong share the family name Vang).

o *Stratified
sample:* One way to insure that
groups, perhaps small in number but significant for other reasons, are included
is to “stratify” the sample by subgroup, and then randomly select
from within each group. If one
wanted to look at the effect of, say, a proposed education policy on school
districts, one might want to select an equal number of rural and urban districts
on the assumption that the impact on them might be different.

o *Cluster
sample:* Sometimes you have
access to a number of groups (say, cohorts composed of people who used your
agency’s services, grouped by week for all of last year). In that case, you might choose to
“sample” the groups and use the information from all the members of
each group you sample. This is
useful when you have reason to believe that the group characteristics may be
important in themselves (perhaps your staff went through a major training
exercise midway through the year….).

·
*Determining
Sample Size:* There
are different ways to determine sample size, depending on how you will be using
the sample (what sort of statistical test you will be using):

o *For a proportion:*

§
*n* =
1.96(1 / p’ – p*)(r[1-r], where p^{*}
is the hypothesized value and p^{’ }is the obtained value^{ } and r is the true proportion underlying
the observation.

§
This can be simplified to 0.25(1.96/Δ)^{2}, where Δ is the difference between the
obtained and hypothesized value and the true proportion is hypothesized as .5
(this is the most restrictive assumption—as the true proportion deviates
from .5, the minimum sample size decreases as well).

§
Note that sample size (n) is determined by the
difference you can accept between what you expect (p^{*}) and what you
get (p^{’} ).

o *For
an average:*

§
*n *=
[1.96 * (σ / √n-1)^{2 }] / (x* - ~~x~~
) where x^{*} is the
hypothesized mean and ~~x~~ is the observed mean (average).

§
This can be simplified to (1.96σ^{ }/Δ)^{
2}+1, where Δ is the difference between the observed and hypothesized
means.^{ }

§ Note here that sample size (n) is determined not only by the acceptable difference between the hypothesized and obtained mean, but also by the variance around the mean (s).

o *Applied example*

§
Suppose you want to get an estimate of public
support for a bond referendum which is coming up in your district. How many people would you have to
survey, assuming you can accept a 10-pt. (0.1) difference between what the
survey says and what the voters are actually thinking? Suppose you can only accept a 5-pt.
(0.05) spread?

·
The calculating formula is 0.25(1.96/Δ)^{2}^{ }, so 0.25(1.96/0.1)^{2}=96,
so you would need to survey about 100 people.

·
If you need to be more precise, .025(1.96/.05)^{ 2}=385, so you
would need to survey almost 4 times as many people to get a doubling in
accuracy (you are squaring the difference, remember?)

§
Suppose you want to survey the community to
estimate the income security in the community. Suppose you have anticipate that the
true average will be around 5 (on a 10-pt index), and that the average
deviation about that mean will probably be 1 pt. How many people would you have to survey
if you need to be accurate within 0.1 pts?
Suppose you need to be accurate within 0.05 pts?

·
The calculating formula is (1.96σ^{ }/Δ)^{
2}+1

·
So, to be accurate within 0.1 pts, you would
need to sample 385 people; doubling your accuracy to 0.05 pts. would require 1,538 people.

§
What happens with the previous problem if the
average deviation narrows to 0.5 instead of 1 pt.?

·
With the average deviation cut in half, you
would need to sample on 97 people to get within 0.1 points of the true average
value, and a sample of 385 people would get you within 0.05 points.

When I was an undergraduate at the *has*
importance *is* important.”

There are a number of terms and techniques used in statistics to estimate the importance of the findings, given the quality of the data. All of these terms, taken together, are called “sensitivity measures.” Sensitivity measures the amount of “error” in a mathematical analysis. “Error,” in this sense, is a technical term meaning the amount of deviation of the observed values from the “true” (or “population”) values, due to

·
*Constant error:* Effects which distort the measures in
one direction

·
*Random error:* Effects which obscure possible effects
(i.e., which distort the measures in all directions).

There are four terms which, taken together, determine the sensitivity of an analysis:

·
*Reliability:* The degree to which a
measure generates similar responses over time and across situations. It is mostly affected by random
error. Reliability can be tested
by:

o *Test-Retest: *Same subjects retake the test after
some lapse of time and results are compared for consistency.

o *Alternate
forms:* A group of subjects take
alternate forms of the same test and the results are compared for consistency.

o *Subsample:* A sample of the subject pool is called
back for a retest, another sample is given an alternate form, and the results
of all three conditions (first form, alternate form, retest of first form) are
compared for consistency.

Each of these techniques carries
its own validity problems. Retest
is subject to maturation and learning (the subjects have had additional life
experiences and have had some initial experience with the test when they sit
for the retest) problems. Alternate
forms cannot answer *which* form is more accurate. Subsample shares all of
these weaknesses.

·
*Validity:* Measure of the extent to which measures
accurately reflect what they are supposed to measure—to what extent are
observed differences “true”?
Validity is mostly affected by constant error. You have already considered the issue of
validity in your discussion of questionnaire design. To summarize, there are two issues of
validity:

o *Internal
validity:* How effective are the
measures within the confines of the study?

o *External
validity:* Can the results
obtained be generalized o the broader population? This involves questions of the
representativeness of subject group, the possibility that the selection or the
measurement process might have affected the outcome (in addition to the
experimental treatment), and the possibility that multiple conditions present
in the study might be necessary for the experimental effect to occur.

·
*Confidence limits:* Estimate of the error that can be
expected in a measure, just due to the nature of measurement. “Confidence” is a term with
a very precise meaning in quantitative analysis, so use it in technical writing
*only *when you intend the technical meaning. It can be used to estimate the sample
size needed for a study. The
general formula is of the form:

o CL = 1.96(σ / √n -1)

o Try this formula with the problem above for the sample size for a test of means.

§ Inserting the values for a standard deviation of 1 and a sample of 385, the formula returns a Confidence Interval of 0.1.

§ With a standard deviation of 0.05 and a sample of 385, the Confidence Interval is 0.05.

§ Isn’t it exciting when things work out the way the theory says they should?

·
*Significance:* Test to determine whether observed
results could be explained by chance deviation from expected (average)
values. “Significance,”
too, has a technical meaning and should only be used in technical writing with
that intention.

o Significance testing is based on the assumption of a “normal curve” (that is the infamous “bell-shaped” curve, the profile through a pile of sand that has dripped through your fingers—the distribution of objects whose interaction with each other is independent and random).

o The normal curve can be completely described by two terms: the mean (m) and the standard deviation (s). [I am using Greek letters here because I am referring to “ideal” parameters of a population, not those actually observed in a sample]. The mean measures the central point (the median and the mode will be at the same point in this ideal distribution). The standard deviation measures the dispersion of cases about the mean.

o Any observation can be located on an ideal normal curve through its “z-score.” Z is a number which converts the value obtained from observation into a location on the normal curve. It takes the form

§
*z* = (x
– μ) / σ

Or, the distance of the observation from the mean, divided by the dispersion (standard deviation) about the mean. Z-scores, then, are commonly used to gauge levels of significance.

o Significance levels represent a trade-off between two potential types of error—the possibility that one could deny that there is a real difference (when there really is one) and the possibility that one could accept that there is a real difference (when there really isn’t one). The relationship between the two is inverse—as one goes up, the other goes down. In the ideal, the two lines cross at the .01 level of significance; this represents the balance between affirming a false result and denying a true result. It is the level commonly used in bench science. In medical science, where the consequences of affirming a false result could be fatal, researchers tend to take a stricter stand and use a .005 level. In the social sciences, where experimental control is weaker and the data are often less precisely measured, we commonly use the .05 level.

o A
“.05 level of significance” means that 5% of cases belonging to a
normal distribution will fall outside of the point with a z-score of __+__1.96. Any smaller deviation from the mean can
be expected to occur simply by chance 95% of the time. And this explains what that
“1.96” was doing in the formulas above. It was setting the bounds for the .05
level of significance.

o To use the z-score (i.e., 1.96 for the .05 level), simply multiply the sample standard deviation by 1.96. Anything “outside” that value is not likely to belong to the sample distribution.

- Develop
a dataset with 3 different items, at least one based on averages and at
least one based on proportions, which focus on an issue/question that will
be important to you in your profession. You can use data you gathered for an earlier
assignment, or get data from the US Census’ “American Fact
Finder” website for the
*American Community Survey*(ACS): (http://factfinder.census.gov/servlet/DTGeoSearchByListServlet?ds_name=ACS_2002_EST_G00_&_lang=en&_ts=110716884202) - Calculate the mean and standard deviation for the averages, and calculate the standard deviation for the proportions. Use 1/4 the standard deviation as your estimate for the difference between the expected and the obtained values for each of your 3 items.
- Estimate the sample size of records (observations, cities, people, etc.) you would need to be confident that the mean you actually obtained is close to the real value.
- Write a brief memo to me as if I were your supervisor on the job, recommending a sampling frame and justifying it quantitatively.

Ammons, David.
2001. *Municipal Benchmarks, 2 ^{nd} ed*.

Bearden, William,
Richard Netemeyer, Mary Mobley. 1993. *Handbook
of Marketing Scales*.

Hatry, Harry P. *et** alii*. 1977. *How
Effective Are Your Community Services?*

Knapp, Gerrit J. *Land** Market Monitoring for Smart Urban Growth*.

Kraemer, Helena
C. & Sue Thiemann. 1987. *How
Many Subjects? Statistical
Power Analysis in Research.*

Miller, Delbert
C. & Neil Salkind 2002. *Handbook of Research Design and Social Measurement.*

Sheldon, Eleanor
B. & Wilbert E. Moore, eds. 1968. *Indicators
of Social Change: Concepts and
Measurements.* NY: Russell Sage Foundation.

Vellman, Paul F & David C. Hoaglin. 1981. *Applications,
Basics, and Computing of Exploratory Data Analysis.*

© 2006 A.J.Filipovitch

Revised 5 October 2008