|
ANOVA (Analysis Of
Variance): Similar to a t-test, which determines
if variation between two groups is statistically significant,
ANOVA is a test that allows comparison among multiple
groups. Multiple t-tests are less desirable here because,
as the number of groups increases, the number of needed
comparisons grows quickly. For instance, for seven groups,
there are twenty-one t-tests. If we test twenty-one
pairs, it would not be surprising to find something
that happens only 5 percent of the time, or outside
the usual 95 percent confidence level.
Bivariate
Correlation:A measure of association (strength)
or relationship between two variables. Correlations
are useful when researchers want to compare the attitudes,
beliefs, or behavior of different groups.
Bonferroni Comparisons:
A method that allows many comparisons, similar to ANOVA.
The Bonferroni Correction is a statistical adjustment
for multiple comparisons to avoid false positives.
CATI (Computer-Assisted
Telephone Interviewing): An interviewing method
in which questions appear on a computer screen and answers
are entered directly into the computer. The method reduces
interviewers' clerical errors and speeds up data processing
(Bradburn and Sudman, 1988).
Cronbach's
Alpha: Cronbach's alpha is a test for a model
or survey's internal consistency. Sometimes called a
"scale reliability coefficient."
Disproportionate
Stratified Sampling Design: A method used to
reach a higher proportion of a target group efficiently
while still representing those in the target group who
live in areas with a lower density of that group.
The disproportionate stratified sample
provides a highly accurate sampling frame, thereby reducing
the cost per effective interview. Typically, all telephone
exchanges within a target area are listed in descending
order by concentration of the target population. Exchanges
are then divided into strata based on the incidence
of the target population. Each stratum generally contains
the same number of target population households. For
example, roughly 25 percent of households served by
telephone exchanges with the highest incidence are placed
in the first stratum, followed by those with the next
largest incidence, and so on, with a fourth stratum
containing the 25 percent with the lowest incidence.
At this point, most sampling designs employ
an optimal allocation scheme. This "textbook"
approach allocates interviews to a stratum proportionate
to the number of target population households, but inversely
proportionate to the square root of the relative cost,
the relative cost in this situation being a simple function
of the incidence. As such, the number of completed interviews
increases as you move from a lower incidence stratum
to higher incidence strata. This is a known, formulaic
approach to allocation that provides a starting point
for discussions of sample allocation and associated
costs.
Thus, sample generation within each defined
stratum utilizes a strict EPSEM sampling
procedure, providing equal probability of selection
to every telephone number. However, at that point numbers
that reside in higher incidence strata are more likely
to be dialed, and telephone numbers in the lowest incidence
stratum are least likely to be interviewed. This procedure
can double, or even triple, the incidence of reaching
a target household as compared to the general, RDD
incidence of that target population. The disproportionality
of the sampling scheme is later taken into account with
weighting, balancing the population back to its true
parameters.
This process does have one principal cost,
and that is on the design effect of the study. Simply
stated, the design effect is the measure of the precision
that is lost in any complex probability design, compared
to what the precision would have been had the study
been conducted using simple RDD methodology. Any stratified
or other complex sampling design, "pound for pound,"
will increase the standard errors of all estimates,
which can also be represented by the number of effective
interviews, which is the number of unweighted interviews
divided by the design effect.
Thus, the larger the design effect, the
smaller the number of effective interviews, and therefore
the larger the standard errors associated with the study.
The size of the design effect in this type of study
is determined by the amount of disproportionality introduced
into the design. A design that roughly doubles the incidence
of the target population will typically create a design
effect of somewhere around 1.5a small price to
pay considering the cost savings associated with a doubling
of survey incidence.
For an example of disproportionate
stratified sampling in action, see the methodology
page of the 2004 National Public Radio/Kaiser Family
Foundation/Kennedy School of Government Immigration
Survey. Thanks to David Dutwin and Melissa Herrmann,
ICR/International Communications Research, for this
discussion of disproportionate stratified sampling design.
EPSEM
Samples: EPSEM samples are probability
samples where each observation in the population
has the same known probability of being selected into
the sample (EPSEM stands for equal probability of selection
method sampling; see Kish
1965, for a comprehensive discussion of sampling
techniques). EPSEM samples have certain desirable properties;
for example, the simple formulas for computing means,
standard deviations, and so on can be applied to estimate
the respective parameters in the population.
General Social Survey
(GSS): The General
Social Survey, conducted since 1972 by the National
Opinion Research Center (NORC), consists of a large
variety of important social indicators. Many of the
questions have been asked for a number of years, which
makes the GSS useful for measuring trends. Moreover,
the large number of interviews in the cumulative dataset
make it possible to learn about the attitudes and beliefs
of small demographic groups.
Independent
Samples T-Tests: An independent samples t-test
is used when you want to compare the means on a dependent
variable (e.g., SAT score) for two independent groups
(e.g., men and women).
Leaner Question:
A follow-up question used to encourage initially undecided
respondents to choose between alternatives, usually
political candidates.
Linear Regression:
A method estimating the conditional expected value of
one ("dependent") variable given the values
of some other ("independent") variable(s).
For instance, if we want to determine the relationship
between height and weight for a sample of people, linear
regression attempts to explain the relationship with
a straight line fit to the data.
List
Samples: With list samples, potential respondent
names come from records or lists which are generally
supplied by the clients. For example, for a survey of
patrons of local libraries, the sample may begin with
a list of persons who have library cards or who have
used library services.
Samples drawn from such lists usually
are generated by a random selection process. Using lists
often makes it possible to link information from records
(e.g., employer records or service records) with information
in the survey. In addition, targeted respondents can
be reached efficiently when working from current, comprehensive
lists, thus keeping costs down.
Metropolitan
Statistical Area (MSA): An MSA is a county or
group of contiguous counties that contains at least
one city with a population of 50,000 or more or includes
a Census Bureau-defined urbanized area of at least 50,000,
with a metropolitan population of at least 100,000.
In addition to the county containing the main city or
urbanized area, an MSA may contain other counties that
are metropolitan in character and are economically and
socially integrated with the central counties. In New
England, cities and towns, rather than counties, are
used to define MSAs.
MSAs are defined by the U.S. Office of
Management and Budget (OMB). The MSA standards are revised
before each decennial census. When U.S.
Census data become available, the standards are
applied to define the actual MSAs.
Most Recent
Birthday Method: A way to choose one respondent
randomly in a household by asking to interview the eligible
person who had the most recent birthday.
Oversample:
A sampling procedure designed to give a demographic
or geographic population a larger proportion of representation
in the sample than the population's proportion of representation
in the overall population. Oversamples are often used
to study the attitudes or behavior of groups that make
up a small proportion of the total population. For instance,
one might oversample African Americans for a study on
discrimination, or people ages 65 and over for a study
about Medicare.
Positivity Bias: The tendency of
respondents who do not have strong opinions to give
a positive rather than a negative response if pushed
to make a choice.
Random
digit dialing (RDD): The selection of telephone
numbers for a telephone sample by computer generation
from the list of working telephone exchanges. RDD procedures
have the advantage of including unlisted numbers, which
would be missed if numbers were drawn from a telephone
book.
From Norman M. Bradburn and Seymour
Sudman, Polls and Surveys: Understanding What They Tell
Us. San Francisco: Jossey-Bass 1988.
Refusal Conversion:
An attempt to convince potential respondents to cooperate
in answering a survey after they refuse to do so in
an earlier contact.
Reliability:
The degree to which multiple measures of the
same behavior or attitude agree. These multiple measures
may be over time or at the same time. (Bradburn
and Sudman, 1988).
Cronbach's alpha
assesses the reliability of a rating summarizing a group
of test or survey answers which measure some underlying
factor (e.g., some attribute of the test-taker). A score
is computed from each test item, and the overall rating,
called a "scale," is defined by the sum of
these scores over all the test items. Then reliability
is defined to be the square of the correlation between
the measured scale and the underlying factor the scale
was supposed to measure.
Salience
Effects: The tendency of people exposed
to news coverage to adjust their issue agendas in response
to that exposure. For instance, those who frequently
see or read stories about an issue or event such as
a war would be more likely to name war as an important
issue or problem.
Screening
questions: Questions used to determine who will
be included in and excluded from the sample. For instance,
a preelection survey might use a screening question
to exclude people who are not registered to vote, or
a survey about Medicare might screen by age in order
to get a sample of people ages 65 and over.
Social Capital:
The central premise of social capital is that social
networks have value. Social capital refers to the collective
value of all "social networks" (who people
know) and the inclinations that arise from these networks
to do things for each other ("norms of reciprocity").
The term social capital emphasizes not just warm and
cuddly feelings, but a wide variety of quite specific
benefits that flow from the trust, reciprocity, information,
and cooperation associated with social networks. Social
capital creates value for the people who are connected
and-at least sometimes-for bystanders as well. http://www.bowlingalone.com/index.php3
Spearman's
rho: A measure of the linear
relationship between two variables. It differs from
Pearson's
correlation only in that the computations are done
after the numbers are converted to ranks. When converting
to ranks, the smallest value on X becomes a rank of
1, etc. Consider the following X-Y pairs:
X Y
7 4
5 7
8 9
9 8
Converting these to ranks would result in the following:
X Y
2 1
1 2
3 4
4 3
The first value of X (which was a 7)
is converted into a 2 because 7 is the second lowest
value of X. The X value of 5 is converted into a 1 since
it is the lowest. Spearman's rho can be computed with
the formula for Pearson's r using the ranked data. For
this example, Spearman's rho = 0.60 Spearman's rho is
an example of a "rank-randomization"
test.
Weighting:
The adjustment of sample results to account for sampling
procedures and possible sample biases caused by non-cooperation
and incomplete data. Weighting assumes that universe
estimates are available from the U.S. Census Bureau
or elsewhere.
|