|
ANOVA (Analysis Of
Variance): Similar to a t-test, which determines
if variation between two groups is statistically significant,
ANOVA is a test that allows comparison among multiple
groups. Multiple t-tests are less desirable here because,
as the number of groups increases, the number of needed
comparisons grows quickly. For instance, for seven groups,
there are twenty-one t-tests. If we test twenty-one
pairs, it would not be surprising to find something
that happens only 5 percent of the time, or outside
the usual 95 percent confidence
level.
Bivariate
Correlation:A measure of association (strength)
or relationship between two variables. Correlations
are useful when researchers want to compare the attitudes,
beliefs, or behavior of different groups.
Bonferroni Comparisons:
A method that allows many comparisons, similar to ANOVA.
The Bonferroni Correction is a statistical adjustment
for multiple comparisons to avoid false positives.
CATI (Computer-Assisted
Telephone Interviewing): An interviewing method
in which questions appear on a computer screen and answers
are entered directly into the computer. The method reduces
interviewers' clerical errors and speeds up data processing
(Bradburn and Sudman, 1988).
Chi-Squared Automatic Interaction Detection (CHAID): CHAID is a statistical procedure, computed using SPSS Answer Tree 3.1, which examines a defined set of either categorical or continuous independent variables and selects the single best predictor of a categorical dependent variable. It then uses this predictor to divide a population into two or more subsets characterized by significantly different scores on the dependent variable. The process then repeats, choosing the best predictor of the dependent variable among the elements of each subset defined on the preceding step. The result is a dendrogram that arranges the significant predictors in a hierarchy based on increasingly small subsets until no additional significant predictors of the dependent variable can be found among the elements of any of the subgroups.
Thanks to Kenneth R. Blake, Robert O. Wyatt, and Holly Warf, Middle Tennessee State University, for this description of chi-squared automatic interaction detection. Closeness of Fit: Statistical models are typically evaluated in terms of how well their output matches data, that is, in terms of model accuracy. A model can match data in several ways, including precision, the absolute "closeness of fit" between model predictions and data.
Computer-Assisted Personal Interviewing (CAPI): An interviewing method in which the interviewer carries a lightweight portable computer into a household. The advantage is that the computer carries out branching and editing commands and reduces interviewer clerical errors (Bradburn and Sudman, Polls and Surveys).
Confidence
limit: The upper and lower values of a statistical
estimate. The 95 percent confidence limit is the most
widely used in polling. This means that the sampling
procedure used had a 95 percent chance of producing
a set of limits that encloses the proportion that would
be found if the entire population had been asked. For
instance, a poll might find that 65 percent of the public
favored a policy, and the confidence limit (at the 95
percent level) could be 62-68 percent (paraphrased from
Bradburn
and Sudman, 1988).
Contract
with America: Crafted by Representative Newt Gingrich
(R-Ga.) during the 1994 congressional campaign, the
"Contract with America" was a statement of
principles signed by Republican candidates who pledged
action on several pieces of legislation, all of which
were said to have majority support from the American
public. Many analysts believe that the Contract served
as a catalyst for a victory that gave Republicans control
of the U.S. House for the first time in decades.
Control: A variable within which a relationship can be analyzed.
"When only two variables are crosstabulated, we call the resulting table a two-way table. However, the general idea of crosstabulating values of variables can be generalized to more than just two variables. For example [in an analysis comparing preferences of men versus women for soda Brand A and Brand B], a third variable could be added to the data set. [In this example, the control is] information about the state in which the study was conducted (either Nebraska or New York )."
|
GENDER |
SODA |
STATE |
case 1
case 2
case 3
case 4
case 5
|
MALE
FEMALE
FEMALE
FEMALE
MALE
|
A
B
B
A
B
|
NEBRASKA
NEW YORK
NEBRASKA
NEBRASKA
NEW YORK
|
"The crosstabulation of these variables would result in a 3-way table:"
|
STATE: NEW YORK |
STATE: NEBRASKA |
|
SODA: A |
SODA: B |
|
SODA: A |
SODA: B |
|
G:MALE |
20 |
30 |
50 |
5 |
45 |
50 |
G:FEMALE |
30 |
20 |
50 |
45 |
5 |
50 |
|
50 |
50 |
100 |
50 |
50 |
100 |
Source: Quoted from StatSoft Electronic Textbook .
Correlation
Coefficient: A correlation coefficient is a number
between -1 and 1 which measures the degree to which
two variables are linearly related. If there is perfect
linear relationship with positive slope between the
two variables, we have a correlation coefficient of
1; if there is positive correlation, whenever one variable
has a high (low) value, so does the other. If there
is a perfect linear relationship with negative slope
between the two variables, we have a correlation coefficient
of -1; if there is negative correlation, whenever one
variable has a high (low) value, the other has a low
(high) value. A correlation coefficient of 0 means that
there is no linear relationship between the variables.
(Valerie J. East and John H. McCall, Statistics Glossary,
http://www.stats.gla.ac.uk/steps/glossary/paired_data.html#corrcoeff)
Cronbach's
Alpha: Cronbach's alpha is a test for a model
or survey's internal consistency. Sometimes called a
"scale reliability coefficient."
Crosstabulation: A test to determine whether there is a relationship between two variables. For instance, we may hypothesize that support for abortion is higher among younger people than older people. One could run a crosstabulation (crosstab) that has an abortion question as the dependent variable and age as the independent variable to see if young people give significantly difference responses from older people.
Disproportionate
Stratified Sampling Design: A method used to
reach a higher proportion of a target group efficiently
while still representing those in the target group who
live in areas with a lower density of that group.
The disproportionate stratified sample
provides a highly accurate sampling frame, thereby reducing
the cost per effective interview. Typically, all telephone
exchanges within a target area are listed in descending
order by concentration of the target population. Exchanges
are then divided into strata based on the incidence
of the target population. Each stratum generally contains
the same number of target population households. For
example, roughly 25 percent of households served by
telephone exchanges with the highest incidence are placed
in the first stratum, followed by those with the next
largest incidence, and so on, with a fourth stratum
containing the 25 percent with the lowest incidence.
At this point, most sampling designs employ
an optimal allocation scheme. This "textbook"
approach allocates interviews to a stratum proportionate
to the number of target population households, but inversely
proportionate to the square root of the relative cost,
the relative cost in this situation being a simple function
of the incidence. As such, the number of completed interviews
increases as you move from a lower incidence stratum
to higher incidence strata. This is a known, formulaic
approach to allocation that provides a starting point
for discussions of sample allocation and associated
costs.
Thus, sample generation within each defined
stratum utilizes a strict EPSEM sampling
procedure, providing equal probability of selection
to every telephone number. However, at that point numbers
that reside in higher incidence strata are more likely
to be dialed, and telephone numbers in the lowest incidence
stratum are least likely to be interviewed. This procedure
can double, or even triple, the incidence of reaching
a target household as compared to the general, RDD
incidence of that target population. The disproportionality
of the sampling scheme is later taken into account with
weighting, balancing the population back to its true
parameters.
This process does have one principal cost,
and that is on the design effect of the study. Simply
stated, the design effect is the measure of the precision
that is lost in any complex probability design, compared
to what the precision would have been had the study
been conducted using simple RDD methodology. Any stratified
or other complex sampling design, "pound for pound,"
will increase the standard errors of all estimates,
which can also be represented by the number of effective
interviews, which is the number of unweighted interviews
divided by the design effect.
Thus, the larger the design effect, the
smaller the number of effective interviews, and therefore
the larger the standard errors associated with the study.
The size of the design effect in this type of study
is determined by the amount of disproportionality introduced
into the design. A design that roughly doubles the incidence
of the target population will typically create a design
effect of somewhere around 1.5a small price to
pay considering the cost savings associated with a doubling
of survey incidence.
For an example of disproportionate
stratified sampling in action, see the methodology
page of the 2004 National Public Radio/Kaiser Family
Foundation/Kennedy School of Government Immigration
Survey. Thanks to David Dutwin and Melissa Herrmann,
ICR/International Communications Research, for this
discussion of disproportionate stratified sampling design.
Dummy Variable: "A variable that marks or encodes a particular attribute. A dummy variable has the value zero or one for each observation, e.g., 1 for male and 0 for female." (Source: http://economics.about.com/library/glossary/bldef-dummy-variables.htm )
EPSEM
Samples: EPSEM samples are probability
samples where each observation in the population
has the same known probability of being selected into
the sample (EPSEM stands for equal probability of selection
method sampling; see Kish
1965, for a comprehensive discussion of sampling
techniques). EPSEM samples have certain desirable properties;
for example, the simple formulas for computing means,
standard deviations, and so on can be applied to estimate
the respective parameters in the population.
Factor analysis: Factor analysis tells us what variables group or go together. Factor analysis boils down a correlation matrix into a few major pieces so that the variables within the pieces are more highly correlated with each other than with variables in the other pieces. Factor analysis is actually a causal model. We assume that observed variables are correlated or go together because they share one or more underlying causes, called factors.
General Social Survey
(GSS): The General
Social Survey, conducted since 1972 by the National
Opinion Research Center (NORC), consists of a large
variety of important social indicators. Many of the
questions have been asked for a number of years, which
makes the GSS useful for measuring trends. Moreover,
the large number of interviews in the cumulative dataset
make it possible to learn about the attitudes and beliefs
of small demographic groups.
Guttman
Scale: A measurement scale that assumes that when
you agree with a scale item you will also agree with
items that are less extreme.
Independent
Samples T-Tests: An independent samples t-test
is used when you want to compare the means on a dependent
variable (e.g., SAT score) for two independent groups
(e.g., men and women).
Kish Grid: A table of numbers, named after the statistician who invented it. The number of people in the household is discovered, and a random number from the table is chosen to select a particular person. http://www.sysurvey.com/tips/sampling.htm
Leaner: 1. A survey respondent who does not make a choice among alternatives in an initial question, but makes a choice once asked if he or she leans toward one of the alternatives. 2. A survey question that asks respondents who do not initially make a choice between alternatives if they lean toward one of them. "Leaners" occur most often in questions about election choices.
Leaner Question:
A follow-up question used to encourage initially undecided
respondents to choose between alternatives, usually
political candidates.
Likely Voter:
A survey respondent who is estimated, by a variety of
means, to be likely to vote in a coming election. Survey
firms use different methods of determining the likelihood
of voting, usually including a scale of several items
in a poll, such as current voter registration status,
past history of having voted, and self-described likelihood
of voting.
Linear Regression:
A method estimating the conditional expected value of
one ("dependent") variable given the values
of some other ("independent") variable(s).
For instance, if we want to determine the relationship
between height and weight for a sample of people, linear
regression attempts to explain the relationship with
a straight line fit to the data.
List
Samples: With list samples, potential respondent
names come from records or lists which are generally
supplied by the clients. For example, for a survey of
patrons of local libraries, the sample may begin with
a list of persons who have library cards or who have
used library services.
Samples drawn from such lists usually
are generated by a random selection process. Using lists
often makes it possible to link information from records
(e.g., employer records or service records) with information
in the survey. In addition, targeted respondents can
be reached efficiently when working from current, comprehensive
lists, thus keeping costs down.
Literary
Digest Disaster: A poll conducted by
the Literary Digest called the 1936
presidential election for Alf Landon, when in fact
Franklin D. Roosevelt won reelection in a landslide.
The survey ballot had been mailed to Literary Digest
subscribers and certain other listed groups, which resulted
in the poll being unrepresentative of the voting population.
Margin of
Error: A bound that we can confidently place on
the difference between an estimate of something and
the true value.
Mean Squared Error: The average of the square of the difference between a desired response and an actual response. Since the definition of the mean is the point about which the average error is zero, we square the errors to eliminate the positive and negative signs and get the point where the average error is as low as it can be.
Method Effect: Differences in survey results related to the method by which the data are gathered. For instance, the same question may yield different responses when asked on a telephone versus an online survey.
Metropolitan
Statistical Area (MSA): An MSA is a county or
group of contiguous counties that contains at least
one city with a population of 50,000 or more or includes
a Census Bureau-defined urbanized area of at least 50,000,
with a metropolitan population of at least 100,000.
In addition to the county containing the main city or
urbanized area, an MSA may contain other counties that
are metropolitan in character and are economically and
socially integrated with the central counties. In New
England, cities and towns, rather than counties, are
used to define MSAs.
MSAs are defined by the U.S. Office of
Management and Budget (OMB). The MSA standards are revised
before each decennial census. When U.S.
Census data become available, the standards are
applied to define the actual MSAs.
Mode effect: A difference in response caused by the mode by which the data are collected. For instance, the same question asked on both a telephone survey and an online survey may yield difference results because the respondent interprets or responds to the question differently when presented by one mode versus the other.
Most Recent
Birthday Method: A way to choose one respondent
randomly in a household by asking to interview the eligible
person who had the most recent birthday.
Multi-Stage Sample: A sample that is selected in stages, where the sampling units at each stage are subsamples from the previous stage.
Multivariate Analysis: A statistical analysis of the simultaneous relationships among three or more (some would say two or more) variables; the analysis of several variables simultaneously.
National Election
Pool (NEP): A consortium of ABC News, CBS News,
NBC News, Fox News, CNN, and the Associated Press. Edison
Media Research and Mitofsky International conducted
the 2004 national exit poll for NEP.
Null Hypothesis: "Because a research hypothesis cannot be proved but only disproved, scientists have developed the notion of the null hypothesis. usually the opposite of the research hypothesis. [For] example, "Republicans are as likely as Democrats to vote for Democratic candidates." Most often, the null hypothesis states that no relationship exists between two variables or that one variable does not affect another variable. After stating a null hypothesis, researchers try to disprove or reject it. Disproving a null hypothesis offers some support for the research hypothesis." (From Herbert F. Weisberg, Jon A. Krosnick, Bruce D. Bowen, An Introduction to Survey Research, Polling, and Data Analysis.) 1936 Presidential
Election: In 1936, methods of polling pioneered
by George Gallup, Elmo Roper, and Archibald Crossley
were put to the test of predicting the outcome of the
Franklin D. Roosevelt-Alf Landon presidential election.
The new methods involved interviewing relatively small
samples in person and paying attention to the demographic
composition of the samplein those days achieved
through quota samplingin contrast to the Literary
Digest poll, which had used huge mail-in samples
to predict the outcomes correctly in every presidential
election since 1920. All three of the new polls correctly
predicted a substantial victory for Roosevelt, while
the Literary Digest,
whose methods had been criticized prior to the election
by George Gallup, wrongly forecast a Landon win. According
to Bradburn
and Sudman 1988, upon which this account is based,
"The 1936 election led... to the almost overnight
acceptance of public opinion polls by politicians and
the general public" (p. 19).
Oversample:
A sampling procedure designed to give a demographic
or geographic population a larger proportion of representation
in the sample than the population's proportion of representation
in the overall population. Oversamples are often used
to study the attitudes or behavior of groups that make
up a small proportion of the total population. For instance,
one might oversample African Americans for a study on
discrimination, or people ages 65 and over for a study
about Medicare.
Panel
Sampling: Panels represent
sample units who have agreed to answer questions again
and again over a period of time.
Positivity
Bias: The tendency of respondents who do not have
strong opinions to give a positive rather than a negative
response if pushed to make a choice.
Primacy effect: The tendency of respondents to remember and/or choose the first item on a list. This is contrasted to recency effect, the tendency of respondents to remember and/or choose the last item.
Probability
Sampling: Any method of sampling that utilizes some
form of random selection of participants from a population.
Each possible participant in the population has an equal
chance of being selected to be in the sample. Simple
random sampling, stratified random sampling, cluster
sampling, and systematic sampling are examples of probability
sampling methods. Drawing names from a hat is also an
example of probability sampling.
Projective
Tests: In psychology, examinations that commonly
employ ambiguous stimuli, notably inkblots (Rorschach
Test) and enigmatic pictures (Thematic Apperception
Test) to evoke responses that may reveal facets of the
subject's personality by projection of internal attitudes,
traits, and behavior patterns upon the external stimuli.
(Encyclopaedia Britannica Online)
Propensity Score Adjustment in Weighting Data: Although proper sampling is typically the first step in achieving representativeness, data can be adjusted to resemble a general population more closely through a technique known as weighting . Most often, simple demographic weighting is used to bring proportions of respondents in line with the proportions as they exist for age, gender, or region of country. This is known as simple demographic or rim weighting.
Additional weighting can sometimes be useful to take into account not only demographic factors but also attitudinal ones. Taking the attitudinal factors into account when weighting forms the basis of a propensity score adjustment approach to weighting.
Some people (in the United States and elsewhere) who are not online have characteristics that lead survey researchers to think they should be online. By the same token, some people who are online have characteristics that would lead researchers to believe that they should not be online. A propensity score is a single, summary measure of whether one is likely to be a participant in a telephone survey rather than an online survey, given their characteristics. Propensity score adjustment makes it possible to balance efficiently the characteristics, beyond demographics, that differentiate online respondents from telephone respondents. Propensity score adjustment is a statistical technique that minimizes error associated with internet-based panel samples and the learning effects associated with participating in multiple surveys. By taking into account attitudinal differences between online and phone respondents, it is used in conjunction with standard data weighting techniques in order to produce reliable, valid data that can be projected to populations of interest, whether they are large, general populations or smaller, more specific groups.
Thanks to George Terhanian and John Bremer, H arris Interactive, for this description of propensity score adjustment in weighting data.
Additional readings Random
digit dialing (RDD): The selection of telephone
numbers for a telephone sample by computer generation
from the list of working telephone exchanges. RDD procedures
have the advantage of including unlisted numbers, which
would be missed if numbers were drawn from a telephone
book.
From Norman M. Bradburn and Seymour
Sudman, Polls and Surveys: Understanding What They Tell
Us. San Francisco: Jossey-Bass 1988.
Random Route Method: In order to ensure random selection within a sampling unit for in-person surveys, a random route is chosen for the interviewer to take after finding the starting address. The route chosen gives every household in the cluster an equal chance of being selected for the survey.
Random
Sampling: A sampling technique where we select a
group of subjects (a sample) for study from a larger
group (a population). Each individual is chosen entirely
by chance, and each member of the population has a known,
but possibly unequal, chance of being included in the
sample. (Valerie J. East and John H. McCall, Statistics
Glossary, http://www.stats.gla.ac.uk/steps/glossary/paired_data.html#corrcoeff)
Ratings Battery: A series of questions used to evaluate institutions, businesses, people, products, advertisements, and so forth, in which respondents are asked to select one response from a scale to indicate the degree of their opinion.
Refusal Conversion: An attempt to convince potential respondents to cooperate in answering a survey after they refuse to do so in an earlier contact.
Reliability:
The degree to which multiple measures of the
same behavior or attitude agree. These multiple measures
may be over time or at the same time. (Bradburn
and Sudman, 1988).
Cronbach's alpha
assesses the reliability of a rating summarizing a group
of test or survey answers which measure some underlying
factor (e.g., some attribute of the test-taker). A score
is computed from each test item, and the overall rating,
called a "scale," is defined by the sum of
these scores over all the test items. Then reliability
is defined to be the square of the correlation between
the measured scale and the underlying factor the scale
was supposed to measure.
Reverse
Scoring or Reverse Coding: The process of rescoring
items (i.e., survey questions) in a scale that are negatively
worded in a positive direction, in order to match the
other items in the scale that are positively worded
(or vice versa).
Reweighting: Often used interchangeably with "weighting," reweighting can also mean applying a different weight than the one originally used. Right Direction/Wrong Track: This question, first asked in the early 1970s and frequently asked since the 1980s by various polling organizations, is generally asked at the beginning of a survey to measure the public's general mood about the state and direction of the country (or state, or other political entity). The most common forms are, "Do you (think/feel) things in this country are generally (going/heading) in the right direction, or (are they/have they gotten) (pretty seriously) off on the wrong track?"
Salience
Effects: The tendency of people exposed
to news coverage to adjust their issue agendas in response
to that exposure. For instance, those who frequently
see or read stories about an issue or event such as
a war would be more likely to name war as an important
issue or problem.
Sampling
Error: An error arising from the fact that it is
not statistically possible, short of having a 100 percent
sample, to select a sample which corresponds perfectly
to the population from which it is selected. As the
size of a sample increases, the magnitude of the sampling
error decreases. Sampling errors differ from other kinds
of statistical errors in that they occur at random and
are unbiased. Nonsampling errors, on the other hand,
are errors that can be attributed to mistakes in data
collection, tabulation, analysis, and so forth. www.nhes.state.nh.us/elmi/s_glossary.htm
Screening
questions: Questions used to determine who will
be included in and excluded from the sample. For instance,
a preelection survey might use a screening question
to exclude people who are not registered to vote, or
a survey about Medicare might screen by age in order
to get a sample of people ages 65 and over.
Show Cards: A type of prompt material in the form of cards with images that are shown to participants in research studies.
Social Capital:
The central premise of social capital is that social
networks have value. Social capital refers to the collective
value of all "social networks" (who people
know) and the inclinations that arise from these networks
to do things for each other ("norms of reciprocity").
The term social capital emphasizes not just warm and
cuddly feelings, but a wide variety of quite specific
benefits that flow from the trust, reciprocity, information,
and cooperation associated with social networks. Social
capital creates value for the people who are connected
and-at least sometimes-for bystanders as well. http://www.bowlingalone.com/index.php3
Spearman's
rho: A measure of the linear
relationship between two variables. It differs from
Pearson's
correlation only in that the computations are done
after the numbers are converted to ranks. When converting
to ranks, the smallest value on X becomes a rank of
1, etc. Consider the following X-Y pairs:
X Y
7 4
5 7
8 9
9 8
Converting these to ranks would result in the following:
X Y
2 1
1 2
3 4
4 3
The first value of X (which was a 7)
is converted into a 2 because 7 is the second lowest
value of X. The X value of 5 is converted into a 1 since
it is the lowest. Spearman's rho can be computed with
the formula for Pearson's r using the ranked data. For
this example, Spearman's rho = 0.60 Spearman's rho is
an example of a "rank-randomization"
test.
Split Sample: Different parts of the sample are sometimes asked different questions in the same place in a survey. Generally this is done either to test the effect of some difference in question wording about the same topic, to avoid respondent fatigue in answering two long questions with multiple items, or simply to make it possible to ask more questions in the same survey.
Standard Deviation: Ameasure of variability (or dispersion) of a distribution equal to the square root of the variance. See Standard Error.
Standard Error: A measure of the variability of estimates due to sampling. It indicates the variability of a sample estimate that would be obtained from all possible samples of a given design and size. Standard errors are used as a measure of the precision expected from a particular sample. See Standard Deviation. http://nces.ed.gov/surveys/frss/publications/92130/7.asp
Statistical
significance: Statistical measures are used to test
hypotheses that two (or more) estimates are really different
from one another or that the estimate is really different
from zero-that is, that the differences obtained in
the survey are not the result of chance variation. When
the outcome of a statistical test has statistical significance,
the investigator is willing to say that the estimated
differences between two groups (or example, in the percent
supporting some policy) are real and not chance differences.
Statistical significance is usually stated as being
at some level-for example, at the 95 or 99 percent level
(paraphrased from Bradburn
and Sudman, 1988).
Tocqueville, Alexis de (1805-59): French historian and author of Democracy in America, a penetrating study of the American polity.
Topline: A document showing the overall responses (frequencies) for each question in a survey.
Type I and Type II Errors: "When a researcher decides whether to reject a null hypothesis, two types of errors can be made. First, a true null hypothesis may be rejected by mistake. Falsely rejecting a true null hypothesis is known as a Type I error. Second, a null hypothesis may be accepted when it is false. Accepting a false null hypothesis is known as a Type II error." (From Herbert F. Weisberg, Jon A. Krosnick, and Bruce D. Bowen, An Introduction to Survey Research, Polling, and Data Analysis.)
Variance: A measure of variability (or dispersion) of a distribution equal to the mean of the squared deviations of all values from the mean. http://www.esomar.org/web/show/id=137136 Voter News Service
(VNS): A consortium of the national television networks
and major newspapers that conducted exit polls in state
and national elections between 1994 and 2002.
Weighting:
The adjustment of sample results to account for sampling
procedures and possible sample biases caused by non-cooperation
and incomplete data. Weighting assumes that universe
estimates are available from the U.S. Census Bureau
or elsewhere.
|