Emerging Trends in The Social and Behavioral Sciences · Repeated Cross‐Sections in Survey Data

Repeated Cross‐Sections in Survey Data

Media

Part of Repeated Cross‐Sections in Survey Data

Title: Repeated Cross‐Sections in Survey Data
extracted text: Repeated Cross-Sections
in Survey Data
HENRY E. BRADY and RICHARD JOHNSTON

Abstract
Examples of repeated cross-sections (RCS) include daily tracking polls of political
opinions during campaigns, monthly Current Population Surveys of unemployment,
yearly national health interview surveys, and quadrennial election studies of presidential voting. Each iteration is a distinct sample, as opposed to panels in which the
same people are interviewed two or more times. By asking the same questions on
repeated survey samples from the same population, RCS studies allow us to track
trends and to establish causal inferences. One analytic challenge is to maintain both
the representativeness and the comparability of samples as fieldwork methods or
sources change. The longer the span covered by an RCS, the likelier it is that the
universe will change. For an RCS spanning decades, populations can change in fundamental ways. The universe of content also changes, as issues of one period are
redefined or even rendered irrelevant in another. Extracting trends from RCS data
typically requires smoothing to separate signal from noise, especially where samples
or subsamples are small, but this can lead to bias due to excessive smoothing or to
mistaking noise for signal because of sampling variability when there is not enough
smoothing. By deploying time the RCS design enables certain kinds of causal inference, but many alternative micro-processes are observationally equivalent, and so
the RCS benefits from being combined with the panel design.

INTRODUCTION
Repeated cross-sections (RCS) have been with us for decades. They appeared
as soon as an initial sample survey was followed by a second one that used
the same questions for a sample of the same population. Only in recent years
have we come to recognize the importance and utility of these data. Knowing about trends has intrinsic value, and for many indicators this requires
broadly comparable mass survey data repeated over time. And time is absolutely critical for establishing causal inferences that are the gold-standard for
good science.
For high-quality causal inferences from observational data, the starting
point is typically the panel survey, where the same persons are interviewed
Emerging Trends in the Social and Behavioral Sciences. Edited by Robert Scott and Stephen Kosslyn.
© 2015 John Wiley & Sons, Inc. ISBN 978-1-118-90077-2.

1

2

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

repeatedly so that individual change can be tracked and individual differences controlled. However, panels bring costs and benefits and are rarely
available in key historical moments. In their stead, we commonly resort to
pseudo-panels, or RCS.
At the microscopic extreme is daily sampling, increasingly common in the
study of electoral campaigns. Early studies of primary-election dynamics
(Bartels, 1988; Brady & Johnston, 1987) relied heavily on a 1984 weekly
RCS conducted as part of the American National Election Study (ANES).
Johnston, Blais, Brady, and Crête (1992) narrowed the focus to daily variation
in their pathbreaking study of a Canadian election campaign. This became
the model for the massive National Annenberg Election Study (Johnston,
Hagen, & Jamieson, 2004). New modes, especially the maturation of
Web-based surveys, and generally falling fieldwork costs are enabling more
researchers to get in on the action. The distinctive claim of daily sampling is
granularity. This is critical if, for example, shifts in preference or perception
are to be attributed to campaign events. But the logic of RCS extends well
beyond a daily or weekly time scale, as we show in the next section. Monthly
or quarterly surveys are a staple for government statistical agencies. Even
annual or quadrennial surveys are now accumulating enough waves for
dynamic analysis.
At the same time, we now recognize that initially unrelated surveys can
be lined up over time and analyzed together, especially as we think through
item equivalence and missing data problems. Brady and Kaplan (2012, 2012),
for example, assembled weekly and bi-weekly data from various sources for
a microscopic analysis of opinion in the collapse of the Soviet Union. Berinsky, Powell, Schickler, and Yohai (2011) portray opinion shifts in the 1930s
and 1940s in America, in some cases on a monthly time scale. Stimson (1999)
takes virtually the entire postwar period as his time span for annual readings
of Americans’ policy mood.
If the incorporation of time is a distinctive feature of the RCS, the scale of
time critically affects both the conduct of fieldwork and the analysis of data
after the fact. The shorter the time units the greater the burden on design and
execution of the sample to ensure daily samples that are truly random. The
longer the overall temporal span, the greater the burden on analysis to deal
with changes in the composition of the population.
WHAT ARE REPEATED CROSS-SECTIONAL SURVEYS?
RCS surveys involve the repeated administration of the same (or similar)
questions to a sample from the same (or a similar) survey population. Unlike
panel surveys, the same people are not necessarily interviewed in every survey; instead, typically a new sample is drawn each time the survey is fielded.

Repeated Cross-Sections in Survey Data

3

This approach avoids the costs of tracking people, eliminates the priming and
learning that may carry over from previous interviews, and ensures the representativeness of samples by avoiding attrition. Although RCS do not have
the power of equivalently sized panels, this is usually more than compensated for by their larger size. Although it is easiest to think in terms of sample
surveys, some of the logic extends to firms and organizations and even to successive Congresses or Supreme Court terms (Lebo & Weber, 2014), although
as we move away from sample surveys we enter the territory of “unbalanced
panels” (Honaker & King, 2010) in which some entities fall in and out of the
population.
Table 1 describes some of the major RCS data sets that come from political
science, economics, sociology, demography, and public health—although
the emphasis is on political science data. In every case, there are enough
cross-sections that time-series analysis can be undertaken. Even the quadrennial ANES Presidential surveys now comprise 17 temporal observations
from 1948 to 2012, and the new American Community Survey already has
eight yearly observations. For many of the other surveys there are hundreds
of time-series observations, and often parallel time series for geographic
sub-aggregates such as states.
From Table 1 it emerges that RCS studies can differ in three fundamental
ways:
•

•

•

Time between Cross-Sections. The time between repetitions of the survey
can vary more than three orders of magnitude from individual days
in some election-campaign tracking studies to 2–4 years (1461 days) in
social change and presidential election studies. This variation in the
period can affect the appropriateness of the questions and the nature of
the sample.
Number of Cross-Sections. Although the number of cross-sections could
be as few as two, we are generally interested in those cases where there
are enough cross-sections so that temporal trends can be identified and
analyzed. This means that we want at least 10–15 RCS, and preferably
many more. All the surveys listed in Table 1 have seven or more time
periods and the median number is about 80.
Number of Interviews at Each Time Point. The number of interviews at
each time point also varies by several orders of magnitude from 75 in
some daily or weekly tracking polls to 60,000 for the monthly Current
Population Survey to 3,000,000 in the annual American Community
Survey. The number of interviews affects the statistical accuracy of the
data and the degree to which it can be broken down by region and
subpopulation.

4

Monthly
≈2000

≈1000–4000

Soviet Collapse Data Set (1989–1991)

Roper Social and Political Trends
Dataset (1973–1994)

≈76

≈85 in 2005
≈100 in 2009

German Longitudinal Election Studies
(2005, 2009)

American National Election 1984
Continuous Monitoring

≈100–250

Annenberg National Election Studies
(2000, 2004, 2008)

Weekly or
bi-weekly

≈75

Canadian Election Studies—1988 (and
subsequent ones)

Daily

Interviews at
Each Time Point

Name of Study

Time between
Cross-Sections

207

≈80

46

41 in 2005
60 in 2009

≈365

≈50

# of CrossSections—
Time Points

Table 1
Examples of Time-Series Cross-Sectional Studies

—

Irregular spacing

—

Panels inside
RCS plus some
post-election
re-interviews
Post-election
re-interview

Post-election
re-interview

Panel or Unusual
Features?

Public opinion
and campaign
dynamics
Public opinion
and campaign
dynamics
Participation and
opinion during
collapse of SU
Political and
social
participation
over time

Public opinion
and campaign
dynamics
Public opinion
and campaign
dynamics

Subject Matter

5

Quarterly

1000+ per
country
≈7000

Eurobarometer (2–5 times per year)

US Consumer Expenditure Survey
(1980 to present)

≈480

Survey of Consumer Sentiment (1952;
micro-data from 1978-today)
≈1000

≈60,000

Current Population Survey (1940;
micro-data from 1962-present)

California Field Poll Data (1956–2006)

≈3000

American Mass Public in 1930s and
1940s–

≈120

39+

232

≈400

≈600

400

Diary component

—

—

—

Rotating panels

Irregular spacing

(Continued Overleaf)

Public opinion
and voting over
time
Employment,
poverty,
welfare over
time
Consumer
sentiment over
time
Public opinion
and voting over
time
Political and
social trends,
attitudes to EU
Consumer
expenditures
over time

6
≈1000–2000

7

≈3,000,000

American National Election
Presidential Studies (1948–2012)

49

≈80000

Quadrennially

25

≈1000–2000

American National Election Studies
(1956–2004) –Every 2 years
Integrated National Health Interview
Surveys (1963–2011)
American Community Survey
(2006-present)

17

27

≈2000

General Social Survey (1972–2010)

# of CrossSections—
Time Points

Yearly or
biennially

Interviews at
Each Time Point

Name of Study

Time between
Cross-Sections

Table 1
(Continued)

Some panels
embedded

First annual, now
2-years
Some panels
embedded
Continuous
sampling
Continuous
sampling

Panel or Unusual
Features?

Social trends
over time
Political trends
and elections
Health of
population
Demographics,
poverty, institutionalization
status,
employment
Political trends
and elections
over time

Subject Matter

Repeated Cross-Sections in Survey Data

7

Depending on how these dimensions are combined, an RCS can be sensitive to different kinds of change. Larger samples with more interviews at
each time point make it possible to distinguish true change from apparent
change due to sampling variability. True change can come in two varieties.
One is where the composition of the population is stable over time, but the
units change their behaviors or attitudes. An obvious example is, where
the members of an electorate decide that a candidate is better than they
had originally thought, perhaps based on a performance in a debate. The
other kind of change is compositional, where individuals do not change
their proclivities but are themselves replaced by those with different characteristics. Both kinds of change can coexist in the population, but their
relative prevalence and importance in samples depends on the interval
between cross-sections and the total length of the data collection period.
The shorter the time between cross-sections, the more that true change must
reflect change in individuals’ behavior or attitudes. Apart from sampling
variability, change in the sample’s propensities can have no other source
but the conversion of its component individuals. As the span of the series
increases, on the other hand, attention is forced to compositional change,
which typically takes years to register. Again, the passage of time may
make possible—may even be necessary for—individuals to change their
attitudes or characteristics, but the longer the span the more inevitable it
is that samples will also be drawn from populations comprising different
individuals.
Depending on where a given RCS sits on the dimensions of time unit, overall duration, and sample size per time unit, various challenges must be faced.
Commonly, addressing one challenge may only exacerbate another.
COMPARABILITY VERSUS REPRESENTATIVENESS
COMPARABILITY IN SAMPLES
At the high-frequency extreme of data collection the objective is to compare
one day’s results with those from the next day. To this end, the data collection strategy for each day should be identical, such that differences between
the days are the product of something that has happened in the interval, not
of differences in, say, accessibility or availability of respondents. However,
if we require that the date of interview is the same as the date of release of
the potential respondent’s contact information to the field the sample will be
very unrepresentative because many respondents will not be reached immediately. It takes time to “clear a sample,” and we know that the longer a poll is
in the field, the better it performs in providing predictions of elections (Lau,

8

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

1994). Even where prediction is not the objective it can be a factor in the credibility of the study. To combine granularity with representativeness, the key
is to work equally hard to contact respondents as they are released to the
field and to recognize that after a suitable number of days (perhaps a week),
those people who are interviewed on a given day will constitute a random
sample even though they come from different cohorts released on different
days. This approach was successfully used with the Canadian studies, the
National Annenberg Election Survey, and other studies. This implies careful
management of the release and clearance of the sample and acceptance that
the first days of fieldwork will not be useful for dynamic analysis (although
the cases will be perfectly usable in analyses not involving time). A similar
logic applies if the design involves changes in sampling fractions, although
here model-based weighting can be employed (Brady & Johnston, 2006; Johnston & Brady, 2002).
Even when the intent is to provide a representative sample of some
well-defined population (e.g., the United States), samples may differ over
time because data come from different survey houses, interviewing is added
in another language (e.g., Spanish), one method of interviewing is replaced
by another (e.g., in-person by telephone), or sampling methods change (e.g.,
more sampling points, or shift from clustering to a simple random draw).
For example, in their efforts to put together public opinion datasets on the
American mass public in the 1930s and the 1940s, Berinsky and Schickler
(2010) faced problems stemming from collecting surveys from four different
survey organizations and the widespread use of quota sampling. They
employed several model-based post-stratification methods to make the data
comparable over time (Berinsky, 2006). Similarly, using a series of CBS/New
York Times national polls from the 1988 election campaign, Gelman (2007)
discussed the strengths and limitations of post-stratification weighting in
the context of regression analysis. As Gelman notes, much more work has
to be done to figure out the appropriate statistical methods for solving
these problems. These problems are only compounded by the explosion in
tracking polls and the salience of aggregators, such as FiveThirtyEight1 or
Pollster2 .
COMPARABILITY OF SURVEY QUESTIONS
The questions asked on RCS often change over time because better versions
are constructed, because times change and questions must be modified,
or simply because different investigators have different beliefs about good
1. FiveThirtyEight (blog). http://fivethirtyeight.com/ (accessed 15 November 2014).
2. Huffington Post. Pollster. http://elections.huffingtonpost.com/pollster (accessed 10 September
2012).

Repeated Cross-Sections in Survey Data

9

questions. The problems created by these changes range from modifications
at one point in time (e.g., a new way to measure unemployment or a
new way to ask about liberal-conservative identification) to a mélange
of different ways of answering a similar question (e.g., the popularity of
some political figure asked with 3, 5, or 10 point scales and with different
words such as “approve,” “trust,” or “support”). Brady and Kaplan (2012,
2012) approached this problem by simultaneously modeling trends in the
data and the question formats using methods from test theory. Stimson
(1999, Appendix 1) considered an extreme version of this problem in which
he wanted to build factor scores (typically for a “liberal-conservative”
dimension) from dated items with only partially overlapping cases.
BIAS VERSUS VARIABILITY
One of the major reasons for considering RCS is to analyze trends, but in
many cases samples are so small that simply “connecting the dots” risks
confusing sampling variability for real change. Even when there are large
samples, a focus on small sub-populations quickly leads to the same problem. Identifying real change requires smoothing.
But how much? The simplest approach is simply to estimate a linear trend.
This may have the appeal of visual clarity but typically it provides far too
much smoothing. The truth is, we commonly lack a theory of events that
would tell us the shape of their impact over time that would guide smoothing, especially for highly granular variants of the RCS. Shaw (1999) gives an
inventory of events that might populate a Presidential campaign and outlined alternative time paths of effect. However, this is a primer on shapes to
look for; no real theory distinguishes the various paths. Hill, Lo, Vavreck, and
Zaller (2013) stake a strong claim for a particular path for impact from campaign ads, and this may be a starting point for thinking about how discrete
events, such as debates, might play out.
For the most part, however; we lack a strong basis on which to stake
dynamic claims. In statistics, this problem has spawned an enormous
literature on nonparametric smoothing techniques, such as the roughness
penalty approach (Eubank, 1999; Green & Silverman, 1994), kernel smoothing (Bowman & Azzalini, 1997; Hardle, 1990), local polynomial modeling
(Fan & Gijbels, 1996), the least squares spline approach (Smith, 1979), and
the free-knot approach of Mao and Zhao (2003). Examples in political data
are still rare but already the field shows variety. Brady and Johnston (2006)
discuss this “bias versus sampling variability” problem in the context of
daily cross-sections. They recommend a kernel smoothing approach and an
optimum smoothing criterion based on variances in the variable of interest.
Brady and Kaplan (2012, 2012) use least squares regression splines (Smith,

10

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

1979) with pre-chosen knots to estimate changes in public opinion in RCS.
Matthews and Johnston (2010) propose a semi-parametric component for the
economy that uses cubic smoothing splines with fixed equivalent degrees
of freedom. For a similar problem, Johnston and Partheymüller (2012) use
thin-plate regression splines (Wood, 2003) which estimate knot locations
and flexible functional forms. Four papers, four approaches. We hope that
in the coming years experts in nonparametric regression turn their attention
to RCS data, to suggest better ways of choosing knot locations, equivalent
degrees of freedom, and, indeed, the smoothing method itself.
Smoothing issues apply not just to the description of population
and subpopulation trends over time. They also enable estimation of
associations—and, critically, changes in the associations—among variables
over time. How, for example, are public attitudes and leadership approval
related to one another? How do events affect public attitudes? How do
changes in employment or marital status affect the up-take of social welfare
programs? How do changes in health status affect political participation,
employment, marriage, education, or a host of other things? What is
correlated with changes in party identification over time?
Before we can answer these questions we must get a better grip on a
number of issues: How can we build models which simultaneously include
covariates and smooth the data? How can we separate cross-sectional
association (e.g., party identification with policy attitudes) from time series
association (changes in party identification with changes in policy attitudes)?
Johnston and Brady (2002) propose a method for separating time-series
from cross-sectional effects that starts with a specific model of opinion
change. Earlier, these same authors (Johnston et al., 1992) proposed a simple
step function representation of varying parameters. Linear interactions of
factors with time have been ventured (e.g., Bartels, 2006a). Mebane and
Wand (1997) seem to be the first to try to extract patterns in individual
transitions from early US and Canadian RCS campaign data. Pelzer, Eisinga,
and Franses (2002) built on work by Moffitt (1993) and Franklin (1989) to
take this further. In truth, we do not have plausible theories for the exact
time path of varying coefficients. This points, again, to debates over how
smoothing methods help identify how true temporal change in one variable
relates to true change in another one. Eubank et al. (2004) use a smoothing
spline to estimate a varying coefficient model. The Brady and Kaplan (2012),
Johnston and Partheymüller (2012), and Matthews and Johnston (2010)
references cited earlier include covariates in their model as interaction terms
with time. All these approaches assume an absence of autocorrelation across
successive days, an assumption forced by the fact that the data are, after
all, not panels. Lebo and Weber (2014) propose a solution that involves
multi-level modeling. The strengths and weaknesses of these methods need

Repeated Cross-Sections in Survey Data

11

to be assessed, and new methods must be developed which will allow for
the flexible modeling of temporal trends and the incorporation of covariates
in models.
CHANGES IN THE UNIVERSE
Most of the discussion so far assumes that comparisons are across basically
unchanging universes of persons and content. The issues are ones of error, in
sampling or in measurement. The longer the time span for the study, however; the more the analysis engages issues in the comparability of populations and of attitude domains.
UNIVERSE OF PEOPLE
If we can assume that RCS surveys draw their sample from the same population over time, then the impact of an explanatory variable on different groups
in the population (say the impact of a change in wages on female labor supply or the impact of a change in unemployment on presidential popularity
among men) can be determined by looking at what happens to the dependent
variables (female labor supply in the first instance and presidential popularity among men in the second) from one period to the next as changes occur
in wages or unemployment. In effect, the fixed characteristic of a person (in
this case his or her sex) is used to create a “pseudo-panel” of similar people
whose reactions can be measured by conditioning on these characteristics.
Tellingly, a pseudo-panel is sometimes called a synthetic cohort.
However, this method only works if the populations of each group stay
the same from one period to the next, and this assumption is surely wrong,
since day by day people age so that new people enter into the eligible
population (by being born or by turning 18) and people die. In addition,
the way in which a factor affects the dependent variable may change from
one period to the next or its impact may depend upon the cohort of people
exposed to the factor (and these cohorts might change because of emigration
or immigration). These changes are probably negligible for relatively short
time periods—maybe even as long as several years, but at some point they
must be confronted. In sum, as time spans expand, we can no longer pretend
that an RCS is a single synthetic cohort; “Age-Period-Cohort” problems
must be confronted directly. A thought experiment illustrates the problem: in
the 1980s, one could give a quite complete account of US politics and yet say
nothing about Hispanics; could one imagine doing so today? Any attempt
to track issue evolution over this span must allow for this transformation in
the electorate’s coalition possibilities (Shafer & Spady, 2014).

12

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Certainly any study of long-term trends must take into account these kinds
of changes that stem from changes in age, the impacts of the periods in which
people live, and the cohorts of which they are a member. Sociologists have
done the most to consider these factors. The classic work is Mason and Fienberg (1985), which considered ways to deal with the basic identification problem created by the fact that one’s age (say 50) in a given period (say the year
1965) equals period minus cohort as defined by year of birth (1915). More
recently, Yang Yang and Kenneth Land (2006) and Yang Yang (2006) have
developed mixed (fixed and random effects) models of trend data in RCS
surveys (e.g., the General Social Survey) which identify and estimate age,
period, and cohort components of change. Another example of this line of
research is Devereux’s (2007) analysis of biases in synthetic cohort models.
These papers suggest ways to deal with the non-fixity of the underlying population, and they should be further developed so that they can be combined
with analyses that go beyond Age-Period-Cohort analysis.
UNIVERSE OF CONTENT
The issues flagged above for the comparability of survey questions are compounded as time spans several decades. Some issue domains simply have no
counterparts across the periods. Even persistent domains—race relations in
the United States, to take an example—change their surface content. Questions from the 1950s about, say, school integration and rights to sit at lunch
counters, would be almost incomprehensible to current respondents. The
usual response is to go minimal and look for questions with maximum correspondence, often focusing on single items (Carmines & Stimson, 1989; Shafer
& Johnston, 2006). Attempts to model the landscape more broadly do not
have this luxury, and analysts are shifting to combinations of exploratory and
confirmatory factor analyses (Claggett & Shafer, 2010) and to item-response
theory (Shafer & Spady, 2014). Once again, better theories about how attitudes are affected by social and cultural change over time might be very
useful (MacKinnon & Luke, 2002).
CAUSAL INFERENCE
Describing trends and estimating associations between variables are important scientific tasks, but we often want to estimate causal impacts as well.
This takes us back to where we started. True panel data have the advantage
that we can control for fixed effects, add lagged endogenous variables, and
develop dynamic models of change (Bartels, 2006b; Hsiao, 2003), which can
eliminate many threats to making reliable inferences. The pseudo panel
approach cannot do these things directly, but RCS surveys have some

Repeated Cross-Sections in Survey Data

13

advantages over panels. Their temporal granularity makes it possible to
detect changes that might be missed by panels spaced far apart (Brady &
Johnston, 2006). They are less likely to suffer from panel attrition which can
create selection bias problems for panel studies. Moreover, the sheer number
of repeated temporal data points (sometimes in the hundreds compared to
panel studies which rarely have more than five waves) makes it possible to
study the impact of putative explanatory variables both as they go up and
as they go down, which provides more leverage for causal inference.
Starting with Deaton (1985), econometricians have developed and
expanded on the pseudo-panel concept. After Franklin’s (1989) and Moffitt’s
(1993) papers made some conceptual breakthroughs, many others have
followed (Collado, 1997; Pelzer, Eisinga, & Franses, 2002, 2005; Ridder &
Moffitt, 2007; Verbeek, 2008; Verbeek and Nijman, 1993; Verbeek & Vella,
2005). Athey and Imbens (2006; see also Abadie, 2005; Manski & Pepper,
2012) used the Neyman-Rubin-Holland counterfactual outcomes approach
to develop a very general framework for thinking about “nonparametric
identification, estimation, and inference for the average effect of the treatment for settings where RCS of individuals are observed in a treatment and a
control group, before and after the treatment” (432). Even if RCS designs can
be sensitive to fine temporal distinctions, the patterns they identify can be
consistent with quite different micro-mechanisms. Lenz (2009), for example,
notes that shifts in coefficients that Johnston et al. (1992, 2004) attribute to
priming are observationally equivalent to effects from learning and opinion
change. He deploys true panels to identify the mechanisms.
For high-frequency RCS designs, campaign studies for example, it is
possible to combine repetition of cross-sections with true panels. Canadian
Election Studies are, in one sense, just pre-post panels, with the first wave
released in a rolling manner. Even this simple design gives causal leverage,
provided the key questions appear at each wave (Johnston & Partheymüller,
2012; Lenz, 2009). More elaborate designs, including a proper baseline,
would be more effective for inference, if at the cost of representativeness.
And such designs are starting to appear, thanks to the power and flexibility
of the online mode. Goldman (2012) uses the five-wave RCS-panel combination, part of the 2008 National Annenberg Election Survey (http://www.
annenbergpublicpolicycenter.org/political-communication/naes/),
for
an account of racial politics. Faas and Blumenberg (2013) apply the
design to election and referendum campaigns in the German state of
Baden-Württemburg.

14

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

CONCLUSIONS
As mentioned earlier, RCS have been with us for a long time. With hindsight,
we now see the commonalities—but also the distinctions—among such data
sets. The critical groundwork has been divided among economists, sociologists, political scientists, and others, with the effect that one disciplinary
groups proceeds largely in ignorance of the others. Now we are poised to
learn from each other. The possibilities for future discoveries are exciting.
No less exciting are the opportunities to repurpose data and learn about our
past.
REFERENCES
Abadie, A. (2005). Semiparametric difference-in-differences estimators. The Review of
Economic Studies, 72, 1–19.
Athey, S., & Imbens, G. W. (2006). Identification and inference in nonlinear
difference-in-differences models. Econometrica, 74, 431–497.
Bartels, L. M. (1988). Presidential primaries and the dynamics of public choice. Princeton,
NJ: Princeton University Press.
Bartels, L. M. (2006a). Priming and persuasion in presidential campaigns. In H. E.
Brady & R. Johnston (Eds.), Capturing campaign effects (pp. 78–112). Ann Arbor:
University of Michigan Press.
Bartels, L. M. (2006b). Three virtues of panel data for the analysis of campaign effects.
In H. E. Brady & R. Johnston (Eds.), Capturing campaign effects (pp. 134–163). Ann
Arbor: University of Michigan Press.
Berinsky, A., Powell, E. N., Schickler, E., & Yohai, I. (2011). Revisiting Public Opinion
in the 1930s and 1940s. PS: Political Science and Politics, 44, 515–520.
Berinsky, A & Schickler, E. (2010). Collaborative Research: The American Mass
Public in the 1930s and 1940s. National Science Foundation, Political Science Program Grant, http://web.mit.edu/berinsky/www/nsf.pdf (accessed 10 September 2012).
Berinsky, A. (2006). American public opinion in the 1930s and 1940s: The analysis of
quota-controlled sample survey data. The Public Opinion Quarterly, 70, 499–529.
Bowman, A., & Azzalini, A. (1997). Applied smoothing techniques for data analysis: The
kernel approach with S+ illustrations. Oxford, England: Clarendon Press.
Brady, H. E., & Johnston, R. (1987). What’s the primary message: Horse race or issue
journalism?. In G. R. Orren & N. W. Polsby (Eds.), Media and momentum: The new
Hampshire primary and nomination politics (pp. 127–186). Chatham House: Chatham,
NJ.
Brady, H. E., & Johnston, R. (2006). The rolling cross-section and causal attribution.
Capturing campaign effects (pp. 164–195). Ann Arbor: University of Michigan Press.
Brady, H. E. & Kaplan, C. (2012). A Least-Squares Spline Method for Identifying Trends
in Participation in Informal Political Groups in the Soviet Union from January 1989 to
January 1992 Using a New Rolling Cross-Section Data Set. Paper presented at the
Midwest Political Science Association Annual Meeting, April 2012, Chicago, IL.

Repeated Cross-Sections in Survey Data

15

Brady, H. E. & Kaplan, C. (2012). Political Opinion in the Collapse of the USSR: A
Reassessment Twenty Years Later Using a New Consolidated and Linked Data Set. Paper
presented at the American Political Science Association Annual Meeting, August
2012, New Orleans, LA.
Carmines, E. G., & Stimson, J. A. (1989). Issue evolution: Race and the transformation of
American politics. Princeton, NJ: Princeton University Press.
Claggett, W. J. M., & Shafer, B. E. (2010). The American public mind: The issue structures
of mass politics in the postwar years. Cambridge, England: Cambridge University
Press.
Collado, D. M. (1997). Estimating dynamic models from time series of independent
cross-sections. Journal of Econometrics, 82, 37–62.
Deaton, A. (1985). Panel data from time series of cross-sections. Journal of Econometrics, 20, 109–126.
Devereux, P. J. (2007). Small-sample bias in synthetic cohort models of labor supply.
Journal of Applied Econometrics, 22, 839–848.
Eubank, R. L. (1999). Nonparameteric regression and spline smoothing (2nd ed.). New
York, NY: Marcel Dekker.
Eubank, R. L., Huang, C., Maldonado, M., Wang, N., Wang, S., & Buchanan, R. J.
(2004). Smoothing spline estimation in varying-coeffeicient models. Journal of the
Royal Statistical Society, Series B: Statistical Methodology, 66, 653–667.
Faas, T. & Blumenberg, J. N. (2013) Measuring Dynamics: A Rolling Panel Study in the
run-up to the Baden-Wuerttemberg state election 2011. Presented to the 71st Annual
Conference of the Midwest Political Science Association, Chicago, IL.
Fan, J., & Gijbels, I. (1996). Local polynomial modelling and its applications. London, England: Chapman and Hall.
Franklin, C. H. (1989). Estimation across data sets: Two-stage auxiliary instrumental
variables estimation (2SAIV). Political Analysis, 1, 1–23.
Gelman, A. (2007). Struggles with survey weighting and regression modeling. Statistical Science, 22, 153–164.
Goldman, S. K. (2012). Effects of the 2008 Obama presidential campaign on white
racial prejudice. Public Opinion Quarterly, 76, 663–687.
Green, P. J., & Silverman, B. W. (1994). Nonparametric regression and generalized linear
models: A roughness penalty approach. London, England: Chapman and Hall.
Hardle, W. (1990). Applied nonparametric regression. Cambridge, England: Cambridge
University Press.
Hill, S., Lo, J., Vavreck, L., & Zaller, J. (2013). How quickly we forget: The duration of persuasion effects from mass communication. Political Communication, 30,
521–547.
Honaker, J., & King, G. (2010). What to do about missing values in time-series
cross-section data. American Journal of Political Science, 54, 561–581.
Hsiao, C. (2003). Analysis of panel data (2nd ed.). New York, NY: Cambridge University Press.
Johnston, R., Blais, A., Brady, H. E., & Crête, J. (1992). Letting the people decide: Dynamics of a Canadian election. Stanford, CA: Stanford University Press.

16

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Johnston, R., & Brady, H. E. (2002). The rolling cross-section design. Electoral Studies,
21, 283–295. Reprinted in Mark N. Franklin and Christopher Wlezien (editors), The
Future of Election Studies. Oxford, England: Pergamon Press.
Johnston, R., Hagen, M. G., & Jamieson, K. H. (2004). The 2000 presidential election and
the foundations of party politics. Cambridge, England: Cambridge University Press.
Johnston, R & Partheymüller, J. (2012). Campaign Activation in German Elections: Evidence from 2005 and 2009. Paper presented at the American Political Science Association Annual Meeting, August 2012, New Orleans, LA.
Lau, R. R. (1994). An analysis of the accuracy of “Trial Heat” polls during the 1992
presidential election. Public Opinion Quarterly, 58, 2–20.
Lebo, M., & Weber, C. (2014). An effective approach to the repeated cross sectional
design. American Journal of Political Science. doi:10.1111/ajps.12095.
Lenz, G. (2009). Learning and opinion change, not priming: Reconsidering the evidence for the priming hypothesis. American Journal of Political Science, 53, 821–837.
Manski, C. F. & Pepper, J. V. (2012). Partial Identification of the Treatment Response with
Data on Repeated Cross Sections. Working Paper http://faculty.wcas.northwestern.
edu/∼cfm754/tr_rcs.pdf (accessed 10 September 2012).
MacKinnon, N. J., & Luke, A. (2002). Changes in identity attitudes as reflections of
social and cultural change. The Canadian Journal of Sociology, 27(3), 299–338.
Mao, W., & Zhao, L. H. (2003). Free-knot polynomial splines with confidence intervals. Journal of the Royal Statistical Society, 65(4), 901–919.
Mason, W., & Fienberg, S. (Eds.) (1985). Cohort analysis in social research: Beyond the
identification problem. New York, NY: Springer Verlag.
Matthews, J. S., & Johnston, R. (2010). The campaign dynamics of economic voting.
Electoral Studies, 29, 13–24.
Mebane, W. R. & Wand, J. (1997). Markov Chain Models for Rolling Cross-section Data:
How Campaign Events and Political Awareness Affect Vote Intentions and Partisanship in the United States and Canada. Presented to the Annual meeting of the
Midwest Political Science Association, Chicago, IL. http://polmeth.wustl.edu/
mediaDetail.php?docId=446 (accessed 10 September 2012).
Moffitt, R. (1993). Identification and estimation of dynamic models with a time series
of repeated cross-sections. Journal of Econometrics, 59, 99–124.
Pelzer, B., Eisinga, R., & Franses, P. H. (2002). Inferring transition probabilities from
repeated cross sections. Political Analysis, 10, 113–133.
Pelzer, B., Eisinga, R., & Franses, P. H. (2005). ‘Panelizing’ repeated cross sections:
Female labor force participation in the Netherlands and West Germany. Quality &
Quantity, 39, 155–174.
Ridder, G., & Moffitt, R. (2007). The econometrics of data combination. In J. J. Heckman & E. E. Leamer (Eds.), Handbook of econometrics (Vol. 6, 1st ed. Chapter 75).
Shafer, B. E., & Johnston, R. (2006). The end of southern exceptionalism: Class, race, and
partisan change in the postwar south. Cambridge, MA: Harvard University Press.
Shafer, B. E., & Spady, R. H. (2014). The American political landscape. Cambridge, MA:
Harvard University Press.
Shaw, D. R. (1999). A study of presidential campaign effects from 1952 to 1992. Journal
of Politics, 61, 387–422.

Repeated Cross-Sections in Survey Data

17

Smith, P. L. (1979). Splines as a useful and convenient statistical tool. American Statistician, 33, 57–62.
Stimson, J. A. (1999). Public opinion in America: Moods, cycles, and swings (2nd ed.).
Boulder, CO: Westview.
Verbeek, M. (2008). Pseudo panels and repeated cross-sections. In L. Mátyás & P.
Sevestre (Eds.), The econometrics of panel data: Fundamentals and recent developments
in theory and practice (3rd ed., pp. 369–383). Berlin, Germany: Springer.
Verbeek, M., & Nijman, T. (1993). Minimum MSE estimation of a regression model
with fixed effects from a series of cross-sections. Journal of Econometrics, 59,
125–136.
Verbeek, M., & Vella, F. (2005). Estimating dynamic models from repeated crosssections. Journal of Econometrics, 127, 83–102.
Wood, S. N. (2003). Thin plate regression splines. Journal of the Royal Statistical Society,
Series B: Statistical Methodology, 65, 95–114.
Yang, Y. (2006). Bayesian inference for hierarchical age-period-cohort models of
repeated cross-section survey data. Sociological Methodology, 36, 39–74.
Yang, Y., & Land, K. C. (2006). A mixed models approach to the age-period-cohort
analysis of repeated cross-section surveys, with and application to data on trends
in verbal test scores. Sociological Methodology, 36, 75–97.

HENRY E. BRADY SHORT BIOGRAPHY
Henry E. Brady is Dean of the Goldman School of Public Policy and Class of
1941 Monroe Deutsch Professor of Political Science and Public Policy at the
University of California, Berkeley. He received his PhD in Economics and
Political Science from MIT in 1980, and he has written extensively on political methodology. He is coauthor or coeditor of nine books including coeditor
of Rethinking Social Inquiry (2004), Capturing Campaign Effects (2006), and the
Oxford Handbook of Political Methodology (2008). He has been president of the
American Political Science Association, president of the Political Methodology Society, and director of the University of California’s Survey Research
Center from 1998 to 2009. He was elected a Fellow of the American Academy
of Arts and Sciences in 2003, a Fellow of the American Association for the
Advancement of Science in 2006, and a Fellow of the Political Methodology
Society in 2008. He received the Career Achievement Award of the Political
Methodology Society in 2012.
RICHARD JOHNSTON SHORT BIOGRAPHY
Richard Johnston (PhD Stanford) is Professor of Political Science and Canada
Research Chair in Public Opinion, Elections, and Representation at the University of British Columbia. He has also taught at the University of Toronto,
the California Institute of Technology, Harvard University (Mackenzie King

18

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

chair, 1994–1995), and the University of Pennsylvania. He was an Associate
Member of Nuffield College, Oxford, a Marie Curie Research Fellow at the
European University Institute, and official visitor at MZES, Mannheim. He
is the author or coauthor of five books, three on Canadian politics and two
on US Politics. He has coedited three other books and has written numerous
articles and book chapters. In 2007–2008 he was President of the Canadian
Political Science Association. He was Principal Investigator of the 1988 and
1992–1993 Canadian Election Studies and Research Director for the National
Annenberg Election Survey (Penn), 2000–2008. Much of his work focuses on
elections and public opinion, with special reference to the role of mass communications and campaigns.
RELATED ESSAYS
Social Epigenetics: Incorporating Epigenetic Effects as Social Cause and
Consequence (Sociology), Douglas L. Anderton and Kathleen F. Arcaro
To Flop Is Human: Inventing Better Scientific Approaches to Anticipating
Failure (Methods), Robert Boruch and Alan Ruby
Ambulatory Assessment: Methods for Studying Everyday Life (Methods),
Tamlin S. Conner and Matthias R. Mehl
Models of Nonlinear Growth (Methods), Patrick Coulombe and James P.
Selig
Quantile Regression Methods (Methods), Bernd Fitzenberger and Ralf
Andreas Wilke
The Evidence-Based Practice Movement (Sociology), Edward W. Gondolf
Meta-Analysis (Methods), Larry V. Hedges and Martyna Citkowicz
The Use of Geophysical Survey in Archaeology (Methods), Timothy J.
Horsley
Network Research Experiments (Methods), Allen L. Linton and Betsy Sinclair
Longitudinal Data Analysis (Methods), Todd D. Little et al.
Structural Equation Modeling and Latent Variable Approaches (Methods),
Alex Liu
Data Mining (Methods), Gregg R. Murray and Anthony Scime
Remote Sensing with Satellite Technology (Archaeology), Sarah Parcak
Quasi-Experiments (Methods), Charles S. Reichard
Digital Methods for Web Research (Methods), Richard Rogers
Virtual Worlds as Laboratories (Methods), Travis L. Ross et al.
Modeling Life Course Structure: The Triple Helix (Sociology), Tom Schuller
Content Analysis (Methods), Steven E. Stemler
Person-Centered Analysis (Methods), Alexander von Eye and Wolfgang
Wiedermann
Translational Sociology (Sociology), Elaine Wethington; Repeated Cross-Sections
in Survey Data
HENRY E. BRADY and RICHARD JOHNSTON

Abstract
Examples of repeated cross-sections (RCS) include daily tracking polls of political
opinions during campaigns, monthly Current Population Surveys of unemployment,
yearly national health interview surveys, and quadrennial election studies of presidential voting. Each iteration is a distinct sample, as opposed to panels in which the
same people are interviewed two or more times. By asking the same questions on
repeated survey samples from the same population, RCS studies allow us to track
trends and to establish causal inferences. One analytic challenge is to maintain both
the representativeness and the comparability of samples as fieldwork methods or
sources change. The longer the span covered by an RCS, the likelier it is that the
universe will change. For an RCS spanning decades, populations can change in fundamental ways. The universe of content also changes, as issues of one period are
redefined or even rendered irrelevant in another. Extracting trends from RCS data
typically requires smoothing to separate signal from noise, especially where samples
or subsamples are small, but this can lead to bias due to excessive smoothing or to
mistaking noise for signal because of sampling variability when there is not enough
smoothing. By deploying time the RCS design enables certain kinds of causal inference, but many alternative micro-processes are observationally equivalent, and so
the RCS benefits from being combined with the panel design.

INTRODUCTION
Repeated cross-sections (RCS) have been with us for decades. They appeared
as soon as an initial sample survey was followed by a second one that used
the same questions for a sample of the same population. Only in recent years
have we come to recognize the importance and utility of these data. Knowing about trends has intrinsic value, and for many indicators this requires
broadly comparable mass survey data repeated over time. And time is absolutely critical for establishing causal inferences that are the gold-standard for
good science.
For high-quality causal inferences from observational data, the starting
point is typically the panel survey, where the same persons are interviewed
Emerging Trends in the Social and Behavioral Sciences. Edited by Robert Scott and Stephen Kosslyn.
© 2015 John Wiley & Sons, Inc. ISBN 978-1-118-90077-2.

1

2

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

repeatedly so that individual change can be tracked and individual differences controlled. However, panels bring costs and benefits and are rarely
available in key historical moments. In their stead, we commonly resort to
pseudo-panels, or RCS.
At the microscopic extreme is daily sampling, increasingly common in the
study of electoral campaigns. Early studies of primary-election dynamics
(Bartels, 1988; Brady & Johnston, 1987) relied heavily on a 1984 weekly
RCS conducted as part of the American National Election Study (ANES).
Johnston, Blais, Brady, and Crête (1992) narrowed the focus to daily variation
in their pathbreaking study of a Canadian election campaign. This became
the model for the massive National Annenberg Election Study (Johnston,
Hagen, & Jamieson, 2004). New modes, especially the maturation of
Web-based surveys, and generally falling fieldwork costs are enabling more
researchers to get in on the action. The distinctive claim of daily sampling is
granularity. This is critical if, for example, shifts in preference or perception
are to be attributed to campaign events. But the logic of RCS extends well
beyond a daily or weekly time scale, as we show in the next section. Monthly
or quarterly surveys are a staple for government statistical agencies. Even
annual or quadrennial surveys are now accumulating enough waves for
dynamic analysis.
At the same time, we now recognize that initially unrelated surveys can
be lined up over time and analyzed together, especially as we think through
item equivalence and missing data problems. Brady and Kaplan (2012, 2012),
for example, assembled weekly and bi-weekly data from various sources for
a microscopic analysis of opinion in the collapse of the Soviet Union. Berinsky, Powell, Schickler, and Yohai (2011) portray opinion shifts in the 1930s
and 1940s in America, in some cases on a monthly time scale. Stimson (1999)
takes virtually the entire postwar period as his time span for annual readings
of Americans’ policy mood.
If the incorporation of time is a distinctive feature of the RCS, the scale of
time critically affects both the conduct of fieldwork and the analysis of data
after the fact. The shorter the time units the greater the burden on design and
execution of the sample to ensure daily samples that are truly random. The
longer the overall temporal span, the greater the burden on analysis to deal
with changes in the composition of the population.
WHAT ARE REPEATED CROSS-SECTIONAL SURVEYS?
RCS surveys involve the repeated administration of the same (or similar)
questions to a sample from the same (or a similar) survey population. Unlike
panel surveys, the same people are not necessarily interviewed in every survey; instead, typically a new sample is drawn each time the survey is fielded.

Repeated Cross-Sections in Survey Data

3

This approach avoids the costs of tracking people, eliminates the priming and
learning that may carry over from previous interviews, and ensures the representativeness of samples by avoiding attrition. Although RCS do not have
the power of equivalently sized panels, this is usually more than compensated for by their larger size. Although it is easiest to think in terms of sample
surveys, some of the logic extends to firms and organizations and even to successive Congresses or Supreme Court terms (Lebo & Weber, 2014), although
as we move away from sample surveys we enter the territory of “unbalanced
panels” (Honaker & King, 2010) in which some entities fall in and out of the
population.
Table 1 describes some of the major RCS data sets that come from political
science, economics, sociology, demography, and public health—although
the emphasis is on political science data. In every case, there are enough
cross-sections that time-series analysis can be undertaken. Even the quadrennial ANES Presidential surveys now comprise 17 temporal observations
from 1948 to 2012, and the new American Community Survey already has
eight yearly observations. For many of the other surveys there are hundreds
of time-series observations, and often parallel time series for geographic
sub-aggregates such as states.
From Table 1 it emerges that RCS studies can differ in three fundamental
ways:
•

•

•

Time between Cross-Sections. The time between repetitions of the survey
can vary more than three orders of magnitude from individual days
in some election-campaign tracking studies to 2–4 years (1461 days) in
social change and presidential election studies. This variation in the
period can affect the appropriateness of the questions and the nature of
the sample.
Number of Cross-Sections. Although the number of cross-sections could
be as few as two, we are generally interested in those cases where there
are enough cross-sections so that temporal trends can be identified and
analyzed. This means that we want at least 10–15 RCS, and preferably
many more. All the surveys listed in Table 1 have seven or more time
periods and the median number is about 80.
Number of Interviews at Each Time Point. The number of interviews at
each time point also varies by several orders of magnitude from 75 in
some daily or weekly tracking polls to 60,000 for the monthly Current
Population Survey to 3,000,000 in the annual American Community
Survey. The number of interviews affects the statistical accuracy of the
data and the degree to which it can be broken down by region and
subpopulation.

4

Monthly
≈2000

≈1000–4000

Soviet Collapse Data Set (1989–1991)

Roper Social and Political Trends
Dataset (1973–1994)

≈76

≈85 in 2005
≈100 in 2009

German Longitudinal Election Studies
(2005, 2009)

American National Election 1984
Continuous Monitoring

≈100–250

Annenberg National Election Studies
(2000, 2004, 2008)

Weekly or
bi-weekly

≈75

Canadian Election Studies—1988 (and
subsequent ones)

Daily

Interviews at
Each Time Point

Name of Study

Time between
Cross-Sections

207

≈80

46

41 in 2005
60 in 2009

≈365

≈50

# of CrossSections—
Time Points

Table 1
Examples of Time-Series Cross-Sectional Studies

—

Irregular spacing

—

Panels inside
RCS plus some
post-election
re-interviews
Post-election
re-interview

Post-election
re-interview

Panel or Unusual
Features?

Public opinion
and campaign
dynamics
Public opinion
and campaign
dynamics
Participation and
opinion during
collapse of SU
Political and
social
participation
over time

Public opinion
and campaign
dynamics
Public opinion
and campaign
dynamics

Subject Matter

5

Quarterly

1000+ per
country
≈7000

Eurobarometer (2–5 times per year)

US Consumer Expenditure Survey
(1980 to present)

≈480

Survey of Consumer Sentiment (1952;
micro-data from 1978-today)
≈1000

≈60,000

Current Population Survey (1940;
micro-data from 1962-present)

California Field Poll Data (1956–2006)

≈3000

American Mass Public in 1930s and
1940s–

≈120

39+

232

≈400

≈600

400

Diary component

—

—

—

Rotating panels

Irregular spacing

(Continued Overleaf)

Public opinion
and voting over
time
Employment,
poverty,
welfare over
time
Consumer
sentiment over
time
Public opinion
and voting over
time
Political and
social trends,
attitudes to EU
Consumer
expenditures
over time

6
≈1000–2000

7

≈3,000,000

American National Election
Presidential Studies (1948–2012)

49

≈80000

Quadrennially

25

≈1000–2000

American National Election Studies
(1956–2004) –Every 2 years
Integrated National Health Interview
Surveys (1963–2011)
American Community Survey
(2006-present)

17

27

≈2000

General Social Survey (1972–2010)

# of CrossSections—
Time Points

Yearly or
biennially

Interviews at
Each Time Point

Name of Study

Time between
Cross-Sections

Table 1
(Continued)

Some panels
embedded

First annual, now
2-years
Some panels
embedded
Continuous
sampling
Continuous
sampling

Panel or Unusual
Features?

Social trends
over time
Political trends
and elections
Health of
population
Demographics,
poverty, institutionalization
status,
employment
Political trends
and elections
over time

Subject Matter

Repeated Cross-Sections in Survey Data

7

Depending on how these dimensions are combined, an RCS can be sensitive to different kinds of change. Larger samples with more interviews at
each time point make it possible to distinguish true change from apparent
change due to sampling variability. True change can come in two varieties.
One is where the composition of the population is stable over time, but the
units change their behaviors or attitudes. An obvious example is, where
the members of an electorate decide that a candidate is better than they
had originally thought, perhaps based on a performance in a debate. The
other kind of change is compositional, where individuals do not change
their proclivities but are themselves replaced by those with different characteristics. Both kinds of change can coexist in the population, but their
relative prevalence and importance in samples depends on the interval
between cross-sections and the total length of the data collection period.
The shorter the time between cross-sections, the more that true change must
reflect change in individuals’ behavior or attitudes. Apart from sampling
variability, change in the sample’s propensities can have no other source
but the conversion of its component individuals. As the span of the series
increases, on the other hand, attention is forced to compositional change,
which typically takes years to register. Again, the passage of time may
make possible—may even be necessary for—individuals to change their
attitudes or characteristics, but the longer the span the more inevitable it
is that samples will also be drawn from populations comprising different
individuals.
Depending on where a given RCS sits on the dimensions of time unit, overall duration, and sample size per time unit, various challenges must be faced.
Commonly, addressing one challenge may only exacerbate another.
COMPARABILITY VERSUS REPRESENTATIVENESS
COMPARABILITY IN SAMPLES
At the high-frequency extreme of data collection the objective is to compare
one day’s results with those from the next day. To this end, the data collection strategy for each day should be identical, such that differences between
the days are the product of something that has happened in the interval, not
of differences in, say, accessibility or availability of respondents. However,
if we require that the date of interview is the same as the date of release of
the potential respondent’s contact information to the field the sample will be
very unrepresentative because many respondents will not be reached immediately. It takes time to “clear a sample,” and we know that the longer a poll is
in the field, the better it performs in providing predictions of elections (Lau,

8

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

1994). Even where prediction is not the objective it can be a factor in the credibility of the study. To combine granularity with representativeness, the key
is to work equally hard to contact respondents as they are released to the
field and to recognize that after a suitable number of days (perhaps a week),
those people who are interviewed on a given day will constitute a random
sample even though they come from different cohorts released on different
days. This approach was successfully used with the Canadian studies, the
National Annenberg Election Survey, and other studies. This implies careful
management of the release and clearance of the sample and acceptance that
the first days of fieldwork will not be useful for dynamic analysis (although
the cases will be perfectly usable in analyses not involving time). A similar
logic applies if the design involves changes in sampling fractions, although
here model-based weighting can be employed (Brady & Johnston, 2006; Johnston & Brady, 2002).
Even when the intent is to provide a representative sample of some
well-defined population (e.g., the United States), samples may differ over
time because data come from different survey houses, interviewing is added
in another language (e.g., Spanish), one method of interviewing is replaced
by another (e.g., in-person by telephone), or sampling methods change (e.g.,
more sampling points, or shift from clustering to a simple random draw).
For example, in their efforts to put together public opinion datasets on the
American mass public in the 1930s and the 1940s, Berinsky and Schickler
(2010) faced problems stemming from collecting surveys from four different
survey organizations and the widespread use of quota sampling. They
employed several model-based post-stratification methods to make the data
comparable over time (Berinsky, 2006). Similarly, using a series of CBS/New
York Times national polls from the 1988 election campaign, Gelman (2007)
discussed the strengths and limitations of post-stratification weighting in
the context of regression analysis. As Gelman notes, much more work has
to be done to figure out the appropriate statistical methods for solving
these problems. These problems are only compounded by the explosion in
tracking polls and the salience of aggregators, such as FiveThirtyEight1 or
Pollster2 .
COMPARABILITY OF SURVEY QUESTIONS
The questions asked on RCS often change over time because better versions
are constructed, because times change and questions must be modified,
or simply because different investigators have different beliefs about good
1. FiveThirtyEight (blog). http://fivethirtyeight.com/ (accessed 15 November 2014).
2. Huffington Post. Pollster. http://elections.huffingtonpost.com/pollster (accessed 10 September
2012).

Repeated Cross-Sections in Survey Data

9

questions. The problems created by these changes range from modifications
at one point in time (e.g., a new way to measure unemployment or a
new way to ask about liberal-conservative identification) to a mélange
of different ways of answering a similar question (e.g., the popularity of
some political figure asked with 3, 5, or 10 point scales and with different
words such as “approve,” “trust,” or “support”). Brady and Kaplan (2012,
2012) approached this problem by simultaneously modeling trends in the
data and the question formats using methods from test theory. Stimson
(1999, Appendix 1) considered an extreme version of this problem in which
he wanted to build factor scores (typically for a “liberal-conservative”
dimension) from dated items with only partially overlapping cases.
BIAS VERSUS VARIABILITY
One of the major reasons for considering RCS is to analyze trends, but in
many cases samples are so small that simply “connecting the dots” risks
confusing sampling variability for real change. Even when there are large
samples, a focus on small sub-populations quickly leads to the same problem. Identifying real change requires smoothing.
But how much? The simplest approach is simply to estimate a linear trend.
This may have the appeal of visual clarity but typically it provides far too
much smoothing. The truth is, we commonly lack a theory of events that
would tell us the shape of their impact over time that would guide smoothing, especially for highly granular variants of the RCS. Shaw (1999) gives an
inventory of events that might populate a Presidential campaign and outlined alternative time paths of effect. However, this is a primer on shapes to
look for; no real theory distinguishes the various paths. Hill, Lo, Vavreck, and
Zaller (2013) stake a strong claim for a particular path for impact from campaign ads, and this may be a starting point for thinking about how discrete
events, such as debates, might play out.
For the most part, however; we lack a strong basis on which to stake
dynamic claims. In statistics, this problem has spawned an enormous
literature on nonparametric smoothing techniques, such as the roughness
penalty approach (Eubank, 1999; Green & Silverman, 1994), kernel smoothing (Bowman & Azzalini, 1997; Hardle, 1990), local polynomial modeling
(Fan & Gijbels, 1996), the least squares spline approach (Smith, 1979), and
the free-knot approach of Mao and Zhao (2003). Examples in political data
are still rare but already the field shows variety. Brady and Johnston (2006)
discuss this “bias versus sampling variability” problem in the context of
daily cross-sections. They recommend a kernel smoothing approach and an
optimum smoothing criterion based on variances in the variable of interest.
Brady and Kaplan (2012, 2012) use least squares regression splines (Smith,

10

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

1979) with pre-chosen knots to estimate changes in public opinion in RCS.
Matthews and Johnston (2010) propose a semi-parametric component for the
economy that uses cubic smoothing splines with fixed equivalent degrees
of freedom. For a similar problem, Johnston and Partheymüller (2012) use
thin-plate regression splines (Wood, 2003) which estimate knot locations
and flexible functional forms. Four papers, four approaches. We hope that
in the coming years experts in nonparametric regression turn their attention
to RCS data, to suggest better ways of choosing knot locations, equivalent
degrees of freedom, and, indeed, the smoothing method itself.
Smoothing issues apply not just to the description of population
and subpopulation trends over time. They also enable estimation of
associations—and, critically, changes in the associations—among variables
over time. How, for example, are public attitudes and leadership approval
related to one another? How do events affect public attitudes? How do
changes in employment or marital status affect the up-take of social welfare
programs? How do changes in health status affect political participation,
employment, marriage, education, or a host of other things? What is
correlated with changes in party identification over time?
Before we can answer these questions we must get a better grip on a
number of issues: How can we build models which simultaneously include
covariates and smooth the data? How can we separate cross-sectional
association (e.g., party identification with policy attitudes) from time series
association (changes in party identification with changes in policy attitudes)?
Johnston and Brady (2002) propose a method for separating time-series
from cross-sectional effects that starts with a specific model of opinion
change. Earlier, these same authors (Johnston et al., 1992) proposed a simple
step function representation of varying parameters. Linear interactions of
factors with time have been ventured (e.g., Bartels, 2006a). Mebane and
Wand (1997) seem to be the first to try to extract patterns in individual
transitions from early US and Canadian RCS campaign data. Pelzer, Eisinga,
and Franses (2002) built on work by Moffitt (1993) and Franklin (1989) to
take this further. In truth, we do not have plausible theories for the exact
time path of varying coefficients. This points, again, to debates over how
smoothing methods help identify how true temporal change in one variable
relates to true change in another one. Eubank et al. (2004) use a smoothing
spline to estimate a varying coefficient model. The Brady and Kaplan (2012),
Johnston and Partheymüller (2012), and Matthews and Johnston (2010)
references cited earlier include covariates in their model as interaction terms
with time. All these approaches assume an absence of autocorrelation across
successive days, an assumption forced by the fact that the data are, after
all, not panels. Lebo and Weber (2014) propose a solution that involves
multi-level modeling. The strengths and weaknesses of these methods need

Repeated Cross-Sections in Survey Data

11

to be assessed, and new methods must be developed which will allow for
the flexible modeling of temporal trends and the incorporation of covariates
in models.
CHANGES IN THE UNIVERSE
Most of the discussion so far assumes that comparisons are across basically
unchanging universes of persons and content. The issues are ones of error, in
sampling or in measurement. The longer the time span for the study, however; the more the analysis engages issues in the comparability of populations and of attitude domains.
UNIVERSE OF PEOPLE
If we can assume that RCS surveys draw their sample from the same population over time, then the impact of an explanatory variable on different groups
in the population (say the impact of a change in wages on female labor supply or the impact of a change in unemployment on presidential popularity
among men) can be determined by looking at what happens to the dependent
variables (female labor supply in the first instance and presidential popularity among men in the second) from one period to the next as changes occur
in wages or unemployment. In effect, the fixed characteristic of a person (in
this case his or her sex) is used to create a “pseudo-panel” of similar people
whose reactions can be measured by conditioning on these characteristics.
Tellingly, a pseudo-panel is sometimes called a synthetic cohort.
However, this method only works if the populations of each group stay
the same from one period to the next, and this assumption is surely wrong,
since day by day people age so that new people enter into the eligible
population (by being born or by turning 18) and people die. In addition,
the way in which a factor affects the dependent variable may change from
one period to the next or its impact may depend upon the cohort of people
exposed to the factor (and these cohorts might change because of emigration
or immigration). These changes are probably negligible for relatively short
time periods—maybe even as long as several years, but at some point they
must be confronted. In sum, as time spans expand, we can no longer pretend
that an RCS is a single synthetic cohort; “Age-Period-Cohort” problems
must be confronted directly. A thought experiment illustrates the problem: in
the 1980s, one could give a quite complete account of US politics and yet say
nothing about Hispanics; could one imagine doing so today? Any attempt
to track issue evolution over this span must allow for this transformation in
the electorate’s coalition possibilities (Shafer & Spady, 2014).

12

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Certainly any study of long-term trends must take into account these kinds
of changes that stem from changes in age, the impacts of the periods in which
people live, and the cohorts of which they are a member. Sociologists have
done the most to consider these factors. The classic work is Mason and Fienberg (1985), which considered ways to deal with the basic identification problem created by the fact that one’s age (say 50) in a given period (say the year
1965) equals period minus cohort as defined by year of birth (1915). More
recently, Yang Yang and Kenneth Land (2006) and Yang Yang (2006) have
developed mixed (fixed and random effects) models of trend data in RCS
surveys (e.g., the General Social Survey) which identify and estimate age,
period, and cohort components of change. Another example of this line of
research is Devereux’s (2007) analysis of biases in synthetic cohort models.
These papers suggest ways to deal with the non-fixity of the underlying population, and they should be further developed so that they can be combined
with analyses that go beyond Age-Period-Cohort analysis.
UNIVERSE OF CONTENT
The issues flagged above for the comparability of survey questions are compounded as time spans several decades. Some issue domains simply have no
counterparts across the periods. Even persistent domains—race relations in
the United States, to take an example—change their surface content. Questions from the 1950s about, say, school integration and rights to sit at lunch
counters, would be almost incomprehensible to current respondents. The
usual response is to go minimal and look for questions with maximum correspondence, often focusing on single items (Carmines & Stimson, 1989; Shafer
& Johnston, 2006). Attempts to model the landscape more broadly do not
have this luxury, and analysts are shifting to combinations of exploratory and
confirmatory factor analyses (Claggett & Shafer, 2010) and to item-response
theory (Shafer & Spady, 2014). Once again, better theories about how attitudes are affected by social and cultural change over time might be very
useful (MacKinnon & Luke, 2002).
CAUSAL INFERENCE
Describing trends and estimating associations between variables are important scientific tasks, but we often want to estimate causal impacts as well.
This takes us back to where we started. True panel data have the advantage
that we can control for fixed effects, add lagged endogenous variables, and
develop dynamic models of change (Bartels, 2006b; Hsiao, 2003), which can
eliminate many threats to making reliable inferences. The pseudo panel
approach cannot do these things directly, but RCS surveys have some

Repeated Cross-Sections in Survey Data

13

advantages over panels. Their temporal granularity makes it possible to
detect changes that might be missed by panels spaced far apart (Brady &
Johnston, 2006). They are less likely to suffer from panel attrition which can
create selection bias problems for panel studies. Moreover, the sheer number
of repeated temporal data points (sometimes in the hundreds compared to
panel studies which rarely have more than five waves) makes it possible to
study the impact of putative explanatory variables both as they go up and
as they go down, which provides more leverage for causal inference.
Starting with Deaton (1985), econometricians have developed and
expanded on the pseudo-panel concept. After Franklin’s (1989) and Moffitt’s
(1993) papers made some conceptual breakthroughs, many others have
followed (Collado, 1997; Pelzer, Eisinga, & Franses, 2002, 2005; Ridder &
Moffitt, 2007; Verbeek, 2008; Verbeek and Nijman, 1993; Verbeek & Vella,
2005). Athey and Imbens (2006; see also Abadie, 2005; Manski & Pepper,
2012) used the Neyman-Rubin-Holland counterfactual outcomes approach
to develop a very general framework for thinking about “nonparametric
identification, estimation, and inference for the average effect of the treatment for settings where RCS of individuals are observed in a treatment and a
control group, before and after the treatment” (432). Even if RCS designs can
be sensitive to fine temporal distinctions, the patterns they identify can be
consistent with quite different micro-mechanisms. Lenz (2009), for example,
notes that shifts in coefficients that Johnston et al. (1992, 2004) attribute to
priming are observationally equivalent to effects from learning and opinion
change. He deploys true panels to identify the mechanisms.
For high-frequency RCS designs, campaign studies for example, it is
possible to combine repetition of cross-sections with true panels. Canadian
Election Studies are, in one sense, just pre-post panels, with the first wave
released in a rolling manner. Even this simple design gives causal leverage,
provided the key questions appear at each wave (Johnston & Partheymüller,
2012; Lenz, 2009). More elaborate designs, including a proper baseline,
would be more effective for inference, if at the cost of representativeness.
And such designs are starting to appear, thanks to the power and flexibility
of the online mode. Goldman (2012) uses the five-wave RCS-panel combination, part of the 2008 National Annenberg Election Survey (http://www.
annenbergpublicpolicycenter.org/political-communication/naes/),
for
an account of racial politics. Faas and Blumenberg (2013) apply the
design to election and referendum campaigns in the German state of
Baden-Württemburg.

14

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

CONCLUSIONS
As mentioned earlier, RCS have been with us for a long time. With hindsight,
we now see the commonalities—but also the distinctions—among such data
sets. The critical groundwork has been divided among economists, sociologists, political scientists, and others, with the effect that one disciplinary
groups proceeds largely in ignorance of the others. Now we are poised to
learn from each other. The possibilities for future discoveries are exciting.
No less exciting are the opportunities to repurpose data and learn about our
past.
REFERENCES
Abadie, A. (2005). Semiparametric difference-in-differences estimators. The Review of
Economic Studies, 72, 1–19.
Athey, S., & Imbens, G. W. (2006). Identification and inference in nonlinear
difference-in-differences models. Econometrica, 74, 431–497.
Bartels, L. M. (1988). Presidential primaries and the dynamics of public choice. Princeton,
NJ: Princeton University Press.
Bartels, L. M. (2006a). Priming and persuasion in presidential campaigns. In H. E.
Brady & R. Johnston (Eds.), Capturing campaign effects (pp. 78–112). Ann Arbor:
University of Michigan Press.
Bartels, L. M. (2006b). Three virtues of panel data for the analysis of campaign effects.
In H. E. Brady & R. Johnston (Eds.), Capturing campaign effects (pp. 134–163). Ann
Arbor: University of Michigan Press.
Berinsky, A., Powell, E. N., Schickler, E., & Yohai, I. (2011). Revisiting Public Opinion
in the 1930s and 1940s. PS: Political Science and Politics, 44, 515–520.
Berinsky, A & Schickler, E. (2010). Collaborative Research: The American Mass
Public in the 1930s and 1940s. National Science Foundation, Political Science Program Grant, http://web.mit.edu/berinsky/www/nsf.pdf (accessed 10 September 2012).
Berinsky, A. (2006). American public opinion in the 1930s and 1940s: The analysis of
quota-controlled sample survey data. The Public Opinion Quarterly, 70, 499–529.
Bowman, A., & Azzalini, A. (1997). Applied smoothing techniques for data analysis: The
kernel approach with S+ illustrations. Oxford, England: Clarendon Press.
Brady, H. E., & Johnston, R. (1987). What’s the primary message: Horse race or issue
journalism?. In G. R. Orren & N. W. Polsby (Eds.), Media and momentum: The new
Hampshire primary and nomination politics (pp. 127–186). Chatham House: Chatham,
NJ.
Brady, H. E., & Johnston, R. (2006). The rolling cross-section and causal attribution.
Capturing campaign effects (pp. 164–195). Ann Arbor: University of Michigan Press.
Brady, H. E. & Kaplan, C. (2012). A Least-Squares Spline Method for Identifying Trends
in Participation in Informal Political Groups in the Soviet Union from January 1989 to
January 1992 Using a New Rolling Cross-Section Data Set. Paper presented at the
Midwest Political Science Association Annual Meeting, April 2012, Chicago, IL.

Repeated Cross-Sections in Survey Data

15

Brady, H. E. & Kaplan, C. (2012). Political Opinion in the Collapse of the USSR: A
Reassessment Twenty Years Later Using a New Consolidated and Linked Data Set. Paper
presented at the American Political Science Association Annual Meeting, August
2012, New Orleans, LA.
Carmines, E. G., & Stimson, J. A. (1989). Issue evolution: Race and the transformation of
American politics. Princeton, NJ: Princeton University Press.
Claggett, W. J. M., & Shafer, B. E. (2010). The American public mind: The issue structures
of mass politics in the postwar years. Cambridge, England: Cambridge University
Press.
Collado, D. M. (1997). Estimating dynamic models from time series of independent
cross-sections. Journal of Econometrics, 82, 37–62.
Deaton, A. (1985). Panel data from time series of cross-sections. Journal of Econometrics, 20, 109–126.
Devereux, P. J. (2007). Small-sample bias in synthetic cohort models of labor supply.
Journal of Applied Econometrics, 22, 839–848.
Eubank, R. L. (1999). Nonparameteric regression and spline smoothing (2nd ed.). New
York, NY: Marcel Dekker.
Eubank, R. L., Huang, C., Maldonado, M., Wang, N., Wang, S., & Buchanan, R. J.
(2004). Smoothing spline estimation in varying-coeffeicient models. Journal of the
Royal Statistical Society, Series B: Statistical Methodology, 66, 653–667.
Faas, T. & Blumenberg, J. N. (2013) Measuring Dynamics: A Rolling Panel Study in the
run-up to the Baden-Wuerttemberg state election 2011. Presented to the 71st Annual
Conference of the Midwest Political Science Association, Chicago, IL.
Fan, J., & Gijbels, I. (1996). Local polynomial modelling and its applications. London, England: Chapman and Hall.
Franklin, C. H. (1989). Estimation across data sets: Two-stage auxiliary instrumental
variables estimation (2SAIV). Political Analysis, 1, 1–23.
Gelman, A. (2007). Struggles with survey weighting and regression modeling. Statistical Science, 22, 153–164.
Goldman, S. K. (2012). Effects of the 2008 Obama presidential campaign on white
racial prejudice. Public Opinion Quarterly, 76, 663–687.
Green, P. J., & Silverman, B. W. (1994). Nonparametric regression and generalized linear
models: A roughness penalty approach. London, England: Chapman and Hall.
Hardle, W. (1990). Applied nonparametric regression. Cambridge, England: Cambridge
University Press.
Hill, S., Lo, J., Vavreck, L., & Zaller, J. (2013). How quickly we forget: The duration of persuasion effects from mass communication. Political Communication, 30,
521–547.
Honaker, J., & King, G. (2010). What to do about missing values in time-series
cross-section data. American Journal of Political Science, 54, 561–581.
Hsiao, C. (2003). Analysis of panel data (2nd ed.). New York, NY: Cambridge University Press.
Johnston, R., Blais, A., Brady, H. E., & Crête, J. (1992). Letting the people decide: Dynamics of a Canadian election. Stanford, CA: Stanford University Press.

16

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Johnston, R., & Brady, H. E. (2002). The rolling cross-section design. Electoral Studies,
21, 283–295. Reprinted in Mark N. Franklin and Christopher Wlezien (editors), The
Future of Election Studies. Oxford, England: Pergamon Press.
Johnston, R., Hagen, M. G., & Jamieson, K. H. (2004). The 2000 presidential election and
the foundations of party politics. Cambridge, England: Cambridge University Press.
Johnston, R & Partheymüller, J. (2012). Campaign Activation in German Elections: Evidence from 2005 and 2009. Paper presented at the American Political Science Association Annual Meeting, August 2012, New Orleans, LA.
Lau, R. R. (1994). An analysis of the accuracy of “Trial Heat” polls during the 1992
presidential election. Public Opinion Quarterly, 58, 2–20.
Lebo, M., & Weber, C. (2014). An effective approach to the repeated cross sectional
design. American Journal of Political Science. doi:10.1111/ajps.12095.
Lenz, G. (2009). Learning and opinion change, not priming: Reconsidering the evidence for the priming hypothesis. American Journal of Political Science, 53, 821–837.
Manski, C. F. & Pepper, J. V. (2012). Partial Identification of the Treatment Response with
Data on Repeated Cross Sections. Working Paper http://faculty.wcas.northwestern.
edu/∼cfm754/tr_rcs.pdf (accessed 10 September 2012).
MacKinnon, N. J., & Luke, A. (2002). Changes in identity attitudes as reflections of
social and cultural change. The Canadian Journal of Sociology, 27(3), 299–338.
Mao, W., & Zhao, L. H. (2003). Free-knot polynomial splines with confidence intervals. Journal of the Royal Statistical Society, 65(4), 901–919.
Mason, W., & Fienberg, S. (Eds.) (1985). Cohort analysis in social research: Beyond the
identification problem. New York, NY: Springer Verlag.
Matthews, J. S., & Johnston, R. (2010). The campaign dynamics of economic voting.
Electoral Studies, 29, 13–24.
Mebane, W. R. & Wand, J. (1997). Markov Chain Models for Rolling Cross-section Data:
How Campaign Events and Political Awareness Affect Vote Intentions and Partisanship in the United States and Canada. Presented to the Annual meeting of the
Midwest Political Science Association, Chicago, IL. http://polmeth.wustl.edu/
mediaDetail.php?docId=446 (accessed 10 September 2012).
Moffitt, R. (1993). Identification and estimation of dynamic models with a time series
of repeated cross-sections. Journal of Econometrics, 59, 99–124.
Pelzer, B., Eisinga, R., & Franses, P. H. (2002). Inferring transition probabilities from
repeated cross sections. Political Analysis, 10, 113–133.
Pelzer, B., Eisinga, R., & Franses, P. H. (2005). ‘Panelizing’ repeated cross sections:
Female labor force participation in the Netherlands and West Germany. Quality &
Quantity, 39, 155–174.
Ridder, G., & Moffitt, R. (2007). The econometrics of data combination. In J. J. Heckman & E. E. Leamer (Eds.), Handbook of econometrics (Vol. 6, 1st ed. Chapter 75).
Shafer, B. E., & Johnston, R. (2006). The end of southern exceptionalism: Class, race, and
partisan change in the postwar south. Cambridge, MA: Harvard University Press.
Shafer, B. E., & Spady, R. H. (2014). The American political landscape. Cambridge, MA:
Harvard University Press.
Shaw, D. R. (1999). A study of presidential campaign effects from 1952 to 1992. Journal
of Politics, 61, 387–422.

Repeated Cross-Sections in Survey Data

17

Smith, P. L. (1979). Splines as a useful and convenient statistical tool. American Statistician, 33, 57–62.
Stimson, J. A. (1999). Public opinion in America: Moods, cycles, and swings (2nd ed.).
Boulder, CO: Westview.
Verbeek, M. (2008). Pseudo panels and repeated cross-sections. In L. Mátyás & P.
Sevestre (Eds.), The econometrics of panel data: Fundamentals and recent developments
in theory and practice (3rd ed., pp. 369–383). Berlin, Germany: Springer.
Verbeek, M., & Nijman, T. (1993). Minimum MSE estimation of a regression model
with fixed effects from a series of cross-sections. Journal of Econometrics, 59,
125–136.
Verbeek, M., & Vella, F. (2005). Estimating dynamic models from repeated crosssections. Journal of Econometrics, 127, 83–102.
Wood, S. N. (2003). Thin plate regression splines. Journal of the Royal Statistical Society,
Series B: Statistical Methodology, 65, 95–114.
Yang, Y. (2006). Bayesian inference for hierarchical age-period-cohort models of
repeated cross-section survey data. Sociological Methodology, 36, 39–74.
Yang, Y., & Land, K. C. (2006). A mixed models approach to the age-period-cohort
analysis of repeated cross-section surveys, with and application to data on trends
in verbal test scores. Sociological Methodology, 36, 75–97.

HENRY E. BRADY SHORT BIOGRAPHY
Henry E. Brady is Dean of the Goldman School of Public Policy and Class of
1941 Monroe Deutsch Professor of Political Science and Public Policy at the
University of California, Berkeley. He received his PhD in Economics and
Political Science from MIT in 1980, and he has written extensively on political methodology. He is coauthor or coeditor of nine books including coeditor
of Rethinking Social Inquiry (2004), Capturing Campaign Effects (2006), and the
Oxford Handbook of Political Methodology (2008). He has been president of the
American Political Science Association, president of the Political Methodology Society, and director of the University of California’s Survey Research
Center from 1998 to 2009. He was elected a Fellow of the American Academy
of Arts and Sciences in 2003, a Fellow of the American Association for the
Advancement of Science in 2006, and a Fellow of the Political Methodology
Society in 2008. He received the Career Achievement Award of the Political
Methodology Society in 2012.
RICHARD JOHNSTON SHORT BIOGRAPHY
Richard Johnston (PhD Stanford) is Professor of Political Science and Canada
Research Chair in Public Opinion, Elections, and Representation at the University of British Columbia. He has also taught at the University of Toronto,
the California Institute of Technology, Harvard University (Mackenzie King

18

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

chair, 1994–1995), and the University of Pennsylvania. He was an Associate
Member of Nuffield College, Oxford, a Marie Curie Research Fellow at the
European University Institute, and official visitor at MZES, Mannheim. He
is the author or coauthor of five books, three on Canadian politics and two
on US Politics. He has coedited three other books and has written numerous
articles and book chapters. In 2007–2008 he was President of the Canadian
Political Science Association. He was Principal Investigator of the 1988 and
1992–1993 Canadian Election Studies and Research Director for the National
Annenberg Election Survey (Penn), 2000–2008. Much of his work focuses on
elections and public opinion, with special reference to the role of mass communications and campaigns.
RELATED ESSAYS
Social Epigenetics: Incorporating Epigenetic Effects as Social Cause and
Consequence (Sociology), Douglas L. Anderton and Kathleen F. Arcaro
To Flop Is Human: Inventing Better Scientific Approaches to Anticipating
Failure (Methods), Robert Boruch and Alan Ruby
Ambulatory Assessment: Methods for Studying Everyday Life (Methods),
Tamlin S. Conner and Matthias R. Mehl
Models of Nonlinear Growth (Methods), Patrick Coulombe and James P.
Selig
Quantile Regression Methods (Methods), Bernd Fitzenberger and Ralf
Andreas Wilke
The Evidence-Based Practice Movement (Sociology), Edward W. Gondolf
Meta-Analysis (Methods), Larry V. Hedges and Martyna Citkowicz
The Use of Geophysical Survey in Archaeology (Methods), Timothy J.
Horsley
Network Research Experiments (Methods), Allen L. Linton and Betsy Sinclair
Longitudinal Data Analysis (Methods), Todd D. Little et al.
Structural Equation Modeling and Latent Variable Approaches (Methods),
Alex Liu
Data Mining (Methods), Gregg R. Murray and Anthony Scime
Remote Sensing with Satellite Technology (Archaeology), Sarah Parcak
Quasi-Experiments (Methods), Charles S. Reichard
Digital Methods for Web Research (Methods), Richard Rogers
Virtual Worlds as Laboratories (Methods), Travis L. Ross et al.
Modeling Life Course Structure: The Triple Helix (Sociology), Tom Schuller
Content Analysis (Methods), Steven E. Stemler
Person-Centered Analysis (Methods), Alexander von Eye and Wolfgang
Wiedermann
Translational Sociology (Sociology), Elaine Wethington