Skip to main content

Resource Limitations in Visual Cognition

Item

Title
Resource Limitations in Visual Cognition
Author
Liverence, Brandon M.
Franconeri, Steven L.
Research Area
Cognition and Emotions
Topic
Cognitive Development
Abstract
Visual attention and visual working memory are two of the core resources that support visual perception. Foundational research has demonstrated that these resources are highly limited, but an active debate concerns exactly how they are limited. While many classic studies suggested that these resources are fundamentally discrete, with fixed capacity of 3–4 objects maximum, a number of recent studies have argued that these resources are fundamentally continuous, with no fixed upper‐bound to the number of objects that can be attended or remembered. This entry reviews the state of this debate, and shows how convergence between these (often separate) areas of research is a major emerging trend in the field of visual cognition.
Identifier
etrds0287
extracted text
Resource Limitations
in Visual Cognition
BRANDON M. LIVERENCE and STEVEN L. FRANCONERI

Abstract
Visual attention and visual working memory are two of the core resources that support visual perception. Foundational research has demonstrated that these resources
are highly limited, but an active debate concerns exactly how they are limited. While
many classic studies suggested that these resources are fundamentally discrete, with
fixed capacity of 3–4 objects maximum, a number of recent studies have argued that
these resources are fundamentally continuous, with no fixed upper-bound to the
number of objects that can be attended or remembered. This entry reviews the state
of this debate, and shows how convergence between these (often separate) areas of
research is a major emerging trend in the field of visual cognition.

INTRODUCTION
The visual system is constantly overwhelmed with information. As the
amount of input registered in early vision far outstrips the capacity of more
computationally expensive later stages of visual processing, it is impossible
to fully process and perceive everything in view at any given moment.
Additionally, because low-level visual input is frequently in flux (due to
blinks, eye movements, and physical changes in the environment), the visual
system has to solve tricky correspondence problems in order to maintain
perceptual stability. To meet these challenges, vision relies on a pair of core
resources: visual attention, which serves as a filter to ensure that only relevant
objects are fully processed, and visual working memory (VWM), which
supports perceptual stability by providing a temporary storage for recent
visual input. Unfortunately, these resources are highly limited, and there are
often limits to the number of objects that can be simultaneously attended or
actively remembered. When these limits are exceeded, dramatic failures of
visual awareness can occur (e.g., inattentional blindness and change blindness).
This entry will explore the nature of these visual resources and address the
following questions along the way: How many objects can be attended or
Emerging Trends in the Social and Behavioral Sciences. Edited by Robert Scott and Stephen Kosslyn.
© 2015 John Wiley & Sons, Inc. ISBN 978-1-118-90077-2.

1

2

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

remembered at one time? Are these resources fundamentally discrete (with
fixed precision) or continuous (with variable precision)? What explains these
limits, and are they fixed, or are there ways to increase one’s resources?
While drawing novel connections between parallel work on visual attention
and VWM, this essay will show that their convergence—both theoretical
and methodological—represents a major emerging trend in visual cognition.
FOUNDATIONAL RESEARCH ON VISUAL WORKING
MEMORY RESOURCES
VWM is a highly limited resource, as is clear from demonstrations of
“change blindness” in which observers fail to detect dramatic changes
occurring between glances of a scene (Rensink et al., 1997; Simons & Levin,
1997; cf. Scott-Brown et al., 2000, for an alternative discussion of how such
effects may reflect “comparison blindness” rather than memory limitations).
A simplified version of this change detection paradigm has become a common
way to measure VWM. In one seminal study (Luck & Vogel, 1997), observers
briefly viewed 1–12 objects that disappeared briefly and then reappeared,
with a change occurring to a single object on some trials. Observers’
performance at detecting these changes suggested that they could store
the features of only ∼4 objects per trial, confirming that VWM capacity is
quite low. Intriguingly, observers were just as good at noticing changes to
objects that had only one feature as to objects that could change along any
of four feature dimensions, which led the authors to conclude that VWM
is a fundamentally discrete resource constrained by the number of objects
stored rather than their complexity. While subsequent studies challenged the
strongest versions of this hypothesis (Wheeler & Treisman, 2002; Xu, 2002),
the basic finding of an upper limit in VWM capacity of 3–4 fairly simple
objects has been repeatedly verified (e.g., Awh et al., 2007; Vogel et al., 2001).
An influential study (Zhang & Luck, 2008) using a continuous report
paradigm provides even more powerful evidence for discrete VWM
resources. Observers briefly viewed 1–6 objects and then reported a test
object’s feature value from memory (e.g., color) by selecting it from a
continuous circular distribution (e.g., a color wheel). When the data were fit
to a mixture model with a normally-distributed component (reflecting trials
in which the probed item was noisily encoded) and a uniform component
(reflecting random guessing for unencoded items), they found that the
uniform component sharply increased from set size 3–6 but that the standard deviation of the normally-distributed component (a measure of the
precision of encoding) did not change. This suggests that once participants
had fully allocated their ∼3 fixed-capacity VWM “slots,” they failed to
encode any information from additional items and had to guess at random.

Resource Limitations in Visual Cognition

3

Precision also decreased from set size 1–3, which the authors explain in
terms of a “Slots + Averaging” model: for set sizes under 3, participants
allocate multiple slots (each containing some independent noise) to each
item allowing them to improve their performance by averaging across
multiple noisy representations. Critically, however; the Slots + Averaging
cannot account for recent findings that the typical drop in precision from
set size 1–2 is larger that would be predicted by averaging (Bays et al., 2009)
and that under some conditions there is no drop in precision in this range
whatsoever (Bae & Flombaum, 2013).
Another source of evidence for discrete VWM resource limits comes
from investigations of the contralateral delay activity (CDA), a putative
electrophysiological index of the number of items stored in VWM that
increases with memory load but reaches plateau at ∼3 objects in typical
observers (Anderson et al., 2011; Vogel & Machizawa, 2004). Notably, the
CDA also indexes set size in a multifocal attention task (as reviewed below;
Drew & Vogel, 2008; Drew et al., 2011), suggesting that it may reflect spatial
selection rather than memory, per se. Also, if the CDA fundamentally reflects
discrete memory “slots,” then following the logic of Slots + Averaging, all
slots should be used in parallel even at set sizes 1–2 to improve precision
via averaging, implying that there should not be differences in the CDA at
small versus large set sizes.
CUTTING-EDGE RESEARCH ON RESOURCE LIMITATIONS
IN VISUAL WORKING MEMORY
One of the central debates in VWM research concerns whether VWM
resources are constituted by discrete fixed-capacity “slots,” or a flexible,
continuous resource that can be variably allocated to any number of objects.
Critically, continuous resource models predict tradeoffs between complexity
and capacity, as complex objects are assumed to require more resources
to encode. Strong evidence for such effects came from a study (Alvarez &
Cavanagh, 2004) that tested VWM capacity using stimuli ranging from very
simple color patches to highly complex objects, such as Chinese characters
or multi-shade cubes. The authors used both a change-detection task to
estimate memory capacity for these different stimulus types, and a separate
speeded-search task to quantify the complexity of each stimulus type. They
observed an almost perfect linear correlation between these measures,
verifying that memory capacity was much lower for the most complex
objects (e.g., only ∼1.5 cubes could be stored per trial).
Other work has shown that it is possible to store more than four representations in VWM, albeit at low precision. Bays & Husain (2008) found that in
a spatial memory task, VWM resources could be spread among up to at least

4

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

six objects, and that at increased set sizes there was a concomitant decrease
in precision that follows a power law, consistent with a continuous resource
being spread ever more thinly. There is also evidence that different objects in
a scene can receive differing amounts of memory resources, with numerous
studies finding that some objects can be prioritized (via cueing) over others
(Bays & Husain, 2008; Gorgoraptis et al., 2011). Even in the absence of cues,
memory precision seems to naturally vary between objects and across trials,
consistent with continuous resources (Fougnie et al., 2012).
The upper threshold of ∼3–4 representations observed in classical VWM
tasks may reflect a tendency to represent a subset of items in high resolution
and a subset in low resolution, with these low resolution representations
being treated incorrectly by many models as “guesses” rather than as
low-precision “hits” (van den Berg et al., 2012), though the question of
whether participants ever truly guess at random (due to a complete failure
to encode any detail from the target) is a matter of debate (cf. Fougnie
et al., 2012). Alternatively, VWM resources may be continuous—even while
behavioral performance exhibits strict (and seemingly discrete) capacity
limits—because VWM relies on a fundamentally discrete indexing resource
(possibly attention) to link VWM representations to spatial locations (Xu &
Chun, 2009). Even if observers can store more than four representations,
they may be unable to accurately link all of these representations to items
in the test display, causing them to make accurate responses to the wrong
items (Bays et al., 2009; Emrich & Ferber, 2012).
These results are also compatible with a recent perspective that suggests
that the fundamental units of VWM are not objects, but rather hierarchical
feature bundles that encompass both object-based advantages in storing individual features and higher-order regularities (e.g., spatial and featural) that
emerge across collections of objects, and that can enhance VWM capacity via
compression and summary statistics (Brady et al., 2011). Thus, even if there is
a capacity limit to the number of objects that can stored independently with
great precision, the actual capacity of VWM may be much higher because
higher-order regularities may be encoded across all objects in the scene.
The work reviewed thus far has focused on the capacity and nature of VWM
resources. However, to truly explain why and how these resources are limited, it is critical to consider computational and neural models of VWM. For
example, multi-object working memory can be implemented via local spatial
interactions between neurons following a “Mexican Hat” formation in which
inhibition is high near the peak of each representation’s activation and falls
off with increasing distance. Such interactions help keep neighboring representations separate, but also keep overall inhibition within the network low
enough to allow multiple items to be represented in parallel. Given certain
parameterizations of these interactions, such neural models mimic typical

Resource Limitations in Visual Cognition

5

human VWM capacity limits in change-detection (Johnson et al., 2008). Relatedly, a recent, biologically plausible model (Wei et al., 2012) illustrates how
modeling representations as “activity bumps” in a continuous pool of neurons with excitatory and inhibitory properties gives rise to properties typically associated with both discrete and continuous models of VWM.
Instead of conceptualizing VWM resources as having a fixed upper bound
capacity, a similar recent proposal suggests that these resources are fundamentally unlimited, with observed limits in task performance arising from
spatial competition between representations in content maps (Franconeri,
2013; Franconeri et al., 2013). Content maps are extensions of functionallyand spatially-organized neural substrate, and representations occupy
physical locations within these neural maps. Objects that are physically
close in visual or feature space thus become represented in neighboring
regions of visual cortex, and must therefore compete for the same pool of
neural resources. Visual capacity limits are thus a byproduct of the physical
limitations of neural real estate and the competitive interactions that emerge
between representations in these maps. For example, many VWM errors can
be accounted for as spatial mismatches between sample and test items (Bays
et al., 2009; Emrich & Ferber, 2012), and decreasing similarity in integral feature dimensions (e.g., color in a brightness memory task) increases precision
in VWM (Bae & Flombaum, 2013), suggesting that representational capacity
may be much higher than spatial indexing capacity.
Alternatively, VWM resource limitations may arise due to purely temporal properties of the neural substrates of memory. According to the principle
of oscillatory multiplexing, memory representations are encoded in oscillating patterns of global brain activity. Lisman and Idiart (1995) suggest that
a working memory capacity of 7 ± 1 could be derived from the number
of high-frequency gamma (40 Hz) brain oscillations that fit within a single
low-frequency alpha–theta (5–12 Hz) oscillation, though more recent formulations (Raffone & Wolters, 2001; Siegel et al., 2009) have argued for values
closer to the classic “slots” capacity of 4 ± 1 items. A major advantage of these
theories is that they show how brains that are inherently continuous (having
pools of billions of neurons) can nonetheless behave in a way that is discrete and slot-like. The downside is that there is very little direct empirical
support for these theories thus far. Furthermore, while multiplexing theories
are typically associated with discrete resource theories, they could also be
consistent with continuous resources if variability in these oscillatory rates
is causally related to memory precision. For example, different patterns of
oscillation might result in higher capacity but lower precision because each
representation receives a smaller temporal share of memory resources.
Of course, a computational model that can simulate characteristics of
human performance via careful tweaking of free parameters is not especially

6

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

compelling in and of itself. Thus, the challenge is to link such parameters
to actual neural measures and individual differences in human behavioral
performance (e.g., the amount and shape of “inhibition” observed between
neighboring representations on some VWM task), and then use these
parameters to predict individual differences in resource capacity.
FOUNDATIONAL RESEARCH ON RESOURCE LIMITATIONS
IN VISUAL ATTENTION
When observers focus all of their attention on a challenging primary task,
even highly salient changes (e.g., the sudden appearance of a gorilla) can
go unnoticed (Mack & Rock, 1998; Simons & Chabris, 1999). This effect,
called inattentional blindness, is thought to reflect the limited capacity of
attention—when attention is “used up” by the primary task, there are insufficient resources to detect the salient changes. While such demonstrations
reveal that attentional resources are limited, other measures focus on quantifying these limitations. In multiple object tracking (MOT), observers see a
display filled with identical objects, some of which are cued as “targets.” All
objects then move independently (and often, unpredictably), and observers
must keep track of the targets’ positions throughout the movement and later
reidentify them. These studies showed that people could track at least four
objects (Pylyshyn & Storm, 1988; Yantis, 1992) suggesting an underlying
capacity limit in the ability to divide attentional resources (Pylyshyn &
Storm, 1988). This limit matched values obtained from broader literatures
on memory (Cowan, 2001), leading some to suggest that VWM and visual
attention rely (at least partially) upon a common pool of resources, with
VWM being constrained by a form of “inward-directed” attention (Chun,
2011; Gazzaley & Nobre, 2012).
CUTTING-EDGE RESEARCH ON RESOURCE LIMITATIONS
IN VISUAL ATTENTION
Mirroring the debate in the VWM literature over whether capacity reflects a
fixed number of discrete object-based “slots” or a more continuous resource,
new research on MOT explores similar divisions. Like VWM, the upper limit
on capacity does not appear to be fixed—recent studies show that there are in
fact display conditions where tracking capacity can be raised to eight objects
at once (Alvarez & Franconeri, 2007). Like VWM, there are arguments that
objects are represented not as individuals in “slots,” but that higher-order
structures (hierarchical features bundles for VWM, e.g., Brady et al., 2011)
might help compress position representations of objects. For example, there
is evidence that tracked objects might be organized by common fate or as

Resource Limitations in Visual Cognition

7

vertices of a rigid polygon (Yantis, 1992, see Scimeca & Franconeri, 2015, for
discussion).
There are also several demonstrations of performance limits that appear to
reflect continuous allocation of processing resources. For example, participants are capable of tracking up to eight objects at a time, but only when the
objects move very slowly (Alvarez & Franconeri, 2007). Relatedly, at very fast
speeds only a single object can be successfully tracked (Holcombe & Chen,
2012). These results may suggest a capacity-precision tradeoff in visual
attention, with faster moving objects demanding additional resources,
though such explanations are controversial (cf. Franconeri et al., 2010).
Given how instrumental continuous report measures have been in recent
studies of the nature of VWM resources, the development of continuous
report measures in MOT may similarly inform debate over the nature of
visual attention resources. For example, a recent study in which “targets” in
a simplified tracking task disappeared at the end of each trial and then had
to be localized from memory via mouseclicks found that with increasing
tracking load, response clicks lagged increasingly far behind each target’s
true position relative to its direction of motion (Howard & Holcombe,
2008), though these effects do not seem to generalize to more typical MOT
displays (Howard et al., 2011). Relatedly, Horowitz and Cohen (2010) asked
participants to judge the last-remembered trajectory angle of targets at set
sizes ranging from 1 to 6. When they fit these data to a two-component
mixture model (as in Zhang & Luck, 2008) to derive separate estimates of
the probability and precision of tracking, they found that angular error for
targets increased continuously up to set size 6, consistent with a continuous
resource.
There are also parallels to the proposed set of underlying mechanisms for
limits in VWM, which can be divided into spatial (cortical map limitations)
and temporal (oscillatory multiplexing) theories. Explanations of attention
resources as spatially limited are supported by work demonstrating that
target-distractor spacing influences tracking capacity. When spacing is
maximized, observers can successfully track at least six objects in parallel,
irrespective of object speed (Franconeri et al., 2010; cf. Tombu & Seiffert,
2011), suggesting that attention may be a fundamentally continuous resource
with no strict capacity limit. Critically, this proposal explains limits in tracking capacity as a consequence of competition within spatial attention maps
(e.g., in the frontal eye fields and related parietal areas) arising during
close interactions between targets and distractors, as such interactions
involve destructive interference between representations. Relatedly, speed
and spatial crowding may limit tracking performance by creating spatial
confusability between targets and distractors (Franconeri et al., 2010),
analogous to how spatial confusions seem to reduce effective VWM capacity

8

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

(Bays et al., 2009). In fast-moving displays, target objects have more frequent
close interactions with distractors than in slow-moving displays, increasing
the probability of selecting the wrong item, and thus, lowering capacity
estimates.
Spatial content maps may also be partly interactive with feature-based
content maps, such that objects that are similar along one dimension can
be separately indexed (and successfully tracked or stored) so long as they
are distinct along another dimension. This possibility is supported by
recent work showing that tracking performance improves when targets and
distractors are visually distinct during close encounters (Bae & Flombaum,
2012), suggesting that competition within one content map (e.g., spatial
position) can be avoided or alleviated via distinctiveness within another
(e.g., position in color space).
Support for a temporal basis for limitations of attention resources is
provided by recent work claiming to show tradeoffs between capacity and
the temporal precision of tracking. Holcombe and Chen (2013) show that
a single target can be tracked at temporal resolutions up to 7 Hz (i.e., 0.58
rev/s on a clock face with 12 positions), but this threshold drops to 4 Hz
for two targets and 2.6 Hz for three targets. This point is controversial, with
some researchers arguing that these data do not reveal temporal resource
limitations (Scimeca, Jonathan, & Franconeri, submitted). Also, as such
effects could also be accommodated within a discrete-resource framework
(via Slots + Averaging), a critical question is whether at even higher set sizes,
temporal precision continues to decrease gradually, or whether temporal
precision eventually bottoms out as guessing rate increases (as a discrete
model would predict).
Another open question is whether attentional resources can be allocated unevenly during tracking, as appears possible for continuous VWM
resources (Bays & Husain, 2008). While there is no solid evidence to date
for such effects, a study design in which multiple targets per trial are
probed with a continuous response (e.g., location or trajectory angle), or in
which participants are asked to respond to the “best remembered” versus a
randomly-chosen item (as in Fougnie et al., 2012), or in which some targets
are designated as “high-priority” and others as “low-priority” (perhaps
reinforced by a monetary incentive structure), would be helpful in resolving
this question.
It seems clear that attention researchers have much to gain by following
developments in the study of VWM and vice versa. Even if visual attention and VWM involve distinct (though partially interactive) resources, both
resources may be subject to similar architectural constraints. Also at stake is
the question of whether attention and memory reflect the same resource, as

Resource Limitations in Visual Cognition

9

such an account would require that attention and VWM resources are limited
in the same way—either discretely or continuously.
CONCLUSIONS
While visual attention and VWM have often been studied as separate topics,
using distinct methodologies (and, more often than not, by different groups
of researchers), it seems clear that there is much to be gained from increasing
theoretical and methodological convergence between these areas of research.
For example, we have seen how major methodological advances in VWM
research (e.g., continuous report paradigms and neural signatures such as
the CDA) can inform fundamental theories of visual attention, in particular,
the question of whether attention resources are fundamentally discrete or
continuous, and of whether attention and VWM resources are the same or
are distinct.
There are still many open questions left to resolve about the nature of these
resources, but there is mounting evidence for continuous resource models in
visual attention and VWM. If these resources truly are continuous, it will be
crucial to understand why human performance sometimes appears to be discrete, and to synthesize this finding with neural measures (such as the CDA
and global oscillatory activity) that are inherently discrete. More generally,
a deeper understanding of resource limitations will require both continued
advances in behavioral and neural measures of visual attention and VWM
capacity and crosstalk between these fields. Eventually, visual cognition can
begin to move beyond questions of how these resources are limited to more
fundamental questions about what these resources are and how to maximize
them in everyday contexts.
KEY ISSUES FOR FUTURE RESEARCH
1. How much overlap is there between visual attention and VWM
resources? Do they have the same capacity? Are they both discrete or
both continuous?
2. Is there such a thing as a true “guess” in VWM? Or are all objects in the
scene always encoded with at least a minimal amount of resources?
3. Is there a true upper bound to the number of objects that can be remembered or tracked? Likewise, is there an upper bound to the resolution
with which a single object can be tracked or remembered?
4. To what extent are objects encoded independently versus hierarchically
in VWM and attention?
5. Can speed impair tracking performance independently of spacing?

10

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

6. Can visual attention resources be divided unevenly among objects, as
VWM resources seemingly can be?
7. What is the ultimate neural basis of visual resources? Can individual
differences in VWM and MOT performance be predicted based on differences in brain activity?
REFERENCES
Alvarez, G. A., & Cavanagh, P. (2004). The capacity of visual short-term memory is
set both by visual information load and by number of objects. Psychological Science,
15, 106–111.
Alvarez, G. A., & Franconeri, S. L. (2007). How many objects can you track? Evidence
for a resource-limited tracking mechanism. Journal of Vision, 7(13), 1–10.
Anderson, D. E., Vogel, E. K., & Awh, E. (2011). Precision in visual working memory reaches a stable plateau when individual item limits are exceeded. Journal of
Neuroscience, 31, 1128–1138.
Awh, E., Barton, B., & Vogel, E. K. (2007). Visual working memory represents a fixed
number of items regardless of complexity. Psychological Science, 18, 622–628.
Bae, G. Y., & Flombaum, J. I. (2012). Close encounters of the distracting kind: Identifying the cause of visual tracking errors. Attention, Perception, & Psychophysics, 74,
703–715.
Bae, G. Y., & Flombaum, J. I. (2013). Two items remembered as precisely as one: How
integral features can improve visual working memory. Psychological Science, 24,
2038–2047.
Bays, P. M., Catalao, R. F., & Husain, M. (2009). The precision of visual working memory is set by allocation of a shared resource. Journal of Vision, 9(10), 1–11.
Bays, P. M., & Husain, M. (2008). Dynamic shifts of limited working memory
resources in human vision. Science, 321, 851–854.
van den Berg, R., Shin, H., Chou, W. C., George, R., & Ma, W. J. (2012). Variability in
encoding precision accounts for visual short-term memory limitations. Proceedings
of the National Academy of Sciences, U.S.A., 109, 8780–8785.
Brady, T. F., Konkle, T., & Alvarez, G. A. (2011). A review of visual memory capacity:
Beyond individual items and toward structured representations. Journal of Vision,
11(5), 1–34.
Chun, M. M. (2011). Visual working memory as visual attention sustained internally
over time. Neuropsychologia, 49, 1407–1409.
Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration
of mental storage capacity. Behavioral & Brain Sciences, 24, 87–185.
Drew, T., Horowitz, T. S., Wolfe, J. M., & Vogel, E. K. (2011). Delineating the neural signatures of tracking spatial position and working memory during attentive
tracking. Journal of Neuroscience, 31, 659–668.
Drew, T., & Vogel, E. K. (2008). Neural measures of individual differences in selecting
and tracking multiple moving objects. Journal of Neuroscience, 28, 4183–4191.
Emrich, S. M., & Ferber, S. (2012). Competition increases binding errors in visual
working memory. Journal of Vision, 12(4), 1–16.

Resource Limitations in Visual Cognition

11

Fougnie, D., Suchow, J. W., & Alvarez, G. A. (2012). Variability in the quality of visual
working memory. Nature Communications, 3(1229), 1–8.
Franconeri, S. L. (2013). The nature and status of visual resources. In D. Reisberg
(Ed.), Oxford handbook of cognitive psychology (pp. 147–162). New York, NY: Oxford
University Press.
Franconeri, S. L., Alvarez, G. A., & Cavanagh, P. (2013). Flexible cognitive resources:
Competitive content maps for attention and memory. Trends in Cognitive Sciences,
17, 134–141.
Franconeri, S. L., Jonathan, S., & Scimeca, J. M. (2010). Tracking multiple objects is
limited only by object spacing, not speed, time, or capacity. Psychological Science,
21, 920–925.
Gazzaley, A., & Nobre, A. C. (2012). Top-down modulation: Bridging selective attention and working memory. Trends in Cognitive Sciences, 16, 129–135.
Gorgoraptis, N., Catalao, R. F., Bays, P. M., & Husain, M. (2011). Dynamic updating of working memory resources for visual objects. Journal of Neuroscience, 31,
8502–8511.
Holcombe, A. O., & Chen, W. Y. (2012). Exhausting attentional tracking resources
with a single fast-moving object. Cognition, 123, 218–228.
Holcombe, A. O., & Chen, W. Y. (2013). Splitting attention reduces temporal resolution from 7 Hz for tracking one object to <3 Hz when tracking three. Journal of
Vision, 13(1), 1–19.
Horowitz, T. S., & Cohen, M. A. (2010). Direction information in multiple object
tracking is limited by a graded resource. Attention, Perception, & Psychophysics, 72,
1765–1775.
Howard, C. J., & Holcombe, A. O. (2008). Tracking the changing features of multiple objects: Progressively poorer perceptual precision and progressively greater
perceptual lag. Vision Research, 48, 1164–1180.
Howard, C. J., Masom, D., & Holcombe, A. O. (2011). Position representations lag
behind targets in multiple object tracking. Vision Research, 51, 1907–1919.
Johnson, J. S., Spencer, J. P., & Schöner, G. (2008). Moving to higher ground: The
dynamic field theory and the dynamics of visual cognition. New Ideas in Psychology,
26, 227–251.
Lisman, J. E., & Idiart, M. A. (1995). Storage of 7 +/− 2 short-term memories in oscillatory subcycles. Science, 267, 1512–1515.
Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features
and conjunctions. Nature, 390, 279–281.
Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA: MIT Press.
Pylyshyn, Z. W., & Storm, R. W. (1988). Tracking multiple independent targets: Evidence for a parallel tracking mechanism. Spatial Vision, 3, 179–197.
Raffone, A., & Wolters, G. (2001). A cortical mechanism for binding in visual working
memory. Journal of Cognitive Neuroscience, 13, 766–785.
Rensink, R. A., O’Regan, J. K., & Clark, J. J. (1997). To see or not to see:
The need for attention to perceive changes in scenes. Psychological Science, 8,
368–373.

12

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Scimeca, J. M., & Franconeri, S. L. (2015). Selecting and tracking multiple objects.
Wiley Interdisciplinary Reviews: Cognitive Science. Advance online publication.
http://dx.doi.org/10.1002/wcs.1328.
Scimeca, J. M., Jonathan, S., & Franconeri, S. L. (submitted). Maintaining selection
of multiple objects. Retrieved from http://www.journalofvision.org/content/
12/9/553.short.
Scott-Brown, K. C., Baker, M. R., & Orbach, H. S. (2000). Comparison blindness.
Visual Cognition, 7, 253–267.
Siegel, M., Warden, M. R., & Miller, E. K. (2009). Phase-dependent neuronal coding
of objects in short-term memory. Proceedings of the National Academy of Sciences,
U.S.A., 106, 21341–21346.
Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional
blindness for dynamic events. Perception, 28, 1059–1074.
Simons, D. J., & Levin, D. T. (1997). Change blindness. Trends in Cognitive Sciences, 1,
261–267.
Tombu, M., & Seiffert, A. E. (2011). Tracking planets and moons: Mechanisms
of object tracking revealed with a new paradigm. Attention, Perception, & Psychophysics, 73, 738–750.
Vogel, E. K., & Machizawa, M. G. (2004). Neural activity predicts individual differences in visual working memory capacity. Nature, 428, 748–751.
Vogel, E. K., Woodman, G. F., & Luck, S. J. (2001). Storage of features, conjunctions
and objects in visual working memory. Journal of Experimental Psychology: Human
Perception & Performance, 27, 92–114.
Wei, Z., Wang, X., & Wang, D. (2012). From distributed resources to limited slots in
multiple-item working memory: A spiking network model with normalization.
Journal of Neuroscience, 32, 11228–11240.
Wheeler, M. E., & Treisman, A. M. (2002). Binding in short-term visual memory. Journal of Experimental Psychology: General, 131, 48–64.
Xu, Y. (2002). Limitations in object-based feature encoding in visual short-term
memory. Journal of Experimental Psychology: Human Perception & Performance, 28,
458–468.
Xu, Y., & Chun, M. M. (2009). Selecting and perceiving multiple visual objects. Trends
in Cognitive Sciences, 13, 167–174.
Yantis, S. (1992). Multielement visual tracking: Attention and perceptual organization. Cognitive Psychology, 24, 295–340.
Zhang, W., & Luck, S. J. (2008). Discrete fixed-resolution representations in visual
working memory. Nature, 453, 233–235.

BRANDON M. LIVERENCE SHORT BIOGRAPHY
Brandon M. Liverence is currently a postdoctoral research fellow in the
Northwestern Visual Cognition Lab, and he received his PhD at Yale University working with Brian Scholl. His current projects explore the nature
of individuation and ensemble representation in visual working memory,

Resource Limitations in Visual Cognition

13

and connections between spatial navigation and object representation. He
is funded by a National Research Service Award from the National Eye
Institute. His work has been published in journals such as Psychological
Science and JEP: General.
STEVEN L. FRANCONERI SHORT BIOGRAPHY
Steven L. Franconeri is an associate professor of Psychology at Northwestern
University. His lab studies include visual cognition, graph comprehension,
and data visualization. He completed his PhD in Experimental Psychology at
Harvard with a National Defense Science and Engineering Fellowship, and
did postdoctoral work at UBC with a Killam Fellowship. He has received
the Psychonomics Early Career Award and an NSF CAREER award, and his
work receives funding from the NSF, NIH, and the Department of Education.
His lab strives to explore fundamental questions that also have real-world
relevance, collaborating with researchers in education (e.g., graph comprehension) and computer science (e.g., comparison within information visualization). These collaborations allow basic research to impact students and
scientists, while their unsolved problems help us identify gaps in our theoretical knowledge.
RELATED ESSAYS
Spatial Attention (Psychology), Kyle R. Cave
Cultural Neuroscience: Connecting Culture, Brain, and Genes (Psychology),
Shinobu Kitayama and Sarah Huff
Models of Duality (Psychology), Anand Krishna et al.
Neural and Cognitive Plasticity (Psychology), Eduardo Mercado III
Embodied Knowledge (Psychology), Diane Pecher and René Zeelenberg
Attention and Perception (Psychology), Ronald A. Rensink
Event Processing as an Executive Enterprise (Psychology), Robbie A. Ross and
Dare A. Baldwin
The Intrinsic Dynamics of Development (Psychology), Paul van Geert and
Marijn van Dijk
How Form Constrains Function in the Human Brain (Psychology), Timothy
D. Verstynen
Speech Perception (Psychology), Athena Vouloumanos
Behavioral Heterochrony (Anthropology), Victoria Wobber and Brian Hare

Resource Limitations
in Visual Cognition
BRANDON M. LIVERENCE and STEVEN L. FRANCONERI

Abstract
Visual attention and visual working memory are two of the core resources that support visual perception. Foundational research has demonstrated that these resources
are highly limited, but an active debate concerns exactly how they are limited. While
many classic studies suggested that these resources are fundamentally discrete, with
fixed capacity of 3–4 objects maximum, a number of recent studies have argued that
these resources are fundamentally continuous, with no fixed upper-bound to the
number of objects that can be attended or remembered. This entry reviews the state
of this debate, and shows how convergence between these (often separate) areas of
research is a major emerging trend in the field of visual cognition.

INTRODUCTION
The visual system is constantly overwhelmed with information. As the
amount of input registered in early vision far outstrips the capacity of more
computationally expensive later stages of visual processing, it is impossible
to fully process and perceive everything in view at any given moment.
Additionally, because low-level visual input is frequently in flux (due to
blinks, eye movements, and physical changes in the environment), the visual
system has to solve tricky correspondence problems in order to maintain
perceptual stability. To meet these challenges, vision relies on a pair of core
resources: visual attention, which serves as a filter to ensure that only relevant
objects are fully processed, and visual working memory (VWM), which
supports perceptual stability by providing a temporary storage for recent
visual input. Unfortunately, these resources are highly limited, and there are
often limits to the number of objects that can be simultaneously attended or
actively remembered. When these limits are exceeded, dramatic failures of
visual awareness can occur (e.g., inattentional blindness and change blindness).
This entry will explore the nature of these visual resources and address the
following questions along the way: How many objects can be attended or
Emerging Trends in the Social and Behavioral Sciences. Edited by Robert Scott and Stephen Kosslyn.
© 2015 John Wiley & Sons, Inc. ISBN 978-1-118-90077-2.

1

2

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

remembered at one time? Are these resources fundamentally discrete (with
fixed precision) or continuous (with variable precision)? What explains these
limits, and are they fixed, or are there ways to increase one’s resources?
While drawing novel connections between parallel work on visual attention
and VWM, this essay will show that their convergence—both theoretical
and methodological—represents a major emerging trend in visual cognition.
FOUNDATIONAL RESEARCH ON VISUAL WORKING
MEMORY RESOURCES
VWM is a highly limited resource, as is clear from demonstrations of
“change blindness” in which observers fail to detect dramatic changes
occurring between glances of a scene (Rensink et al., 1997; Simons & Levin,
1997; cf. Scott-Brown et al., 2000, for an alternative discussion of how such
effects may reflect “comparison blindness” rather than memory limitations).
A simplified version of this change detection paradigm has become a common
way to measure VWM. In one seminal study (Luck & Vogel, 1997), observers
briefly viewed 1–12 objects that disappeared briefly and then reappeared,
with a change occurring to a single object on some trials. Observers’
performance at detecting these changes suggested that they could store
the features of only ∼4 objects per trial, confirming that VWM capacity is
quite low. Intriguingly, observers were just as good at noticing changes to
objects that had only one feature as to objects that could change along any
of four feature dimensions, which led the authors to conclude that VWM
is a fundamentally discrete resource constrained by the number of objects
stored rather than their complexity. While subsequent studies challenged the
strongest versions of this hypothesis (Wheeler & Treisman, 2002; Xu, 2002),
the basic finding of an upper limit in VWM capacity of 3–4 fairly simple
objects has been repeatedly verified (e.g., Awh et al., 2007; Vogel et al., 2001).
An influential study (Zhang & Luck, 2008) using a continuous report
paradigm provides even more powerful evidence for discrete VWM
resources. Observers briefly viewed 1–6 objects and then reported a test
object’s feature value from memory (e.g., color) by selecting it from a
continuous circular distribution (e.g., a color wheel). When the data were fit
to a mixture model with a normally-distributed component (reflecting trials
in which the probed item was noisily encoded) and a uniform component
(reflecting random guessing for unencoded items), they found that the
uniform component sharply increased from set size 3–6 but that the standard deviation of the normally-distributed component (a measure of the
precision of encoding) did not change. This suggests that once participants
had fully allocated their ∼3 fixed-capacity VWM “slots,” they failed to
encode any information from additional items and had to guess at random.

Resource Limitations in Visual Cognition

3

Precision also decreased from set size 1–3, which the authors explain in
terms of a “Slots + Averaging” model: for set sizes under 3, participants
allocate multiple slots (each containing some independent noise) to each
item allowing them to improve their performance by averaging across
multiple noisy representations. Critically, however; the Slots + Averaging
cannot account for recent findings that the typical drop in precision from
set size 1–2 is larger that would be predicted by averaging (Bays et al., 2009)
and that under some conditions there is no drop in precision in this range
whatsoever (Bae & Flombaum, 2013).
Another source of evidence for discrete VWM resource limits comes
from investigations of the contralateral delay activity (CDA), a putative
electrophysiological index of the number of items stored in VWM that
increases with memory load but reaches plateau at ∼3 objects in typical
observers (Anderson et al., 2011; Vogel & Machizawa, 2004). Notably, the
CDA also indexes set size in a multifocal attention task (as reviewed below;
Drew & Vogel, 2008; Drew et al., 2011), suggesting that it may reflect spatial
selection rather than memory, per se. Also, if the CDA fundamentally reflects
discrete memory “slots,” then following the logic of Slots + Averaging, all
slots should be used in parallel even at set sizes 1–2 to improve precision
via averaging, implying that there should not be differences in the CDA at
small versus large set sizes.
CUTTING-EDGE RESEARCH ON RESOURCE LIMITATIONS
IN VISUAL WORKING MEMORY
One of the central debates in VWM research concerns whether VWM
resources are constituted by discrete fixed-capacity “slots,” or a flexible,
continuous resource that can be variably allocated to any number of objects.
Critically, continuous resource models predict tradeoffs between complexity
and capacity, as complex objects are assumed to require more resources
to encode. Strong evidence for such effects came from a study (Alvarez &
Cavanagh, 2004) that tested VWM capacity using stimuli ranging from very
simple color patches to highly complex objects, such as Chinese characters
or multi-shade cubes. The authors used both a change-detection task to
estimate memory capacity for these different stimulus types, and a separate
speeded-search task to quantify the complexity of each stimulus type. They
observed an almost perfect linear correlation between these measures,
verifying that memory capacity was much lower for the most complex
objects (e.g., only ∼1.5 cubes could be stored per trial).
Other work has shown that it is possible to store more than four representations in VWM, albeit at low precision. Bays & Husain (2008) found that in
a spatial memory task, VWM resources could be spread among up to at least

4

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

six objects, and that at increased set sizes there was a concomitant decrease
in precision that follows a power law, consistent with a continuous resource
being spread ever more thinly. There is also evidence that different objects in
a scene can receive differing amounts of memory resources, with numerous
studies finding that some objects can be prioritized (via cueing) over others
(Bays & Husain, 2008; Gorgoraptis et al., 2011). Even in the absence of cues,
memory precision seems to naturally vary between objects and across trials,
consistent with continuous resources (Fougnie et al., 2012).
The upper threshold of ∼3–4 representations observed in classical VWM
tasks may reflect a tendency to represent a subset of items in high resolution
and a subset in low resolution, with these low resolution representations
being treated incorrectly by many models as “guesses” rather than as
low-precision “hits” (van den Berg et al., 2012), though the question of
whether participants ever truly guess at random (due to a complete failure
to encode any detail from the target) is a matter of debate (cf. Fougnie
et al., 2012). Alternatively, VWM resources may be continuous—even while
behavioral performance exhibits strict (and seemingly discrete) capacity
limits—because VWM relies on a fundamentally discrete indexing resource
(possibly attention) to link VWM representations to spatial locations (Xu &
Chun, 2009). Even if observers can store more than four representations,
they may be unable to accurately link all of these representations to items
in the test display, causing them to make accurate responses to the wrong
items (Bays et al., 2009; Emrich & Ferber, 2012).
These results are also compatible with a recent perspective that suggests
that the fundamental units of VWM are not objects, but rather hierarchical
feature bundles that encompass both object-based advantages in storing individual features and higher-order regularities (e.g., spatial and featural) that
emerge across collections of objects, and that can enhance VWM capacity via
compression and summary statistics (Brady et al., 2011). Thus, even if there is
a capacity limit to the number of objects that can stored independently with
great precision, the actual capacity of VWM may be much higher because
higher-order regularities may be encoded across all objects in the scene.
The work reviewed thus far has focused on the capacity and nature of VWM
resources. However, to truly explain why and how these resources are limited, it is critical to consider computational and neural models of VWM. For
example, multi-object working memory can be implemented via local spatial
interactions between neurons following a “Mexican Hat” formation in which
inhibition is high near the peak of each representation’s activation and falls
off with increasing distance. Such interactions help keep neighboring representations separate, but also keep overall inhibition within the network low
enough to allow multiple items to be represented in parallel. Given certain
parameterizations of these interactions, such neural models mimic typical

Resource Limitations in Visual Cognition

5

human VWM capacity limits in change-detection (Johnson et al., 2008). Relatedly, a recent, biologically plausible model (Wei et al., 2012) illustrates how
modeling representations as “activity bumps” in a continuous pool of neurons with excitatory and inhibitory properties gives rise to properties typically associated with both discrete and continuous models of VWM.
Instead of conceptualizing VWM resources as having a fixed upper bound
capacity, a similar recent proposal suggests that these resources are fundamentally unlimited, with observed limits in task performance arising from
spatial competition between representations in content maps (Franconeri,
2013; Franconeri et al., 2013). Content maps are extensions of functionallyand spatially-organized neural substrate, and representations occupy
physical locations within these neural maps. Objects that are physically
close in visual or feature space thus become represented in neighboring
regions of visual cortex, and must therefore compete for the same pool of
neural resources. Visual capacity limits are thus a byproduct of the physical
limitations of neural real estate and the competitive interactions that emerge
between representations in these maps. For example, many VWM errors can
be accounted for as spatial mismatches between sample and test items (Bays
et al., 2009; Emrich & Ferber, 2012), and decreasing similarity in integral feature dimensions (e.g., color in a brightness memory task) increases precision
in VWM (Bae & Flombaum, 2013), suggesting that representational capacity
may be much higher than spatial indexing capacity.
Alternatively, VWM resource limitations may arise due to purely temporal properties of the neural substrates of memory. According to the principle
of oscillatory multiplexing, memory representations are encoded in oscillating patterns of global brain activity. Lisman and Idiart (1995) suggest that
a working memory capacity of 7 ± 1 could be derived from the number
of high-frequency gamma (40 Hz) brain oscillations that fit within a single
low-frequency alpha–theta (5–12 Hz) oscillation, though more recent formulations (Raffone & Wolters, 2001; Siegel et al., 2009) have argued for values
closer to the classic “slots” capacity of 4 ± 1 items. A major advantage of these
theories is that they show how brains that are inherently continuous (having
pools of billions of neurons) can nonetheless behave in a way that is discrete and slot-like. The downside is that there is very little direct empirical
support for these theories thus far. Furthermore, while multiplexing theories
are typically associated with discrete resource theories, they could also be
consistent with continuous resources if variability in these oscillatory rates
is causally related to memory precision. For example, different patterns of
oscillation might result in higher capacity but lower precision because each
representation receives a smaller temporal share of memory resources.
Of course, a computational model that can simulate characteristics of
human performance via careful tweaking of free parameters is not especially

6

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

compelling in and of itself. Thus, the challenge is to link such parameters
to actual neural measures and individual differences in human behavioral
performance (e.g., the amount and shape of “inhibition” observed between
neighboring representations on some VWM task), and then use these
parameters to predict individual differences in resource capacity.
FOUNDATIONAL RESEARCH ON RESOURCE LIMITATIONS
IN VISUAL ATTENTION
When observers focus all of their attention on a challenging primary task,
even highly salient changes (e.g., the sudden appearance of a gorilla) can
go unnoticed (Mack & Rock, 1998; Simons & Chabris, 1999). This effect,
called inattentional blindness, is thought to reflect the limited capacity of
attention—when attention is “used up” by the primary task, there are insufficient resources to detect the salient changes. While such demonstrations
reveal that attentional resources are limited, other measures focus on quantifying these limitations. In multiple object tracking (MOT), observers see a
display filled with identical objects, some of which are cued as “targets.” All
objects then move independently (and often, unpredictably), and observers
must keep track of the targets’ positions throughout the movement and later
reidentify them. These studies showed that people could track at least four
objects (Pylyshyn & Storm, 1988; Yantis, 1992) suggesting an underlying
capacity limit in the ability to divide attentional resources (Pylyshyn &
Storm, 1988). This limit matched values obtained from broader literatures
on memory (Cowan, 2001), leading some to suggest that VWM and visual
attention rely (at least partially) upon a common pool of resources, with
VWM being constrained by a form of “inward-directed” attention (Chun,
2011; Gazzaley & Nobre, 2012).
CUTTING-EDGE RESEARCH ON RESOURCE LIMITATIONS
IN VISUAL ATTENTION
Mirroring the debate in the VWM literature over whether capacity reflects a
fixed number of discrete object-based “slots” or a more continuous resource,
new research on MOT explores similar divisions. Like VWM, the upper limit
on capacity does not appear to be fixed—recent studies show that there are in
fact display conditions where tracking capacity can be raised to eight objects
at once (Alvarez & Franconeri, 2007). Like VWM, there are arguments that
objects are represented not as individuals in “slots,” but that higher-order
structures (hierarchical features bundles for VWM, e.g., Brady et al., 2011)
might help compress position representations of objects. For example, there
is evidence that tracked objects might be organized by common fate or as

Resource Limitations in Visual Cognition

7

vertices of a rigid polygon (Yantis, 1992, see Scimeca & Franconeri, 2015, for
discussion).
There are also several demonstrations of performance limits that appear to
reflect continuous allocation of processing resources. For example, participants are capable of tracking up to eight objects at a time, but only when the
objects move very slowly (Alvarez & Franconeri, 2007). Relatedly, at very fast
speeds only a single object can be successfully tracked (Holcombe & Chen,
2012). These results may suggest a capacity-precision tradeoff in visual
attention, with faster moving objects demanding additional resources,
though such explanations are controversial (cf. Franconeri et al., 2010).
Given how instrumental continuous report measures have been in recent
studies of the nature of VWM resources, the development of continuous
report measures in MOT may similarly inform debate over the nature of
visual attention resources. For example, a recent study in which “targets” in
a simplified tracking task disappeared at the end of each trial and then had
to be localized from memory via mouseclicks found that with increasing
tracking load, response clicks lagged increasingly far behind each target’s
true position relative to its direction of motion (Howard & Holcombe,
2008), though these effects do not seem to generalize to more typical MOT
displays (Howard et al., 2011). Relatedly, Horowitz and Cohen (2010) asked
participants to judge the last-remembered trajectory angle of targets at set
sizes ranging from 1 to 6. When they fit these data to a two-component
mixture model (as in Zhang & Luck, 2008) to derive separate estimates of
the probability and precision of tracking, they found that angular error for
targets increased continuously up to set size 6, consistent with a continuous
resource.
There are also parallels to the proposed set of underlying mechanisms for
limits in VWM, which can be divided into spatial (cortical map limitations)
and temporal (oscillatory multiplexing) theories. Explanations of attention
resources as spatially limited are supported by work demonstrating that
target-distractor spacing influences tracking capacity. When spacing is
maximized, observers can successfully track at least six objects in parallel,
irrespective of object speed (Franconeri et al., 2010; cf. Tombu & Seiffert,
2011), suggesting that attention may be a fundamentally continuous resource
with no strict capacity limit. Critically, this proposal explains limits in tracking capacity as a consequence of competition within spatial attention maps
(e.g., in the frontal eye fields and related parietal areas) arising during
close interactions between targets and distractors, as such interactions
involve destructive interference between representations. Relatedly, speed
and spatial crowding may limit tracking performance by creating spatial
confusability between targets and distractors (Franconeri et al., 2010),
analogous to how spatial confusions seem to reduce effective VWM capacity

8

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

(Bays et al., 2009). In fast-moving displays, target objects have more frequent
close interactions with distractors than in slow-moving displays, increasing
the probability of selecting the wrong item, and thus, lowering capacity
estimates.
Spatial content maps may also be partly interactive with feature-based
content maps, such that objects that are similar along one dimension can
be separately indexed (and successfully tracked or stored) so long as they
are distinct along another dimension. This possibility is supported by
recent work showing that tracking performance improves when targets and
distractors are visually distinct during close encounters (Bae & Flombaum,
2012), suggesting that competition within one content map (e.g., spatial
position) can be avoided or alleviated via distinctiveness within another
(e.g., position in color space).
Support for a temporal basis for limitations of attention resources is
provided by recent work claiming to show tradeoffs between capacity and
the temporal precision of tracking. Holcombe and Chen (2013) show that
a single target can be tracked at temporal resolutions up to 7 Hz (i.e., 0.58
rev/s on a clock face with 12 positions), but this threshold drops to 4 Hz
for two targets and 2.6 Hz for three targets. This point is controversial, with
some researchers arguing that these data do not reveal temporal resource
limitations (Scimeca, Jonathan, & Franconeri, submitted). Also, as such
effects could also be accommodated within a discrete-resource framework
(via Slots + Averaging), a critical question is whether at even higher set sizes,
temporal precision continues to decrease gradually, or whether temporal
precision eventually bottoms out as guessing rate increases (as a discrete
model would predict).
Another open question is whether attentional resources can be allocated unevenly during tracking, as appears possible for continuous VWM
resources (Bays & Husain, 2008). While there is no solid evidence to date
for such effects, a study design in which multiple targets per trial are
probed with a continuous response (e.g., location or trajectory angle), or in
which participants are asked to respond to the “best remembered” versus a
randomly-chosen item (as in Fougnie et al., 2012), or in which some targets
are designated as “high-priority” and others as “low-priority” (perhaps
reinforced by a monetary incentive structure), would be helpful in resolving
this question.
It seems clear that attention researchers have much to gain by following
developments in the study of VWM and vice versa. Even if visual attention and VWM involve distinct (though partially interactive) resources, both
resources may be subject to similar architectural constraints. Also at stake is
the question of whether attention and memory reflect the same resource, as

Resource Limitations in Visual Cognition

9

such an account would require that attention and VWM resources are limited
in the same way—either discretely or continuously.
CONCLUSIONS
While visual attention and VWM have often been studied as separate topics,
using distinct methodologies (and, more often than not, by different groups
of researchers), it seems clear that there is much to be gained from increasing
theoretical and methodological convergence between these areas of research.
For example, we have seen how major methodological advances in VWM
research (e.g., continuous report paradigms and neural signatures such as
the CDA) can inform fundamental theories of visual attention, in particular,
the question of whether attention resources are fundamentally discrete or
continuous, and of whether attention and VWM resources are the same or
are distinct.
There are still many open questions left to resolve about the nature of these
resources, but there is mounting evidence for continuous resource models in
visual attention and VWM. If these resources truly are continuous, it will be
crucial to understand why human performance sometimes appears to be discrete, and to synthesize this finding with neural measures (such as the CDA
and global oscillatory activity) that are inherently discrete. More generally,
a deeper understanding of resource limitations will require both continued
advances in behavioral and neural measures of visual attention and VWM
capacity and crosstalk between these fields. Eventually, visual cognition can
begin to move beyond questions of how these resources are limited to more
fundamental questions about what these resources are and how to maximize
them in everyday contexts.
KEY ISSUES FOR FUTURE RESEARCH
1. How much overlap is there between visual attention and VWM
resources? Do they have the same capacity? Are they both discrete or
both continuous?
2. Is there such a thing as a true “guess” in VWM? Or are all objects in the
scene always encoded with at least a minimal amount of resources?
3. Is there a true upper bound to the number of objects that can be remembered or tracked? Likewise, is there an upper bound to the resolution
with which a single object can be tracked or remembered?
4. To what extent are objects encoded independently versus hierarchically
in VWM and attention?
5. Can speed impair tracking performance independently of spacing?

10

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

6. Can visual attention resources be divided unevenly among objects, as
VWM resources seemingly can be?
7. What is the ultimate neural basis of visual resources? Can individual
differences in VWM and MOT performance be predicted based on differences in brain activity?
REFERENCES
Alvarez, G. A., & Cavanagh, P. (2004). The capacity of visual short-term memory is
set both by visual information load and by number of objects. Psychological Science,
15, 106–111.
Alvarez, G. A., & Franconeri, S. L. (2007). How many objects can you track? Evidence
for a resource-limited tracking mechanism. Journal of Vision, 7(13), 1–10.
Anderson, D. E., Vogel, E. K., & Awh, E. (2011). Precision in visual working memory reaches a stable plateau when individual item limits are exceeded. Journal of
Neuroscience, 31, 1128–1138.
Awh, E., Barton, B., & Vogel, E. K. (2007). Visual working memory represents a fixed
number of items regardless of complexity. Psychological Science, 18, 622–628.
Bae, G. Y., & Flombaum, J. I. (2012). Close encounters of the distracting kind: Identifying the cause of visual tracking errors. Attention, Perception, & Psychophysics, 74,
703–715.
Bae, G. Y., & Flombaum, J. I. (2013). Two items remembered as precisely as one: How
integral features can improve visual working memory. Psychological Science, 24,
2038–2047.
Bays, P. M., Catalao, R. F., & Husain, M. (2009). The precision of visual working memory is set by allocation of a shared resource. Journal of Vision, 9(10), 1–11.
Bays, P. M., & Husain, M. (2008). Dynamic shifts of limited working memory
resources in human vision. Science, 321, 851–854.
van den Berg, R., Shin, H., Chou, W. C., George, R., & Ma, W. J. (2012). Variability in
encoding precision accounts for visual short-term memory limitations. Proceedings
of the National Academy of Sciences, U.S.A., 109, 8780–8785.
Brady, T. F., Konkle, T., & Alvarez, G. A. (2011). A review of visual memory capacity:
Beyond individual items and toward structured representations. Journal of Vision,
11(5), 1–34.
Chun, M. M. (2011). Visual working memory as visual attention sustained internally
over time. Neuropsychologia, 49, 1407–1409.
Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration
of mental storage capacity. Behavioral & Brain Sciences, 24, 87–185.
Drew, T., Horowitz, T. S., Wolfe, J. M., & Vogel, E. K. (2011). Delineating the neural signatures of tracking spatial position and working memory during attentive
tracking. Journal of Neuroscience, 31, 659–668.
Drew, T., & Vogel, E. K. (2008). Neural measures of individual differences in selecting
and tracking multiple moving objects. Journal of Neuroscience, 28, 4183–4191.
Emrich, S. M., & Ferber, S. (2012). Competition increases binding errors in visual
working memory. Journal of Vision, 12(4), 1–16.

Resource Limitations in Visual Cognition

11

Fougnie, D., Suchow, J. W., & Alvarez, G. A. (2012). Variability in the quality of visual
working memory. Nature Communications, 3(1229), 1–8.
Franconeri, S. L. (2013). The nature and status of visual resources. In D. Reisberg
(Ed.), Oxford handbook of cognitive psychology (pp. 147–162). New York, NY: Oxford
University Press.
Franconeri, S. L., Alvarez, G. A., & Cavanagh, P. (2013). Flexible cognitive resources:
Competitive content maps for attention and memory. Trends in Cognitive Sciences,
17, 134–141.
Franconeri, S. L., Jonathan, S., & Scimeca, J. M. (2010). Tracking multiple objects is
limited only by object spacing, not speed, time, or capacity. Psychological Science,
21, 920–925.
Gazzaley, A., & Nobre, A. C. (2012). Top-down modulation: Bridging selective attention and working memory. Trends in Cognitive Sciences, 16, 129–135.
Gorgoraptis, N., Catalao, R. F., Bays, P. M., & Husain, M. (2011). Dynamic updating of working memory resources for visual objects. Journal of Neuroscience, 31,
8502–8511.
Holcombe, A. O., & Chen, W. Y. (2012). Exhausting attentional tracking resources
with a single fast-moving object. Cognition, 123, 218–228.
Holcombe, A. O., & Chen, W. Y. (2013). Splitting attention reduces temporal resolution from 7 Hz for tracking one object to <3 Hz when tracking three. Journal of
Vision, 13(1), 1–19.
Horowitz, T. S., & Cohen, M. A. (2010). Direction information in multiple object
tracking is limited by a graded resource. Attention, Perception, & Psychophysics, 72,
1765–1775.
Howard, C. J., & Holcombe, A. O. (2008). Tracking the changing features of multiple objects: Progressively poorer perceptual precision and progressively greater
perceptual lag. Vision Research, 48, 1164–1180.
Howard, C. J., Masom, D., & Holcombe, A. O. (2011). Position representations lag
behind targets in multiple object tracking. Vision Research, 51, 1907–1919.
Johnson, J. S., Spencer, J. P., & Schöner, G. (2008). Moving to higher ground: The
dynamic field theory and the dynamics of visual cognition. New Ideas in Psychology,
26, 227–251.
Lisman, J. E., & Idiart, M. A. (1995). Storage of 7 +/− 2 short-term memories in oscillatory subcycles. Science, 267, 1512–1515.
Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features
and conjunctions. Nature, 390, 279–281.
Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA: MIT Press.
Pylyshyn, Z. W., & Storm, R. W. (1988). Tracking multiple independent targets: Evidence for a parallel tracking mechanism. Spatial Vision, 3, 179–197.
Raffone, A., & Wolters, G. (2001). A cortical mechanism for binding in visual working
memory. Journal of Cognitive Neuroscience, 13, 766–785.
Rensink, R. A., O’Regan, J. K., & Clark, J. J. (1997). To see or not to see:
The need for attention to perceive changes in scenes. Psychological Science, 8,
368–373.

12

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Scimeca, J. M., & Franconeri, S. L. (2015). Selecting and tracking multiple objects.
Wiley Interdisciplinary Reviews: Cognitive Science. Advance online publication.
http://dx.doi.org/10.1002/wcs.1328.
Scimeca, J. M., Jonathan, S., & Franconeri, S. L. (submitted). Maintaining selection
of multiple objects. Retrieved from http://www.journalofvision.org/content/
12/9/553.short.
Scott-Brown, K. C., Baker, M. R., & Orbach, H. S. (2000). Comparison blindness.
Visual Cognition, 7, 253–267.
Siegel, M., Warden, M. R., & Miller, E. K. (2009). Phase-dependent neuronal coding
of objects in short-term memory. Proceedings of the National Academy of Sciences,
U.S.A., 106, 21341–21346.
Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional
blindness for dynamic events. Perception, 28, 1059–1074.
Simons, D. J., & Levin, D. T. (1997). Change blindness. Trends in Cognitive Sciences, 1,
261–267.
Tombu, M., & Seiffert, A. E. (2011). Tracking planets and moons: Mechanisms
of object tracking revealed with a new paradigm. Attention, Perception, & Psychophysics, 73, 738–750.
Vogel, E. K., & Machizawa, M. G. (2004). Neural activity predicts individual differences in visual working memory capacity. Nature, 428, 748–751.
Vogel, E. K., Woodman, G. F., & Luck, S. J. (2001). Storage of features, conjunctions
and objects in visual working memory. Journal of Experimental Psychology: Human
Perception & Performance, 27, 92–114.
Wei, Z., Wang, X., & Wang, D. (2012). From distributed resources to limited slots in
multiple-item working memory: A spiking network model with normalization.
Journal of Neuroscience, 32, 11228–11240.
Wheeler, M. E., & Treisman, A. M. (2002). Binding in short-term visual memory. Journal of Experimental Psychology: General, 131, 48–64.
Xu, Y. (2002). Limitations in object-based feature encoding in visual short-term
memory. Journal of Experimental Psychology: Human Perception & Performance, 28,
458–468.
Xu, Y., & Chun, M. M. (2009). Selecting and perceiving multiple visual objects. Trends
in Cognitive Sciences, 13, 167–174.
Yantis, S. (1992). Multielement visual tracking: Attention and perceptual organization. Cognitive Psychology, 24, 295–340.
Zhang, W., & Luck, S. J. (2008). Discrete fixed-resolution representations in visual
working memory. Nature, 453, 233–235.

BRANDON M. LIVERENCE SHORT BIOGRAPHY
Brandon M. Liverence is currently a postdoctoral research fellow in the
Northwestern Visual Cognition Lab, and he received his PhD at Yale University working with Brian Scholl. His current projects explore the nature
of individuation and ensemble representation in visual working memory,

Resource Limitations in Visual Cognition

13

and connections between spatial navigation and object representation. He
is funded by a National Research Service Award from the National Eye
Institute. His work has been published in journals such as Psychological
Science and JEP: General.
STEVEN L. FRANCONERI SHORT BIOGRAPHY
Steven L. Franconeri is an associate professor of Psychology at Northwestern
University. His lab studies include visual cognition, graph comprehension,
and data visualization. He completed his PhD in Experimental Psychology at
Harvard with a National Defense Science and Engineering Fellowship, and
did postdoctoral work at UBC with a Killam Fellowship. He has received
the Psychonomics Early Career Award and an NSF CAREER award, and his
work receives funding from the NSF, NIH, and the Department of Education.
His lab strives to explore fundamental questions that also have real-world
relevance, collaborating with researchers in education (e.g., graph comprehension) and computer science (e.g., comparison within information visualization). These collaborations allow basic research to impact students and
scientists, while their unsolved problems help us identify gaps in our theoretical knowledge.
RELATED ESSAYS
Spatial Attention (Psychology), Kyle R. Cave
Cultural Neuroscience: Connecting Culture, Brain, and Genes (Psychology),
Shinobu Kitayama and Sarah Huff
Models of Duality (Psychology), Anand Krishna et al.
Neural and Cognitive Plasticity (Psychology), Eduardo Mercado III
Embodied Knowledge (Psychology), Diane Pecher and René Zeelenberg
Attention and Perception (Psychology), Ronald A. Rensink
Event Processing as an Executive Enterprise (Psychology), Robbie A. Ross and
Dare A. Baldwin
The Intrinsic Dynamics of Development (Psychology), Paul van Geert and
Marijn van Dijk
How Form Constrains Function in the Human Brain (Psychology), Timothy
D. Verstynen
Speech Perception (Psychology), Athena Vouloumanos
Behavioral Heterochrony (Anthropology), Victoria Wobber and Brian Hare


Resource Limitations
in Visual Cognition
BRANDON M. LIVERENCE and STEVEN L. FRANCONERI

Abstract
Visual attention and visual working memory are two of the core resources that support visual perception. Foundational research has demonstrated that these resources
are highly limited, but an active debate concerns exactly how they are limited. While
many classic studies suggested that these resources are fundamentally discrete, with
fixed capacity of 3–4 objects maximum, a number of recent studies have argued that
these resources are fundamentally continuous, with no fixed upper-bound to the
number of objects that can be attended or remembered. This entry reviews the state
of this debate, and shows how convergence between these (often separate) areas of
research is a major emerging trend in the field of visual cognition.

INTRODUCTION
The visual system is constantly overwhelmed with information. As the
amount of input registered in early vision far outstrips the capacity of more
computationally expensive later stages of visual processing, it is impossible
to fully process and perceive everything in view at any given moment.
Additionally, because low-level visual input is frequently in flux (due to
blinks, eye movements, and physical changes in the environment), the visual
system has to solve tricky correspondence problems in order to maintain
perceptual stability. To meet these challenges, vision relies on a pair of core
resources: visual attention, which serves as a filter to ensure that only relevant
objects are fully processed, and visual working memory (VWM), which
supports perceptual stability by providing a temporary storage for recent
visual input. Unfortunately, these resources are highly limited, and there are
often limits to the number of objects that can be simultaneously attended or
actively remembered. When these limits are exceeded, dramatic failures of
visual awareness can occur (e.g., inattentional blindness and change blindness).
This entry will explore the nature of these visual resources and address the
following questions along the way: How many objects can be attended or
Emerging Trends in the Social and Behavioral Sciences. Edited by Robert Scott and Stephen Kosslyn.
© 2015 John Wiley & Sons, Inc. ISBN 978-1-118-90077-2.

1

2

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

remembered at one time? Are these resources fundamentally discrete (with
fixed precision) or continuous (with variable precision)? What explains these
limits, and are they fixed, or are there ways to increase one’s resources?
While drawing novel connections between parallel work on visual attention
and VWM, this essay will show that their convergence—both theoretical
and methodological—represents a major emerging trend in visual cognition.
FOUNDATIONAL RESEARCH ON VISUAL WORKING
MEMORY RESOURCES
VWM is a highly limited resource, as is clear from demonstrations of
“change blindness” in which observers fail to detect dramatic changes
occurring between glances of a scene (Rensink et al., 1997; Simons & Levin,
1997; cf. Scott-Brown et al., 2000, for an alternative discussion of how such
effects may reflect “comparison blindness” rather than memory limitations).
A simplified version of this change detection paradigm has become a common
way to measure VWM. In one seminal study (Luck & Vogel, 1997), observers
briefly viewed 1–12 objects that disappeared briefly and then reappeared,
with a change occurring to a single object on some trials. Observers’
performance at detecting these changes suggested that they could store
the features of only ∼4 objects per trial, confirming that VWM capacity is
quite low. Intriguingly, observers were just as good at noticing changes to
objects that had only one feature as to objects that could change along any
of four feature dimensions, which led the authors to conclude that VWM
is a fundamentally discrete resource constrained by the number of objects
stored rather than their complexity. While subsequent studies challenged the
strongest versions of this hypothesis (Wheeler & Treisman, 2002; Xu, 2002),
the basic finding of an upper limit in VWM capacity of 3–4 fairly simple
objects has been repeatedly verified (e.g., Awh et al., 2007; Vogel et al., 2001).
An influential study (Zhang & Luck, 2008) using a continuous report
paradigm provides even more powerful evidence for discrete VWM
resources. Observers briefly viewed 1–6 objects and then reported a test
object’s feature value from memory (e.g., color) by selecting it from a
continuous circular distribution (e.g., a color wheel). When the data were fit
to a mixture model with a normally-distributed component (reflecting trials
in which the probed item was noisily encoded) and a uniform component
(reflecting random guessing for unencoded items), they found that the
uniform component sharply increased from set size 3–6 but that the standard deviation of the normally-distributed component (a measure of the
precision of encoding) did not change. This suggests that once participants
had fully allocated their ∼3 fixed-capacity VWM “slots,” they failed to
encode any information from additional items and had to guess at random.

Resource Limitations in Visual Cognition

3

Precision also decreased from set size 1–3, which the authors explain in
terms of a “Slots + Averaging” model: for set sizes under 3, participants
allocate multiple slots (each containing some independent noise) to each
item allowing them to improve their performance by averaging across
multiple noisy representations. Critically, however; the Slots + Averaging
cannot account for recent findings that the typical drop in precision from
set size 1–2 is larger that would be predicted by averaging (Bays et al., 2009)
and that under some conditions there is no drop in precision in this range
whatsoever (Bae & Flombaum, 2013).
Another source of evidence for discrete VWM resource limits comes
from investigations of the contralateral delay activity (CDA), a putative
electrophysiological index of the number of items stored in VWM that
increases with memory load but reaches plateau at ∼3 objects in typical
observers (Anderson et al., 2011; Vogel & Machizawa, 2004). Notably, the
CDA also indexes set size in a multifocal attention task (as reviewed below;
Drew & Vogel, 2008; Drew et al., 2011), suggesting that it may reflect spatial
selection rather than memory, per se. Also, if the CDA fundamentally reflects
discrete memory “slots,” then following the logic of Slots + Averaging, all
slots should be used in parallel even at set sizes 1–2 to improve precision
via averaging, implying that there should not be differences in the CDA at
small versus large set sizes.
CUTTING-EDGE RESEARCH ON RESOURCE LIMITATIONS
IN VISUAL WORKING MEMORY
One of the central debates in VWM research concerns whether VWM
resources are constituted by discrete fixed-capacity “slots,” or a flexible,
continuous resource that can be variably allocated to any number of objects.
Critically, continuous resource models predict tradeoffs between complexity
and capacity, as complex objects are assumed to require more resources
to encode. Strong evidence for such effects came from a study (Alvarez &
Cavanagh, 2004) that tested VWM capacity using stimuli ranging from very
simple color patches to highly complex objects, such as Chinese characters
or multi-shade cubes. The authors used both a change-detection task to
estimate memory capacity for these different stimulus types, and a separate
speeded-search task to quantify the complexity of each stimulus type. They
observed an almost perfect linear correlation between these measures,
verifying that memory capacity was much lower for the most complex
objects (e.g., only ∼1.5 cubes could be stored per trial).
Other work has shown that it is possible to store more than four representations in VWM, albeit at low precision. Bays & Husain (2008) found that in
a spatial memory task, VWM resources could be spread among up to at least

4

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

six objects, and that at increased set sizes there was a concomitant decrease
in precision that follows a power law, consistent with a continuous resource
being spread ever more thinly. There is also evidence that different objects in
a scene can receive differing amounts of memory resources, with numerous
studies finding that some objects can be prioritized (via cueing) over others
(Bays & Husain, 2008; Gorgoraptis et al., 2011). Even in the absence of cues,
memory precision seems to naturally vary between objects and across trials,
consistent with continuous resources (Fougnie et al., 2012).
The upper threshold of ∼3–4 representations observed in classical VWM
tasks may reflect a tendency to represent a subset of items in high resolution
and a subset in low resolution, with these low resolution representations
being treated incorrectly by many models as “guesses” rather than as
low-precision “hits” (van den Berg et al., 2012), though the question of
whether participants ever truly guess at random (due to a complete failure
to encode any detail from the target) is a matter of debate (cf. Fougnie
et al., 2012). Alternatively, VWM resources may be continuous—even while
behavioral performance exhibits strict (and seemingly discrete) capacity
limits—because VWM relies on a fundamentally discrete indexing resource
(possibly attention) to link VWM representations to spatial locations (Xu &
Chun, 2009). Even if observers can store more than four representations,
they may be unable to accurately link all of these representations to items
in the test display, causing them to make accurate responses to the wrong
items (Bays et al., 2009; Emrich & Ferber, 2012).
These results are also compatible with a recent perspective that suggests
that the fundamental units of VWM are not objects, but rather hierarchical
feature bundles that encompass both object-based advantages in storing individual features and higher-order regularities (e.g., spatial and featural) that
emerge across collections of objects, and that can enhance VWM capacity via
compression and summary statistics (Brady et al., 2011). Thus, even if there is
a capacity limit to the number of objects that can stored independently with
great precision, the actual capacity of VWM may be much higher because
higher-order regularities may be encoded across all objects in the scene.
The work reviewed thus far has focused on the capacity and nature of VWM
resources. However, to truly explain why and how these resources are limited, it is critical to consider computational and neural models of VWM. For
example, multi-object working memory can be implemented via local spatial
interactions between neurons following a “Mexican Hat” formation in which
inhibition is high near the peak of each representation’s activation and falls
off with increasing distance. Such interactions help keep neighboring representations separate, but also keep overall inhibition within the network low
enough to allow multiple items to be represented in parallel. Given certain
parameterizations of these interactions, such neural models mimic typical

Resource Limitations in Visual Cognition

5

human VWM capacity limits in change-detection (Johnson et al., 2008). Relatedly, a recent, biologically plausible model (Wei et al., 2012) illustrates how
modeling representations as “activity bumps” in a continuous pool of neurons with excitatory and inhibitory properties gives rise to properties typically associated with both discrete and continuous models of VWM.
Instead of conceptualizing VWM resources as having a fixed upper bound
capacity, a similar recent proposal suggests that these resources are fundamentally unlimited, with observed limits in task performance arising from
spatial competition between representations in content maps (Franconeri,
2013; Franconeri et al., 2013). Content maps are extensions of functionallyand spatially-organized neural substrate, and representations occupy
physical locations within these neural maps. Objects that are physically
close in visual or feature space thus become represented in neighboring
regions of visual cortex, and must therefore compete for the same pool of
neural resources. Visual capacity limits are thus a byproduct of the physical
limitations of neural real estate and the competitive interactions that emerge
between representations in these maps. For example, many VWM errors can
be accounted for as spatial mismatches between sample and test items (Bays
et al., 2009; Emrich & Ferber, 2012), and decreasing similarity in integral feature dimensions (e.g., color in a brightness memory task) increases precision
in VWM (Bae & Flombaum, 2013), suggesting that representational capacity
may be much higher than spatial indexing capacity.
Alternatively, VWM resource limitations may arise due to purely temporal properties of the neural substrates of memory. According to the principle
of oscillatory multiplexing, memory representations are encoded in oscillating patterns of global brain activity. Lisman and Idiart (1995) suggest that
a working memory capacity of 7 ± 1 could be derived from the number
of high-frequency gamma (40 Hz) brain oscillations that fit within a single
low-frequency alpha–theta (5–12 Hz) oscillation, though more recent formulations (Raffone & Wolters, 2001; Siegel et al., 2009) have argued for values
closer to the classic “slots” capacity of 4 ± 1 items. A major advantage of these
theories is that they show how brains that are inherently continuous (having
pools of billions of neurons) can nonetheless behave in a way that is discrete and slot-like. The downside is that there is very little direct empirical
support for these theories thus far. Furthermore, while multiplexing theories
are typically associated with discrete resource theories, they could also be
consistent with continuous resources if variability in these oscillatory rates
is causally related to memory precision. For example, different patterns of
oscillation might result in higher capacity but lower precision because each
representation receives a smaller temporal share of memory resources.
Of course, a computational model that can simulate characteristics of
human performance via careful tweaking of free parameters is not especially

6

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

compelling in and of itself. Thus, the challenge is to link such parameters
to actual neural measures and individual differences in human behavioral
performance (e.g., the amount and shape of “inhibition” observed between
neighboring representations on some VWM task), and then use these
parameters to predict individual differences in resource capacity.
FOUNDATIONAL RESEARCH ON RESOURCE LIMITATIONS
IN VISUAL ATTENTION
When observers focus all of their attention on a challenging primary task,
even highly salient changes (e.g., the sudden appearance of a gorilla) can
go unnoticed (Mack & Rock, 1998; Simons & Chabris, 1999). This effect,
called inattentional blindness, is thought to reflect the limited capacity of
attention—when attention is “used up” by the primary task, there are insufficient resources to detect the salient changes. While such demonstrations
reveal that attentional resources are limited, other measures focus on quantifying these limitations. In multiple object tracking (MOT), observers see a
display filled with identical objects, some of which are cued as “targets.” All
objects then move independently (and often, unpredictably), and observers
must keep track of the targets’ positions throughout the movement and later
reidentify them. These studies showed that people could track at least four
objects (Pylyshyn & Storm, 1988; Yantis, 1992) suggesting an underlying
capacity limit in the ability to divide attentional resources (Pylyshyn &
Storm, 1988). This limit matched values obtained from broader literatures
on memory (Cowan, 2001), leading some to suggest that VWM and visual
attention rely (at least partially) upon a common pool of resources, with
VWM being constrained by a form of “inward-directed” attention (Chun,
2011; Gazzaley & Nobre, 2012).
CUTTING-EDGE RESEARCH ON RESOURCE LIMITATIONS
IN VISUAL ATTENTION
Mirroring the debate in the VWM literature over whether capacity reflects a
fixed number of discrete object-based “slots” or a more continuous resource,
new research on MOT explores similar divisions. Like VWM, the upper limit
on capacity does not appear to be fixed—recent studies show that there are in
fact display conditions where tracking capacity can be raised to eight objects
at once (Alvarez & Franconeri, 2007). Like VWM, there are arguments that
objects are represented not as individuals in “slots,” but that higher-order
structures (hierarchical features bundles for VWM, e.g., Brady et al., 2011)
might help compress position representations of objects. For example, there
is evidence that tracked objects might be organized by common fate or as

Resource Limitations in Visual Cognition

7

vertices of a rigid polygon (Yantis, 1992, see Scimeca & Franconeri, 2015, for
discussion).
There are also several demonstrations of performance limits that appear to
reflect continuous allocation of processing resources. For example, participants are capable of tracking up to eight objects at a time, but only when the
objects move very slowly (Alvarez & Franconeri, 2007). Relatedly, at very fast
speeds only a single object can be successfully tracked (Holcombe & Chen,
2012). These results may suggest a capacity-precision tradeoff in visual
attention, with faster moving objects demanding additional resources,
though such explanations are controversial (cf. Franconeri et al., 2010).
Given how instrumental continuous report measures have been in recent
studies of the nature of VWM resources, the development of continuous
report measures in MOT may similarly inform debate over the nature of
visual attention resources. For example, a recent study in which “targets” in
a simplified tracking task disappeared at the end of each trial and then had
to be localized from memory via mouseclicks found that with increasing
tracking load, response clicks lagged increasingly far behind each target’s
true position relative to its direction of motion (Howard & Holcombe,
2008), though these effects do not seem to generalize to more typical MOT
displays (Howard et al., 2011). Relatedly, Horowitz and Cohen (2010) asked
participants to judge the last-remembered trajectory angle of targets at set
sizes ranging from 1 to 6. When they fit these data to a two-component
mixture model (as in Zhang & Luck, 2008) to derive separate estimates of
the probability and precision of tracking, they found that angular error for
targets increased continuously up to set size 6, consistent with a continuous
resource.
There are also parallels to the proposed set of underlying mechanisms for
limits in VWM, which can be divided into spatial (cortical map limitations)
and temporal (oscillatory multiplexing) theories. Explanations of attention
resources as spatially limited are supported by work demonstrating that
target-distractor spacing influences tracking capacity. When spacing is
maximized, observers can successfully track at least six objects in parallel,
irrespective of object speed (Franconeri et al., 2010; cf. Tombu & Seiffert,
2011), suggesting that attention may be a fundamentally continuous resource
with no strict capacity limit. Critically, this proposal explains limits in tracking capacity as a consequence of competition within spatial attention maps
(e.g., in the frontal eye fields and related parietal areas) arising during
close interactions between targets and distractors, as such interactions
involve destructive interference between representations. Relatedly, speed
and spatial crowding may limit tracking performance by creating spatial
confusability between targets and distractors (Franconeri et al., 2010),
analogous to how spatial confusions seem to reduce effective VWM capacity

8

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

(Bays et al., 2009). In fast-moving displays, target objects have more frequent
close interactions with distractors than in slow-moving displays, increasing
the probability of selecting the wrong item, and thus, lowering capacity
estimates.
Spatial content maps may also be partly interactive with feature-based
content maps, such that objects that are similar along one dimension can
be separately indexed (and successfully tracked or stored) so long as they
are distinct along another dimension. This possibility is supported by
recent work showing that tracking performance improves when targets and
distractors are visually distinct during close encounters (Bae & Flombaum,
2012), suggesting that competition within one content map (e.g., spatial
position) can be avoided or alleviated via distinctiveness within another
(e.g., position in color space).
Support for a temporal basis for limitations of attention resources is
provided by recent work claiming to show tradeoffs between capacity and
the temporal precision of tracking. Holcombe and Chen (2013) show that
a single target can be tracked at temporal resolutions up to 7 Hz (i.e., 0.58
rev/s on a clock face with 12 positions), but this threshold drops to 4 Hz
for two targets and 2.6 Hz for three targets. This point is controversial, with
some researchers arguing that these data do not reveal temporal resource
limitations (Scimeca, Jonathan, & Franconeri, submitted). Also, as such
effects could also be accommodated within a discrete-resource framework
(via Slots + Averaging), a critical question is whether at even higher set sizes,
temporal precision continues to decrease gradually, or whether temporal
precision eventually bottoms out as guessing rate increases (as a discrete
model would predict).
Another open question is whether attentional resources can be allocated unevenly during tracking, as appears possible for continuous VWM
resources (Bays & Husain, 2008). While there is no solid evidence to date
for such effects, a study design in which multiple targets per trial are
probed with a continuous response (e.g., location or trajectory angle), or in
which participants are asked to respond to the “best remembered” versus a
randomly-chosen item (as in Fougnie et al., 2012), or in which some targets
are designated as “high-priority” and others as “low-priority” (perhaps
reinforced by a monetary incentive structure), would be helpful in resolving
this question.
It seems clear that attention researchers have much to gain by following
developments in the study of VWM and vice versa. Even if visual attention and VWM involve distinct (though partially interactive) resources, both
resources may be subject to similar architectural constraints. Also at stake is
the question of whether attention and memory reflect the same resource, as

Resource Limitations in Visual Cognition

9

such an account would require that attention and VWM resources are limited
in the same way—either discretely or continuously.
CONCLUSIONS
While visual attention and VWM have often been studied as separate topics,
using distinct methodologies (and, more often than not, by different groups
of researchers), it seems clear that there is much to be gained from increasing
theoretical and methodological convergence between these areas of research.
For example, we have seen how major methodological advances in VWM
research (e.g., continuous report paradigms and neural signatures such as
the CDA) can inform fundamental theories of visual attention, in particular,
the question of whether attention resources are fundamentally discrete or
continuous, and of whether attention and VWM resources are the same or
are distinct.
There are still many open questions left to resolve about the nature of these
resources, but there is mounting evidence for continuous resource models in
visual attention and VWM. If these resources truly are continuous, it will be
crucial to understand why human performance sometimes appears to be discrete, and to synthesize this finding with neural measures (such as the CDA
and global oscillatory activity) that are inherently discrete. More generally,
a deeper understanding of resource limitations will require both continued
advances in behavioral and neural measures of visual attention and VWM
capacity and crosstalk between these fields. Eventually, visual cognition can
begin to move beyond questions of how these resources are limited to more
fundamental questions about what these resources are and how to maximize
them in everyday contexts.
KEY ISSUES FOR FUTURE RESEARCH
1. How much overlap is there between visual attention and VWM
resources? Do they have the same capacity? Are they both discrete or
both continuous?
2. Is there such a thing as a true “guess” in VWM? Or are all objects in the
scene always encoded with at least a minimal amount of resources?
3. Is there a true upper bound to the number of objects that can be remembered or tracked? Likewise, is there an upper bound to the resolution
with which a single object can be tracked or remembered?
4. To what extent are objects encoded independently versus hierarchically
in VWM and attention?
5. Can speed impair tracking performance independently of spacing?

10

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

6. Can visual attention resources be divided unevenly among objects, as
VWM resources seemingly can be?
7. What is the ultimate neural basis of visual resources? Can individual
differences in VWM and MOT performance be predicted based on differences in brain activity?
REFERENCES
Alvarez, G. A., & Cavanagh, P. (2004). The capacity of visual short-term memory is
set both by visual information load and by number of objects. Psychological Science,
15, 106–111.
Alvarez, G. A., & Franconeri, S. L. (2007). How many objects can you track? Evidence
for a resource-limited tracking mechanism. Journal of Vision, 7(13), 1–10.
Anderson, D. E., Vogel, E. K., & Awh, E. (2011). Precision in visual working memory reaches a stable plateau when individual item limits are exceeded. Journal of
Neuroscience, 31, 1128–1138.
Awh, E., Barton, B., & Vogel, E. K. (2007). Visual working memory represents a fixed
number of items regardless of complexity. Psychological Science, 18, 622–628.
Bae, G. Y., & Flombaum, J. I. (2012). Close encounters of the distracting kind: Identifying the cause of visual tracking errors. Attention, Perception, & Psychophysics, 74,
703–715.
Bae, G. Y., & Flombaum, J. I. (2013). Two items remembered as precisely as one: How
integral features can improve visual working memory. Psychological Science, 24,
2038–2047.
Bays, P. M., Catalao, R. F., & Husain, M. (2009). The precision of visual working memory is set by allocation of a shared resource. Journal of Vision, 9(10), 1–11.
Bays, P. M., & Husain, M. (2008). Dynamic shifts of limited working memory
resources in human vision. Science, 321, 851–854.
van den Berg, R., Shin, H., Chou, W. C., George, R., & Ma, W. J. (2012). Variability in
encoding precision accounts for visual short-term memory limitations. Proceedings
of the National Academy of Sciences, U.S.A., 109, 8780–8785.
Brady, T. F., Konkle, T., & Alvarez, G. A. (2011). A review of visual memory capacity:
Beyond individual items and toward structured representations. Journal of Vision,
11(5), 1–34.
Chun, M. M. (2011). Visual working memory as visual attention sustained internally
over time. Neuropsychologia, 49, 1407–1409.
Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration
of mental storage capacity. Behavioral & Brain Sciences, 24, 87–185.
Drew, T., Horowitz, T. S., Wolfe, J. M., & Vogel, E. K. (2011). Delineating the neural signatures of tracking spatial position and working memory during attentive
tracking. Journal of Neuroscience, 31, 659–668.
Drew, T., & Vogel, E. K. (2008). Neural measures of individual differences in selecting
and tracking multiple moving objects. Journal of Neuroscience, 28, 4183–4191.
Emrich, S. M., & Ferber, S. (2012). Competition increases binding errors in visual
working memory. Journal of Vision, 12(4), 1–16.

Resource Limitations in Visual Cognition

11

Fougnie, D., Suchow, J. W., & Alvarez, G. A. (2012). Variability in the quality of visual
working memory. Nature Communications, 3(1229), 1–8.
Franconeri, S. L. (2013). The nature and status of visual resources. In D. Reisberg
(Ed.), Oxford handbook of cognitive psychology (pp. 147–162). New York, NY: Oxford
University Press.
Franconeri, S. L., Alvarez, G. A., & Cavanagh, P. (2013). Flexible cognitive resources:
Competitive content maps for attention and memory. Trends in Cognitive Sciences,
17, 134–141.
Franconeri, S. L., Jonathan, S., & Scimeca, J. M. (2010). Tracking multiple objects is
limited only by object spacing, not speed, time, or capacity. Psychological Science,
21, 920–925.
Gazzaley, A., & Nobre, A. C. (2012). Top-down modulation: Bridging selective attention and working memory. Trends in Cognitive Sciences, 16, 129–135.
Gorgoraptis, N., Catalao, R. F., Bays, P. M., & Husain, M. (2011). Dynamic updating of working memory resources for visual objects. Journal of Neuroscience, 31,
8502–8511.
Holcombe, A. O., & Chen, W. Y. (2012). Exhausting attentional tracking resources
with a single fast-moving object. Cognition, 123, 218–228.
Holcombe, A. O., & Chen, W. Y. (2013). Splitting attention reduces temporal resolution from 7 Hz for tracking one object to <3 Hz when tracking three. Journal of
Vision, 13(1), 1–19.
Horowitz, T. S., & Cohen, M. A. (2010). Direction information in multiple object
tracking is limited by a graded resource. Attention, Perception, & Psychophysics, 72,
1765–1775.
Howard, C. J., & Holcombe, A. O. (2008). Tracking the changing features of multiple objects: Progressively poorer perceptual precision and progressively greater
perceptual lag. Vision Research, 48, 1164–1180.
Howard, C. J., Masom, D., & Holcombe, A. O. (2011). Position representations lag
behind targets in multiple object tracking. Vision Research, 51, 1907–1919.
Johnson, J. S., Spencer, J. P., & Schöner, G. (2008). Moving to higher ground: The
dynamic field theory and the dynamics of visual cognition. New Ideas in Psychology,
26, 227–251.
Lisman, J. E., & Idiart, M. A. (1995). Storage of 7 +/− 2 short-term memories in oscillatory subcycles. Science, 267, 1512–1515.
Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features
and conjunctions. Nature, 390, 279–281.
Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA: MIT Press.
Pylyshyn, Z. W., & Storm, R. W. (1988). Tracking multiple independent targets: Evidence for a parallel tracking mechanism. Spatial Vision, 3, 179–197.
Raffone, A., & Wolters, G. (2001). A cortical mechanism for binding in visual working
memory. Journal of Cognitive Neuroscience, 13, 766–785.
Rensink, R. A., O’Regan, J. K., & Clark, J. J. (1997). To see or not to see:
The need for attention to perceive changes in scenes. Psychological Science, 8,
368–373.

12

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Scimeca, J. M., & Franconeri, S. L. (2015). Selecting and tracking multiple objects.
Wiley Interdisciplinary Reviews: Cognitive Science. Advance online publication.
http://dx.doi.org/10.1002/wcs.1328.
Scimeca, J. M., Jonathan, S., & Franconeri, S. L. (submitted). Maintaining selection
of multiple objects. Retrieved from http://www.journalofvision.org/content/
12/9/553.short.
Scott-Brown, K. C., Baker, M. R., & Orbach, H. S. (2000). Comparison blindness.
Visual Cognition, 7, 253–267.
Siegel, M., Warden, M. R., & Miller, E. K. (2009). Phase-dependent neuronal coding
of objects in short-term memory. Proceedings of the National Academy of Sciences,
U.S.A., 106, 21341–21346.
Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional
blindness for dynamic events. Perception, 28, 1059–1074.
Simons, D. J., & Levin, D. T. (1997). Change blindness. Trends in Cognitive Sciences, 1,
261–267.
Tombu, M., & Seiffert, A. E. (2011). Tracking planets and moons: Mechanisms
of object tracking revealed with a new paradigm. Attention, Perception, & Psychophysics, 73, 738–750.
Vogel, E. K., & Machizawa, M. G. (2004). Neural activity predicts individual differences in visual working memory capacity. Nature, 428, 748–751.
Vogel, E. K., Woodman, G. F., & Luck, S. J. (2001). Storage of features, conjunctions
and objects in visual working memory. Journal of Experimental Psychology: Human
Perception & Performance, 27, 92–114.
Wei, Z., Wang, X., & Wang, D. (2012). From distributed resources to limited slots in
multiple-item working memory: A spiking network model with normalization.
Journal of Neuroscience, 32, 11228–11240.
Wheeler, M. E., & Treisman, A. M. (2002). Binding in short-term visual memory. Journal of Experimental Psychology: General, 131, 48–64.
Xu, Y. (2002). Limitations in object-based feature encoding in visual short-term
memory. Journal of Experimental Psychology: Human Perception & Performance, 28,
458–468.
Xu, Y., & Chun, M. M. (2009). Selecting and perceiving multiple visual objects. Trends
in Cognitive Sciences, 13, 167–174.
Yantis, S. (1992). Multielement visual tracking: Attention and perceptual organization. Cognitive Psychology, 24, 295–340.
Zhang, W., & Luck, S. J. (2008). Discrete fixed-resolution representations in visual
working memory. Nature, 453, 233–235.

BRANDON M. LIVERENCE SHORT BIOGRAPHY
Brandon M. Liverence is currently a postdoctoral research fellow in the
Northwestern Visual Cognition Lab, and he received his PhD at Yale University working with Brian Scholl. His current projects explore the nature
of individuation and ensemble representation in visual working memory,

Resource Limitations in Visual Cognition

13

and connections between spatial navigation and object representation. He
is funded by a National Research Service Award from the National Eye
Institute. His work has been published in journals such as Psychological
Science and JEP: General.
STEVEN L. FRANCONERI SHORT BIOGRAPHY
Steven L. Franconeri is an associate professor of Psychology at Northwestern
University. His lab studies include visual cognition, graph comprehension,
and data visualization. He completed his PhD in Experimental Psychology at
Harvard with a National Defense Science and Engineering Fellowship, and
did postdoctoral work at UBC with a Killam Fellowship. He has received
the Psychonomics Early Career Award and an NSF CAREER award, and his
work receives funding from the NSF, NIH, and the Department of Education.
His lab strives to explore fundamental questions that also have real-world
relevance, collaborating with researchers in education (e.g., graph comprehension) and computer science (e.g., comparison within information visualization). These collaborations allow basic research to impact students and
scientists, while their unsolved problems help us identify gaps in our theoretical knowledge.
RELATED ESSAYS
Spatial Attention (Psychology), Kyle R. Cave
Cultural Neuroscience: Connecting Culture, Brain, and Genes (Psychology),
Shinobu Kitayama and Sarah Huff
Models of Duality (Psychology), Anand Krishna et al.
Neural and Cognitive Plasticity (Psychology), Eduardo Mercado III
Embodied Knowledge (Psychology), Diane Pecher and René Zeelenberg
Attention and Perception (Psychology), Ronald A. Rensink
Event Processing as an Executive Enterprise (Psychology), Robbie A. Ross and
Dare A. Baldwin
The Intrinsic Dynamics of Development (Psychology), Paul van Geert and
Marijn van Dijk
How Form Constrains Function in the Human Brain (Psychology), Timothy
D. Verstynen
Speech Perception (Psychology), Athena Vouloumanos
Behavioral Heterochrony (Anthropology), Victoria Wobber and Brian Hare