Emerging Trends in The Social and Behavioral Sciences · Attention and Perception

Attention and Perception

Media

Part of Attention and Perception

Title: Attention and Perception
extracted text: Attention and Perception
RONALD A. RENSINK

Abstract
This essay discusses several key issues concerning the study of attention and its relation to visual perception, with an emphasis on behavioral and experiential aspects.
It begins with an overview of several classical works carried out in the latter half
of the twentieth century, such as the development of early filter and spotlight models of attention. This is followed by a survey of subsequent research that extended
or modified these results in significant ways. It includes work on various forms of
induced blindness and on the capabilities of nonattentional processes. It also covers
proposals about how a “just-in-time” allocation of attention can create the impression that we see our surroundings in coherent detail everywhere, as well as how the
failure of such allocation can result in various perceptual deficits. The final section
examines issues that have not received much consideration to date, but that may be
important for new lines of research in the near future. These include the prospects
for a better characterization of attention, the possibility of more systematic computational explanations, factors that may significantly modulate attentional operation,
and the possibility of several kinds of visual attention and visual experience.

INTRODUCTION
Whenever we open our eyes, we experience an ever-changing world of colors, shapes, and movements. This experience is so vivid and so compelling
that we rarely stop to consider whether the underlying mechanisms may
have limitations. Instead, we simply have a strong impression that we always
perceive everything in front of us. Although we may need to scrutinize something on occasion, for the most part our visual system appears to operate in
an automatic and seamless way, providing us with a complete and detailed
representation of whatever is in our field of view.
But however appealing it may be, this impression cannot be correct.
Suppose someone wants to keep track of various players in a sports game. A
single player can usually be tracked without problem. Three or four can also
be tracked, although with some effort. But as the number increases further,
simultaneous tracking of all the selected players becomes impossible.
Performance evidently depends on a factor which enables certain kinds of
Emerging Trends in the Social and Behavioral Sciences. Edited by Robert Scott and Stephen Kosslyn.
© 2015 John Wiley & Sons, Inc. ISBN 978-1-118-90077-2.

1

2

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

perception to occur, but which has a clear limit to its capacity. This factor is
generally referred to as attention.
Work on human vision is providing increasing evidence that visual perception is the result of several interacting processes, most of which are quite
sophisticated, and many of which have definite limits to their abilities. And
rather than the outputs of these processes accumulating in a detailed construction, much of our perception results instead from the coordination of these
processes. In particular, much of our visual experience appears to depend on
managing attention so that it is sent to the right item at the right time. As
such, attention is more than something that simply modifies or assists our
perception on occasion—it is instead a factor central to our awareness of the
world around us.
FOUNDATIONAL RESEARCH
It was recognized long ago that we need to pay attention to adequately perceive our surroundings. But only recently have we obtained a better understanding of what attention is and how it relates to perception. Building upon
the proposals of philosophers of the seventeenth and eighteenth centuries,
researchers in the nineteenth century began to map out several of its main
characteristics. For example, Hermann von Helmholtz discovered that an
observer could attend to (in the sense of recognizing) letters at locations outside of where the eyes were aimed (or “fixated”), showing that attention is not
equivalent to eye fixation. Meanwhile, William James distinguished “sensorial” from “intellectual” attention—the former concerned with concrete objects
such as particular sports players, the latter with more abstract structures such
as the quality of the game. James also associated sensorial attention—in particular, visual attention, the focus of this review—with clarity of perception,
intensity of perception, and visual memory. Many of these concerns became
an enduring backdrop for subsequent work.
FILTER MODELS
A more rigorous approach to understanding attention was developed during
the middle decades of the twentieth century, when researchers began focusing more on its selective aspects, and—in line with the “cognitive revolution”
of that time—replaced the original emphasis on subjective experience with
an emphasis on objective models. Donald Broadbent proposed an influential “filter” model, in which perception was carried out via a sequence of
processes in a single pathway, with an attentional filter that gated selected
aspects of a stimulus through to later processes. An important issue was the
locus of this filter: whether it was early (selection affecting the initial stages,

Attention and Perception

3

which measured simple properties such as color and motion) or late (selection appearing only at the highest levels, gating properties such as semantic
category).
The work undertaken to settle this issue resulted in a great deal of information about the ways various operations were affected by attention. However,
the complete resolution of this issue eluded researchers, and continues to do
so to this day. This strongly suggests that some of the original assumptions
were incorrect: there may be, for example, more than one filter in the pathway
(not to mention more than one pathway), making questions concerning a single filter somewhat ill-founded. To get further insights, a different approach
was needed.
SPOTLIGHT MODELS
Despite the failure to determine whether selection was early or late, investigations into this issue resulted in a variety of new methodologies and new
frameworks. Over time, concerns about the nature of filters receded, and
were replaced by an emphasis on how attention affected the representations
themselves.
An example of such a methodology is visual search, where observers are
asked to report on a prespecified target item in a visual display. It was found,
for example, that some items can be detected immediately and without much
attention (e.g., a blue dot among a set of yellow dots), whereas others cannot
(e.g., a “T” among a set of “L”s). Among the more prominent frameworks to
account for such findings was Anne Treisman’s feature integration theory. This
framework modeled visual processing in terms of two stages. The first is a
preattentive stage that determines simple properties (features) such as color
or motion rapidly and in parallel at each point of the visual field, resulting
in a “map” describing the spatial distribution of each feature. The second
involves a limited-capacity “spotlight” of attention that travels from item to
item at a rate of about 50 ms per item, not only filtering but also binding features that correspond to the same item (e.g., integrating the representations of
the “blue” and “vertical” properties at a location into a single representation
of both). Later refinements included the guided search model of Jeremy Wolfe
and colleagues, in which items in a feature map could be selectively inhibited
or excited to improve the efficiency of search. Other variants examined issues
such as the extent to which attention might be allocated in parallel rather than
in a serial fashion. All these models had natural connections to other areas
of vision science: for example, the features found in visual search could be
related in a fairly straightforward way to many of the elements underlying
texture perception.

4

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Other approaches yielded similar results. Michael Posner and colleagues
did seminal work on cuing, showing that if a cue (such as a dot) was shown
at the location of a target just before the target appeared, detection could be
sped up by several hundred milliseconds. This speedup diminishes as the
separation between target and cue increases, something readily accounted
for by a model in which the edges of the spotlight are smooth. Meanwhile,
Charles Eriksen and colleagues showed that a spotlight mechanism could
also account for the ability of nearby items (or “flankers”) to interfere with
detection; results also suggested that only one spotlight operates at any time,
and that it can rapidly adjust its size, “zooming” in or out as required by
the task. Owing to its ability to account for a variety of effects, therefore, the
spotlight model has become the “classical” explanation of visual attention,
forming the basis of much of our current understanding of how it operates.
MULTIPLE-OBJECT TRACKING
A rather different approach to studying attention was developed by Zenon
Pylyshyn and colleagues, based on multiple-object tracking. Here, a set of identical items—dots on a screen, say—is initially displayed. A subset of these
is marked (e.g., some of the dots flash) and the marked items then tracked
as they randomly move around the display. The ability to track is severely
limited: Under most conditions, no more than three or four can be handled.
The extent to which multiple-object tracking can be explained by a spotlight
mechanism remains unclear. However, there is considerable—although not
universal—belief that this tracking does involve a form of attention, if only
because of the limited capacity found.
UNDERLYING MECHANISMS
One of the more successful quantitative models of attentional filtering and
binding was the Theory of Visual Attention of Claus Bundesen, which could
account for a considerable variety of experimental data. It was also compatible with later suggestions that filtering and binding could be implemented
via neural assemblies that inhibit their neighbors when activated. Another
(possibly complementary) proposal about implementation was neural synchrony, which posited that an attended item could be represented by the synchronized firing of a group of neurons. More generally, many of the classical
results could be explained by models based on the dynamics of neural interactions, along with the selective routing of information from various areas of
the brain.
In parallel with this, other work focused on understanding attentional control. Michael Posner suggested that the movement of attention involved three

Attention and Perception

5

distinct components: (i) the disengagement of attention from the current item
being attended, (ii) the shifting of its location (e.g., the center of the spotlight)
over space, and (iii) the reengagement of attention on a new item. Among other
things, this model successfully accounted for several perceptual problems
encountered in developmental disorders and degenerative diseases. Subsequent work placed an increased emphasis on the extent to which control was
affected by properties of the image—for example, the extent to which the size
or color of an item differed from that of its neighbors.
CUTTING-EDGE RESEARCH
The late twentieth and early twenty-first century saw the development of
several new research directions. Some were direct continuations of classical
work, and led to further refinement of earlier results. But others involved new
perspectives, and sometimes caused a reconsideration of previous assumptions. Although these investigations have not yet resulted in a coherent,
generally accepted account of attention, they have provided a better understanding of its operation, including how it relates to other mechanisms
involved in visual perception, and how its limitations can intrude into
everyday life.
INDUCED BLINDNESS
Much of recent work has returned to the issue of how attention relates to conscious visual experience—in particular, the way that an absence of attention
can cause a failure to see an item in clear view of the observer. One example
is inattentional blindness, where an observer fails to see an unexpected object
or event, even when these are large and quite visible. This has been taken
to indicate that attention is needed to see an object or event. There is some
uncertainty as to the extent of its implications at the theoretical level: Does
the observer fail to see all aspects of the object, or do they still see its basic
features but are blind to its structure or meaning? Either way, inattentional
blindness is increasingly recognized as being important at the practical level.
For example, many traffic accidents are likely due to a driver failing to see a
pedestrian (or another car) because attention was focused on something else.
A variant of this is continuous flash suppression. Here, a set of random images
is continually flashed into one eye at a rate of about 10 Hz, suppressing
the experience of the image shown to the other eye. This can be sustained
for several minutes. Various explanations have been put forward for this
phenomenon. The predominant hypothesis is that it occurs because attention cannot be sent to the suppressed image, and that no other effects are

6

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

responsible—that is, that continuous flash suppression is a form of inattentional blindness. If so, it could be a powerful way to study the extent to which
perception can occur in the absence of conscious visual experience.
Another phenomenon that has received a great deal of interest is change
blindness. Here, the observer fails to notice a change that occurs in an object,
even if the change is large and can easily be seen once the observer knows
what it is. This phenomenon strongly suggests that attention is needed to see
change. It appears that attention engages visual short-term memory (vSTM)
to create a representation that is coherent—that is, is integrated over some
extent of space and has continuity over some duration of time. The number
of items that can be monitored simultaneously for change is about three or
four, a limit similar to the capacity of vSTM. Unlike inattentional blindness,
change blindness can occur even when a change is expected. This can lead
to severe problems in everyday life, in that people can miss even a large,
obvious event if they are not attending to it the moment it occurs.
Other types of induced blindness are also of interest. One of these is the
attentional blink. This occurs when two different (prespecified) targets in a
stream of rapidly presented stimuli appear at slightly different times; under
some conditions, the first target will be seen but not the second. This has been
explained in terms of attention not being allocated to the second item in time,
possibly because the representation for the first has not yet been completed.
A related phenomenon is repetition blindness, where the observer can miss the
occurrence of a repeated item in a stream of rapidly presented images. This
is likewise believed to be due to the failure of attention to create sufficiently
quickly a representation of the repeated item.
NONATTENTIONAL PROCESSING
The earliest stages of visual processing are generally thought to be concerned with simple properties such as color, motion, and orientation. It was
originally assumed that attention acted directly on such properties: that
they were the preattentive features uncovered in visual search. But later
work showed that search can be influenced by relatively complex localized
structures—proto-objects—created by processes acting before attention. These
processes can group line segments, bind features, interpret dark regions
as shadows, and perhaps even recover three-dimensional orientation at
each location in the image, essentially creating a “quick and dirty” map of
scene structure. The strength of cuing and speed of search can be similarly
influenced by the inferred structure of the background, being enhanced
for items on the same surface and diminished for items on different ones.
All these results point to a considerable amount of processing that occurs
rapidly (and likely in parallel across the visual field), before attention has
had much of a chance to operate.

Attention and Perception

7

Recent work has also shown that observers can accurately estimate summary statistics, such as the average size of the disks in an image, even if this
image is presented for only a 100 ms or so; more sophisticated properties
(e.g., Pearson correlation) can also be estimated this way. Observers can even
determine the appropriate category (or gist) of a scene under such conditions,
possibly based on these statistics. In all of these, there is no time to filter or
bind more than a few items, suggesting the existence of processes that operate
before—or perhaps in tandem with—visual attention.
The “intelligence” of such nonattentional processes is an open issue.
Observers show little inattentional blindness to words and pictures with a
strong emotional impact (e.g., the observer’s name), indicating that some
degree of recognition exists before attention is sent to the item. In general,
then, all these results imply that nonattentional processes are capable of
more than previously believed. And attention may correspondingly do
less: although attention can be used on occasion to bind visual features, for
example, it may not be necessary for all aspects of binding.
CONNECTIONS WITH SCENE PERCEPTION
Phenomena such as inattentional blindness and change blindness suggest
that attention is necessary for visual experience. And most studies concur
that attention is severely limited. Why then do we not experience such
limits when viewing a scene? One possibility is that attention can create a
representation—a visual object—possessing detail and coherence, but only
as long as attention is maintained. If this can be done on a “just-in-time”
basis—that is, attention is sent to the right item at the right time—the
result would be a virtual representation that would appear to higher level
processes as if it were “real,” that is, as if it contained detailed and coherent
representations everywhere. An important goal of current work is therefore
to understand the nature of the mechanisms underlying such coordination.
One suggestion begins with nonattentional processes providing a
constantly regenerating array of proto-objects, which represent simple
properties of the scene that are visible to the observer. Attention can select a
subset of these, “knitting” them into a coherent visual object. In tandem with
this, the statistics of the (unattended) proto-object array could determine
gist; this could help access high-level knowledge about the scene, and so
guide attention to appropriate parts of the image. In this characterization,
then, scene representations are no longer long-lasting structures built up
from eye movements and attentional shifts, but are relatively temporary
structures that guide such activities. Among other things, this implies that
different observers—with different knowledge, different goals, and therefore
different attentional strategies—can literally see the same scene differently.

8

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

CONNECTIONS WITH PERCEPTUAL DEFICITS
Given that attention is needed for visual experience, problems with its allocation may explain various perceptual deficits. In unilateral neglect, for example,
patients with damage to the right posterior parietal cortex (at the top and
back of the head) can fail to visually experience whatever is in the left half of
the visual field, even if this is directly in front of them. (Oddly, a corresponding deficit does not result from damage to the left side.) A related condition
is extinction, where such a failure also occurs, but only when an object exists
in the right half of the visual field. Such deficits may result from problems in
shifting attention to the relevant location (or at least, keeping it there), possibly because of damage to the parietal circuits that control it. Interestingly,
words and pictures in the neglected—and presumably unattended—part of
the visual field can still affect the observer, consistent with the proposal of
intelligent nonattentional processes.
Another condition likely related to these is simultanagnosia. Patients with
this deficit cannot see more than one coherent object (or coherent part of an
object) at a time; the rest of the scene is experienced only in a fragmented
way, or not experienced at all. This has been associated with damage to the
parieto-occipital areas (at the upper part of the back of the head), which may
cause problems in allocating attention to particular objects.

KEY ISSUES FOR FUTURE RESEARCH
Most issues in attention research—both classical and subsequent—are still far
from being resolved. For example, what is the relation between attention and
vSTM? How many nonattentional process exist, and how intelligent is each?
How exactly do the knowledge and goals of the observer determine how
attention is allocated? The answers to all of these are necessary for a complete
understanding of attention. Finding them will take many more years of work.
Meanwhile, other issues are also beginning to emerge. Part of the reason
they have not received much consideration to date is sociological: Given the
work still to be done on current issues, little incentive exists to embark upon
riskier ventures elsewhere. Part is methodological: It is not clear how some
of these issues could be addressed in a productive way. And part is simple
ignorance: We did not know enough until recently to realize that some of
these issues even existed. But whatever the reason for their previous obscurity, many of these issues are becoming increasingly prominent, and may well
form a critical part of future research.

Attention and Perception

9

CHARACTERIZATION
One of the most basic—and oldest—issues concerning attention concerns its
nature: What exactly is it? Over the years, attention has been characterized in
various ways, such as the quality of visual experience, or a limited “resource”
that enables particular operations to be carried out. But the greatest increase
in our understanding seems to have been achieved by focusing on the idea
of selection. Could this idea be developed further, ideally in a way consistent
with most of the other characterizations that have been applied?
One possibility would be to define an attentional process as one that is
contingently selective, with that selectivity controlled via global considerations
(e.g., tracking a particular person of interest). From this perspective, “attention” is more an adjective than a noun. Any globally controlled process
of limited capacity—such as binding visual features, or placing them into
vSTM—would be “attentional,” because limited capacity implies selectivity
of one form or other. This would also be the case for any process that
selectively improves the quality of visual experience, provided only that this
is done on the basis of some global consideration (e.g., not done reflexively).
COMPUTATIONAL EXPLANATION
Even if attention could be described in terms of a particular function or mechanism, our understanding of it would be incomplete: We might know how it
operates, but not why. For example, if some capacity were limited to three
items, why should this be? Why not four? Why not one? Of course, such a
limit may simply be an accident of history. But it may also reflect the influence
of deeper principles.
One possible way of investigating this is to apply the computational framework of David Marr. This framework posits that any (visual) process can be
analyzed from three interlocking perspectives: (i) function (both description
and justification), (ii) mechanism (algorithm and representation), and (iii) neural implementation. Such explanations have led to deep insights into the nature
of processes at early levels of human vision, and have helped develop their
equivalents in machine vision. A few studies, such as those of John Tsotsos,
have begun applying this approach to attention as well. Such analyses could
eventually provide considerable insights into the nature of attention and the
exact role it plays in perception.
MODULATORY FACTORS
It is often assumed that attention is governed entirely by the demands of the
task and the knowledge of the observer. However, evidence is emerging that
other factors also play an important role:

10

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Stress. Stress can cause tunneling, where the observer loses awareness of
anything beyond the center of the visual field. It can also speed up visual
search for simple features (e.g., a particular orientation, such as “vertical”), although apparently not for their combination (e.g., “blue” and
“vertical”). Such effects suggest that stress causes attention to improve
its selectivity by reducing the range of the properties allowed through.
However, it may be that such improvement is obtained at the cost of a
slower switching of the underlying mechanisms.
Aging. Another important perspective is how attention changes over
lifespan. Different aspects of attention appear to be differently
affected: Filtering and binding appear to be largely unaffected, while
top-down control (e.g., disregard of irrelevant stimuli, switching
speed) deteriorates noticeably with age. More investigation would be
of great practical importance, and could provide new perspectives on
underlying mechanisms.
Cultural/Visual Environment. Recent work suggests that observers from
Western countries (e.g., the United States) generally attend to individual objects in a scene, whereas observers from East Asian countries
(e.g., Japan) generally attend to the scene as a whole. Western observers
show a search asymmetry: They can detect a long line among short
lines more quickly than vice versa. Meanwhile, East Asian observers
are equally slow for both. Preliminary work suggests that some of
these differences disappear when significant time is spent in the other
culture. If these results hold, they would indicate a strong effect of
culture—or at least, visual environment—on the way attention is used.
Interesting issues would then arise as to which (visual) characteristics
are relevant, and why.
Mental Set. Attentional control—including the speed of visual search—can
be influenced by explicit instruction to the observer. Such results suggest
that an observer may have available several processing modes, each corresponding to a particular “mental set.” (Some of these may account for
the cultural differences mentioned above.) If so, interesting questions
arise as to the nature of these modes, and the conditions that trigger
them.
KINDS OF ATTENTION
Another important issue is whether there exists one kind of attention or several. Occasional conflicts have occurred in claims regarding the speed, sensitivity, and even function of attention; it is not even clear as to what extent it
travels along perceptual structures or “raw space.” The existence of multiple

Attention and Perception

11

kinds of attention could help resolve some of these issues. It would also create new ones, such as determining the taxonomy that would best describe
these kinds, and establishing the various ways in which a process could be
“preattentive” or “nonattentional.”
On the basis of function, speed, and structures operated upon, several
groupings of attentional processes can be delineated. An important question
is the extent to which these groupings correspond to distinct aspects—or
even kinds—of attention (or, perhaps, more precisely, attentional processing):
Attentional Sampling. This is the selective pickup of information by the eye.
The eye has high acuity and color perception only in the few degrees
around the point of fixation. It must therefore—together with the head
and body—move around to pick up the right information from the environment. Sampling has traditionally been referred to as overt attention.
It has long been known to differ from operations carried out internally,
which are often collectively referred to as covert attention.
Attentional Filtering (Gating). Irrelevant information can degrade performance, and must be removed as soon as possible. Ways of doing so
include spatial filtering (selection only from a particular region of space)
and feature filtering (selection of items containing a particular feature);
these are largely the focus of classical approaches. Selection can be diffuse
(over a wide range) or focused (over a restricted range). It appears that
the mechanisms involved can be switched quickly (typically, within 50
ms) and operate on the basis of simple properties, such as color, motion,
or spatial position.
Attentional Binding. This is the selective linking of properties so as to capture the structure of the world at any given moment. This can be done
in various ways, such as feature binding (e.g., linking the color and orientation of an item) and position binding (e.g., linking an item to a precise
position in space). Binding differs from filtering, being concerned not
with access, but construction. The mechanisms involved also appear to
differ, being slower (completing within about 150 ms) and involving
organized structures rather than simple properties.
Attentional Holding. When a physical object changes over time (e.g., a
bird takes flight), it is useful to perceive an underlying structure that
remains the same. The associated representation must be “held” across
time, likely via vSTM; such “holding” therefore differs from binding.
The mechanisms involved also appear to differ, being even slower
(completing within about 300 ms) and operating on no more than three
to four items at a time.
Attentional Individuating. It is often useful to perceive not just an object,
but a particular object (e.g., when determining if one item is to the left

12

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

of another). Such “individuating” (or “indexing”) may also be the basis
of tracking. The mechanisms involved can act quickly (about 50 ms per
item) and involve up to seven to eight structures at a time.
KINDS OF VISUAL EXPERIENCE
A parallel set of concerns involves conscious visual experience. As in the
case of attention, it has been widely assumed that there exists only one kind
of visual experience. But just as color and motion are distinct aspects—or
even kinds—of experience concerned with distinct physical properties of the
world, so might there be other kinds of experience concerned with distinct
structural properties:
Fragmented Experience. This is the experience of simple features with little
structure and poor localization; in some ways, it is what is experienced
when viewing an Impressionist painting. It can be encountered in
brief displays, where the experience is one of a fleeting array of simple
colors and shapes with relatively little structure. This has sometimes been termed background consciousness—the experience of the
background when attention (binding) is focused on foreground objects.
Assembled Experience. This is the experience of unstructured properties
(fragmented experience) along with a degree of superimposed static
structure. It can be encountered in displays presented for at least 150
ms, the time needed for binding; it is essentially what is experienced
under stroboscopic conditions. Although no new sensory (physical)
properties are present, more complex kinds of structure are. Among
other things, this distinction allows two kinds of inattentional blindness
to be distinguished: Type 1, the absence of fragmented experience (i.e.,
the absence of sensory qualities, perhaps caused by an absence of
attentional gating), and Type 2, the absence of assembled experience,
with simple sensory qualities still present but no higher level structure
(perhaps caused by an absence of attentional binding).
Coherent Experience. This is the “standard” experience encountered when
giving complete attention to a physical object: Not only is the static
structure of assembled experience present but also movement—or
more generally, change—along with the impression of an underlying
substrate that persists over time. The absence of coherent experience
(change blindness) might be regarded as Type 3 inattentional blindness,
caused by an absence of attentional holding.
Sensing. Observers in change detection experiments occasionally report
that they “sense” or “feel” a change without having any visual experience of it. The status of this “sensing” is controversial. It has been

Attention and Perception

13

suggested that it is simply a “weakened” form of seeing (i.e., coherent
experience). However, it differs qualitatively from the other kinds of
visual experience, and appears to involve different mechanisms as well.
An important challenge for future work is to determine the extent to which
these really are distinct kinds of visual experience, and how they may relate
to various kinds of attention. There are also important issues concerning what
might be called dark structure—structure is never experienced at all, yet still
affects visual perception.
CONCLUSION
The nature of attention and its relation to perception have long been issues
cloaked in mystery, involving matters that are highly subjective and poorly
defined. But a great deal of progress has been made, particularly over the
past century. A considerable amount of understanding now exists as to how
attention operates, and the role it plays in our conscious experience. And,
importantly, this understanding has suggested new questions, concerning
issues that researchers of earlier times had not even imagined. Investigating
these issues will no doubt require much time and effort. But the results
are likely to shed interesting new light on the way we experience our
world.

FURTHER READING
Bundesen, C., & Habekost, T. (2008). Principles of visual attention: Linking mind and
brain. Oxford, England: Oxford University Press.
Itti, L., Rees, G., & Tsotsos, J. K. (2005). The neurobiology of attention. San Diego, CA:
Academic Press.
Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA: MIT Press.
Pashler, H. E. (1999). The psychology of attention. Cambridge, MA: MIT Press.
Rensink, R. A. (2013). Perception and attention. In D. Reisberg (Ed.), Oxford handbook
of cognitive psychology (pp. 97–116). Oxford, England: Oxford University Press.
Simons, D. J. (Ed.) (2000). Change blindness and visual memory. New York, NY: Psychology Press.
Styles, E. A. (2006). The psychology of attention (2nd ed.). New York, NY: Psychology
Press.
Tsotsos, J. K. (2011). A computational perspective on visual attention. Cambridge, MA:
MIT Press.
Wolfe, J. M. (2000). Visual attention. In K. K. De Valois (Ed.), Seeing (2nd ed.),
pp. 335–386. San Diego, CA: Academic Press.
Wright, R. D. (Ed.) (1998). Visual attention. Oxford, England: Oxford University Press.

14

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

RONALD A. RENSINK SHORT BIOGRAPHY
Ronald A. Rensink is an Associate Professor in the departments of Computer Science and Psychology at the University of British Columbia (UBC) in
Vancouver, Canada. His interests include human vision (particularly visual
attention and consciousness), computer vision, visual design, and the perceptual mechanisms used in visual analysis. He obtained a PhD in Computer
Science from UBC in 1992, followed by a postdoctoral fellowship for 2 years
in the Psychology Department at Harvard University. This was followed by 6
years as a research scientist at Cambridge Basic Research, a laboratory sponsored by the Nissan Motor Company. He returned to UBC in 2000. He is
currently part of the UBC Cognitive Systems Program, an interdisciplinary
program that combines Computer Science, Linguistics, Philosophy, and Psychology. Among other things, he is a cofounder of the Vancouver Institute
for Visual Analytics (VIVA), an institute dedicated to facilitating the development of systems that can combine human and machine intelligence in
optimal ways. Webpage:
http://www.psych.ubc.ca/∼rensink; http://www.cs.ubc.ca/∼rensink
RELATED ESSAYS
Mental Models (Psychology), Ruth M.J. Byrne
Spatial Attention (Psychology), Kyle R. Cave
Misinformation and How to Correct It (Psychology), John Cook et al.
Construal Level Theory and Regulatory Scope (Psychology), Alison Ledgerwood et al.
Resource Limitations in Visual Cognition (Psychology), Brandon M. Liverence
and Steven L. Franconeri
Neural and Cognitive Plasticity (Psychology), Eduardo Mercado III
Speech Perception (Psychology), Athena Vouloumanos; Attention and Perception
RONALD A. RENSINK

Abstract
This essay discusses several key issues concerning the study of attention and its relation to visual perception, with an emphasis on behavioral and experiential aspects.
It begins with an overview of several classical works carried out in the latter half
of the twentieth century, such as the development of early filter and spotlight models of attention. This is followed by a survey of subsequent research that extended
or modified these results in significant ways. It includes work on various forms of
induced blindness and on the capabilities of nonattentional processes. It also covers
proposals about how a “just-in-time” allocation of attention can create the impression that we see our surroundings in coherent detail everywhere, as well as how the
failure of such allocation can result in various perceptual deficits. The final section
examines issues that have not received much consideration to date, but that may be
important for new lines of research in the near future. These include the prospects
for a better characterization of attention, the possibility of more systematic computational explanations, factors that may significantly modulate attentional operation,
and the possibility of several kinds of visual attention and visual experience.

INTRODUCTION
Whenever we open our eyes, we experience an ever-changing world of colors, shapes, and movements. This experience is so vivid and so compelling
that we rarely stop to consider whether the underlying mechanisms may
have limitations. Instead, we simply have a strong impression that we always
perceive everything in front of us. Although we may need to scrutinize something on occasion, for the most part our visual system appears to operate in
an automatic and seamless way, providing us with a complete and detailed
representation of whatever is in our field of view.
But however appealing it may be, this impression cannot be correct.
Suppose someone wants to keep track of various players in a sports game. A
single player can usually be tracked without problem. Three or four can also
be tracked, although with some effort. But as the number increases further,
simultaneous tracking of all the selected players becomes impossible.
Performance evidently depends on a factor which enables certain kinds of
Emerging Trends in the Social and Behavioral Sciences. Edited by Robert Scott and Stephen Kosslyn.
© 2015 John Wiley & Sons, Inc. ISBN 978-1-118-90077-2.

1

2

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

perception to occur, but which has a clear limit to its capacity. This factor is
generally referred to as attention.
Work on human vision is providing increasing evidence that visual perception is the result of several interacting processes, most of which are quite
sophisticated, and many of which have definite limits to their abilities. And
rather than the outputs of these processes accumulating in a detailed construction, much of our perception results instead from the coordination of these
processes. In particular, much of our visual experience appears to depend on
managing attention so that it is sent to the right item at the right time. As
such, attention is more than something that simply modifies or assists our
perception on occasion—it is instead a factor central to our awareness of the
world around us.
FOUNDATIONAL RESEARCH
It was recognized long ago that we need to pay attention to adequately perceive our surroundings. But only recently have we obtained a better understanding of what attention is and how it relates to perception. Building upon
the proposals of philosophers of the seventeenth and eighteenth centuries,
researchers in the nineteenth century began to map out several of its main
characteristics. For example, Hermann von Helmholtz discovered that an
observer could attend to (in the sense of recognizing) letters at locations outside of where the eyes were aimed (or “fixated”), showing that attention is not
equivalent to eye fixation. Meanwhile, William James distinguished “sensorial” from “intellectual” attention—the former concerned with concrete objects
such as particular sports players, the latter with more abstract structures such
as the quality of the game. James also associated sensorial attention—in particular, visual attention, the focus of this review—with clarity of perception,
intensity of perception, and visual memory. Many of these concerns became
an enduring backdrop for subsequent work.
FILTER MODELS
A more rigorous approach to understanding attention was developed during
the middle decades of the twentieth century, when researchers began focusing more on its selective aspects, and—in line with the “cognitive revolution”
of that time—replaced the original emphasis on subjective experience with
an emphasis on objective models. Donald Broadbent proposed an influential “filter” model, in which perception was carried out via a sequence of
processes in a single pathway, with an attentional filter that gated selected
aspects of a stimulus through to later processes. An important issue was the
locus of this filter: whether it was early (selection affecting the initial stages,

Attention and Perception

3

which measured simple properties such as color and motion) or late (selection appearing only at the highest levels, gating properties such as semantic
category).
The work undertaken to settle this issue resulted in a great deal of information about the ways various operations were affected by attention. However,
the complete resolution of this issue eluded researchers, and continues to do
so to this day. This strongly suggests that some of the original assumptions
were incorrect: there may be, for example, more than one filter in the pathway
(not to mention more than one pathway), making questions concerning a single filter somewhat ill-founded. To get further insights, a different approach
was needed.
SPOTLIGHT MODELS
Despite the failure to determine whether selection was early or late, investigations into this issue resulted in a variety of new methodologies and new
frameworks. Over time, concerns about the nature of filters receded, and
were replaced by an emphasis on how attention affected the representations
themselves.
An example of such a methodology is visual search, where observers are
asked to report on a prespecified target item in a visual display. It was found,
for example, that some items can be detected immediately and without much
attention (e.g., a blue dot among a set of yellow dots), whereas others cannot
(e.g., a “T” among a set of “L”s). Among the more prominent frameworks to
account for such findings was Anne Treisman’s feature integration theory. This
framework modeled visual processing in terms of two stages. The first is a
preattentive stage that determines simple properties (features) such as color
or motion rapidly and in parallel at each point of the visual field, resulting
in a “map” describing the spatial distribution of each feature. The second
involves a limited-capacity “spotlight” of attention that travels from item to
item at a rate of about 50 ms per item, not only filtering but also binding features that correspond to the same item (e.g., integrating the representations of
the “blue” and “vertical” properties at a location into a single representation
of both). Later refinements included the guided search model of Jeremy Wolfe
and colleagues, in which items in a feature map could be selectively inhibited
or excited to improve the efficiency of search. Other variants examined issues
such as the extent to which attention might be allocated in parallel rather than
in a serial fashion. All these models had natural connections to other areas
of vision science: for example, the features found in visual search could be
related in a fairly straightforward way to many of the elements underlying
texture perception.

4

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Other approaches yielded similar results. Michael Posner and colleagues
did seminal work on cuing, showing that if a cue (such as a dot) was shown
at the location of a target just before the target appeared, detection could be
sped up by several hundred milliseconds. This speedup diminishes as the
separation between target and cue increases, something readily accounted
for by a model in which the edges of the spotlight are smooth. Meanwhile,
Charles Eriksen and colleagues showed that a spotlight mechanism could
also account for the ability of nearby items (or “flankers”) to interfere with
detection; results also suggested that only one spotlight operates at any time,
and that it can rapidly adjust its size, “zooming” in or out as required by
the task. Owing to its ability to account for a variety of effects, therefore, the
spotlight model has become the “classical” explanation of visual attention,
forming the basis of much of our current understanding of how it operates.
MULTIPLE-OBJECT TRACKING
A rather different approach to studying attention was developed by Zenon
Pylyshyn and colleagues, based on multiple-object tracking. Here, a set of identical items—dots on a screen, say—is initially displayed. A subset of these
is marked (e.g., some of the dots flash) and the marked items then tracked
as they randomly move around the display. The ability to track is severely
limited: Under most conditions, no more than three or four can be handled.
The extent to which multiple-object tracking can be explained by a spotlight
mechanism remains unclear. However, there is considerable—although not
universal—belief that this tracking does involve a form of attention, if only
because of the limited capacity found.
UNDERLYING MECHANISMS
One of the more successful quantitative models of attentional filtering and
binding was the Theory of Visual Attention of Claus Bundesen, which could
account for a considerable variety of experimental data. It was also compatible with later suggestions that filtering and binding could be implemented
via neural assemblies that inhibit their neighbors when activated. Another
(possibly complementary) proposal about implementation was neural synchrony, which posited that an attended item could be represented by the synchronized firing of a group of neurons. More generally, many of the classical
results could be explained by models based on the dynamics of neural interactions, along with the selective routing of information from various areas of
the brain.
In parallel with this, other work focused on understanding attentional control. Michael Posner suggested that the movement of attention involved three

Attention and Perception

5

distinct components: (i) the disengagement of attention from the current item
being attended, (ii) the shifting of its location (e.g., the center of the spotlight)
over space, and (iii) the reengagement of attention on a new item. Among other
things, this model successfully accounted for several perceptual problems
encountered in developmental disorders and degenerative diseases. Subsequent work placed an increased emphasis on the extent to which control was
affected by properties of the image—for example, the extent to which the size
or color of an item differed from that of its neighbors.
CUTTING-EDGE RESEARCH
The late twentieth and early twenty-first century saw the development of
several new research directions. Some were direct continuations of classical
work, and led to further refinement of earlier results. But others involved new
perspectives, and sometimes caused a reconsideration of previous assumptions. Although these investigations have not yet resulted in a coherent,
generally accepted account of attention, they have provided a better understanding of its operation, including how it relates to other mechanisms
involved in visual perception, and how its limitations can intrude into
everyday life.
INDUCED BLINDNESS
Much of recent work has returned to the issue of how attention relates to conscious visual experience—in particular, the way that an absence of attention
can cause a failure to see an item in clear view of the observer. One example
is inattentional blindness, where an observer fails to see an unexpected object
or event, even when these are large and quite visible. This has been taken
to indicate that attention is needed to see an object or event. There is some
uncertainty as to the extent of its implications at the theoretical level: Does
the observer fail to see all aspects of the object, or do they still see its basic
features but are blind to its structure or meaning? Either way, inattentional
blindness is increasingly recognized as being important at the practical level.
For example, many traffic accidents are likely due to a driver failing to see a
pedestrian (or another car) because attention was focused on something else.
A variant of this is continuous flash suppression. Here, a set of random images
is continually flashed into one eye at a rate of about 10 Hz, suppressing
the experience of the image shown to the other eye. This can be sustained
for several minutes. Various explanations have been put forward for this
phenomenon. The predominant hypothesis is that it occurs because attention cannot be sent to the suppressed image, and that no other effects are

6

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

responsible—that is, that continuous flash suppression is a form of inattentional blindness. If so, it could be a powerful way to study the extent to which
perception can occur in the absence of conscious visual experience.
Another phenomenon that has received a great deal of interest is change
blindness. Here, the observer fails to notice a change that occurs in an object,
even if the change is large and can easily be seen once the observer knows
what it is. This phenomenon strongly suggests that attention is needed to see
change. It appears that attention engages visual short-term memory (vSTM)
to create a representation that is coherent—that is, is integrated over some
extent of space and has continuity over some duration of time. The number
of items that can be monitored simultaneously for change is about three or
four, a limit similar to the capacity of vSTM. Unlike inattentional blindness,
change blindness can occur even when a change is expected. This can lead
to severe problems in everyday life, in that people can miss even a large,
obvious event if they are not attending to it the moment it occurs.
Other types of induced blindness are also of interest. One of these is the
attentional blink. This occurs when two different (prespecified) targets in a
stream of rapidly presented stimuli appear at slightly different times; under
some conditions, the first target will be seen but not the second. This has been
explained in terms of attention not being allocated to the second item in time,
possibly because the representation for the first has not yet been completed.
A related phenomenon is repetition blindness, where the observer can miss the
occurrence of a repeated item in a stream of rapidly presented images. This
is likewise believed to be due to the failure of attention to create sufficiently
quickly a representation of the repeated item.
NONATTENTIONAL PROCESSING
The earliest stages of visual processing are generally thought to be concerned with simple properties such as color, motion, and orientation. It was
originally assumed that attention acted directly on such properties: that
they were the preattentive features uncovered in visual search. But later
work showed that search can be influenced by relatively complex localized
structures—proto-objects—created by processes acting before attention. These
processes can group line segments, bind features, interpret dark regions
as shadows, and perhaps even recover three-dimensional orientation at
each location in the image, essentially creating a “quick and dirty” map of
scene structure. The strength of cuing and speed of search can be similarly
influenced by the inferred structure of the background, being enhanced
for items on the same surface and diminished for items on different ones.
All these results point to a considerable amount of processing that occurs
rapidly (and likely in parallel across the visual field), before attention has
had much of a chance to operate.

Attention and Perception

7

Recent work has also shown that observers can accurately estimate summary statistics, such as the average size of the disks in an image, even if this
image is presented for only a 100 ms or so; more sophisticated properties
(e.g., Pearson correlation) can also be estimated this way. Observers can even
determine the appropriate category (or gist) of a scene under such conditions,
possibly based on these statistics. In all of these, there is no time to filter or
bind more than a few items, suggesting the existence of processes that operate
before—or perhaps in tandem with—visual attention.
The “intelligence” of such nonattentional processes is an open issue.
Observers show little inattentional blindness to words and pictures with a
strong emotional impact (e.g., the observer’s name), indicating that some
degree of recognition exists before attention is sent to the item. In general,
then, all these results imply that nonattentional processes are capable of
more than previously believed. And attention may correspondingly do
less: although attention can be used on occasion to bind visual features, for
example, it may not be necessary for all aspects of binding.
CONNECTIONS WITH SCENE PERCEPTION
Phenomena such as inattentional blindness and change blindness suggest
that attention is necessary for visual experience. And most studies concur
that attention is severely limited. Why then do we not experience such
limits when viewing a scene? One possibility is that attention can create a
representation—a visual object—possessing detail and coherence, but only
as long as attention is maintained. If this can be done on a “just-in-time”
basis—that is, attention is sent to the right item at the right time—the
result would be a virtual representation that would appear to higher level
processes as if it were “real,” that is, as if it contained detailed and coherent
representations everywhere. An important goal of current work is therefore
to understand the nature of the mechanisms underlying such coordination.
One suggestion begins with nonattentional processes providing a
constantly regenerating array of proto-objects, which represent simple
properties of the scene that are visible to the observer. Attention can select a
subset of these, “knitting” them into a coherent visual object. In tandem with
this, the statistics of the (unattended) proto-object array could determine
gist; this could help access high-level knowledge about the scene, and so
guide attention to appropriate parts of the image. In this characterization,
then, scene representations are no longer long-lasting structures built up
from eye movements and attentional shifts, but are relatively temporary
structures that guide such activities. Among other things, this implies that
different observers—with different knowledge, different goals, and therefore
different attentional strategies—can literally see the same scene differently.

8

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

CONNECTIONS WITH PERCEPTUAL DEFICITS
Given that attention is needed for visual experience, problems with its allocation may explain various perceptual deficits. In unilateral neglect, for example,
patients with damage to the right posterior parietal cortex (at the top and
back of the head) can fail to visually experience whatever is in the left half of
the visual field, even if this is directly in front of them. (Oddly, a corresponding deficit does not result from damage to the left side.) A related condition
is extinction, where such a failure also occurs, but only when an object exists
in the right half of the visual field. Such deficits may result from problems in
shifting attention to the relevant location (or at least, keeping it there), possibly because of damage to the parietal circuits that control it. Interestingly,
words and pictures in the neglected—and presumably unattended—part of
the visual field can still affect the observer, consistent with the proposal of
intelligent nonattentional processes.
Another condition likely related to these is simultanagnosia. Patients with
this deficit cannot see more than one coherent object (or coherent part of an
object) at a time; the rest of the scene is experienced only in a fragmented
way, or not experienced at all. This has been associated with damage to the
parieto-occipital areas (at the upper part of the back of the head), which may
cause problems in allocating attention to particular objects.

KEY ISSUES FOR FUTURE RESEARCH
Most issues in attention research—both classical and subsequent—are still far
from being resolved. For example, what is the relation between attention and
vSTM? How many nonattentional process exist, and how intelligent is each?
How exactly do the knowledge and goals of the observer determine how
attention is allocated? The answers to all of these are necessary for a complete
understanding of attention. Finding them will take many more years of work.
Meanwhile, other issues are also beginning to emerge. Part of the reason
they have not received much consideration to date is sociological: Given the
work still to be done on current issues, little incentive exists to embark upon
riskier ventures elsewhere. Part is methodological: It is not clear how some
of these issues could be addressed in a productive way. And part is simple
ignorance: We did not know enough until recently to realize that some of
these issues even existed. But whatever the reason for their previous obscurity, many of these issues are becoming increasingly prominent, and may well
form a critical part of future research.

Attention and Perception

9

CHARACTERIZATION
One of the most basic—and oldest—issues concerning attention concerns its
nature: What exactly is it? Over the years, attention has been characterized in
various ways, such as the quality of visual experience, or a limited “resource”
that enables particular operations to be carried out. But the greatest increase
in our understanding seems to have been achieved by focusing on the idea
of selection. Could this idea be developed further, ideally in a way consistent
with most of the other characterizations that have been applied?
One possibility would be to define an attentional process as one that is
contingently selective, with that selectivity controlled via global considerations
(e.g., tracking a particular person of interest). From this perspective, “attention” is more an adjective than a noun. Any globally controlled process
of limited capacity—such as binding visual features, or placing them into
vSTM—would be “attentional,” because limited capacity implies selectivity
of one form or other. This would also be the case for any process that
selectively improves the quality of visual experience, provided only that this
is done on the basis of some global consideration (e.g., not done reflexively).
COMPUTATIONAL EXPLANATION
Even if attention could be described in terms of a particular function or mechanism, our understanding of it would be incomplete: We might know how it
operates, but not why. For example, if some capacity were limited to three
items, why should this be? Why not four? Why not one? Of course, such a
limit may simply be an accident of history. But it may also reflect the influence
of deeper principles.
One possible way of investigating this is to apply the computational framework of David Marr. This framework posits that any (visual) process can be
analyzed from three interlocking perspectives: (i) function (both description
and justification), (ii) mechanism (algorithm and representation), and (iii) neural implementation. Such explanations have led to deep insights into the nature
of processes at early levels of human vision, and have helped develop their
equivalents in machine vision. A few studies, such as those of John Tsotsos,
have begun applying this approach to attention as well. Such analyses could
eventually provide considerable insights into the nature of attention and the
exact role it plays in perception.
MODULATORY FACTORS
It is often assumed that attention is governed entirely by the demands of the
task and the knowledge of the observer. However, evidence is emerging that
other factors also play an important role:

10

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

Stress. Stress can cause tunneling, where the observer loses awareness of
anything beyond the center of the visual field. It can also speed up visual
search for simple features (e.g., a particular orientation, such as “vertical”), although apparently not for their combination (e.g., “blue” and
“vertical”). Such effects suggest that stress causes attention to improve
its selectivity by reducing the range of the properties allowed through.
However, it may be that such improvement is obtained at the cost of a
slower switching of the underlying mechanisms.
Aging. Another important perspective is how attention changes over
lifespan. Different aspects of attention appear to be differently
affected: Filtering and binding appear to be largely unaffected, while
top-down control (e.g., disregard of irrelevant stimuli, switching
speed) deteriorates noticeably with age. More investigation would be
of great practical importance, and could provide new perspectives on
underlying mechanisms.
Cultural/Visual Environment. Recent work suggests that observers from
Western countries (e.g., the United States) generally attend to individual objects in a scene, whereas observers from East Asian countries
(e.g., Japan) generally attend to the scene as a whole. Western observers
show a search asymmetry: They can detect a long line among short
lines more quickly than vice versa. Meanwhile, East Asian observers
are equally slow for both. Preliminary work suggests that some of
these differences disappear when significant time is spent in the other
culture. If these results hold, they would indicate a strong effect of
culture—or at least, visual environment—on the way attention is used.
Interesting issues would then arise as to which (visual) characteristics
are relevant, and why.
Mental Set. Attentional control—including the speed of visual search—can
be influenced by explicit instruction to the observer. Such results suggest
that an observer may have available several processing modes, each corresponding to a particular “mental set.” (Some of these may account for
the cultural differences mentioned above.) If so, interesting questions
arise as to the nature of these modes, and the conditions that trigger
them.
KINDS OF ATTENTION
Another important issue is whether there exists one kind of attention or several. Occasional conflicts have occurred in claims regarding the speed, sensitivity, and even function of attention; it is not even clear as to what extent it
travels along perceptual structures or “raw space.” The existence of multiple

Attention and Perception

11

kinds of attention could help resolve some of these issues. It would also create new ones, such as determining the taxonomy that would best describe
these kinds, and establishing the various ways in which a process could be
“preattentive” or “nonattentional.”
On the basis of function, speed, and structures operated upon, several
groupings of attentional processes can be delineated. An important question
is the extent to which these groupings correspond to distinct aspects—or
even kinds—of attention (or, perhaps, more precisely, attentional processing):
Attentional Sampling. This is the selective pickup of information by the eye.
The eye has high acuity and color perception only in the few degrees
around the point of fixation. It must therefore—together with the head
and body—move around to pick up the right information from the environment. Sampling has traditionally been referred to as overt attention.
It has long been known to differ from operations carried out internally,
which are often collectively referred to as covert attention.
Attentional Filtering (Gating). Irrelevant information can degrade performance, and must be removed as soon as possible. Ways of doing so
include spatial filtering (selection only from a particular region of space)
and feature filtering (selection of items containing a particular feature);
these are largely the focus of classical approaches. Selection can be diffuse
(over a wide range) or focused (over a restricted range). It appears that
the mechanisms involved can be switched quickly (typically, within 50
ms) and operate on the basis of simple properties, such as color, motion,
or spatial position.
Attentional Binding. This is the selective linking of properties so as to capture the structure of the world at any given moment. This can be done
in various ways, such as feature binding (e.g., linking the color and orientation of an item) and position binding (e.g., linking an item to a precise
position in space). Binding differs from filtering, being concerned not
with access, but construction. The mechanisms involved also appear to
differ, being slower (completing within about 150 ms) and involving
organized structures rather than simple properties.
Attentional Holding. When a physical object changes over time (e.g., a
bird takes flight), it is useful to perceive an underlying structure that
remains the same. The associated representation must be “held” across
time, likely via vSTM; such “holding” therefore differs from binding.
The mechanisms involved also appear to differ, being even slower
(completing within about 300 ms) and operating on no more than three
to four items at a time.
Attentional Individuating. It is often useful to perceive not just an object,
but a particular object (e.g., when determining if one item is to the left

12

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

of another). Such “individuating” (or “indexing”) may also be the basis
of tracking. The mechanisms involved can act quickly (about 50 ms per
item) and involve up to seven to eight structures at a time.
KINDS OF VISUAL EXPERIENCE
A parallel set of concerns involves conscious visual experience. As in the
case of attention, it has been widely assumed that there exists only one kind
of visual experience. But just as color and motion are distinct aspects—or
even kinds—of experience concerned with distinct physical properties of the
world, so might there be other kinds of experience concerned with distinct
structural properties:
Fragmented Experience. This is the experience of simple features with little
structure and poor localization; in some ways, it is what is experienced
when viewing an Impressionist painting. It can be encountered in
brief displays, where the experience is one of a fleeting array of simple
colors and shapes with relatively little structure. This has sometimes been termed background consciousness—the experience of the
background when attention (binding) is focused on foreground objects.
Assembled Experience. This is the experience of unstructured properties
(fragmented experience) along with a degree of superimposed static
structure. It can be encountered in displays presented for at least 150
ms, the time needed for binding; it is essentially what is experienced
under stroboscopic conditions. Although no new sensory (physical)
properties are present, more complex kinds of structure are. Among
other things, this distinction allows two kinds of inattentional blindness
to be distinguished: Type 1, the absence of fragmented experience (i.e.,
the absence of sensory qualities, perhaps caused by an absence of
attentional gating), and Type 2, the absence of assembled experience,
with simple sensory qualities still present but no higher level structure
(perhaps caused by an absence of attentional binding).
Coherent Experience. This is the “standard” experience encountered when
giving complete attention to a physical object: Not only is the static
structure of assembled experience present but also movement—or
more generally, change—along with the impression of an underlying
substrate that persists over time. The absence of coherent experience
(change blindness) might be regarded as Type 3 inattentional blindness,
caused by an absence of attentional holding.
Sensing. Observers in change detection experiments occasionally report
that they “sense” or “feel” a change without having any visual experience of it. The status of this “sensing” is controversial. It has been

Attention and Perception

13

suggested that it is simply a “weakened” form of seeing (i.e., coherent
experience). However, it differs qualitatively from the other kinds of
visual experience, and appears to involve different mechanisms as well.
An important challenge for future work is to determine the extent to which
these really are distinct kinds of visual experience, and how they may relate
to various kinds of attention. There are also important issues concerning what
might be called dark structure—structure is never experienced at all, yet still
affects visual perception.
CONCLUSION
The nature of attention and its relation to perception have long been issues
cloaked in mystery, involving matters that are highly subjective and poorly
defined. But a great deal of progress has been made, particularly over the
past century. A considerable amount of understanding now exists as to how
attention operates, and the role it plays in our conscious experience. And,
importantly, this understanding has suggested new questions, concerning
issues that researchers of earlier times had not even imagined. Investigating
these issues will no doubt require much time and effort. But the results
are likely to shed interesting new light on the way we experience our
world.

FURTHER READING
Bundesen, C., & Habekost, T. (2008). Principles of visual attention: Linking mind and
brain. Oxford, England: Oxford University Press.
Itti, L., Rees, G., & Tsotsos, J. K. (2005). The neurobiology of attention. San Diego, CA:
Academic Press.
Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA: MIT Press.
Pashler, H. E. (1999). The psychology of attention. Cambridge, MA: MIT Press.
Rensink, R. A. (2013). Perception and attention. In D. Reisberg (Ed.), Oxford handbook
of cognitive psychology (pp. 97–116). Oxford, England: Oxford University Press.
Simons, D. J. (Ed.) (2000). Change blindness and visual memory. New York, NY: Psychology Press.
Styles, E. A. (2006). The psychology of attention (2nd ed.). New York, NY: Psychology
Press.
Tsotsos, J. K. (2011). A computational perspective on visual attention. Cambridge, MA:
MIT Press.
Wolfe, J. M. (2000). Visual attention. In K. K. De Valois (Ed.), Seeing (2nd ed.),
pp. 335–386. San Diego, CA: Academic Press.
Wright, R. D. (Ed.) (1998). Visual attention. Oxford, England: Oxford University Press.

14

EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES

RONALD A. RENSINK SHORT BIOGRAPHY
Ronald A. Rensink is an Associate Professor in the departments of Computer Science and Psychology at the University of British Columbia (UBC) in
Vancouver, Canada. His interests include human vision (particularly visual
attention and consciousness), computer vision, visual design, and the perceptual mechanisms used in visual analysis. He obtained a PhD in Computer
Science from UBC in 1992, followed by a postdoctoral fellowship for 2 years
in the Psychology Department at Harvard University. This was followed by 6
years as a research scientist at Cambridge Basic Research, a laboratory sponsored by the Nissan Motor Company. He returned to UBC in 2000. He is
currently part of the UBC Cognitive Systems Program, an interdisciplinary
program that combines Computer Science, Linguistics, Philosophy, and Psychology. Among other things, he is a cofounder of the Vancouver Institute
for Visual Analytics (VIVA), an institute dedicated to facilitating the development of systems that can combine human and machine intelligence in
optimal ways. Webpage:
http://www.psych.ubc.ca/∼rensink; http://www.cs.ubc.ca/∼rensink
RELATED ESSAYS
Mental Models (Psychology), Ruth M.J. Byrne
Spatial Attention (Psychology), Kyle R. Cave
Misinformation and How to Correct It (Psychology), John Cook et al.
Construal Level Theory and Regulatory Scope (Psychology), Alison Ledgerwood et al.
Resource Limitations in Visual Cognition (Psychology), Brandon M. Liverence
and Steven L. Franconeri
Neural and Cognitive Plasticity (Psychology), Eduardo Mercado III
Speech Perception (Psychology), Athena Vouloumanos