Understanding Biological Motion
Media
Part of Understanding Biological Motion
- Title
- Understanding Biological Motion
- extracted text
-
Understanding Biological Motion
JEROEN J. A. VAN BOXTEL and HONGJING LU
Abstract
The ultimate goal of biological motion perception is to be able to understand actions
so as to provide an answer to the question, “Who did what to whom and why?” This
inference capacity enables humans to go beyond the surface appearance of behavior
in order to successfully interact with others and with the environment. In addition
to its functional importance, understanding biological motion bridges several major
fields, including perception, reasoning, and social cognition. However, despite its
paramount role in human perception and cognition, only limited progress has so far
been made in understanding biological motion. After reviewing the relevant literature, this essay argues that future research needs to identify the contributions of
three basic processes involved in understanding biological motion: perception of
animacy, causality, and intention. The involvement of these basic processes needs
to be investigated both in the typical healthy population as well as in populations
with mental disorders, such as autism spectrum disorders and schizophrenia. We
also suggest that a productive research approach should focus on more interactive
actions of the sort often observed in the natural social environment, rather than solely
using the single-actor displays that have been typical in previous work. It is further emphasized that there is a need for a theoretical and computational framework
within which these different types of processing can be united. We propose that the
predictive coding framework provides a good candidate.
INTRODUCTION
In 1872, in his seminal work demonstrating parallels in the way humans
and animals express emotions, Darwin noted that “actions speak louder
than pictures when it comes to understanding what others are doing and feeling”
(Darwin, 1872). Darwin’s claim is supported by the fact that many animal
species are sensitive to motion patterns generated by other living organisms,
presumably due to the ecological importance of biological motion. Superior
perception for biological motion manifests itself in two pervasive behavioral
characteristics: the robustness of recognizing actions, and sophisticated
inference in understanding them, that is, grasping the intentions of actors.
In support of the first characteristic, numerous psychophysical studies have
Emerging Trends in the Social and Behavioral Sciences. Edited by Robert Scott and Stephen Kosslyn.
© 2015 John Wiley & Sons, Inc. ISBN 978-1-118-90077-2.
1
2
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
demonstrated that human observers show an exquisite ability to accurately
identify attributes of an actor, such as identity (Cutting & Kozlowski, 1977),
emotional state (Dittrich, Troscianko, Lea, & Morgan, 1996), and gender
(Kozlowski & Cutting, 1977), even when the stimulus lacks a detailed human
body form (e.g., a point-light display consisting of only a few discrete dots
representing joint movements). The remarkably rapid, accurate, and robust
perception of biological motion has inspired a great deal of research directed
at understanding how the visual system achieves this perceptual feat in
recognizing actions.
However, the second characteristic—sophisticated inference in understanding biological motion—is arguably more essential for human
perception and cognition. In order to successfully interact with others and
with the environment, the human mind is equipped with the ability to make
inferences that go beyond the surface appearance of behavior. For example,
with a brief glance at the crowd in Times Square in New York, you can not
only recognize which pedestrians are walking in relaxation and which are
running in a hurry but also readily identify people who are interacting with
others (e.g., walking together while having a conversation; shaking hands).
Furthermore, you can predict other people’s actions in the near future (e.g.,
expecting someone to extend a hand to wave goodbye to a friend). The
ultimate goal of biological motion perception, it seems, is to understand and
predict actions in which multiple individuals interact, and to make inferences
about other individuals’ intentions and goals by evaluating their actions.
For several reasons, systematic research on biological motion understanding needs to be pursued with greater vigor. First, the inference capacity of
the human visual system exceeds that of the most advanced machine vision.
For example, in the investigation of the Boston Marathon bombing case in
2013, extensive video from surveillance camera systems was available, but
it was the trained human eye that led to arrests. Hence, understanding how
humans make inferences and predictions about actions will doubtlessly play
an important role in guiding the development of more advanced machine
vision systems.
Second, within the human population, the ability to understand biological
motion varies among individuals. Within the first days of life, human
newborns show selective preference for biological motion (Simion, Regolin,
& Bulf, 2008), supporting the hypothesis that detection of biological motion
is an intrinsic capability of the human visual system. However, this evolutionarily basic ability is impaired for people with disorders such as autism
(Blake, Turner, Smoski, Pozdol, & Stone, 2003; Klin, Lin, Gorrindo, Ramsay,
& Jones, 2009). One of the core symptoms distinguishing autism from
other disorders is lack of ability to infer the meaning of observed actions,
which makes it difficult to carry on effective interactions with others. This
Understanding Biological Motion
3
impairment is generally believed to contribute to the severe cognitive and
social consequences of autism in later life (Kaiser & Shiffrar, 2009). Hence,
investigation of the key mechanisms underlying action understanding
may potentially guide the development of behavioral interventions to help
individuals with autism to adopt compensatory strategies.
Third, understanding biological motion plays an essential role in bridging
several important fields, including perception, reasoning, and social cognition. A precursor of biological motion processing is the extraction of motion
information using general motion detectors; hence, biological motion perception offers a window to study the interactions between high-level visual processing and low-level motion processing. The outcome of biological motion
perception must feed into a reasoning system that infers the intentions and
goals of other individuals, yielding social understanding. Currently, relatively little research has addressed the connections between perception and
reasoning, and even less has investigated the further connection to social cognition. Hence, theoretical investigations and empirical tests are both needed
to advance understanding of the perceptual and cognitive architecture that
supports the social human mind.
In the past decades, the vast majority of studies of biological motion have
focused on simple and stereotyped actions, such as walking and running,
which involve a single agent. Little progress has been made in quantifying
what visual information is used in predicting other people’s (inter)actions,
determining how action representations are utilized in these inferential tasks,
and assessing how perception and reasoning operate synergistically to infer
hidden goals and intentions. This essay will focus on how biological motion
enables human agents to effectively interact with objects and other agents in
the environment. We will first review classical and recent work relevant to
understanding biological motion. We then propose a unified computational
framework, and point to future research directions for work on this fundamental issue in perception and cognition.
BUILDING BLOCKS FOR UNDERSTANDING BIOLOGICAL MOTION
In order to carry out effective social interactions, human minds need to
address the fundamental question of “who did what to whom and why?” by
identifying causal relationships between individuals’ actions and inferring
the intentions of individuals. For example, when we observe an interaction
between an individual and an object (e.g., a person throws a ball), what
we really see is a living actor that causes a change in the states of an object
(e.g., positions, moving directions of the ball), in order to achieve certain
subjective goals (e.g., to hit a basketball net). Although the brain performs
this complex analysis with little effort, the process is sophisticated, involving
4
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
three distinct, but related, types of analyses: animacy (i.e., perceiving the
moving agent as a living being, and the moving ball as not), causality (i.e.,
inferring that the person is the cause that makes the ball fly), and intention
(i.e., understanding what the actor wants to achieve by his or her action).
ANIMACY AND CAUSAL PERCEPTION
A classical investigation into animacy judgments was performed in the context of the perception of causality (Michotte, 1946/1963). Michotte showed
that moving geometric shapes give rise to a vivid perception of animacy, and
even of meaningful interactions between an animate object and an inanimate
object. Around the same time, Heider and Simmel (1944) showed that such
simple stimuli even yield perceived intentions (e.g., one shape would seem to
“want to catch” another shape). These pioneering studies highlight human
sensitivity to animacy, causality, and intentions, even in stimuli devoid of
regular social cues (such as body movements and facial expressions).
The work by Michotte, and others (Scholl & Tremoulet, 2000), has provided
evidence that animacy can be directly perceived in a stimulus, rather
than inferred from associations between stimulus elements. Specifically,
the perception of animacy is fairly fast, automatic, irresistible, and highly
stimulus-driven (Scholl & Tremoulet, 2000). For example, small changes
in the speed of a moving shape, which are unlikely to change humans’
high-level cognitive inferences about the movement, can nonetheless induce
or abolish a perception of animacy (Michotte, 1946/1963). Accordingly,
Michotte (1946/1963) proposed that some special and automatic mechanisms for analyzing perceptual input are responsible for giving rise to a
“genuine causal impression.” This hypothesis is supported by evidence
showing that processing of such causal interactions is not influenced by
attention (Blakemore et al., 2003).
Hence, two of the building blocks for the understanding of biological
motion—animacy and causal perception—depend on perceptual quantities.
Any change in the stimulus, or in the mechanisms with which the brain
processes these stimuli (e.g., due to brain diseases), could affect the understanding of intentions, and hence the social understanding of the actions.
THE INFERENCE OF INTENTION
The third building block of understanding biological motion is to infer
other people’s intentions, which requires making a connection between the
observed behavior (e.g., body movements) and the inferred mental states
(including intentions) of the actor. This process involves taking what has
been called an intentional stance (Dennett, 1987), and requires the observer to
Understanding Biological Motion
5
have extensive knowledge of how actions may arise from a combinations of
internal factors. Such a mentalizing process (Frith & Frith, 2006), also termed
theory of mind (Premack & Woodruff, 1978), is largely built on conscious and
unconscious cognitive inferences about intentions of the perceived actor.
Different types of inferential processes are involved in understanding intentions from biological motion. Recently, it has been suggested that intentions
can be divided into goal-directed intentions, and those that require multistep
movements, with an orthogonal dimension of social to nonsocial intentions
(Chambon et al., 2011). It is possible that simple goal-directed intentions (e.g.,
grabbing a mug to drink something) may be directly perceivable or may be
retrieved by internal motor simulation (Brass, Schmitt, Spengler, & Gergely,
2007), without the need for explicit inferences. However, interpretation of
more complex intentions—those that are essential in social contexts—most
likely depends on cognitive inferences (Keysers & Gazzola, 2007; de Lange,
Spronk, Willems, Toni, & Bekkering, 2008). These latter, more complex, inferential processes are classically thought to be at the root of some of the disorders affecting social cognition, such as autism.
In order to make correct inferences about intentions, humans need to
combine directly perceivable information, such as body movements, as
well as animacy and even causality, with prior knowledge about intentions
underlying similar dynamic instances observed previously in social contexts. According to Marr’s three levels of analysis in vision (Marr, 1982), we
need to examine biological motion understanding at computation, representation and neural implementation levels. To achieve the balance between
incoming “sensory-driven” information and prior knowledge, Bayesian
inference provides a mathematical framework at the computational level
to understand the function and purposes of biological motion (Knill &
Richards, 1996). The predictive coding framework provides a testable theory
at the representation and implementation levels to make connections to
neuroscience and behavioral studies (Friston, 2010), which will be discussed
in detail as one of the emerging trends.
PREDICTIVE CODING FRAMEWORK FOR UNDERSTANDING
BIOLOGICAL MOTION
In a predictive coding framework, the brain aims to explain incoming
sensory data by making inferences about the potential causes of the sensory
data. These predictions (in computational terms, “empirical priors”) are
sent down to a processing level primarily driven by sensory information,
where they are subtracted from the incoming information. If the prediction
is inaccurate, a large residual of the sensory data (i.e., the prediction error)
remains unexplained. This information about the discrepancy between
6
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
prediction and observation is sent back to the higher (less sensory) levels in
order to adjust the prediction. If the prediction is reasonably accurate, and
only a small error signal (prediction error) remains, the system can conclude
that the predicted (i.e., inferred) causes were indeed present. Such loops
are present in both early sensory levels and high-level brain areas, where
intentions are processed. Therefore, depending on the stage within the visual
system, inferred causes consist of intentions at higher levels of hierarchical
processes; while at lower levels, they are more mechanical causes (e.g., a ball
hitting another ball), or simply the presence of a stimulus. Therefore, within
the predictive coding framework, there is no major difference between
perceptual and cognitive inferences. The “perception” of the presence of a
stimulus relies on an identical computational schema as the “inference” of
the presence of an intention.
Biological motion understanding involves a hierarchical system involving
different stages from low-level processing to higher level processing, that is,
local motion processing, biological motion processing, animacy and causal
perception, and intention inference. When a problem occurs at higher level
processing, this could lead to difficulties in inferring intentions (Kilner, Friston, & Frith, 2007), which in turn may cause an incorrect prediction of what
the observed actor may do in the future. These wrong predictions will differ greatly from the input and create a large prediction error, which requires
additional processing at this lower level to resolve the disagreement between
predictions and observations. Hence, malfunction at a higher level could
therefore cause an increased demand of attention to lower level element (i.e.,
details), which could account for some behavioral atypicalities observed in
autism (van Boxtel & Lu, 2013b; Friston, Lawson, & Frith, 2013; Van de Cruys,
de-Wit, Evers, Boets, & Wagemans, 2013). Similarly, when lower level processors do not function in a typical way, this could cause inaccurate perception
of animacy, which could in turn affect how well an observer can infer the
intentions of the observed actor.
FUTURE DIRECTIONS
In this section, we identify three main themes that may guide future research
investigating how humans achieve deep understanding of biological motion
for effective social interactions.
RELATIONSHIP BETWEEN PERCEPTUAL AND COGNITIVE INFERENTIAL PROCESSES
To achieve deep understanding via biological motion, the human mind relies
on a sophisticated interplay between three tight interrelated basic processes:
Understanding Biological Motion
7
perception of animacy, causality, and inference of intention. In future studies, it will therefore be important to separate different contributions of these
individual processes to biological motion understanding. This type of work is
important, both for determining the constraints on the inference of intentions
and for quantifying the contributions of each process in accounting for individual differences. Without recognizing the distinction between these three
basic processes, it would be difficult to understand the origins and determinants of certain psychological disorders that involve problems in social
cognition and develop principled intervention programs.
Isolating these distinct factors is not easy. For example, in certain experimental paradigms, a stimulus manipulation intended to change the intention
inference process may instead change the perception of animacy, resulting
(indirectly) in a change in inferred intentions. This potential confound was
recently addressed by Gao and colleagues (Gao, Scholl, & McCarthy, 2012) in
a psychophysical study in which moving geometric shapes were perceived
to be animate and intentional. By manipulating the stimulus carefully, the
researchers were able to maintain the same level of perceived animacy, while
varying the perceived intentions of the shape elements. They showed that
some areas [such as the posterior superior temporal sulcus (pSTS)], which
were previously thought to be involved in the perception of animacy, are
actually more related to the detection of intentionality. Their study demonstrates the promise of developing visual stimuli to discriminate inferential
processes of intention from perceptual processes involving animacy and
causality. We consider studies that demarcate the influence of animacy,
causality, and intention to represent an important future research direction.
Meanwhile, realistic action stimuli will also be important because natural
social stimuli are more complex than the combined movements of simple
geometric shapes. For example, several studies demonstrated that the
perception of animacy depends on the correct relationship between internal
and external movements (Michotte, 1946/1963; Thurman & Lu, 2013), for
example, an actor will look less animate when he/she moves too fast in
relation to how fast its extremities move. Such complex interactions between
internal and external cues in biological motion stimuli will need to be
researched more thoroughly.
Finally, there is an emerging trend to investigate how social scenarios are
perceived using controlled biological motion stimuli. Although inferences
about intentions are especially important in social/interactive contexts,
research on this topic is challenging due to the complexity of this inferential
task. Identifying interactions between actors is obviously more complicated
than perceiving the action of a single individual, and recent imaging work
has indeed shown that additional brain structures (including the STS, and
more frontal areas) are recruited when an interactive scene is analyzed
8
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
(Backasch et al., 2013; Iacoboni et al., 2004). Systematic investigations on
intention inference are needed to deepen our understanding on this important problem in human perception and cognition. Indeed, it is one of the
stated goals of the Human Brain Project (2012, p. 34), a major scientific
research project sponsored by the European Union. The initial research
forays in this direction have yielded interesting data. For example, psychophysical experiments provided evidence to show top-down influences
of interactive information at very early levels of processing in the visual
system. In case of the detection of a point-light actor in a noisy background,
it was found that detection was easier when individuals with physical
interactions (i.e., dancing and boxing partners) (Neri, Luu, & Levi, 2006).
A similar finding is reported (Manera, Del Giudice, Bara, Verfaillie, & Becchio, 2011) for communicative interactions (e.g., hand gestures). Therefore,
interactive information is able to impact perception at very early stages
within the visual hierarchy. In addition, more research is needed to examine
human predictability of future actions in complex social environment. In the
literature, researchers employed predictive tasks to address this question
using single actor stimuli (Graf et al., 2007). Future research will likely focus
on systematically examining the ability of humans to predict biological
motion in social/interactive scenes involving more than one actor.
INVESTIGATING THE PREDICTIVE CODING FRAMEWORK
The previous section focused on experimental investigations into perceptual
and inferential processes involved in understanding biological motion. However, to arrive at a complete understanding of human action understanding,
these separate influences need to be integrated within a unifying computational framework (van Boxtel & Lu, 2013b). The predictive coding framework
has recently garnered a lot of attention, because it provides a parsimonious
explanation of various problems in autism, including both altered social perception and visual perception (van Boxtel & Lu, 2013b; Friston et al., 2013; Van
de Cruys et al., 2013). This explanation focuses on two important elements
in the predictive coding framework. The first is that there is an imbalance
between bottom-up “sensory” inputs and top-down prediction-driven priors. Different theories point to different causes of the imbalance, emphasizing
either increased/altered sensory processing or decreased high-level (prior)
information, but it should be emphasized that these theories are not mutually
exclusive. In fact, different types of autism may depend on different causes.
The second element in the predictive coding framework is its circular
architecture, where a change in low-level processes is forwarded to a
high-level process, which then produces a different prediction (empirical
prior), followed by subsequent alteration of low-level processing. The
Understanding Biological Motion
9
predictive coding framework, because of its circular architecture, is more
complicated compared to previous models that attempt to explain perceptual deficiencies to either low-level or high-level processing; however, the
predictive coding framework has the advantage of being consistent with
known brain architecture (Mumford, 1992), thus yielding interpretations of
empirical findings that may more closely link social cognition to its neural
substrate. Future work will need to investigate how prior knowledge is
updated with experience, and how it is applied by the brain depending on
the expected precision of their inferred causes (Feldman & Friston, 2010)
There have been recent efforts to explain autism within the framework
of predictive coding (van Boxtel & Lu, 2013b; Friston et al., 2013; Van de
Cruys et al., 2013). These largely theoretical advances need to be put to
empirical tests, and future research should be aimed at providing evidence
for the influence of predictive coding mechanisms in autism. Importantly,
the predictive coding framework can be used to guide research in new
directions, leading to potential insights regarding how the balance between
priors driven by predictions and likelihoods driven by sensory information
affects perception, and perhaps social cognition in general, in mental disease,
and in the general population (e.g., van Boxtel & Lu, 2013a; Rhodes, Jeffery,
Taylor, & Ewing, 2013).
Thus, the predictive coding framework (especially when developed into a
computational model) will allow to test very specific predictions and may
allow future research to determine which parts of the framework are related
to perceptual and cognitive deficits, and how such deficits may be counteracted. Experimental and computational work directed at both the perceptual
and inferential levels, as well as their interaction, will be a very fruitful future
endeavor.
CONNECTION TO MENTAL DISORDERS
Apart from the focus on the theoretical predictive coding framework, we
expect more research at the interface between perception and cognition in
the context of mental disorders that affect cognition, especially research that
dissociates the problems related to animacy, causal perception, and intention
inference. As noted earlier, deficits in social perception may result from a
problem in any of these three (or other) mechanisms. Although the emphasis
of the field has been on social deficits in autism, there is evidence that social
cognition in general is very much based on the interplay between all three
processes. For example, the pSTS is involved in the perception of biological
motion (Grossman et al., 2000), while at the same time being an important
hub in the understanding of intention (Frith & Frith, 2003; Gao et al., 2012).
10
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
It is also sensitive to other social cues, such as where someone is looking
(Pelphrey, Morris, & McCarthy, 2004).
In fact, the STS may be at the crossroads of perception and the inference
of intention and social cognition (Allison, Puce, & McCarthy, 2000; Castelli,
Happe, Frith, & Frith, 2000; Frith & Frith, 2003; Gao et al., 2012), being
hypoactive in autism (Zilbovicius et al., 2006) and hyperactive in schizophrenia (Backasch et al., 2013). Perhaps, the pSTS is important in connecting
the inferred intentions to a certain stimulus in the visual array. Problems
in the attribution of intention are potentially central to the understanding
of autism and schizophrenia. For example, hallucinations in schizophrenia
can be viewed as an “over-attribution” of causation/intentionality. Patients
attribute a cause to a certain percept, or an intention to a certain action, that
did not actually exist (see, e.g., Backasch et al., 2013). Contradistinctively,
people with autism may suffer from a weaker attribution of intentions. For
example, children with autism spectrum disorder (ASD) show deficits in
understanding social intentions in biological motion displays relative to
typically developing children (Centelles, Assaiante, Etchegoyhen, Bouvard,
& Schmitz, 2013). However, they may not necessarily have a deficit in
identifying the observed action (e.g., Saygin, Cook, & Blakemore, 2010).
This type of research, at the crossroads of perception and intention inference, will be a fruitful contribution. With these future directions in mind, we
can look forward to an increased understanding of what separate processes
are essential to the understanding of biological motion stimuli, and how they
work together, based on detailed computational models.
ACKNOWLEDGMENT
This work was supported by NSF grant BCS-0843880 to H. L.
REFERENCES
Allison, T., Puce, A., & McCarthy, G. (2000). Social perception from visual cues: Role
of the STS region. Trends in Cognitive Sciences, 4(7), 267–278.
Backasch, B., Straube, B., Pyka, M., Klohn-Saghatolislam, F., Muller, M. J., Kircher,
T. T., & Leube, D. T. (2013). Hyperintentionality during automatic perception of
naturalistic cooperative behavior in patients with schizophrenia. Social Neuroscience, 8(5), 489–504.
Blake, R., Turner, L. M., Smoski, M. J., Pozdol, S. L., & Stone, W. L. (2003). Visual
recognition of biological motion is impaired in children with autism. Psychological
Science, 14(2), 151–157.
Blakemore, S. J., Boyer, P., Pachot-Clouard, M., Meltzoff, A., Segebarth, C., &
Decety, J. (2003). The detection of contingency and animacy from simple animations in the human brain. Cerebral Cortex, 13(8), 837–844.
Understanding Biological Motion
11
van Boxtel, J. J. A., & Lu, H. (2013a). Impaired global, and compensatory local, biological motion processing in people with high levels of autistic traits. Frontiers in
Psychology, 4(209), 1–10.
van Boxtel, J. J. A., & Lu, H. (2013b). A predictive coding perspective on autism spectrum disorders. Frontiers in Psychology, 4(19), 1–3.
Brass, M., Schmitt, R. M., Spengler, S., & Gergely, G. (2007). Investigating action
understanding: Inferential processes versus action simulation. Current Biology,
17(24), 2117–2121.
Castelli, F., Happe, F., Frith, U., & Frith, C. (2000). Movement and mind: A functional
imaging study of perception and interpretation of complex intentional movement
patterns. NeuroImage, 12(3), 314–325.
Centelles, L., Assaiante, C., Etchegoyhen, K., Bouvard, M., & Schmitz, C. (2013). From
action to interaction: Exploring the contribution of body motion cues to social
understanding in typical development and in autism spectrum disorders. Journal
of Autism and Developmental Disorders, 43(5), 1140–1150.
Chambon, V., Domenech, P., Pacherie, E., Koechlin, E., Baraduc, P., & Farrer, C. (2011).
What are they up to? The role of sensory evidence and prior knowledge in action
understanding. PLoS ONE, 6(2), e17133.
Cutting, J. E., & Kozlowski, L. (1977). Recognizing friends by their walk: Gait perception without familiarity cues. Bulletin of the Psychonomic Society, 9, 353–356.
Darwin, C. (1872). The expression of the emotions in man and animals. London, England:
John Murray.
Dennett, D. C. (1987). The intentional stance. Cambridge, MA: MIT Press.
Dittrich, W. H., Troscianko, T., Lea, S. E., & Morgan, D. (1996). Perception of
emotion from dynamic point-light displays represented in dance. Perception, 25(6),
727–738.
Feldman, H., & Friston, K. J. (2010). Attention, uncertainty, and free-energy. Frontiers
in Human Neuroscience, 4, 215.
Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews
Neuroscience, 11(2), 127–138.
Friston, K. J., Lawson, R., & Frith, C. D. (2013). On hyperpriors and hypopriors: Comment on Pellicano and Burr. Trends in Cognitive Sciences, 17(1), 1.
Frith, U., & Frith, C. D. (2003). Development and neurophysiology of mentalising.
Philosophical Transactions of the Royal Society B, 358, 685–694.
Frith, C. D., & Frith, U. (2006). The neural basis of mentalizing. Neuron, 50(4), 531–534.
Gao, T., Scholl, B. J., & McCarthy, G. (2012). Dissociating the detection of intentionality from animacy in the right posterior superior temporal sulcus. The
Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 32(41),
14276–14280.
Graf, M., Reitzner, B., Corves, C., Casile, A., Giese, M., & Prinz, W. (2007). Predicting
point-light actions in real-time. NeuroImage, 36(Suppl 2), T22–T32.
Grossman, E., Donnelly, M., Price, R., Pickens, D., Morgan, V., Neighbor, G., &
Blake, R. (2000). Brain areas involved in perception of biological motion. Journal of
Cognitive Neuroscience, 12(5), 711–720.
12
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
Heider, F., & Simmel, M. (1944). An experimental study of apparent behavior. The
American Journal of Psychology, 57, 243–249.
Iacoboni, M., Lieberman, M. D., Knowlton, B. J., Molnar-Szakacs, I., Moritz, M.,
Throop, C. J., & Fiske, A. P. (2004). Watching social interactions produces dorsomedial prefrontal and medial parietal BOLD fMRI signal increases compared to a
resting baseline. NeuroImage, 21(3), 1167–1173.
Kaiser, M. D., & Shiffrar, M. (2009). The visual perception of motion by observers
with autism spectrum disorders: A review and synthesis. Psychonomic Bulletin &
Review, 16(5), 761–777.
Keysers, C., & Gazzola, V. (2007). Integrating simulation and theory of mind: From
self to social cognition. Trends in Cognitive Sciences, 11(5), 194–196.
Kilner, J. M., Friston, K. J., & Frith, C. D. (2007). Predictive coding: An account of the
mirror neuron system. Cognitive Processing, 8(3), 159–166.
Klin, A., Lin, D. J., Gorrindo, P., Ramsay, G., & Jones, W. (2009). Two-year-olds with
autism orient to non-social contingencies rather than biological motion. Nature,
459(7244), 257–261.
Knill, D. C., & Richards, W. (1996). Perception as Bayesian inference. Cambridge, England: Cambridge University Press.
Kozlowski, L., & Cutting, J. E. (1977). Recognizing the sex of a walker from a dynamic
point-light display. Perception & Psychophysics, 21(6), 575–580.
de Lange, F. P., Spronk, M., Willems, R. M., Toni, I., & Bekkering, H. (2008). Complementary systems for understanding action intentions. Current Biology: CB, 18(6),
454–457.
Manera, V., Del Giudice, M., Bara, B. G., Verfaillie, K., & Becchio, C. (2011). The
second-agent effect: Communicative gestures increase the likelihood of perceiving
a second agent. PLoS ONE, 6(7), e22650.
Marr, D. (1982). Vision: A computational approach. San Francisco, CA: Freeman & Co..
Michotte, A. (1946/1963). The perception of causality (Miles T.R., Miles E. Trans.). New
York, NY: Basic Books (Original work published 1946).
Mumford, D. (1992). On the computational architecture of the neocortex II. The role
of cortico-cortical loops. Biological Cybernetics, 66(3), 241–251.
Neri, P., Luu, J. Y., & Levi, D. M. (2006). Meaningful interactions can enhance visual
discrimination of human agents. Nature Neuroscience, 9(9), 1186–1192.
Pelphrey, K. A., Morris, J. P., & McCarthy, G. (2004). Grasping the intentions of others: The perceived intentionality of an action influences activity in the superior
temporal sulcus during social perception. Journal of Cognitive Neuroscience, 16(10),
1706–1716.
Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind?
Behavioral and Brain Sciences, 1(4), 515–526.
Rhodes, G., Jeffery, L., Taylor, L., & Ewing, L. (2013). Autistic traits are linked to
reduced adaptive coding of face identity and selectively poorer face recognition
in men but not women. Neuropsychologia, 51(13), 2702–2708.
Saygin, A. P., Cook, J., & Blakemore, S. J. (2010). Unaffected perceptual thresholds for
biological and non-biological form-from-motion perception in autism spectrum
conditions. PLoS ONE, 5(10), e13491.
Understanding Biological Motion
13
Scholl, B. J., & Tremoulet, P. D. (2000). Perceptual causality and animacy. Trends in
Cognitive Sciences, 4(8), 299–309.
Simion, F., Regolin, L., & Bulf, H. (2008). A predisposition for biological motion in
the newborn baby. Proceedings of the National Academy of Sciences of the United States
of America, 105(2), 809–813.
The_HBP-PS_Consortium (2012). The human brain project. A Report to the European Commission. Lausanne, Switzerland.
Thurman, S. M., & Lu, H. (2013). Physical and biological constraints govern perceived animacy of scrambled human forms. Psychological Science, 24(7), 1133–1141.
Van de Cruys, S., de-Wit, L., Evers, K., Boets, B., & Wagemans, J. (2013). Weak
priors versus overfitting of predictions in autism: Reply to Pellicano and Burr
(TICS, 2012). i-Perception, 4(2), 95–97.
Zilbovicius, M., Meresse, I., Chabane, N., Brunelle, F., Samson, Y., & Boddaert, N.
(2006). Autism, the superior temporal sulcus and social perception. Trends in Neurosciences, 29(7), 359–366.
FURTHER READING
Blake, R., & Shiffrar, M. (2007). Perception of human motion. Annual Review of Psychology, 58, 47–73.
Feldman, H., & Friston, K. J. (2010). Attention, uncertainty, and free-energy. Frontiers
in Human Neuroscience, 4, 215.
Frith, C. D., & Frith, U. (2006). The neural basis of mentalizing. Neuron, 50(4), 531–534.
Michotte, A. (1946/1963). The perception of causality.
Scholl, B. J., & Tremoulet, P. D. (2000). Perceptual causality and animacy. Trends in
Cognitive Sciences, 4(8), 299–309.
JEROEN J. A. VAN BOXTEL SHORT BIOGRAPHY
Jeroen J. A. van Boxtel (www.jeroenvanboxtel.com) is an Associate Professor
in Cognitive Neuroscience at Monash University in Australia. Previously,
he worked as a postdoctoral fellow at UCLA and at the California Institute
of Technology. He studies biological motion perception and the interaction between attention and consciousness. He showed that attention
and consciousness can have opposite effects on visual perception. Before
working in Australia and the USA, van Boxtel obtained his PhD at Utrecht
University, the Netherlands, working on the topics of binocular rivalry
and motion perception. During his graduate and undergraduate years,
he studied both in the Netherlands and in France. He obtained a Masters
degree in Biology at the Utrecht University, a Masters degree in Cognitive
Sciences at the Université Pierre et Marie Curie, and the Collège de France,
in Paris.
14
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
HONGJING LU SHORT BIOGRAPHY
Hongjing Lu (cvl.psych.ucla.edu) is an Associate Professor in Departments
of Psychology and Statistics at University of California, Los Angeles (UCLA).
After completing her PhD in Psychology at UCLA, Dr. Lu was a postdoctoral fellow in the Department of Statistics, UCLA, and then an Assistant
Professor at the University of Hong Kong. Dr. Lu joined the UCLA faculty in
2008. Her research integrates computational and psychophysical approaches
to the study of human visual perception and cognition. The basic goal of her
research is to investigate how humans learn and reason, and how intelligent
machines might emulate them. Dr. Lu has a broad background in psychology and statistics, and expertize in designing psychophysical experiments
and developing computational models. She has been the recipient of an NSF
CAREER award and has been PI or coinvestigator on several grants funded
by NSF, ONR, AFOSR, and UCLA.
RELATED ESSAYS
Agency as an Explanatory Key: Theoretical Issues (Sociology), Richard
Biernacki and Tad Skotnicki
Theory of Mind and Behavior (Psychology), Amanda C. Brandone
Mental Models (Psychology), Ruth M. J. Byrne
Spatial Attention (Psychology), Kyle R. Cave
The Inherence Heuristic: Generating Everyday Explanations (Psychology),
Andrei Cimpian
An Evolutionary Perspective on Developmental Plasticity (Psychology),
Sarah Hartman and Jay Belsky
From Individual Rationality to Socially Embedded Self-Regulation (Sociology), Siegwart Lindenberg
Two-Systems View of Children’s Theory-of-Mind Understanding (Psychology), Jason Low
Concepts and Semantic Memory (Psychology), Barbara C. Malt
A Bio-Social-Cultural Approach to Early Cognitive Development: Entering
the Community of Minds (Psychology), Katherine Nelson
Emerging Trends in Culture and Concepts (Psychology), Bethany Ojalehto
and Douglas Medin
Theory of Mind (Psychology), Henry Wellman
-
Understanding Biological Motion
JEROEN J. A. VAN BOXTEL and HONGJING LU
Abstract
The ultimate goal of biological motion perception is to be able to understand actions
so as to provide an answer to the question, “Who did what to whom and why?” This
inference capacity enables humans to go beyond the surface appearance of behavior
in order to successfully interact with others and with the environment. In addition
to its functional importance, understanding biological motion bridges several major
fields, including perception, reasoning, and social cognition. However, despite its
paramount role in human perception and cognition, only limited progress has so far
been made in understanding biological motion. After reviewing the relevant literature, this essay argues that future research needs to identify the contributions of
three basic processes involved in understanding biological motion: perception of
animacy, causality, and intention. The involvement of these basic processes needs
to be investigated both in the typical healthy population as well as in populations
with mental disorders, such as autism spectrum disorders and schizophrenia. We
also suggest that a productive research approach should focus on more interactive
actions of the sort often observed in the natural social environment, rather than solely
using the single-actor displays that have been typical in previous work. It is further emphasized that there is a need for a theoretical and computational framework
within which these different types of processing can be united. We propose that the
predictive coding framework provides a good candidate.
INTRODUCTION
In 1872, in his seminal work demonstrating parallels in the way humans
and animals express emotions, Darwin noted that “actions speak louder
than pictures when it comes to understanding what others are doing and feeling”
(Darwin, 1872). Darwin’s claim is supported by the fact that many animal
species are sensitive to motion patterns generated by other living organisms,
presumably due to the ecological importance of biological motion. Superior
perception for biological motion manifests itself in two pervasive behavioral
characteristics: the robustness of recognizing actions, and sophisticated
inference in understanding them, that is, grasping the intentions of actors.
In support of the first characteristic, numerous psychophysical studies have
Emerging Trends in the Social and Behavioral Sciences. Edited by Robert Scott and Stephen Kosslyn.
© 2015 John Wiley & Sons, Inc. ISBN 978-1-118-90077-2.
1
2
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
demonstrated that human observers show an exquisite ability to accurately
identify attributes of an actor, such as identity (Cutting & Kozlowski, 1977),
emotional state (Dittrich, Troscianko, Lea, & Morgan, 1996), and gender
(Kozlowski & Cutting, 1977), even when the stimulus lacks a detailed human
body form (e.g., a point-light display consisting of only a few discrete dots
representing joint movements). The remarkably rapid, accurate, and robust
perception of biological motion has inspired a great deal of research directed
at understanding how the visual system achieves this perceptual feat in
recognizing actions.
However, the second characteristic—sophisticated inference in understanding biological motion—is arguably more essential for human
perception and cognition. In order to successfully interact with others and
with the environment, the human mind is equipped with the ability to make
inferences that go beyond the surface appearance of behavior. For example,
with a brief glance at the crowd in Times Square in New York, you can not
only recognize which pedestrians are walking in relaxation and which are
running in a hurry but also readily identify people who are interacting with
others (e.g., walking together while having a conversation; shaking hands).
Furthermore, you can predict other people’s actions in the near future (e.g.,
expecting someone to extend a hand to wave goodbye to a friend). The
ultimate goal of biological motion perception, it seems, is to understand and
predict actions in which multiple individuals interact, and to make inferences
about other individuals’ intentions and goals by evaluating their actions.
For several reasons, systematic research on biological motion understanding needs to be pursued with greater vigor. First, the inference capacity of
the human visual system exceeds that of the most advanced machine vision.
For example, in the investigation of the Boston Marathon bombing case in
2013, extensive video from surveillance camera systems was available, but
it was the trained human eye that led to arrests. Hence, understanding how
humans make inferences and predictions about actions will doubtlessly play
an important role in guiding the development of more advanced machine
vision systems.
Second, within the human population, the ability to understand biological
motion varies among individuals. Within the first days of life, human
newborns show selective preference for biological motion (Simion, Regolin,
& Bulf, 2008), supporting the hypothesis that detection of biological motion
is an intrinsic capability of the human visual system. However, this evolutionarily basic ability is impaired for people with disorders such as autism
(Blake, Turner, Smoski, Pozdol, & Stone, 2003; Klin, Lin, Gorrindo, Ramsay,
& Jones, 2009). One of the core symptoms distinguishing autism from
other disorders is lack of ability to infer the meaning of observed actions,
which makes it difficult to carry on effective interactions with others. This
Understanding Biological Motion
3
impairment is generally believed to contribute to the severe cognitive and
social consequences of autism in later life (Kaiser & Shiffrar, 2009). Hence,
investigation of the key mechanisms underlying action understanding
may potentially guide the development of behavioral interventions to help
individuals with autism to adopt compensatory strategies.
Third, understanding biological motion plays an essential role in bridging
several important fields, including perception, reasoning, and social cognition. A precursor of biological motion processing is the extraction of motion
information using general motion detectors; hence, biological motion perception offers a window to study the interactions between high-level visual processing and low-level motion processing. The outcome of biological motion
perception must feed into a reasoning system that infers the intentions and
goals of other individuals, yielding social understanding. Currently, relatively little research has addressed the connections between perception and
reasoning, and even less has investigated the further connection to social cognition. Hence, theoretical investigations and empirical tests are both needed
to advance understanding of the perceptual and cognitive architecture that
supports the social human mind.
In the past decades, the vast majority of studies of biological motion have
focused on simple and stereotyped actions, such as walking and running,
which involve a single agent. Little progress has been made in quantifying
what visual information is used in predicting other people’s (inter)actions,
determining how action representations are utilized in these inferential tasks,
and assessing how perception and reasoning operate synergistically to infer
hidden goals and intentions. This essay will focus on how biological motion
enables human agents to effectively interact with objects and other agents in
the environment. We will first review classical and recent work relevant to
understanding biological motion. We then propose a unified computational
framework, and point to future research directions for work on this fundamental issue in perception and cognition.
BUILDING BLOCKS FOR UNDERSTANDING BIOLOGICAL MOTION
In order to carry out effective social interactions, human minds need to
address the fundamental question of “who did what to whom and why?” by
identifying causal relationships between individuals’ actions and inferring
the intentions of individuals. For example, when we observe an interaction
between an individual and an object (e.g., a person throws a ball), what
we really see is a living actor that causes a change in the states of an object
(e.g., positions, moving directions of the ball), in order to achieve certain
subjective goals (e.g., to hit a basketball net). Although the brain performs
this complex analysis with little effort, the process is sophisticated, involving
4
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
three distinct, but related, types of analyses: animacy (i.e., perceiving the
moving agent as a living being, and the moving ball as not), causality (i.e.,
inferring that the person is the cause that makes the ball fly), and intention
(i.e., understanding what the actor wants to achieve by his or her action).
ANIMACY AND CAUSAL PERCEPTION
A classical investigation into animacy judgments was performed in the context of the perception of causality (Michotte, 1946/1963). Michotte showed
that moving geometric shapes give rise to a vivid perception of animacy, and
even of meaningful interactions between an animate object and an inanimate
object. Around the same time, Heider and Simmel (1944) showed that such
simple stimuli even yield perceived intentions (e.g., one shape would seem to
“want to catch” another shape). These pioneering studies highlight human
sensitivity to animacy, causality, and intentions, even in stimuli devoid of
regular social cues (such as body movements and facial expressions).
The work by Michotte, and others (Scholl & Tremoulet, 2000), has provided
evidence that animacy can be directly perceived in a stimulus, rather
than inferred from associations between stimulus elements. Specifically,
the perception of animacy is fairly fast, automatic, irresistible, and highly
stimulus-driven (Scholl & Tremoulet, 2000). For example, small changes
in the speed of a moving shape, which are unlikely to change humans’
high-level cognitive inferences about the movement, can nonetheless induce
or abolish a perception of animacy (Michotte, 1946/1963). Accordingly,
Michotte (1946/1963) proposed that some special and automatic mechanisms for analyzing perceptual input are responsible for giving rise to a
“genuine causal impression.” This hypothesis is supported by evidence
showing that processing of such causal interactions is not influenced by
attention (Blakemore et al., 2003).
Hence, two of the building blocks for the understanding of biological
motion—animacy and causal perception—depend on perceptual quantities.
Any change in the stimulus, or in the mechanisms with which the brain
processes these stimuli (e.g., due to brain diseases), could affect the understanding of intentions, and hence the social understanding of the actions.
THE INFERENCE OF INTENTION
The third building block of understanding biological motion is to infer
other people’s intentions, which requires making a connection between the
observed behavior (e.g., body movements) and the inferred mental states
(including intentions) of the actor. This process involves taking what has
been called an intentional stance (Dennett, 1987), and requires the observer to
Understanding Biological Motion
5
have extensive knowledge of how actions may arise from a combinations of
internal factors. Such a mentalizing process (Frith & Frith, 2006), also termed
theory of mind (Premack & Woodruff, 1978), is largely built on conscious and
unconscious cognitive inferences about intentions of the perceived actor.
Different types of inferential processes are involved in understanding intentions from biological motion. Recently, it has been suggested that intentions
can be divided into goal-directed intentions, and those that require multistep
movements, with an orthogonal dimension of social to nonsocial intentions
(Chambon et al., 2011). It is possible that simple goal-directed intentions (e.g.,
grabbing a mug to drink something) may be directly perceivable or may be
retrieved by internal motor simulation (Brass, Schmitt, Spengler, & Gergely,
2007), without the need for explicit inferences. However, interpretation of
more complex intentions—those that are essential in social contexts—most
likely depends on cognitive inferences (Keysers & Gazzola, 2007; de Lange,
Spronk, Willems, Toni, & Bekkering, 2008). These latter, more complex, inferential processes are classically thought to be at the root of some of the disorders affecting social cognition, such as autism.
In order to make correct inferences about intentions, humans need to
combine directly perceivable information, such as body movements, as
well as animacy and even causality, with prior knowledge about intentions
underlying similar dynamic instances observed previously in social contexts. According to Marr’s three levels of analysis in vision (Marr, 1982), we
need to examine biological motion understanding at computation, representation and neural implementation levels. To achieve the balance between
incoming “sensory-driven” information and prior knowledge, Bayesian
inference provides a mathematical framework at the computational level
to understand the function and purposes of biological motion (Knill &
Richards, 1996). The predictive coding framework provides a testable theory
at the representation and implementation levels to make connections to
neuroscience and behavioral studies (Friston, 2010), which will be discussed
in detail as one of the emerging trends.
PREDICTIVE CODING FRAMEWORK FOR UNDERSTANDING
BIOLOGICAL MOTION
In a predictive coding framework, the brain aims to explain incoming
sensory data by making inferences about the potential causes of the sensory
data. These predictions (in computational terms, “empirical priors”) are
sent down to a processing level primarily driven by sensory information,
where they are subtracted from the incoming information. If the prediction
is inaccurate, a large residual of the sensory data (i.e., the prediction error)
remains unexplained. This information about the discrepancy between
6
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
prediction and observation is sent back to the higher (less sensory) levels in
order to adjust the prediction. If the prediction is reasonably accurate, and
only a small error signal (prediction error) remains, the system can conclude
that the predicted (i.e., inferred) causes were indeed present. Such loops
are present in both early sensory levels and high-level brain areas, where
intentions are processed. Therefore, depending on the stage within the visual
system, inferred causes consist of intentions at higher levels of hierarchical
processes; while at lower levels, they are more mechanical causes (e.g., a ball
hitting another ball), or simply the presence of a stimulus. Therefore, within
the predictive coding framework, there is no major difference between
perceptual and cognitive inferences. The “perception” of the presence of a
stimulus relies on an identical computational schema as the “inference” of
the presence of an intention.
Biological motion understanding involves a hierarchical system involving
different stages from low-level processing to higher level processing, that is,
local motion processing, biological motion processing, animacy and causal
perception, and intention inference. When a problem occurs at higher level
processing, this could lead to difficulties in inferring intentions (Kilner, Friston, & Frith, 2007), which in turn may cause an incorrect prediction of what
the observed actor may do in the future. These wrong predictions will differ greatly from the input and create a large prediction error, which requires
additional processing at this lower level to resolve the disagreement between
predictions and observations. Hence, malfunction at a higher level could
therefore cause an increased demand of attention to lower level element (i.e.,
details), which could account for some behavioral atypicalities observed in
autism (van Boxtel & Lu, 2013b; Friston, Lawson, & Frith, 2013; Van de Cruys,
de-Wit, Evers, Boets, & Wagemans, 2013). Similarly, when lower level processors do not function in a typical way, this could cause inaccurate perception
of animacy, which could in turn affect how well an observer can infer the
intentions of the observed actor.
FUTURE DIRECTIONS
In this section, we identify three main themes that may guide future research
investigating how humans achieve deep understanding of biological motion
for effective social interactions.
RELATIONSHIP BETWEEN PERCEPTUAL AND COGNITIVE INFERENTIAL PROCESSES
To achieve deep understanding via biological motion, the human mind relies
on a sophisticated interplay between three tight interrelated basic processes:
Understanding Biological Motion
7
perception of animacy, causality, and inference of intention. In future studies, it will therefore be important to separate different contributions of these
individual processes to biological motion understanding. This type of work is
important, both for determining the constraints on the inference of intentions
and for quantifying the contributions of each process in accounting for individual differences. Without recognizing the distinction between these three
basic processes, it would be difficult to understand the origins and determinants of certain psychological disorders that involve problems in social
cognition and develop principled intervention programs.
Isolating these distinct factors is not easy. For example, in certain experimental paradigms, a stimulus manipulation intended to change the intention
inference process may instead change the perception of animacy, resulting
(indirectly) in a change in inferred intentions. This potential confound was
recently addressed by Gao and colleagues (Gao, Scholl, & McCarthy, 2012) in
a psychophysical study in which moving geometric shapes were perceived
to be animate and intentional. By manipulating the stimulus carefully, the
researchers were able to maintain the same level of perceived animacy, while
varying the perceived intentions of the shape elements. They showed that
some areas [such as the posterior superior temporal sulcus (pSTS)], which
were previously thought to be involved in the perception of animacy, are
actually more related to the detection of intentionality. Their study demonstrates the promise of developing visual stimuli to discriminate inferential
processes of intention from perceptual processes involving animacy and
causality. We consider studies that demarcate the influence of animacy,
causality, and intention to represent an important future research direction.
Meanwhile, realistic action stimuli will also be important because natural
social stimuli are more complex than the combined movements of simple
geometric shapes. For example, several studies demonstrated that the
perception of animacy depends on the correct relationship between internal
and external movements (Michotte, 1946/1963; Thurman & Lu, 2013), for
example, an actor will look less animate when he/she moves too fast in
relation to how fast its extremities move. Such complex interactions between
internal and external cues in biological motion stimuli will need to be
researched more thoroughly.
Finally, there is an emerging trend to investigate how social scenarios are
perceived using controlled biological motion stimuli. Although inferences
about intentions are especially important in social/interactive contexts,
research on this topic is challenging due to the complexity of this inferential
task. Identifying interactions between actors is obviously more complicated
than perceiving the action of a single individual, and recent imaging work
has indeed shown that additional brain structures (including the STS, and
more frontal areas) are recruited when an interactive scene is analyzed
8
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
(Backasch et al., 2013; Iacoboni et al., 2004). Systematic investigations on
intention inference are needed to deepen our understanding on this important problem in human perception and cognition. Indeed, it is one of the
stated goals of the Human Brain Project (2012, p. 34), a major scientific
research project sponsored by the European Union. The initial research
forays in this direction have yielded interesting data. For example, psychophysical experiments provided evidence to show top-down influences
of interactive information at very early levels of processing in the visual
system. In case of the detection of a point-light actor in a noisy background,
it was found that detection was easier when individuals with physical
interactions (i.e., dancing and boxing partners) (Neri, Luu, & Levi, 2006).
A similar finding is reported (Manera, Del Giudice, Bara, Verfaillie, & Becchio, 2011) for communicative interactions (e.g., hand gestures). Therefore,
interactive information is able to impact perception at very early stages
within the visual hierarchy. In addition, more research is needed to examine
human predictability of future actions in complex social environment. In the
literature, researchers employed predictive tasks to address this question
using single actor stimuli (Graf et al., 2007). Future research will likely focus
on systematically examining the ability of humans to predict biological
motion in social/interactive scenes involving more than one actor.
INVESTIGATING THE PREDICTIVE CODING FRAMEWORK
The previous section focused on experimental investigations into perceptual
and inferential processes involved in understanding biological motion. However, to arrive at a complete understanding of human action understanding,
these separate influences need to be integrated within a unifying computational framework (van Boxtel & Lu, 2013b). The predictive coding framework
has recently garnered a lot of attention, because it provides a parsimonious
explanation of various problems in autism, including both altered social perception and visual perception (van Boxtel & Lu, 2013b; Friston et al., 2013; Van
de Cruys et al., 2013). This explanation focuses on two important elements
in the predictive coding framework. The first is that there is an imbalance
between bottom-up “sensory” inputs and top-down prediction-driven priors. Different theories point to different causes of the imbalance, emphasizing
either increased/altered sensory processing or decreased high-level (prior)
information, but it should be emphasized that these theories are not mutually
exclusive. In fact, different types of autism may depend on different causes.
The second element in the predictive coding framework is its circular
architecture, where a change in low-level processes is forwarded to a
high-level process, which then produces a different prediction (empirical
prior), followed by subsequent alteration of low-level processing. The
Understanding Biological Motion
9
predictive coding framework, because of its circular architecture, is more
complicated compared to previous models that attempt to explain perceptual deficiencies to either low-level or high-level processing; however, the
predictive coding framework has the advantage of being consistent with
known brain architecture (Mumford, 1992), thus yielding interpretations of
empirical findings that may more closely link social cognition to its neural
substrate. Future work will need to investigate how prior knowledge is
updated with experience, and how it is applied by the brain depending on
the expected precision of their inferred causes (Feldman & Friston, 2010)
There have been recent efforts to explain autism within the framework
of predictive coding (van Boxtel & Lu, 2013b; Friston et al., 2013; Van de
Cruys et al., 2013). These largely theoretical advances need to be put to
empirical tests, and future research should be aimed at providing evidence
for the influence of predictive coding mechanisms in autism. Importantly,
the predictive coding framework can be used to guide research in new
directions, leading to potential insights regarding how the balance between
priors driven by predictions and likelihoods driven by sensory information
affects perception, and perhaps social cognition in general, in mental disease,
and in the general population (e.g., van Boxtel & Lu, 2013a; Rhodes, Jeffery,
Taylor, & Ewing, 2013).
Thus, the predictive coding framework (especially when developed into a
computational model) will allow to test very specific predictions and may
allow future research to determine which parts of the framework are related
to perceptual and cognitive deficits, and how such deficits may be counteracted. Experimental and computational work directed at both the perceptual
and inferential levels, as well as their interaction, will be a very fruitful future
endeavor.
CONNECTION TO MENTAL DISORDERS
Apart from the focus on the theoretical predictive coding framework, we
expect more research at the interface between perception and cognition in
the context of mental disorders that affect cognition, especially research that
dissociates the problems related to animacy, causal perception, and intention
inference. As noted earlier, deficits in social perception may result from a
problem in any of these three (or other) mechanisms. Although the emphasis
of the field has been on social deficits in autism, there is evidence that social
cognition in general is very much based on the interplay between all three
processes. For example, the pSTS is involved in the perception of biological
motion (Grossman et al., 2000), while at the same time being an important
hub in the understanding of intention (Frith & Frith, 2003; Gao et al., 2012).
10
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
It is also sensitive to other social cues, such as where someone is looking
(Pelphrey, Morris, & McCarthy, 2004).
In fact, the STS may be at the crossroads of perception and the inference
of intention and social cognition (Allison, Puce, & McCarthy, 2000; Castelli,
Happe, Frith, & Frith, 2000; Frith & Frith, 2003; Gao et al., 2012), being
hypoactive in autism (Zilbovicius et al., 2006) and hyperactive in schizophrenia (Backasch et al., 2013). Perhaps, the pSTS is important in connecting
the inferred intentions to a certain stimulus in the visual array. Problems
in the attribution of intention are potentially central to the understanding
of autism and schizophrenia. For example, hallucinations in schizophrenia
can be viewed as an “over-attribution” of causation/intentionality. Patients
attribute a cause to a certain percept, or an intention to a certain action, that
did not actually exist (see, e.g., Backasch et al., 2013). Contradistinctively,
people with autism may suffer from a weaker attribution of intentions. For
example, children with autism spectrum disorder (ASD) show deficits in
understanding social intentions in biological motion displays relative to
typically developing children (Centelles, Assaiante, Etchegoyhen, Bouvard,
& Schmitz, 2013). However, they may not necessarily have a deficit in
identifying the observed action (e.g., Saygin, Cook, & Blakemore, 2010).
This type of research, at the crossroads of perception and intention inference, will be a fruitful contribution. With these future directions in mind, we
can look forward to an increased understanding of what separate processes
are essential to the understanding of biological motion stimuli, and how they
work together, based on detailed computational models.
ACKNOWLEDGMENT
This work was supported by NSF grant BCS-0843880 to H. L.
REFERENCES
Allison, T., Puce, A., & McCarthy, G. (2000). Social perception from visual cues: Role
of the STS region. Trends in Cognitive Sciences, 4(7), 267–278.
Backasch, B., Straube, B., Pyka, M., Klohn-Saghatolislam, F., Muller, M. J., Kircher,
T. T., & Leube, D. T. (2013). Hyperintentionality during automatic perception of
naturalistic cooperative behavior in patients with schizophrenia. Social Neuroscience, 8(5), 489–504.
Blake, R., Turner, L. M., Smoski, M. J., Pozdol, S. L., & Stone, W. L. (2003). Visual
recognition of biological motion is impaired in children with autism. Psychological
Science, 14(2), 151–157.
Blakemore, S. J., Boyer, P., Pachot-Clouard, M., Meltzoff, A., Segebarth, C., &
Decety, J. (2003). The detection of contingency and animacy from simple animations in the human brain. Cerebral Cortex, 13(8), 837–844.
Understanding Biological Motion
11
van Boxtel, J. J. A., & Lu, H. (2013a). Impaired global, and compensatory local, biological motion processing in people with high levels of autistic traits. Frontiers in
Psychology, 4(209), 1–10.
van Boxtel, J. J. A., & Lu, H. (2013b). A predictive coding perspective on autism spectrum disorders. Frontiers in Psychology, 4(19), 1–3.
Brass, M., Schmitt, R. M., Spengler, S., & Gergely, G. (2007). Investigating action
understanding: Inferential processes versus action simulation. Current Biology,
17(24), 2117–2121.
Castelli, F., Happe, F., Frith, U., & Frith, C. (2000). Movement and mind: A functional
imaging study of perception and interpretation of complex intentional movement
patterns. NeuroImage, 12(3), 314–325.
Centelles, L., Assaiante, C., Etchegoyhen, K., Bouvard, M., & Schmitz, C. (2013). From
action to interaction: Exploring the contribution of body motion cues to social
understanding in typical development and in autism spectrum disorders. Journal
of Autism and Developmental Disorders, 43(5), 1140–1150.
Chambon, V., Domenech, P., Pacherie, E., Koechlin, E., Baraduc, P., & Farrer, C. (2011).
What are they up to? The role of sensory evidence and prior knowledge in action
understanding. PLoS ONE, 6(2), e17133.
Cutting, J. E., & Kozlowski, L. (1977). Recognizing friends by their walk: Gait perception without familiarity cues. Bulletin of the Psychonomic Society, 9, 353–356.
Darwin, C. (1872). The expression of the emotions in man and animals. London, England:
John Murray.
Dennett, D. C. (1987). The intentional stance. Cambridge, MA: MIT Press.
Dittrich, W. H., Troscianko, T., Lea, S. E., & Morgan, D. (1996). Perception of
emotion from dynamic point-light displays represented in dance. Perception, 25(6),
727–738.
Feldman, H., & Friston, K. J. (2010). Attention, uncertainty, and free-energy. Frontiers
in Human Neuroscience, 4, 215.
Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews
Neuroscience, 11(2), 127–138.
Friston, K. J., Lawson, R., & Frith, C. D. (2013). On hyperpriors and hypopriors: Comment on Pellicano and Burr. Trends in Cognitive Sciences, 17(1), 1.
Frith, U., & Frith, C. D. (2003). Development and neurophysiology of mentalising.
Philosophical Transactions of the Royal Society B, 358, 685–694.
Frith, C. D., & Frith, U. (2006). The neural basis of mentalizing. Neuron, 50(4), 531–534.
Gao, T., Scholl, B. J., & McCarthy, G. (2012). Dissociating the detection of intentionality from animacy in the right posterior superior temporal sulcus. The
Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 32(41),
14276–14280.
Graf, M., Reitzner, B., Corves, C., Casile, A., Giese, M., & Prinz, W. (2007). Predicting
point-light actions in real-time. NeuroImage, 36(Suppl 2), T22–T32.
Grossman, E., Donnelly, M., Price, R., Pickens, D., Morgan, V., Neighbor, G., &
Blake, R. (2000). Brain areas involved in perception of biological motion. Journal of
Cognitive Neuroscience, 12(5), 711–720.
12
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
Heider, F., & Simmel, M. (1944). An experimental study of apparent behavior. The
American Journal of Psychology, 57, 243–249.
Iacoboni, M., Lieberman, M. D., Knowlton, B. J., Molnar-Szakacs, I., Moritz, M.,
Throop, C. J., & Fiske, A. P. (2004). Watching social interactions produces dorsomedial prefrontal and medial parietal BOLD fMRI signal increases compared to a
resting baseline. NeuroImage, 21(3), 1167–1173.
Kaiser, M. D., & Shiffrar, M. (2009). The visual perception of motion by observers
with autism spectrum disorders: A review and synthesis. Psychonomic Bulletin &
Review, 16(5), 761–777.
Keysers, C., & Gazzola, V. (2007). Integrating simulation and theory of mind: From
self to social cognition. Trends in Cognitive Sciences, 11(5), 194–196.
Kilner, J. M., Friston, K. J., & Frith, C. D. (2007). Predictive coding: An account of the
mirror neuron system. Cognitive Processing, 8(3), 159–166.
Klin, A., Lin, D. J., Gorrindo, P., Ramsay, G., & Jones, W. (2009). Two-year-olds with
autism orient to non-social contingencies rather than biological motion. Nature,
459(7244), 257–261.
Knill, D. C., & Richards, W. (1996). Perception as Bayesian inference. Cambridge, England: Cambridge University Press.
Kozlowski, L., & Cutting, J. E. (1977). Recognizing the sex of a walker from a dynamic
point-light display. Perception & Psychophysics, 21(6), 575–580.
de Lange, F. P., Spronk, M., Willems, R. M., Toni, I., & Bekkering, H. (2008). Complementary systems for understanding action intentions. Current Biology: CB, 18(6),
454–457.
Manera, V., Del Giudice, M., Bara, B. G., Verfaillie, K., & Becchio, C. (2011). The
second-agent effect: Communicative gestures increase the likelihood of perceiving
a second agent. PLoS ONE, 6(7), e22650.
Marr, D. (1982). Vision: A computational approach. San Francisco, CA: Freeman & Co..
Michotte, A. (1946/1963). The perception of causality (Miles T.R., Miles E. Trans.). New
York, NY: Basic Books (Original work published 1946).
Mumford, D. (1992). On the computational architecture of the neocortex II. The role
of cortico-cortical loops. Biological Cybernetics, 66(3), 241–251.
Neri, P., Luu, J. Y., & Levi, D. M. (2006). Meaningful interactions can enhance visual
discrimination of human agents. Nature Neuroscience, 9(9), 1186–1192.
Pelphrey, K. A., Morris, J. P., & McCarthy, G. (2004). Grasping the intentions of others: The perceived intentionality of an action influences activity in the superior
temporal sulcus during social perception. Journal of Cognitive Neuroscience, 16(10),
1706–1716.
Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind?
Behavioral and Brain Sciences, 1(4), 515–526.
Rhodes, G., Jeffery, L., Taylor, L., & Ewing, L. (2013). Autistic traits are linked to
reduced adaptive coding of face identity and selectively poorer face recognition
in men but not women. Neuropsychologia, 51(13), 2702–2708.
Saygin, A. P., Cook, J., & Blakemore, S. J. (2010). Unaffected perceptual thresholds for
biological and non-biological form-from-motion perception in autism spectrum
conditions. PLoS ONE, 5(10), e13491.
Understanding Biological Motion
13
Scholl, B. J., & Tremoulet, P. D. (2000). Perceptual causality and animacy. Trends in
Cognitive Sciences, 4(8), 299–309.
Simion, F., Regolin, L., & Bulf, H. (2008). A predisposition for biological motion in
the newborn baby. Proceedings of the National Academy of Sciences of the United States
of America, 105(2), 809–813.
The_HBP-PS_Consortium (2012). The human brain project. A Report to the European Commission. Lausanne, Switzerland.
Thurman, S. M., & Lu, H. (2013). Physical and biological constraints govern perceived animacy of scrambled human forms. Psychological Science, 24(7), 1133–1141.
Van de Cruys, S., de-Wit, L., Evers, K., Boets, B., & Wagemans, J. (2013). Weak
priors versus overfitting of predictions in autism: Reply to Pellicano and Burr
(TICS, 2012). i-Perception, 4(2), 95–97.
Zilbovicius, M., Meresse, I., Chabane, N., Brunelle, F., Samson, Y., & Boddaert, N.
(2006). Autism, the superior temporal sulcus and social perception. Trends in Neurosciences, 29(7), 359–366.
FURTHER READING
Blake, R., & Shiffrar, M. (2007). Perception of human motion. Annual Review of Psychology, 58, 47–73.
Feldman, H., & Friston, K. J. (2010). Attention, uncertainty, and free-energy. Frontiers
in Human Neuroscience, 4, 215.
Frith, C. D., & Frith, U. (2006). The neural basis of mentalizing. Neuron, 50(4), 531–534.
Michotte, A. (1946/1963). The perception of causality.
Scholl, B. J., & Tremoulet, P. D. (2000). Perceptual causality and animacy. Trends in
Cognitive Sciences, 4(8), 299–309.
JEROEN J. A. VAN BOXTEL SHORT BIOGRAPHY
Jeroen J. A. van Boxtel (www.jeroenvanboxtel.com) is an Associate Professor
in Cognitive Neuroscience at Monash University in Australia. Previously,
he worked as a postdoctoral fellow at UCLA and at the California Institute
of Technology. He studies biological motion perception and the interaction between attention and consciousness. He showed that attention
and consciousness can have opposite effects on visual perception. Before
working in Australia and the USA, van Boxtel obtained his PhD at Utrecht
University, the Netherlands, working on the topics of binocular rivalry
and motion perception. During his graduate and undergraduate years,
he studied both in the Netherlands and in France. He obtained a Masters
degree in Biology at the Utrecht University, a Masters degree in Cognitive
Sciences at the Université Pierre et Marie Curie, and the Collège de France,
in Paris.
14
EMERGING TRENDS IN THE SOCIAL AND BEHAVIORAL SCIENCES
HONGJING LU SHORT BIOGRAPHY
Hongjing Lu (cvl.psych.ucla.edu) is an Associate Professor in Departments
of Psychology and Statistics at University of California, Los Angeles (UCLA).
After completing her PhD in Psychology at UCLA, Dr. Lu was a postdoctoral fellow in the Department of Statistics, UCLA, and then an Assistant
Professor at the University of Hong Kong. Dr. Lu joined the UCLA faculty in
2008. Her research integrates computational and psychophysical approaches
to the study of human visual perception and cognition. The basic goal of her
research is to investigate how humans learn and reason, and how intelligent
machines might emulate them. Dr. Lu has a broad background in psychology and statistics, and expertize in designing psychophysical experiments
and developing computational models. She has been the recipient of an NSF
CAREER award and has been PI or coinvestigator on several grants funded
by NSF, ONR, AFOSR, and UCLA.
RELATED ESSAYS
Agency as an Explanatory Key: Theoretical Issues (Sociology), Richard
Biernacki and Tad Skotnicki
Theory of Mind and Behavior (Psychology), Amanda C. Brandone
Mental Models (Psychology), Ruth M. J. Byrne
Spatial Attention (Psychology), Kyle R. Cave
The Inherence Heuristic: Generating Everyday Explanations (Psychology),
Andrei Cimpian
An Evolutionary Perspective on Developmental Plasticity (Psychology),
Sarah Hartman and Jay Belsky
From Individual Rationality to Socially Embedded Self-Regulation (Sociology), Siegwart Lindenberg
Two-Systems View of Children’s Theory-of-Mind Understanding (Psychology), Jason Low
Concepts and Semantic Memory (Psychology), Barbara C. Malt
A Bio-Social-Cultural Approach to Early Cognitive Development: Entering
the Community of Minds (Psychology), Katherine Nelson
Emerging Trends in Culture and Concepts (Psychology), Bethany Ojalehto
and Douglas Medin
Theory of Mind (Psychology), Henry Wellman
