What is learning?

Most psychologists (indeed, most people in general) would agree that learning is the acquisition of new knowledge, or new behaviors, or new skills. Hungarian psychologists Gergely and Csibra offer a deceptively simple description: "Learning involves acquiring new information and using it later when necessary." What this means is that learning requires the generalization of information to new situations - new people, objects, locations, or events. The problem is that any particular piece of information that a human or animal receives is situated within a particular context. Learning theorists refer to this as the problem of induction. Most learning theories invoke statistical learning mechanisms to account for this: as infants or animals have experiences in the world, they can identify correlations among events or encounters, and use those statistical correlations to form the basis of generalizations for novel events or encounters. However, this does not explain the situations in which infants rapidly learn information after only one or a few instances - certainly not enough time for any statistical learning mechanism to provide reliable information. Human communication might provide a shortcut.

Gergely and Csibra offer the following examples:

If I point at two aeroplanes and tell you that 'aeroplanes fly', what you learn is not restricted to the particular aeroplanes you see or to the present context, but will provide you generic knowledge about the kind of artefact these planes belong to that is generalizable to other members of the category and to variable contexts... If I show you by manual demonstration how to open a milk carton, what you will learn is how to open that kind of container (i.e. you acquire kind-generalizable knowledge from a single manifestation). In such cases, the observer does not need to rely on statistical procedures to extract the relevant information to be generalized because this is selectively manifested to her by the communicative demonstration.

The key here is that the learner does not need to statistically infer the generalizable information. Rather, the generalizability of the information is indicated within the communicative interaction itself. You don't tell the child "that airplane is flying"; you say "airplanes fly." This sort of teaching is not restricted to linguistic communication, as in the case of the milk carton.

What Gergely and Csibra are hypothesizing is that human communication is an evolutionary adaptation designed to aid in the transmission of generic knowledge between individuals. Specifically, they speculate that the emergence of tool-making led to the selection for the capacity for the communication of generic knowledge, during hominin evolution. The argument is that observational learning mechanisms would not be sufficient for the cognitively opaque process of making and using tools.

What does this mean?

Chimpanzees use tools. While this used to be a somewhat surprising revelation, this is not so surprising anymore. But their tool use is limited in important ways. They choose suitable tools for a given task from the immediate surroundings, sometimes modifying them, and then they generally discard the tool after they're done with it. In a sense, they're using tools as answers to the question "what object could I use to achieve this specific goal?" One common example of tool use in chimps involves using two objects as hammer and anvil to break apart nuts. Watch the juvenile chimp in this video learn about this process from her mother (that segment begins around 2:15):

Early humans may have had a slight shift in the way they thought about tools. Tools were kept rather than discarded, and often stored in particular locations. Tools could be made at one place, and carried to another place to be used. Rather than asking "what object can i use to achieve this specific goal," as a chimpanzee would, the human might ask, "for what purpose might I use this object?"

The problem is that any new member of a given culture (such as a child) would have to learn the function of tools. Trial-and-error is a slow and somewhat clunky process, and it might lead to various useful ways of interacting with tools, but probably not the intended use of a given tool. Trial-and-error is also unlikely to reveal the function of tools on other tools (such as a screwdriver and a screw, unless you have both tools in front of you), or the future function of a given tool in a different place or context.

A social learning mechanism such as imitation can get you part of the way there - and, indeed, in chimpanzees and other non-human animals it does. One could observe another individual use a tool and infer the function of the tool from the outcome. But this sort of learning mechanism is also limited: you need to observe an immediately obvious outcome in order to determine the goal of a given set of behaviors.

But even simple observation and imitation won't entirely solve the problem. For example, imagine someone using a tool to carve a piece of wood. What is the goal of this behavior? To take a big piece of wood and turn it into smaller pieces of wood? To make sounds? To make a carving? Without some prior knowledge of the tool, it is difficult to figure out what it is used for.

Or for another example, what is this?

When I asked on twitter, I got responses ranging from "bottle opener" to something with which to "beat the snot out of someone" else (twice). One person thought it could be used to measure something, and another thought it was a strange cookie cutter. One guess confused even me. Surely anybody could come up with a dozen potential uses for the item, but there is only one function that is was designed to fulfill: it's an antique pot cover lifter, designed to remove hot lids from their bases. You thread your fingers through it, and use it as a hook (if the lid has a handle), or you wedge the cover in there to lift it away.

Observing someone's behavior isn't as straightforward as you might think. Behavior can always be explained by an infinite combination of mental states, goals, and background knowledge, and is rarely (if ever) transparent with respect to the goals of a given action or the background knowledge that informs that action. This problem could be solved, however, if the tool user makes some of this information explicit. Some aspects of a behavior can be emphasized and others can be ignored, and products can be distinguished from by-products. But the learner must be receptive to this information for learning to take place at all.

Is it possible that evolution has prepared humans to learn generalizable information? Gergely and Csibra think so. They hypothesize that a specialized innate pedagogy mechanism (the pedagogical learning stance) is in place that allows an individual to remember generic information, which becomes generalizable to other contexts. A cognitive system like this requires three things. First, the learner must understand the communicative intent of the teacher via ostensive cues. Second, the teacher and learner must be able to use referential signals (things like eyegaze and pointing) to facilitate joint attention on a given object or location. Third, the learner must be able to comprehend the information content of the interaction; they must assume they are getting relevant information.

For their hypothesis to hold, infants should be sensitive to ostensive cues. In other words, they need to know that they are being addressed. The developmental psychology literature is rife with evidence that infants indeed possess this ability. For example, infants prefer to look at faces with directed gaze over faces with averted gaze. Further, the infant brain responds to a smile from another individual only if there is mutual eye-contact, and not if the smiler is looking elsewhere. Another ostensive cue is infant-directed speech, or baby-talk or "motherese." Newborn infants prefer listening to infant-directed speech over adult-directed speech. One particularly fascinating line of research has demonstrated that parents also adjust their actions themselves when engaged in a pedagogical interaction with their children, and infants prefer this "motionese" to adult-directed motion (this has also been found in macaques!). This fulfills the first requirement: that the learner must identify the communicative intent of the teacher.

The second requirement, that learners must understand the referential signals provided by their teachers, is also fairly straightforward. Preverbal infants aren't able to use linguistic information (or really, any symbolic system) in a robust way, but they are able to use actions such as pointing or the shifting of eyegaze towards an object in order to facilitate shared attention. Infants follow the gaze of social partners from very early on in development (as early as three months), and moreover, they are more likely to do so if the gaze-shift is preceded by an ostensive signal such as eye-contact or infant-directed speech. In other words, first the teacher must get the attention of the learner using ostensive cues, and then the teacher must redirect the learner's attention to a particular object or location using referential signals. This fulfills the second requirement: that the learner must be able to interpret the referential signals provided by the teacher.

The third requirement is that learners must understand that they are going to learn generic information. In other words, children would expect to learn something generalizable when in the context of ostensive-referential communication, rather than simply gaining episodic facts that pertain only to the specific context in which the social interaction occurs. Gergely and Csibra point out that this is what separates their hypothesis from other competing hypotheses (such as that of Michael Tomasello), which suggest that human communication derives from the desire to cooperate with others in order to achieve shared goals. If this was the case, then infants should treat generic information about an object (such as an object's visual appearance) differently than episodic information about that same object (such as an object's location). One recent study provided evidence to support this. In a non-communicative context, infants are more likely to notice a change in an object's location than in it's appearance. That is, they are giving preferential attention to episodic here-and-now information. However, when provided ostensive-referential communication, they are more likely to notice a change in an object's identity rather than it's location - they are attending to generic information rather than episodic information. This fulfills the third requirement: that the ostensive cues and referential signals prepare the infant to learn generalizable information from the teacher - they put the infant into "learning mode."

This is all very good evidence that humans do have a form of natural pedagogy, and that it is innate. But in order to make the case for pedagogy to be an evolutionary adaptation in the hominin lineage, as Gergely and Csibra are claiming, three additional types of support are necessary: (1) that natural pedagogy is human-specific, (2) that natural pedagogy is universal among human cultures, and (3) that this sort of human social communication was explicitly selected for in evolution, rather than having emerged as a by-product of some other selection.

The next set of posts in this series will address these questions.

See Part 1: Perseverating on Perseverative Error: What Does The "A-not-B Error" Really Tell Us About Infant Cognition?

For more on social learning:

How Do You Figure Out How Chimps Learn? Peanuts. and More on Chimpanzees and Peanuts

Ed Tronick and the "Still-Face Experiment"

Csibra, G., & Gergely, G. (2009). Natural pedagogy Trends in Cognitive Sciences, 13 (4), 148-153 DOI: 10.1016/j.tics.2009.01.005

Image via Flickr/19melissa68