Social-cognitive basis of language development


  • corresponding author Michael Tomasello - Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany External link

Published: September 7, 2009

Linguistic communication is a form of social interaction. It is a special form because it is mediated by linguistic symbols. Linguistic symbols are social conventions used by human beings to direct one another’s attention to various aspects of their shared environment. Different societies have created different sets of linguistic symbols – at the moment there are approximately 6000 languages in the world – and indeed the linguistic symbols of each society evolve fairly rapidly over historical time as speakers modify existing ways of talking to meet changing communicative circumstances.

Children are exposed to the linguistic conventions of their speech community as they interact socially with mature language users. To acquire productive use of linguistic conventions children must have the social-cognitive skills to:

  • participate with others in the joint attentional interactions that provide a common referential ground for symbolic communication (joint attention);
  • understand the communicative intentions of others as embodied in linguistic symbols of various types (intention reading);
  • take the perspective of others in tailoring the choice of linguistic expression to particular communicative contexts (perspective taking);
  • collaborate with others in coconstructing a conversation with a shared topic and ‘relevant’ contributions of new information (communicative collaboration).

Other animal species do not possess any of these social-cognitive skills – at least not in their human form – and that is the main reason why they are unable to create or to learn a human language [17], [13].

Joint attention, intention reading, and word learning

The social-cognitive foundations for language acquisition begin to emerge in human ontogeny near the end of the first year of life in various of infants’ nonlinguistic activities. Most importantly, at around 9 months of age infants begin to interact with other people triadically around a ‘topic’ of mutual interest (‘the primordial sharing situation’). Thus, they begin to follow others’ attention to external events and entities [15], to direct others’ attention to external events and entities using nonlinguistic gestures for both imperative and declarative purposes [3], and to engage with others in relatively extended bouts of joint attentional activity [1]. Other animal species engage rarely, if at all, in triadic communicative interaction.

Children acquire their earliest linguistic conventions within the context of routine joint attentional activities such as having a meal together, taking a bath together, going for a ride together in the car, and so forth [4]. These routines serve as necessary scaffolding because linguistic symbols are associated with their communicative functions only conventionally – which presents a unique learning situation. That is, no matter how intelligent a child is, she cannot on her own figure out what some sound means, but rather she must experience a speaker using that sound for the purposes of communication and then determine why, toward what communicative end, he is using it. This determination requires the child to have some nonlinguistic access to the kinds of thing the adult may be communicating about. This is provided by the joint attentional activity, which basically defines ‘what we are doing’ and so determines a kind of domain of relevance. For example, if a child hears the new word gavagai from an adult out of the blue, so to speak, she will not learn it because she has no idea why, for what communicative purpose, the adult is using this word. But if she and the adult are giving her a bath together and the adult looks at the rising water level and says gavagai while turning the faucet off, there are only a limited number of plausible meanings for this expression – assuming it is relevant to the activity. It is thus not surprising that Carpenter et al. [6] found that the size of children’s early vocabularies is highly correlated with the amount of time they spend in joint attentional engagement with others.

For the child to be able to zero in on a single one of the plausible meanings of a new expression used within a joint attentional activity, she must be able to read the communicative intentions of other people more specifically; that is, she must be able to determine specifically what the adult intends for her to attend to. Initially, it is very helpful if adults do most of the work and follow into the child’s already existing focus of attention when using a new expression (see [18] for a review). But as development proceeds, children become ever more skillful at determining adult communicative intentions in novel contexts in which the two of them are focused on different things. Thus, if an adult uses a novel word when he, the adult, is attending to one object and the child is attending to a different object, the 18-month-old child will follow the adult’s attention to the object on which he is focused in determining the intended referent [2]. Further, if the adult is picking his way through a toy box full of objects in search of a ‘modi,’ 18-month-old children will be able to pick out his intended referent by the fact that he rejects some objects and finally accepts one – despite the fact that he looks equally at them all. And if the adult uses a novel verb to announce an intended action (‘‘Now, let’s dack Ernie’’) and then performs one act on purpose and another by accident on Ernie – with different orders of these two across children – 24-month-old children know that the intended referent of dack is the one done on purpose (see [16] for a review of these and similar studies).

Children’s acquisition of linguistic symbols thus depends most fundamentally on their ability to participate in the joint attentional activities that create a common communicative ground with their interlocutor and on their ability to read the intentions of others in particular communicative contexts.

Perspective taking and construction learning

The broader social purposes for which people use language are such things as greeting others, informing others, asking questions, requesting favors, and so forth. In many cases the speakers of a language have conventionalized entire grammatical constructions for the most frequently occurring of these functions, for example, English wh-questions for requesting information and imperatives for requesting favors. In addition, speakers use different conventionalized grammatical constructions depending on who the listener is and what she knows and expects. For example, if we are conversing about my sister and I want to tell you something that happened to her, I might very well use an English passive construction such as She was sued by her landlord. Or perhaps I learn that you erroneously believe that my sister’s therapist sued her, in which case I might correct you by using a cleft construction such as It was her landlord that sued her. If we have been talking about something else I might introduce the topic by using some kind of presentational construction like You know my sister’s landlord, he . . . . In theories of language such as cognitive linguistics and construction grammar these kinds of constructions are, like words, linguistic symbols that pair a linguistic form and communicative function (i.e., the communicative intentions the symbol embodies). Children thus learn these constructions in a way that is not totally different from the way they learn words, that is, by observing people use them for particular communicative functions – which they determine by intention reading, including the subintentions of any subconstructions involved (e.g., noun phrases, locative phrases; see [10], [7]).

From a social-cognitive point of view, the most interesting thing about the acquisition of constructions is that children must learn when to use which one, and this often involves a subtle assessment of the knowledge and expectations of the listener. For instance, in their acquisition of noun phrases 2-year-old children learn to do such things as use pronouns to refer to an entity when they are jointly attending to that entity with their listener, but use a lexical noun with a determiner (or even a relative clause) when the two of them are not jointly attending (see [13]: Chap. 6 for a review). Similarly, to learn to use a cleft construction (e.g., It was Jeffrey that took the cookies) children must be able to tell that their listener is currently under the mistaken impression that someone else took the cookies and then choose a construction that makes contact both with the mistaken impression and its correction. Saying The vase broke enters the event from the perspective of the vase whereas saying I broke the vase enters it from the perspective of me and my activity [5].

In all these cases, then, children must learn the perspective, in the broad sense of that term, that a construction embodies, and the communicative situations in which it is appropriately used. They do this initially in a very local and item-based way. For example, they might initially be able to alternate between the object’s and the actor’s perspectives only with the verb break, with extension to other verbs coming only very gradually [14].

Communicative collaboration in conversation

As children are acquiring the various symbols and constructions of their language they are at the same time learning how to use these symbols and constructions to communicate more effectively in extended conversational interactions with other people. Conversational and discourse skills are concerned not so much with the mastery of the grammaticized and conventional aspects of a language, but more with the mastery of strategies for using those constructions effectively to manage the flow of information across turns in a developing conversational interaction. Skill at conversation involves such things as taking turns appropriately, managing the conversational topic effectively, and repairing a conversational interaction when it breaks down.

Becoming a skilled conversationalist requires children to read the intentions and take the perspective and role of their listener in many complex ways in novel contexts, and so many conversational skills do not fully develop until late in the preschool period. Thus, at 2 years of age only a minority of children’s utterances are semantically contingent on the adult’s previous utterance, and very few of their turns include reference to both the preceding topic and also some new information – the prototype of a mature conversational turn. When they do engage in a relatively extended conversation on a single topic, 2-year-olds typically take only one or two turns per conversation, with this value doubling by about 4 years of age [11].

Conversation is a form of collaboration in the sense that from a very early age there is a ‘negotiation of meaning’ between the two conversationalists [8], and there are requests for clarification back and forth when one person does not understand something in the other’s formulation. Almost from the beginning of the language acquisition process, young children both respond to requests for clarification and, less frequently, request clarification of others in their conversations (see [13]: Chap. 7 for a review). This again requires sophisticated skills of intention reading and perspective taking.

Collaborative activities, either nonlinguistic or linguistic, thus require participants to have a common goal and to work together in coordinated ways to achieve it. This coordination is not a foregone conclusion, and indeed animals that communicate with one another in fairly sophisticated nonsymbolic ways still do not engage in the back and forth of conversational interaction (and often engage very little in collaborative, turn-taking interactions of any kind).


Children thus employ their species-unique socialcognitive skills to acquire and use a natural language – skills of joint attention, intention reading, perspective taking, and communicative collaboration. But in the process of acquiring a language, children are led to consider various particular intentions and perspectives that they almost certainly would not have created on their own: the very same animal is referred to on different occasions as a dog, an animal, a pet, or a pest; the very same action is referred to as running, fleeing, chasing, or exercising. Acquiring a natural language in interaction with other persons thus leads young children to create some speciesunique forms of cognitive representation, namely, perspectival cognitive representations [17].

A number of theorists have also proposed that conversation and discourse might play an important role in children coming to have a ‘theory of mind,’ that is, in coming to view other persons as mental agents who can have beliefs (including false beliefs) about the world [9]. Thus, to engage in true conversation children must in some sense simulate the perspective of other people as they express themselves linguistically, and thus the back and forth of discourse involves the child in a constant shifting of perspectives from her own to that of others and back again – which helps her to construct both social norms and individual attitudes and beliefs [12].

And so the relationship between children’s social cognition and language is not one way. They must have certain social-cognitive skills to acquire a language, but then engaging in linguistic communication leads them to create new social-cognitive skills. Acquiring a language may thus be seen as another manifestation of the basic dialectic in which children are biologically prepared for culture, but then participation in culture – whose artifacts embody the skills, attitudes, and perspectives of other individuals – takes their cognitive skills to new places.

