A Bird's-Eye View of Human Language and Evolution

A Bird’s-Eye View of Human Language and Evolution

It’s National Bird Day! We are celebrating with a passage by Robert Berwick and Noam Chomsky from Birdsong, Speech, and Languagewhich considers the cognitive and neural similarities between birdsong and human speech and language.

Scholars have long been captivated by the parallels between birdsong and human speech and language. Over two thousand years ago, Aristotle had already observed in his Historia Animalium (about 350 BCE) that some songbirds, like children, acquire sophisticated, patterned vocalizations, “ articulated voice, ” in part from listening to adult “ tutors ” but also in part via prior predisposition: “ Some of the small birds do not utter the same voice as their parents when they sing, if they are reared away from home and hear other birds singing. A nightingale has already been observed teaching its chick, suggesting that [birdsong] . . . is receptive to training ” ( Hist. Anim. 1970, 504a35 – 504b3; 536b, 14 – 20 ). Here Aristotle uses the Greek word dialektos to refer to song variation, paralleling human speech, and even anticipates recent work on how the songs of isolated juvenile vocal learning birds might “ drift ” from those of their parents over successive generations. Given two millennia of progress from neuroscience to genomics, we might expect that our insights regarding the parallels between birdsong and human language have advanced since Aristotle ’ s day. But how much have we learned? That is the aim of this book: What can birdsong tell us today about the biology of human speech and language? 

From an evolutionary standpoint, birds are particularly well placed to probe certain biolinguistic questions. The last common ancestor of birds and mammals (the clade Amniotes ) lived about 310 – 330 million years ago, so 600 million years of evolutionary time in all separates humans from Aves , 300 million years from this common ancestor to humans, plus 300 million years from this ancestor to birds. This gulf of more than half a billion years provides an opportunity to resolve certain vexing questions about the adaptive significance of particular biological traits, because given such a large gap of evolutionary time, analogous “ solutions ” are more likely to have arisen as a result of independent, convergent evolution, rather than by shared descent from a common ancestor — the classic example being the independent development of wings in bats and birds ( Stearns & Hoekstra, 2005 ). Since the last common ancestor of birds and bats did not have wings, we can more readily conclude that these distinct “ solutions ” arose independently as adaptive solutions to the same common functional problem of flying. Paradoxically, if two species are extremely closely related — humans and chimpanzees — it can be much more challenging to sort out which traits are due to shared ancestry (i.e., homology) and which are true functional adaptations. It is thus crucial to explore in depth the extent to which the many parallels between human speech and birdsong, ranging from vocal learning, to vocal imitation and vocal production, to analogous brain regions and neural pathways in both songbirds and humans, might best be thought of as the result of converging mechanisms. From this vantage point, on balance it would seem that birdsong is most comparable to the mechanisms of human speech, not language in the broad sense, with both solving the common problem of “ externalizing ” some internal representation as a set of serially ordered motor commands to distinct vocal “ output machines. ”

On the other hand, one should not be too hasty in dismissing the possibility of shared ancestry and the insights it might provide into language. For example, though bird wings and bat wings may have arisen independently, both feathers and hair share keratin genes derived from some common ancestor of both, and so the “ solution ” to flying remains a more nuanced interplay between shared ancestry and common descent ( Eckharta et al., 2008 ). Indeed, since the rise of the “ evo-devo ” revolution, over the past several decades biologists have grown to appreciate that there has been a surprising amount of conservation across species in the tree of life, sometimes revealed only by a deeper look at shared traits at the cellular and molecular levels, including regulatory and ontogenetic effects, sometimes called “ deep homology. ” On this account, it would be no surprise to find much common ground between birdsong and human speech, even down to the level of corresponding brain regions. If this commonality turns out to be correct, it would also be a favorable state of affairs since it would reinforce the possibility of using songbirds as animal models of language, especially speech in certain respects. Perhaps the most famous current example of such a case centers on the gene encoding forkhead-box protein P2 (FoxP2), a highly conserved DNA regulatory factor, which apparently plays a role in guiding normal neuronal development involving both vocal learning and production in humans and songbirds ( Fisher & Scharff, 2009 ; Vernes et al., 2011 ). How far one can drive this genomic work upward into neuronal assemblies — ultimately, the dissection of the underlying circuitry responsible for vocal production — remains to be seen, but the current “ state of play ” in this area is covered by several chapters that follow. 

In any case, the bridge between birdsong research and speech and language dovetails extremely well with recent developments in certain strands of current linguistic thinking, which aim to identify the assumed species-specific biological substrate for language, so reducing to a minimum any language-specific cognitive traits ( Berwick & Chomsky, 2011 ). This stands in sharp contrast to the earliest attempts at developing explicit rule systems that even began to approach descriptive adequacy in terms of accounting for the properties of human language. The complexity of such rule systems poses a seemingly insurmountable biolinguistic puzzle, because it requires that one assumes substantial, language-particular machinery without any clear path as to how this highly specific cognitive capability might have arisen. Now however, according to some linguists, one can strip away all this complexity, arriving at a system that requires much less in the way of language-particular rules. This system contains just a single operation that combines hierarchical structure into larger representations, along with a storehouse of conceptual “ atoms, ” roughly corresponding to individual words, along with two interface systems, one an external “ inputoutput ” system mapping internal representations to speech or manual signs, and the second an internal mapping between these internal representations and the cognitive systems of thought ( Hornstein, 2009 ). If this reduction is on the right track, and some of the chapters in this book address this very point, it would go a long way to resolving what some have called “ Darwin ’ s problem ” — the biolinguistic question as to the origin of language. Such a “ bare-bones ” linguistic account would also accord with the view that the capacity for language apparently emerged relatively late and rapidly in evolutionary terms and has not changed substantially since then. Biologically, this points to a common evolutionary scenario: most of the substrate for language must have already been in place, and what we see in the case of language is evolutionary opportunism — the assembly of already-existing abilities into a novel phenotype. For example, during the past few years alone, at least two “ input-output ” system abilities long thought to be the sole province of humans have been claimed to be attested in other vocal-learning animals: (1) perception of synthetic “ auditory caricatures ” of spoken words in chimpanzees ( Heimbauer, Beran, & Owren, 2011 ), and (2) rhythmic entrainment to music in birds ( Patel, Iversen, Bregman, & Schulz, 2009 ). To be sure, these abilities focus only on acoustic input, and we do not yet know what role, if any, these abilities play in human speech and language; nothing comes close to human language in other animal species. But by understanding the scope of what other animals can do, the continued exploration of birdsong as pursued in this book can only boost our understanding of how the interface between language and the external world evolved and works, thus improving our focus on that part of language that remains uniquely human.