A data-driven exploration of children's early language learning across different languages, providing an empirical reference and a new theoretical framework.
This book examines variability and consistency in children's language learning across different languages and cultures, drawing on Wordbank, an open database with data from more than 75,000 children and twenty-nine languages or dialects. This big data approach makes the book the most comprehensive cross-linguistic analysis to date of early language learning. Moreover, its data-driven picture of which aspects of language learning are consistent across languages suggests constraints on the nature of children's language learning mechanisms. The book provides both a theoretical framework for scholars of language learning, language, and human cognition, and a resource for future research.
Wordbank archives data from parents' reports about their children's language learning using instruments in the MacArthur-Bates Communicative Development Inventory (CDI); its goal is to make CDI data available for study and analysis. After an overview of practical and theoretical issues, each of the book's empirical chapters applies a particular analysis to the Wordbank dataset, considering such topics as vocabulary size, demographic variation, syntactic and semantic categories, and the relationship between vocabulary growth and grammar. The final three chapters draw on the preceding chapters to quantify variability and consistency, consider the bird's eye view of language acquisition afforded by the data, and reflect on methodology.