Best practices for addressing the bias and inequality that may result from the automated collection, analysis, and distribution of large datasets.
Human-centered data science is a new interdisciplinary field that draws from human-computer interaction, social science, statistics, and computational techniques. This book, written by founders of the field, introduces best practices for addressing the bias and inequality that may result from the automated collection, analysis, and distribution of very large datasets. It offers a brief and accessible overview of many common statistical and algorithmic data science techniques, explains human-centered approaches to data science problems, and presents practical guidelines and real-world case studies to help readers apply these methods.
The authors explain how data scientists' choices are involved at every stage of the data science workflow—and show how a human-centered approach can enhance each one, by making the process more transparent, asking questions, and considering the social context of the data. They describe how tools from social science might be incorporated into data science practices, discuss different types of collaboration, and consider data storytelling through visualization. The book shows that data science practitioners can build rigorous and ethical algorithms and design projects that use cutting-edge computational tools and address social concerns.
Cecilia Aragon is Professor in the Department of Human Centered Design and Engineering at the University of Washington.
Shion Guha is Assistant Professor in the Faculty of Information at the University of Toronto.
Marina Kogan is Assistant Professor in the School of Computing at the University of Utah.
Michael Muller is Research staff member at IBM Research.
Gina Neff is Director of the Minderoo Centre for Technology and Democracy at the University of Cambridge and Professor of Technology and Society at the Oxford Internet Institute and the Department of Sociology at the University of Oxford. She is the author of Venture Labor: Work and the Burden of Risk in Innovative Industries and coauthor of Self-Tracking and Human-Centered Data Science (both published by the MIT Press).
“We cannot engage in data science that doesn't account for power. Histories and systems of race and gender must be taught to data scientists, because we know terrible wrongs can occur in the making and use of data. This book is a must-read to expose the next generation of data scientists to the consequences of their work.”
Safiya Umoja Noble, author of Algorithms of Oppression
“By centering both the human-centric perspective and the data scientific perspective equally, the authors craft a comprehensive approach to human-centered data science. This book is a useful handbook to developing forward-thinking data science teams who will be prepared for the next iteration of machine learning.”
Rumman Chowdhury, Director of ML Ethics, Transparency, and Accountability (META), Twitter
“This book's unique approach recognizes that data science is a craft spun with subjective design choices. It will be invaluable to students and practitioners alike.”
Chris Wiggins, Chief Data Scientist, New York Times