Contents
The MIT Press | Cambridge, Massachusetts | London, England
Series Foreword
The MIT Press Essential Knowledge series offers accessible, concise, beautifully produced pocket-size books on topics of current interest. Written by leading thinkers, the books in this series deliver expert overviews of subjects that range from the cultural and the historical to the scientific and the technical.
In todays era of instant information gratification, we have ready access to opinions, rationalizations, and superficial descriptions. Much harder to come by is the foundational knowledge that informs a principled understanding of the world. Essential Knowledge books fill that need. Synthesizing specialized subject matter for nonspecialists and engaging critical topics through fundamentals, each of these compact volumes offers readers a point of access to complex ideas.
Bruce Tidor
Professor of Biological Engineering and Computer Science
Massachusetts Institute of Technology
Preface
This book was borne largely out of a massive open online course (MOOC) that I taught for the University of North Carolina at Chapel Hill on the Coursera platform, in the fall of 2013 and again in the spring of 2014, titled Metadata: Organizing and Discovering Information. Online teaching and learning is not a new idea by any means, but MOOCs focused a great deal of attention on this form of pedagogy, both inside and outside the academy. I had been teaching online for many years when MOOCs hit the news in 2011, but the sheer scale of a MOOC captured my attention. I got to thinking about what teaching and learning in Information Science might look like, if it were entirely online. I believed then, and still do, that the first course in any Information Science curriculum should be a course on metadata: almost everything else in the field depends on metadata, and the subject provides a hook into most of the issues in the field. So when Carolina decided to launch its MOOC initiative, I was very excited to have the opportunity to launch a course on metadata, to put my ideas to the test.
Im very pleased about how well the metadata MOOC was received. And Im equally pleased that the course caused metadata to come to the attention of the editors of the MIT Press, as a topic worthy of being included in the Essential Knowledge series. So my first thank you must be to Margy Avery, for first suggesting the idea of this book.
Naturally I also must thank the University of North Carolina at Chapel Hill, for launching its MOOC initiative in the first place, and for supporting us MOOC instructors during the production process. I must also express a great deal of thanks to my teaching assistant for the MOOC, Meredith Lewis.
I would like to thank the nearly 50,000 students who registered for the course and especially to the 17,464 students who actually participated in the course across both sessions.
I recorded several interviews for the MOOC, with people who are doing interesting and cutting-edge things with metadata. This provided (I hope) useful supplementary material for the course, and saved the students from having to watch my ugly mug all the time. I learned a great deal in conducting these interviews, and that inevitably made it into this book as well. So let me thank my interviewees: Murtha Baca, of the Getty Research Institute; Robert Glushko, Adjunct Full Professor in the School of Information at the University of California at Berkeley; Steve Hogan, Music Analyst at Pandora; Hunter Janes, Data Analyst at Red Storm Entertainment; Clifford Lynch, Director of the Coalition for Networked Information; and Jason Scott, of the Internet Archive.
The interviews for the MOOC went so well that I decided to do some more, specifically for this book. Thanks to Mary Forster, Joel Steinpreis, and Joel Summerlin of Getty Images for a fascinating conversation about image metadata.
Thanks to Clifford Lynch, again, for bringing pen registers to my attention, and for pointing me in the right direction while researching the history of the word metadata.
Thanks to Ted Johnson, of Studio 713, for helping me to understand music metadata.
Thanks to Jessamyn West, for helping me find images of catalog cards.
This book is dedicated to my daughters, Charlotte and Eleanor, who thought it was cool that I was writing a book.
Preface
Preface
Introduction
Metadata is all around us, all the time. In the modern era of ubiquitous electronics, nearly every device you use relies on metadata or generates it, or both. But when metadata is doing its job well, it just fades into the background, unnoticed and nearly invisible. And this is partly how, in the summer of 2 01 3, metadata came to be a cause clbre .
Edward Snowden, a subcontractor to the United States National Security Agency, flew to Hong Kong in May of 2013 to meet with journalists from The Guardian . There, Snowden handed over a large number of classified documents about the NSAs surveillance program within the United States. One of these programs, PRISM, included collecting data on telephone calls directly from telecommunications companies. Needless to say, this was very big news when The Guardian published the story.
Reactions in the US media to the Snowden revelations were varied, and their evolution was significant. The immediate reaction was anger that the NSA was collecting data on US citizens. This was quickly tempered by relief, when it became clear that the NSA was only collecting metadata about calls, and not the calls themselvesin other words, the NSA was not engaging in wiretapping. After that came punditry, as the media explored just how much information about individuals could be inferred from only metadata.
The MetaPhone study, conducted by researchers at the Stanford Law School Center for Internet and Society in late 2013, attempted to replicate the NSAs data collection of phone metadata. What they discovered was that a truly incredible amount of information can be inferred from only metadata. One example that the MetaPhone researchers report is of a study participant who called a home improvement store, locksmiths, a hydroponics dealer, and a head shop. Perhaps this individual had perfectly innocent reasons for placing all of these calls, and perhaps these calls were entirely unrelated but thats not the inference that most of us are likely to make.
A lot of metadata is associated with phone calls, particularly cell phone calls. Probably the most obvious pieces of metadata about a call are the phone numbers of the caller and the recipient. Then, of course, theres the time and duration of the call. And for calls made from smartphonesmost of which have GPS functionalitythere are the locations of the caller and the recipient, at least to the level of precision of the range of the cell phone towers in which the phones are located. Theres more metadata than this associated with cell phone calls, but even this small amount is enough to give privacy advocates pause. Because your phone exchanges data with local cell towers, even when youre not on a call. And, of course, your phone is presumably being carried by you. A record of your location at any given moment, and your movements over time, may therefore be collected by your cell phone service provider and is in fact collected, as the Snowden revelations revealed.
Thus did the word metadata enter the public conversation. Though, given how pervasive metadata is, a public conversation about it is probably overdue; it deserves to be better understood. In the modern era of ubiquitous computing, metadata has become infrastructural, like the electrical grid or the highway system. These pieces of modern infrastructure are indispensible but are also only the tip of the iceberg: when you flick on a lightswitch, for example, you are the end user of a large set of technologies and policies. Individually, these technologies and policies may be minor, and may seem trivial but in the aggregate, they have far-reaching cultural and economic implications. And its the same with metadata. Metadata, like the electrical grid and the highway system, fades into the background of everyday life, taken for granted as just part of what makes modern life run smoothly.
Next page