97 Things About Ethics Everyone in Data Science Should Know
by Bill Franks
Copyright 2020 OReilly Media, Inc. All rights reserved.
Printed in the United States of America.
Published by OReilly Media, Inc. , 1005 Gravenstein Highway North, Sebastopol, CA 95472.
OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com .
- Acquisitions Editors: Jonathan Hassell,
Andy Kwan - Development Editor: Nicole Tach
- Production Editor: Christopher Faucher
- Copyeditor: Arthur Johnson
- Proofreader: Penelope Perkins
- Indexer: WordCo Indexing Services, Inc.
- Interior Designer: David Futato
- Cover Designer: Karen Montgomery
- Illustrator: OReilly Media, Inc.
- August 2020: First Edition
Revision History for the First Edition
- 2020-08-06: First Release
See http://oreilly.com/catalog/errata.csp?isbn=9781492072669 for release details.
The OReilly logo is a registered trademark of OReilly Media, Inc. 97 Things About Ethics Everyone in Data Science Should Know, the cover image, and related trade dress are trademarks of OReilly Media, Inc.
The views expressed in this work are those of the author, and do not represent the publishers views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
978-1-492-07266-9
[LSI]
Preface
The intersection of ethics with the world of analytics and data science is a topic that I have become passionate about in recent years. Ive written a variety of blogs and papers on the topic. Ive also spoken about the need for attention to ethics at numerous public conferences and at many private meetings with corporate clients. What I discuss is based upon my concerted and ongoing effort to learn what others are thinking and saying about the ethics of analytics. I also receive feedback during these interactions that enables me to continue to evolve my own viewpoints as I recognize gaps in my thinking.
What I have consistently found in my interactions is that people are very receptive to giving ethics more attention once their eyes have been opened to the fact that the need for ethical consideration is much broader and more important than they realized. The vast majority of the examples Ive seen where something unethical occurred with analytics and data science were not driven by anyone operating with bad intent. Rather, it is usually the case that the ethics of the situation simply werent thought through well enough, if at all.
When OReilly approached me about partnering on this project, I knew it was something I had to do. I was excited about the opportunity to see what hundreds of other people had to say about ethics. I firmly believe that as more of these types of conversations about ethics occur among members of the analytics and data science community, we can continue to make progress toward ensuring that analytics and data science are done in as ethical a manner as possible. The key is to get peoples attention so that they are awakened to the need to give proper focus to ethical considerations. The goal of this book is to be a catalyst for this awakeningto help readers fully understand the importance of applying proper ethics to analytics and data science initiatives . Curating the final submissions that made it into the book was a tremendous learning experience for me, and I hope that readers will find the final outcome to be of value as well.
As you read the book, you will find a wide range of opinions and writing styles. That was intentional. To the extent that two entries have conflicting views, it provides an opportunity for you to ponder which view you find more compelling and why. My colleagues and I did not write this book to tell you precisely what is and isnt ethical. Rather, the book provides perspectives from others in the community so that you can continue to refine your own ethical guidelines.
The books title is 97 Things About Ethics Everyone in Data Science Should Know. Who, exactly, is everyone in data science? That description should be interpreted broadly. Certainly, anyone involved in the definition, creation, or usage of analytics and data science processes will benefit from the book. This includes people in both technical and business-facing roles. Students or people considering a career change into the field will also benefit. However, the content is not deeply technical or hard to understand. As a result, people who simply have an interest in understanding how ethics intersects with data science will find this book of value as well, regardless of their job role or educational background.
Why Now?
While the need for ethics in analytics and data science has always been present, a couple of recent trends have helped to finally push the topic to the forefront. The first trend is what I focused on in my book The Analytics Revolution (Wiley). Namely, we have entered an era in which analytical processes are being fully automated and embedded into decision processes. Humans are now often relegated to creating analytics and data science processes and then monitoring their performance, while the important decisions are automated. This automation has led people to be more concerned and suspicious about what is actually happening within those processes, and it quickly leads to a discussion about ethics. This is especially true when models are applied to sensitive areas such as credit scores, health care, or risk assessment.
The second trend driving focus on ethics is the rise of artificial intelligence (AI). Not only are myriad AI processes being embedded and automated as part of the first trend, but AI processes are also quite opaque by nature. This opaqueness makes people uncomfortable and forces discussion around what is happening within the AI algorithms and why. This again quickly turns into an ethics discussion. As AI becomes more sophisticated and continues to impact our lives on a daily basis, people want to know that it is being used in an appropriate manner.
Ethics Are Fuzzy
Ethics are, unfortunately, much fuzzier than wed like to think. If you ask a hundred people, Is the ethical choice typically clear? most will quickly respond with a firm yes. However, once we are challenged to think more deeply about the question, it soon becomes obvious that ethical decisions are not as clear cut as we allow ourselves to believe. While it is often easy to identify the rule that should be followed for a given situation, it is also just as easy to identify one or more exceptions to that rule.
Lets take a relevant example from the analytics and data science space. One of the central points of the European Unions General Data Protection Regulation (GDPR) is the right to be forgotten. This means that I can tell organizations that I no longer want them to keep any data they may have about me, and they have to delete that data. It sounds very unambiguous, doesnt it? If I ask to have my data deleted, then companies must comply under penalty of law.