front matter
foreword
Over the past decade, I chaired or co-chaired more than 40 premier data and AI conferences internationally. It has been amazing to witness the evolution and impact of analytics, data science, and machine learning worldwide. Data science continues to be one of the fastest-growing job functions in the industry today. When I was the chief data scientist of OReilly Media, study after study we conducted confirmed that companies continue to invest in data infrastructure, data science, and machine learning. We also found the companies that excel in using data science and machine learning were the ones that invested in foundational technologies and used those tools to expand their capabilities gradually, one use case at a time.
While much of what we read about pertains to tools or breakthroughs in models, the reality is that organizational issues pose some of the major bottlenecks within most companies. The critical ingredient is recognizing organizational excellence in people, culture, and structure. If you dont have the right people and organizational structure in place, you will still underperform competitors that do.
As demand for data scientists continues to grow and training programs proliferate, I am frequently asked for advice. Novices ask how they can join the ranks of data scientists, and more experienced data scientists ask for pointers on how they can take their careers to the next level.
Unfortunately, information and advice on how to remain relevant and impactful throughout a data science career are hard to come by. Most of the career-related literature focuses on embarking on the journeywhere to study, what skills to learn, and how to interview for and land your first job. There is very little guidance for how employed data scientists can continue to succeed and excel in this career.
How to Lead in Data Science is an essential field guide for data scientists at different stages of their careers as an individual leader, such as a tech lead, staff, principal, or distinguished data scientist, or as a management leader, such as a manager, director, or executive of data science. The book is for data scientists who want to take their careers to the next level. It also provides guidance on tools and techniques in the context of helping data scientists increase their positive impact in business and in society.
Ive known the authors, Jike and Cathy, for many years. Together, they bring a diverse set of operating experiences from a broad range of organizations, including public and private companies, as well as consultancy practices. I have seen them teach the material in this book in training courses for data scientists from diverse backgrounds and industries. Their courses are always among the most popular and well received in the conferences Ive chaired.
This book is the missing field guide for data scientists looking to advance their careers. Readers at various stages of their careers will find it worthwhile to come back and revisit the book as they grow. It is a book I plan to recommend to data scientists from hereon. I hope it inspires more discussions and literature on this topic. Data scientists and those who work with them will need this book in the years to come!
Ben Lorica
Ben Lorica is principal writer at GradientFlow.com ; co-chair of the NLP Summit and Ray Summit; the former chief data scientist and program chair at OReilly Media; the host and organizer of thedataexchange.media podcast; and has been an advisor at many startups and organizations, including Databricks, Anyscale, and Faculty.ai.
preface
As a leader in the practice of data science, you can scale your data, algorithms, and team, but are you scaling you? What is leadership? How are you amplifying your capabilities to produce a more significant impact than what can be achieved as an individual? Are you influencing, nurturing, directing, and inspiring projects and people around you?
These are questions many data science practitioners grapple with as they struggle to advance their careers in this high-growth, fast-evolving field. Most practitioners work in companies with fewer than 10 data scientists, holding broad responsibilities to lead projects, interfacing with cross-functional partners, crafting roadmaps, and influencing executives. Their roles are often not clearly defined and come with unrealistic expectations.