Hollywood loves the myth of a lone scientist working late nights in a dark laboratory on a mysterious island, but the truth is far less melodramatic. Real science is almost always a team sport. Groups of people, collaborating with other groups of people, are the norm in scienceand data science is no exception to the rule.
When large groups of people work together for extended periods of time, a culture begins to emerge. This paper, written in the spring of 2013, was an early attempt at describing the people and processes of the emerging culture of data science.
Its Not Just About Numbers
Todays conversational buzz around big data analytics tends to hover around three general themes: technology, techniques, and the imagined future (either bright or dystopian) of a society in which big data plays a significant role in everyday life.
Typically missing from the buzz are in-depth discussions about the people and processesthe cultural bedrockrequired to build viable frameworks and infrastructures supporting big data initiatives in ordinary organizations.
Thoughtful questions must be asked and thoroughly considered. Who is responsible for launching and leading big data initiatives? Is it the CFO, the CMO, the CIO, or someone else? Who determines the success or failure of a big data project? Does big data require corporate governance? What does a big data project team look like? Is it a mixed group of people with overlapping skills or a hand-picked squad of highly trained data scientists? What exactly is a data scientist?
Those types of questions skim the surface of the emerging cultural landscape of big data. They remind us that big datalike other so-called technology revolutions of the recent pastis also a cultural phenomenon and has a social dimension. Its vitally important to remember that most people have not considered the immense difference between a world seen through the lens of a traditional relational database system and a world seen through the lens of a Hadoop Distributed File System.
This paper broadly describes the cultural challenges that invariably accompany efforts to create and sustain big data initiatives in a global economy that is increasingly evolving toward the Hadoop perspective, but whose data-management processes and capabilities are still rooted firmly in the traditional architecture of the data warehouse.
The cultural component of big data is neither trivial nor free. It is not a list of feel-good or fluffy attributes that are posted on a corporate website. Culture (that is, people and processes) is integral and critical to the success of any new technology deployment or implementation. That fact has been demonstrated repeatedly over the past six decades of technology evolution. Here is a brief and incomplete list of recent technology revolutions that have radically transformed our social and commercial worlds:
The shift from vacuum tubes to transistors
The shift from mainframes to client servers and then to PCs
The shift from written command lines to clickable icons
The introduction and rapid adoption of enterprise resource planning (ERP), e-commerce, sales-force automation, and customer relationship management (CRM) systems
The convergence of cloud, mobile, and social networking systems
Each of those revolutions was followed by a period of intense cultural adjustment as individuals and organizations struggled to capitalize on the many benefits created by the newer technologies. It seems unlikely that big data will follow a different trajectory. Technology does not exist in a vacuum. In the same way that a plant needs water and nourishment to grow, technology needs people and processes to thrive and succeed.
According to Gartner, 4.4 million big data jobs will be created by 2014, and only a third of them will be filled. Gartners prediction evokes images of gold rush for big data talent, with legions of hardcore quants converting their advanced degrees into lucrative employment deals. That scenario promises high times for data analysts in the short term, but it obscures the longer-term challenges facing organizations that hope to benefit from big data strategies.
Hiring data scientists will be the easy part. The real challenge will be integrating that newly acquired talent into existing organizational structures and inventing new structures that will enable data scientists to generate real value for their organizations.
Playing by the Rules
Misha Ghosh is a global solutions leader at MasterCard Advisors, the professional services arm of MasterCard Worldwide. It provides real-time transaction data and proprietary analysis, as well as consulting and marketing services. Its fair to say that MasterCard Advisors is a leader in applied data science. Before joining MasterCard, Ghosh was a senior executive at Bank of America, where he led a variety of data analytics teams and projects. As an experienced practitioner, he knows his way around the obstacles that can slow or undermine big data projects.
One of the main cultural challenges is securing executive sponsorships, says Ghosh. You need executive-level partners and champions early on. You also need to make sure that the business folks, the analytics folks, and the technology folks are marching to the same drumbeat.
Instead of trying to stay under the radar, Ghosh advises big data leaders to play by the rules. Ive seen rogue big data projects pop up, but they tend to fizzle out very quickly, he says. The old adage that its better to seek forgiveness afterward than to beg for permission doesnt really hold for big data projects. They are simply too expensive and they require too much collaboration across various parts of the enterprise. So you cannot run them as rogue projects. You need executive buy-in and support.
After making the case to the executive team, you need to keep the spark of enthusiasm alive among all the players involved in supporting or implementing the project. According to Ghosh, Its critical to maintain the interest and attention of your constituency. After youve laid out a roadmap of the project so everyone knows where they are going, you need to provide them with regular updates. You need to communicate. If you stumble, you need to let them know why you stumbled and what you will do to overcome the barriers you are facing. Remember, theres no clear path for big data projects. Its like Star Trekyoure going where no one has gone before.
At present, there is no standard set of best practices for managing big data teams and projects. But an ad hoc set of practices is emerging. First, you must create transparency, says Ghosh. Lay out the objectives. State explicitly what you intend to accomplish and which problems you intend to solve. Thats absolutely critical. Your big data teams must be use case-centric. In other words, find a problem first and then solve it. That seems intuitive, but Ive seen many teams do exactly the opposite: first they create a solution and then they look for a problem to solve.
Marcia Tal pioneered the application of advanced data analytics to real-world business problems. She is best known in the analytics industry for creating and building Citigroups Decision Management function. Its charter was seeking significant industry breakthroughs for growth across Citigroups retail and wholesale banking businesses. Starting with three people in 2001, Tal grew the function into a scalable organization with more than 1,000 people working in 30 countries. She left Citi in 2011 and formed her own consulting company, Tal Solutions, LLC.
Right now, everyone focuses on the technology of big data, says Tal. But we need to refocus our attention on the people, the processes, the business partnerships, revenue generation, P&L impact, and business results. Most of the conversation has been about generating insights from big data. Instead, we should be talking about how to translate those insights into tangible business results.