Why Data Science Projects Fail

As seen in Analytics Magazine

Executive Edge: Why data science projects fail

Delivering brilliant ideas is the easy part; execution is often where things go awry.

By David Maman

Thousands of companies all over the world are competing for a finite number of data scientists, paying them big bucks to join their organizations – and setting them up for failure.

For most organizations, data science is not the be all and end all. It cannot be the answer if company leadership has not formulated the strategy or questions that data science needs to answer or paved the route to production.

Even organizations that have already invested millions in integrating their complete corporate data infrastructure to create the ultimate data repository – data lakes, data pools, common data sources or whatever pleasant definition your organization has chosen – for integration and analysis will not be able to see results at the rate and accuracy they are demanding once the system is in production.

As with the long-term adoption and implementation of any business process or procedure, fully benefiting from data science takes time. You cannot just jump in head first.

At most companies, data science fails for very specific reasons:

1. Undefined business goals/processes. Just because your data science team can map and predict customer satisfaction rates for the next six months doesn’t mean that information is significant. If the results demonstrate broad-scale customer dissatisfaction, simply predicting the results is woefully inadequate without resolving the roots of that dissatisfaction. You can predict an untold number of data points, but without clearly defined goals accompanied with the will to fix your problems, those data points are meaningless. Many organizations keep following the misconception that data scientists (no matter how smart they are) can correctly define the business goals. Most of the time they can’t, because it’s not their job.

2. Inability to build and apply a uniform data set across the organization. While the politics in your organization may have inspired silos and fiefdoms, each department in your organization is ultimately driving toward the same goal – increasing ROI. To maximize the value of data science, you need to create a uniform data set that will deliver actionable results that every part of the organization can implement.

Just because your data science team can map and predict customer satisfaction rates for the next six months doesn’t mean that information is significant. Photo Courtesy of ThinkStock.com
Just because your data science team can map and predict customer satisfaction rates for the next six months doesn’t mean that information is significant. Photo Courtesy of ThinkStock.com

3. Investment in “sexy” algorithms instead of useful ones. You have just hired the data scientist who graduated in the top of her class. Having just joined your company, she’s extremely excited to get started. Because of her in-depth mathematical and academic background, she has some great algorithms that can get some really cool results. However, are those algorithms going to deliver the results that can focus your organization and drive it forward? Just because something is old, tired, tried and true doesn’t mean it won’t work.

Your team needs to examine the full range of algorithms you are using for your modeling and preprocessing with an objective eye. Is the information those algorithms are delivering actually useful? Do they deliver the actionable intelligence that will increase ROI? Often not.

In most cases, the challenge you are facing isn’t new. A simple Google search will lead you to some great research and case studies to see how others tackled the challenge and how they tailored the solution. Be open. Start thinking from a problem perspective and not just from a solution perspective.

4. Inappropriate data or infrastructure. Most organizations are built in silos. Large organizations have both an ERP system and a CRM system and many others, but those systems may not be in sync. They may not even be touching the inbound marketing and sales software. Until everyone in the organization can work with the same data and analyze the same data, siloed data science results aren’t as applicable or effective. Sometimes it takes much longer to determine whether specific data or the volume of data is sufficient for the specific task.

5. Failure to differentiate between academia or research and the real world. Data scientists generally go from the ivory tower directly into the data silos. They generally deliver great results in most properly defined challenges. Remember, though, at the end of the data scientist’s job, the hard work may only be beginning. The engineers need to implement it into production – many times from scratch. Don’t expect your data science team to learn how businesses actually work. It’s up to the engineers and systems analysts to determine your specific system requirements to get their work into production.

The most critical result of any data science activity is relevancy. Every “answer” generated by your data science team needs to answer specific questions derived from the overall corporate strategy. While investment in the “bells and whistles” of AI is fun and exciting, organizations need to make sure that they have a defined strategy, the right idea and a properly mapped roadmap before the first dollar of data science investment is spent.

David Maman is CEO, CTO and co-founder of Binah.ai, whose out-of-the-box data science solutions leverage signal processing combined with machine learning and AI to create models and accelerate delivery of the right answers to critical business questions.