Last year, the business world was predicted to spend $187 billion on data analytics. Companies show no signs of slowing down or stopping. Indeed, projections indicate spending will skyrocket to $260 billion per year within the next two years.
Yet a lot of that money isn’t doing any good for the companies who invest it. Even Fortune 500 companies are wasting money on projects that don’t deliver results. Some have even launched business “intelligence” programs that actively hurt their operations and brand reputations.
Indeed, according to InfoWorld, Gartner analyst Nick Heudecker believes that nearly 85% of big data projects fail.
So how do you ensure your company ends up being one of the relatively small minority of companies who achieve big wins with data analytics projects? If you manage to avoid these five mistakes, you’ll be in great shape.
1. Failing To Identify specific problems you want to solve.
All the big wins in big data come from using the data to solve a specific business problem.
For example, insurance companies use their data to prevent insurance fraud. Other companies have used a data analytics tool to increase sales, manage their inventory, reduce waste, or increase productivity.
Yet all too often, company leaders jump into the world of data analytics simply because they know their competitors are doing it, or someone convinced them that they have to. Consultants and salespeople drive the narrative, pushing companies to adopt data technology they don’t need yet, or aren’t ready to make the most of.
Keep your company in control of its own data narrative. Begin your data analytics journey by identifying a single, specific problem you’d like to solve with actionable insights.
Create an analysis project designed to address that single problem. As the project progresses, you might find you need a lot less of everything than you thought: less data, fewer people, and a less expensive data analytics tool than the ones those pricey consultants pitched to you last month.
You always have the option of expanding your efforts later. Starting small prevents you from overspending on projects that aren’t going to deliver results. Get your insights, put them into practice, and move on to the next issue you’d like to address.
2. Skipping Your Exploratory Analysis Work.
Conducting exploratory data analysis (EDA) at the beginning of your project helps you prevent skewed conclusions at the end. This requires creating some visual models.
Your models can help you understand the distribution of the data and to spot outliers. They can help you understand which variables you’re working with and can give you a preliminary understanding of certain patterns within the data. You might start to see some correlations you can explore further.
This phase of the work may even show you data that you’re missing, so that you can decide how to solve that issue.
Of course, if these preliminary correlations don’t make a whole lot of sense, then the EDA process has given you the gift of that information, too, which saves you a lot of wasted resources. You can then ask yourself why the correlations don’t make sense. There could be garbage information that’s throwing everything else off.
Later, when you’re running your in-depth analyses, you can see if the information that you’re getting makes sense within the framework of the understanding you’ve already achieved through EDA. That’s a pretty important gut check to make before you start acting on the insights.
3. Not Scrubbing Your Data Before Analyzing It.
The EDA process may show you that some of your data is old, inaccurate, incomplete or otherwise unhelpful. Now you need to get in there and fix it, either correcting mistakes or deleting data that doesn’t make sense because it’s corrupt, imprecise or from a non-representative sample.
You should also locate and eliminate duplicate entries and other hygiene errors which can skew the results.
Finally you’ll need to make sure your data is in a standardized format, with a uniform structure.
There’s a reason why data scientists spend 80% of their time cleaning the information that they input. It’s arguably the most important step in the process. Dirty data isn’t capable of delivering actionable intelligence.
4. Failing To Automate Tasks.
Data science isn’t a job that’s likely to be fully replaced by AI anytime soon. Sure, AI can augment and assist your data science efforts, but it can’t do everything that people can in this space just yet.
Automation and AI can also open the doors for “citizen data analysts,” business analysts who may not be able to handle complex statistical concepts or formulate their own query code, but who can understand how to use a tool designed by a data scientist to generate insights for their own businesses.
This means you might not need to dedicate the resources to employ an entire team of data scientists. You might not have to hire an expensive consultant. You just need an analyst who can use the tools effectively and on what matters. The right person may already be working for your company.
You can always hire a PhD later if you have the budget and the need to do so – maybe after your citizen analyst has used readily available tools to increase your profits by a significant margin.
5. Neglecting Data Pipelines And Storage Infrastructure.
As your organization learns to extract meaning from your data, it will have to start giving some thought to how that data is collected and stored. Here are some key questions you should be asking yourself to help determine the best infrastructure for your needs:
1. Is your business collecting all the data it needs to collect? Your work on your first project might give you some hints here. If you’re not collecting all the data you need, how can you make it happen?
2. Do you have a workable method for sharing the data with all stakeholders? For security reasons, you don’t just want to email spreadsheets all over the office, or worse, outside the office.
3. Speaking of security, what are you doing to secure your data? Some of it is likely to be highly sensitive. Some of it came from your customers, who definitely don’t want their personal details to end up in just anybody’s hands. Improper handling of data security can become a liability issue, so invest some time, thought, and money into data protection.
4. Are you gathering trustworthy external data? You might need it. For example, Tesco, a leading grocery chain in the UK, uses weather data to predict their stock requirements. The exact nature of the data you need to procure depends on what you’re trying to do. Identify the needs, look for trustworthy sources, and find ways to debug and clean it up enough to be useful to your team.
5. Will you be investing in a data management platform? This system will help you collect and manage your data. Right now your data may be scattered across multiple systems. For example, some may be in your point-of-sale system while other data points are in your inventory system. A data management platform helps you bring it all together.
Answering these questions will set your organization up for long-term success, and will widen the number of experiments you can run and insights you can achieve.
Data Isn’t Magic.
Data is like a type of raw material. And in the same way that having a stack of bricks is not the same thing as having a building, you need to use your data correctly if you’re going to realize its potential value. Acknowledging that truth will help you adopt a mindset that will ultimately lead to success with your data – one that will shield you from all the expensive hype.