10 Questions to Answer Before Starting a Big Data Project

Data is the most valuable asset organizations have at their disposal. It can provide you with critical insights into your customers’ behavior and also into your business operations. It can improve efficiencies, drive sales, optimize logistics and accurately predict future product and service needs. The most exciting new technologies, such as artificial intelligence (AI), the Internet of Things (IoT) and machine learning, are growing in their appeal and usability thanks to the great quantity of data that exists today. But the availability of data doesn’t by any means guarantee the success of a big data project.

Some recent studies identify that even with all that data, the majority of projects still fail.  In November of 2017, Gartner said that 60% of big data projects fail, then later revised that number to 85%.  Another study from McKinsey showed that companies have secured only about 10% to 40% of the value that’s available in their data.  So what’s the deal?

Companies have secured only about 10% to 40% of the value that’s available in their data

There are many obstacles and challenges to achieving success with big data projects. There is a need for a change in the organization’s culture, there are high and sometimes unrealistic goals and expectations, and there is a lack of skilled professionals, just to name a few.

To help you avoid as many roadblocks as possible and to help you overcome the various challenges that exist at the beginning of a big data project, here are ten critical questions you should answer before you start:

1. What is the objective, or purpose, of the project?

From the beginning, any big data project must have a clear objective or business goal.  Whether it’s to drive sales for certain products or services, reduce customer turnover, improve operational efficiencies, or any number of others, an objective must be established.

Oftentimes, organizations attempt to begin big data projects by implementing a technical solution.  A project that is created for the sake of utilizing big data technologies is not typically found to be very useful and will frequently result in you simply spinning your wheels.  So instead of focusing on using technology, your big data project should start by focusing on how it will help your organization achieve its business objectives.

When presented with an opportunity to start a big data project, you should look for the key objectives, issues or problems that your organization needs to solve.  Such as:

  • Poor sales to a particular demographic
  • Determine what value-added products or services to offer
  • Discover how to lower the spend on logistics
  • Uncover how to increase employee productivity
  • Achieve more effective results from marketing efforts

In order to provide the best results or deliverables you can from your big data project, the objective or purpose must be clearly defined and well-known throughout your organization.

2. Do you have executive buy-in?

Big data projects can only be successful if all parts of an organization understand, and buy into, the potential value it will provide.  They must understand what the organization is trying to achieve and that buy-in must start at the very top.

Having the support of the C-suite is vital to the success of any big data project.  Projects often compete for funding, along with use of resources.  If management executives don’t understand the value of big data, they may shy away from committing large amounts of resources to them.

In order to effectively get the C-suite involved in big data operations, you must work to educate them about what can be achieved.  If executive awareness of how big data can be used to improve business is low, it can be an enormous obstacle to getting a project off the ground.

One of the best ways to close this knowledge gap and get the necessary support is by illustrating to executives what returns they can expect to see on their investments.  This can be shown in terms of:

  • Increased sales
  • Better customer retention
  • Better margins
  • Improved productivity
  • And many other potential benefits

Executives typically will not be interested in the technical side of the solutions, so you need to have details and use-case reasons that will speak directly to them.  They will want to know things like, how big data can be monetized, and how operations, service and marketing can be improved.

Having high-level executives supporting big data projects greatly helps with wider acceptance of the initiative and plays a key role in successful organizational adoption.

3. Do you understand your data problem?

Once you’ve defined the business objective for your big data project, you must determine what this means from a data perspective.

You may have large datasets, but it may not be in a ready or usable state when it comes to alignment with the project’s objectives.  In other words, you need to understand what your data isn’t currently telling you about what you need to know, so the parameters of your big data project can be adapted to ensure that it does.

In order to minimize the negative impact of potential underlying data issues or errors, you’ll want to explore and prepare your data for cleaning, while also identifying key variables to help categorize your data according to the objective.  These errors and issues need to be corrected prior to cleaning and could include:

  • Duplicate data
  • Omitted data
  • Data that simply doesn’t make sense
  • Spelling errors
  • Inaccurate data
  • And many more

If not corrected, these issues could cause inaccurate analysis.  But once they are fixed and the data is cleaned appropriately, your big data project should be utilizing good, clean data in its processes and therefore able to provide trusted results.

4. Do you know where your data comes from?

As part of the need for any big data project, you must also identify the potential sources of data you would need to access.

All organizations have some type of database and many have additionally built data warehouses, or are utilizing data lakes.  But there are simply too many potential data sources and some of the necessary information you need to capture may come from sources that require complex integration.  This could be data from:

  • Social media
  • Weather information
  • Video streams
  • Consumer opinions
  • Public datasets
  • And more

Furthermore, because databases, data warehouses and data lakes are typically created to address existing storage issues, they may not be very applicable to the specific big data project you’re about to start.

Consequently, it may be necessary to explore utilizing data that lives outside of your environment, or consider combining several databases together.  You may potentially even need to look at obtaining data from third-party sources or from your organization’s partner network.

5. How is your data captured?

As your organization captures data from external sources such as social media, RFID and wireless technology, and combines it with the internal data you’re storing, you must ensure that it is blended properly and being utilized appropriately so that it advances the objectives.

Aggregating data from mobile phones, your organization’s software applications, IoT sensors, partner streams, social media streams and so on, can be incredibly difficult.  So it’s not surprising that siloed data is often listed as a major contributor to the inability to gain meaningful insights and the failure of big data projects.

There’s plenty of technology available to support the integration of data between silos, so the problem doesn’t lie in the technology.  Most often, the biggest challenge in capturing data effectively is the lack of understanding of the critical role that data integration plays in achieving the goals of the objective.

For instance:

While adoption of an AI tool certainly sounds appealing, and may soon become a necessity for many businesses to remain competitive, it won’t be nearly as effective if it’s working from siloed, or incomplete data.  In this way, it’s much more important for your organization to ensure data is integrated appropriately and captured effectively first, before exploring AI adoption.

In order to maximize the potential of data and its analysis, you need to ensure a solid foundation by making certain you’re working with up-to-date data that’s been captured appropriately from every relevant source.

6. Do you have the right data?

Having the ability to collect data doesn’t necessarily make it usable or right for your big data project.  The parameters of the project’s purpose are what determine the data’s usability, so on the whole, the data you utilize must be relevant to the objective of your project.

Factors such as the volume and velocity of data must be taken into consideration, and depending on what you need to know, you may need to look different types of data in structured, unstructured or semi-structured forms to get the right information.  It could be gathered from a variety of sources and could also be used in combination with each other.

It’s also critical to understand that the data your organization currently has stored may not be the right data to achieve the goals of the project.  If this data has been sitting unused for any length of time, it may provide you with insights into past operations, successes or failures, which could be irrelevant.  While this outdated information can be helpful for identifying historical data, trends and patterns, typically, fresh data is needed for starting a big data project, and most likely, real-time data would be preferred.

If you’re starting a big data project to address a brand new problem or issue for which there is no currently consistent dataset, you’ll need to identify the correlating datasets that will provide you with the necessary information and then roadmap how to get your hands on them.

Aligning a big data project’s objectives with the right data, or identifying the additional data that needs to be collected before it begins, is a crucial step in helping to achieve the project’s goals and ensuring its success.

7. Is the project feasible?

If the objective has been identified and validated and the available data is determined to be relevant and useful, then a big data project is feasible.  But even if it is, that doesn’t mean you should rush to implement it.

Before you start, you must ensure you have the tools, technologies and infrastructure in place to support the objective.  If it lacks in any way and your solution isn’t sufficiently positioned to help you achieve your project’s goals, you must look to develop a more apt solution.

Additionally, you’ll want to determine what techniques and methods will best align with your objective and update or revise yours as necessary.  By researching similar projects, you may be able to clearly identify what’s previously been used effectively for your type of issue and avoid a more difficult path to completion.  This variety of techniques and methods could include versions of:

  • Machine learning
  • Data science
  • Statistical methods
  • Data analysis
  • Programming
  • And more

You may also consider the feasibility of utilizing the cloud.  Important factors such as technical experience, flexibility, capacity, and cost will have an impact on this consideration, as will other cloud benefits such as:

  • No budget needed to gain the required hardware and software
  • No time wasted waiting for complete purchasing and deployment processes
  • No need for technically skilled professionals to deploy, integrate and oversee numerous elements of the system

8. What is it going to cost?

To state the obvious, cost is a major deciding factor in every aspect of business and this is no different in the big data arena.

Before you start a project, you need to understand what resources you need, and what they will cost.

The cost of developing and implementing big data management and analytics infrastructure to store colossal amounts of data should be accounted for and could be vastly different depending on your requirements.  There are many options on the market, ranging from platforms that store and distribute large datasets across hundreds of servers that operate in parallel, to traditional data warehouses where memory requirements and data types are dynamic.

Additionally, no matter if your company outsources or performs data science in house, you also need skilled professional big data engineers and data scientists to perform analysis and construct data pipelines for your big data project to be successful.  Assembling a data science team is an extensive process and these skilled professionals are not only expensive, but also in very high demand.

You could also adopt AI or machine learning technology to alleviate much of the manual burden, but again, the cost of implementation of these technologies must be captured and considered from the start.

9. How will this benefit my organization?

As early as possible, define the value your organization will get from the big data project.  Decision-makers faced with rising costs and dwindling budgets simply aren’t going to finance a big data project without stable business justification and a firm return on investment (ROI).

Depending on your unique objectives, value can be identified and defined in many different ways.  Here are some examples of how organizations in a few industries are benefiting from big data initiatives:

  • Marketing:
    Leveraging the big data obtained from social media sites to track user reviews, Tweets and Likes for customer insight and real-time analytics.
  • Retail:
    Capturing data within their stores, such as tracking physical shopping carts and using customer rewards cards to monitor shopping patterns.
  • Healthcare:
    Utilizing data stored in electronic health record systems, along with wearable medical devices and mobile health applications to generate a continuous flow of data that can be used to improve patient care.
  • Logistics:
    Utilizing big data to significantly improve overall insight and support with information collected from Global Positioning System (GPS) technology, customer data, electronic messages from suppliers and shippers, pallets and cases of goods, mobile devices, resource planning systems and social media sources.
  • Financial:
    Reducing fraud and making sure that the patterns ordinarily hidden within datasets can be uncovered, along with ensuring compliance and exposing nefarious activity, such as money laundering.

Of course, these aren’t the only industries that can take advantage of big data to help drive change and improvement.  Nearly every organization in nearly every industry can benefit from a big data initiative.

10. Will this work?

There is a lot to consider before starting a big data project and while there is an element of continual evaluation, the stakes can be high with relation to cost and benefits.  So it’s best to have some assurance that the choices you make will produce the desired results upon completion.

For a project to have worked, it must have:

  • Achieved the defined goals of the business objective
  • Provided measurable ROI

In the beginning, you typically would tie a cost-benefit analysis to the proposed project to measure ROI.  In the case of a big data project, however, there are usually non-measurable features at play.  As you sift through large volumes of data, the goal is to uncover insights that will allow you to make better business decisions, so it can be difficult to predict the value of what can be discovered.

With that said, here are a couple of things to keep in mind when attempting to predict whether or not the project will work:

  • Big data technologies tend to be highly scalable, so the ROI of your big data project shouldn’t be impacted by increased data volume.
  • Newer data management solutions and platforms can run on commodity hardware running open-source software, which make them more cost effective than traditional database management systems

What also can’t be overlooked is the process by which you will evaluate your methods and results.  You need to know up front how you will evaluate what you’ve done, along with the results you achieved, upon completion of the project.  This could include identifying:

  • The numbers, results and insights to be evaluated.
  • If the project was completed within scope of the parameters as originally identified.
  • If the analysis and project were done correctly, and what definition of “correctly” is.
  • If there are key pieces of the project, or sequential checkpoints, that will tell me when and where achievements or failures occurred.

Guidance for Your Big Data Journey

With the immense growth of data that is being captured through various sources, many organizations in many industries are searching for tangible benefits that they can harvest from big data. Their interest is not surprising, given that when utilized appropriately, big data can help improve and streamline operations and provide insights to consumer purchasing habits, as well as other benefits that are simply too important to ignore.

Big data analytics projects aren’t one-size-fits-all kind of solutions. Each organization will have their own set of unique goals, objectives and use cases and should accordingly expect to have their own unique results.

If you’re on the brink of kicking off a big data project, but still have some questions, Contact Us today and we’ll show you how Enterprise Integration’s Big Data Consulting services can provide you with the guidance you need to achieve the objectives of your big data project.

We’ll provide you with on-demand access to industry-leading experience and expertise that will help you identify exactly what you need to gain the critical insights that are currently eluding you.

You might also be interested in…

Blog Home
Enterprise IntegrationEnterprise Integration
Share This