This is the second part in an on-going series of Assessing Analytical Maturity – Are your Analytical efforts Viable? If you are new to this series, we recommend reading the first part here.
Picking up from where we left off, it is time to get our hands dirty and deep dive into the various dimensions that need to be focused upon to make sure of an effective assessment of the analytical maturity. Let’s start with Data, shall we?
Credited to Clive Humby  who was instrumental in what is possibly the industry defining move of launching the Tesco Clubcard back in 1994 , the phrase “Data is the new oil” is now synonymous while expressing the power of data. You can read the case study available here 
Noted author and keynote speaker, Bernard Marr will argue otherwise.  We will not contest the phrase as both sides are unified in accepting that Data is precious and forms the foundational brick on which the massive structures of Analytical Organizations are built.
So, then what is Data? Why does it play a role in assessing analytical maturity of an organization? And how do we understand its significance better?
Data is everything. Every fact, figure, or information that is stored for reference or analysis is data. Every site you visit, every bread you buy, every contact you add to your phone, every drop of rain and every ray of sunshine that is recorded is data. Whether it is structured and stored nicely in the form of rows and columns which is easily accessible/searchable or whether it is unstructured in the form of tweets and comments or images and videos, data is everywhere. As analysts and data scientists, this is where it all begins.
The Seagate Data Age 2025 study predicts that worldwide data creation is expected to snowball to 163 zettabytes (ZB) by 2025. We can provide numerous examples of its growth. The 150bn hours of heart rate data collected through Fitbit or the nearly 20bn IoT devices that will exist in 2020 will only aid the Big Data boom.
It is however the challenges that come with it that interest us. If you are an analyst working on solving the problems or a senior executive who consumes the analytics you would agree that the amount of time spent in discovering and preparing the data to be ready for analyses is fairly large. It would not be a stretch to say that about 80% of an analyst’s time is spent cleaning and preparing the data to be ready for analyses. An additional challenge is data management. With most organizations continuing to work with a traditional data strategy and infrastructure to store data, the requisite tools and technology are not in place to scale quickly and meet the ever-increasing demand. Inadequate data strategy framework to manage data leads to another critical problem of data security. With employees having access to data they should not or vice versa, this issue has severe ramifications. Recent incidents of Cambridge Analytica come to mind. Read the case study here. And with the issues listed above comes the final problem of data usage itself. Conservative estimates put 90% of data being unstructured with only 1% of it ever being used. The remaining 10% (structured data) does not really improve the statistics any further with only 50% being used for analytics.
The one over-arching response to the challenges above would be to have a good data strategy framework. Offensive (using data to drive better decisions, build relationships, market better, etc.) or defensive (enabling systems for better governance, security, and preventing theft), the focus of every organization and individual functions within these organizations will vary.
It is the components of the Data Strategy framework and the organizations’ efforts towards them that separate the laggards from the leaders.
With an executive sponsor, a well-defined governance framework, and a single accepted source of truth with documentation of each data element, a good data strategy ensures the right data elements, which are reliable and of good quality, are available for developing and implementing analytical solutions to business problems. Business (Where does the data fit in with the business? What does this data describe?), technical (What is the format and structure of the data? What is the granularity of the data?), and operational (Where did the data come from? Who is the data owner?) metadata ensures all data sources/repositories are connected to each other and employees are provided access in accordance to their role thereby maintaining data integrity as well as complying to the regulatory laws.
A clean input data acceptable for analytical purposes with the scope/process for modification to ingestion/clean up should not be a limiting factor to harness the power of analytics. A constant view and a process to acquire second and third-party data elements in addition to the first-party data available on hand, yield far greater insights. A robust data strategy framework involves starting with the business problem, understand the data requirements, identify the possible sources of data, use data catalogues to identify data assets and analyze gaps, and finally propose a strategy to bridge data gaps and sustain data assets.
Data Infrastructure and its implications will be delved into more detail in the upcoming parts. However, to provide a bird’s eye view, it covers the lifetime process of data starting from the point at which it originates to the point where it is consumed by the business.
To summarize, data forms the basis for all that we wish to achieve in the analytics space. A good data strategy framework that ensures its availability and considers the security measures to maintain its integrity, requires more than a dedicated effort from the executive sponsor and the organization itself.
Data is only the first dimension we intend to cover in this series. Stay tuned for the upcoming parts wherein we will look to strengthen this foundational brick.