Top 4 Real World Data Challenges (And Fixes) in the CPG Industry

Predictive and prescriptive CPG analytics are helping executives to fish in vast data lakes and make the best catch with insights that truly move the needle of their business. However, is it as simple as it sounds? In one word – no. Apart from having to swerve aside from multiple organizational potholes that take the shape of inadequate data governance, infrastructure and fragmented teams, a series of data challenges continue to disrupt the implementation of data analytics in the CPG industry.

Let’s take a look at common questions around data challenges that leave business leaders in a fix and directions to mitigate such challenges, in the context of the CPG industry.

#1 “What data should I collect?”

There is no doubting the fact that in the search for data, we have vast data lakes to fish in, but in the quest for the “right data”, a majority of data-driven organizations still falter. Manually sifting through data can peak the impossibility of gaining real-time insights into current business scenarios. While leveraging backdated data can have significant negative impact on decision-making.

Corrective approach: The resolution to this challenge starts with an upside-down approach where an enterprise must set an outcome and define a comprehensive set of criteria that defines the success of its outcome. Next, as Robert Coase observed, is time to “Torture the data, and it will confess to anything” by means of subjecting the available data to hypothesis testing and factor analysis. For instance, TheMathCompany helped an international Beverage & Brewing company optimize marketing spend by leveraging factor analysis to filter out attributes such as price effect, quantity effect, merchandise effect, mixed effect, discount bucket, among others. These data were used to isolate overlap of type and timeline of promotions. It helped the CPG enterprise to optimize its marketing spend and achieve an annual savings of almost $300 million. Give this case study a read to get more insights on how the problem was solved.

Based on the relevant data yielded by aforementioned processes, analysts must then create an elaborate roadmap on how to gather that data. This roadmap will generally be indexed with initiatives such as rolling out surveys, exploring ways to extract and store data from PoS systems, or providing forms for sales personnel to fill while making a sale, among others.

#2 “We have some data but it is spread across systems that are not interconnected and owned by disconnected teams”

Typical of the CPG industry, data is gathered from internal and external sources such as subsidiaries, channel partners, and data aggregators. While this can overwhelm the data repository, it creates a major issue around data contextualization which continues to be an inherent pain point for the CPG industry. This naturally contributes to the prevalence of this data challenge.

Corrective approach: A potential resolution to this data challenge starts with identifying and delineating problems that are characteristically faced by each team that can use the data. This is followed by comprehending data requirements for problems based on their priority quotient.

Next and one of the critical steps from an analytics standpoint is to conduct a data maturity assessment on the available repository to filter out datasets based on their availability, quality, and ease of extraction and transfer. The outcome of this assessment will lay the foundation to map out data requirements and guarantee its feasibility. Business implications of the problem and convenience of extracting data are two factors that must be prioritized while chalking out data requirements. However, above all, data infrastructure can make or break the success of this approach while addressing this challenge.

To understand how this challenge is tackled, check out this case study that documents how TheMathCompany aided a large beverage company to optimize their plant performance and centralize tracking process to ease access to the right data for different stakeholders.

#3 “We’ve invested 3 years and millions of dollars in building this data warehouse and I still don’t see any RoI coming from this”

A common yet critical data challenge in the CPG industry is the failure to rake in the desired ROI from building a data warehouse.Despite an exhaustive inventory of data, the efficiency and speed in the analytics process plummet. This lets enterprises to believe in the prudence of lesser data than the warehouse’s storage capacity and limit data access to people. On the other hand, enterprises with better affordability continue to invest more on purchasing databases. This makes the data analysis process more expensive, delivering a direct hit to the enterprise’s ROI from the data warehouse.

Corrective Approach: A Focus on the Why, by Who, and How, Lays the Podium of a Fortified Data Warehouse

Building and implementing a data warehouse should be rooted in two principles, that are:

- One, a data warehouse cannot function like a typical appliance that enacts its designated disposition at the flick of a switch. Rather, it is a process that continuously scales up to dynamic data requirements from the perspectives of both quality and quantity.

- Two, cracking the enigma of which came first- the chicken or the egg. In this case it is about which should come first, the data or the data requirements? To simply answer, data should be procured based on the present requirements of a business.


The second principle is not far off from the necessity to focus on the “why” to build and utilize a data warehouse. Typically, a data warehouse is driven by the information it is fed. It is essential for an enterprise to align this information with its business need. Mapping out key business concepts is essential before the data warehouse is deployed.


It is prudent of enterprises to rope in business stakeholders who are already struggling with getting data together from different sources and owners. It will not be a pure gamble to trust such stakeholders to map out the above-mentioned business concepts in an efficient and timely manner. It is also recommended to partner with veterans in an enterprise, who often hold keys to things that have important business value.


The data realm is a motley of components that inflate data warehouse structures with every passing day. This is a cue for enterprises to bid goodbye to a traditional waterfall process and embrace agility in their data warehouse structure, literally.

Dimensional modeling allows agile development in the date warehouse structure that is well-aligned to business needs. Model storming, in a nutshell, involves brainstorming a possible building model with data engineers and creating a prototype that will be subjected to tests by the analytics team. The model is then implemented by data engineers, following which, a final acceptance is issued by the analytics team on testing the data.

#4 “We can give you a data extract once but doing it repeatedly is difficult”

This data challenge traces its origin to two problem scenarios. Firstly, the data scattered across an organization, which makes it almost impossible to get a quick access to data when required. Despite immense prospects, this data challenge continues to be the Achilles heel for the CPG industry. The second one is the limited access and mechanism to extract data from consolidated data stores. This data challenge related to extraction can occur in numerous scenarios. For instance, inadequate data warehousing infrastructure can render the tasks of filtering and exporting data extremely difficult. Periodic upgrades in the UI of websites result in frequent website structural changes that cause further difficulties in exporting data from websites.

Corrective Approach: The first scenario is much akin to the second data challenge discussed in this article. As discussed earlier, the resolution to this data challenge begins with aligning the procurement of database to present and future business requirements. To execute this thought-process, it is imperative that an organization invests its trust in the dark horses of analytics. In this digital era, they are known as analytics translators, who bridge the gap between the expectation set by the business team and the projection of feasibility clarified by the analytics team. To know more about their role in today’s business world, read this article.

For the second scenario, identifying and procuring an appropriate ETL tool for data pipeline is key to resolving the data challenge. The tool is expected to have the capacity to extract, transform and load data from different marketing platforms into different destinations, such as a BI tool or data warehouse.

Taking things a step further, in the current era, time is more valuable than money and this necessitates the implementation of tools that can negate the requirement of coding and maintaining data pipelines to extract and load data. This is particularly relevant for enterprises who utilize third party products like Zendesk, Stripe, among others to collect data.

At the ground level, it is essential for analytics teams to document the required scripts or setup a process so that data can be fed to the solution on a regular cadence in an automated manner, and a model or set of analysis steps can also be run in the same automated process. This is relevant in view of customers’ shifting preferences towards exercising complete control and ownership over procured data solutions.

This is in fact, the basic ideology that drives hybrid consulting firms that develop platforms to build custom AI assets, or what we call contextual AI. As the name suggests, contextual AI tools can be aligned to a CPG enterprise’s business needs to tackle roadblocks ranging from data challenges to driving analytical transformation, across data extraction, feature engineering, algorithm development, performance monitoring, among others – contextualized to every little detail that collectively move the needle of your business.

Partner, TheMathCompany

Sandeep K


Subscribe to our newsletter to receive latest updates