A revealing report by McKinsey last fall pinpointed two major hurdles—data quality and availability—that significantly restrict AI implementation in organizations.
The Data Dilemma: Labeled Data and Data Scarcity
Many companies wrestle with sourcing high-quality, properly labeled data—essential for training effective machine learning models. Without precise categorization, data labeling becomes a painstaking bottleneck, delaying progress or derailing projects altogether. As Anand Rao, a global AI leader at PricewaterhouseCoopers, explains, “Companies often lack the labeled data needed for model building, which hampers their efforts from the start.”
Consider the National Audubon Society, which employs AI to protect endangered bird species. In a recent climate impact study, AI helped predict how 38 meadow bird species might fare under climate change—revealing that if no action is taken, 42% could become highly vulnerable. Yet, not all AI initiatives run smoothly, as their attempt to count coastal pelicans demonstrated. They relied on drone images, but lacked enough labeled data to train accurate models—especially for certain species like black water cutters, which are seldom photographed from above.
The One-Sided Nature of Training Data
Another challenge is the skewed datasets used in model training. Fritz Labs’ effort to develop a real-time hair color-changing feature uncovered serious biases: the dataset lacked images representing diverse ethnic groups. Jameson Toole, CTO, notes, “Building a comprehensive, representative dataset takes time and effort—skipping this step introduces harmful biases.”
A recent PwC survey revealed over half of companies lack formal processes to detect biases in their AI data, and only a quarter prioritize ethical considerations, highlighting a widespread oversight in responsible AI development.
Overcoming Data Integration and Overload
Sometimes, the issue isn’t insufficient data but overload—disparate data silos hinder holistic insights. A global bank’s unspoken regret was not integrating customer data earlier, resulting in fragmented views and subpar marketing strategies. Now, they aim to unify online, mobile, and physical data sources, although many challenges remain. As the bank’s data lead admits, “Siloed data is one of our biggest hurdles—yet full integration is still a work in progress.”
The Drift of Data in Real-Time AI Systems
Training models on static historical data often leads to underwhelming real-world results. Andreas Braun of Accenture emphasizes that models trained solely on past data struggle to adapt to live, dynamic environments, such as fraud detection, where behavioral patterns evolve rapidly. “Incorporating live data into models is a game changer,” he says, enabling more accurate and timely insights.
The Untapped Goldmine: Unstructured Data
A Deloitte survey highlighted that a staggering 62% of companies mostly rely on spreadsheets, with only 18% harnessing unstructured data—think images, audio, or social media comments—for analytics. Ben Stiller notes that leveraging unstructured data can boost goal attainment by 24%, yet most organizations leave this rich resource untouched.
For example, Mr. Cooper manages around 1.5 billion unstructured customer documents. By applying machine learning to analyze the most frequently accessed files, they streamlined customer support, cut down inquiry times, and paved the way for smarter future AI solutions.
Cultural and Organizational Barriers
Technical hurdles aside, company culture and leadership play crucial roles. Bridging the gap between business teams and AI developers ensures context-rich, relevant solutions. Sridhar Sharma advocates involving subject matter experts early to avoid misaligned efforts. Moreover, senior management support correlates strongly with success—Deloitte reports projects sponsored by CEOs are 77% more likely to meet their goals.
Main resources used:
McKinsey & Company, 2024 report on AI data challenges
PricewaterhouseCoopers, survey on AI bias and ethics
National Audubon Society climate impact analysis, July 2024
Fritz Labs, internal project reports on facial feature detection
Deloitte Consulting, 2024 survey on unstructured data
Accenture Europe, insights on data drift and real-time AI challenges
Industry interviews and case studies from leading organizations
The Biggest Data Obstacles Holding Back AI Breakthroughs
The Biggest Data Obstacles Holding Back AI Breakthroughs
- Attachments
-
- https://changemanagement.club/index.php/downloads
- AI - shopify.png (83.49 KiB) Viewed 100 times
Vilislava Dimbareva