Blog · Blog
Why Unstructured Data Is Holding Back GenAI Adoption
Discover how unstructured data challenges GenAI adoption in logistics and supply chain, and learn strategies to overcome these obstacles.

According to a recent survey sponsored by AWS and MIT, 80% out of 334 Chief Data Officers (CDOs) and data leaders interviewed believe that generative AI (GenAI) will eventually transform their organization's business environment.
Why, then, have only 6% of the CDOs surveyed adopted generative AI applications in production deployment? Why are so many still circling the GenAI runway? Are they unable or unwilling to bridge this gap between recognizing GenAI's potential and actually implementing it?
For many CDOs, the reason is a frustrating four-letter word: DATA---specifically, unstructured data.
What is Unstructured Data?
This vast amount of unstructured data presents a significant challenge for the Freight, Logistics, and Supply Chain industry. Unlike a spreadsheet's neatly organized rows and columns, 85% of this free-form information flowing through the global supply chain everyday data lacks a predefined format.
Traditional databases struggle to process information that lacks a rigid format, putting the handbrakes on valuable insights and a holistic view of the supply chain.
This includes supply chain data such as:
- Textual Data: Emails, internal communication platforms, customer feedback forms, and even handwritten notes on physical documents
- Multimedia Content: Images of damaged goods, audio recordings of dispatch calls, and even videos captured by security cameras at warehouses
Unlike structured data, which is neatly organized in rows and columns, unstructured data is more complex and harder for databases to process. Understanding the difference is the bedrock of a modern data management strategy.
What's Putting the Handbrakes on GenAI Adoption?
While excitement for GenAI runs high, many companies have yet to start preparing their data for it. Jeff McMillan, Chief Data Officer at Morgan Stanley Wealth Management, articulates this challenge succinctly:
"These large language models [GenAI] do not solve the problem of disparate data sources. Companies need to address data integration and mastering before attempting to access data with generative AI."
The AWS/MIT survey mirrors this concern, with over 46% of respondents identifying "data quality" as the biggest barrier to realizing GenAI's potential.
Preparing the Data Foundation for GenAI
The gap between data-prepared and unprepared companies will only widen as GenAI advances, creating a chasm in operational efficiency and innovation. Here's how some companies are bridging the gap:
- Data Integration/Cleaning: 25% of surveyed organizations are cleaning or integrating datasets for GenAI
- Data Surveying: 18% are surveying their data to identify potential GenAI use cases
- Document/Text Curation: 17% are curating documents or text for domain-specific GenAI models
Build Your GenAI Strategy on Data Concrete, not Quicksand
Across industries, a common theme emerges: establishing a robust data foundation before implementing Generative AI (GenAI). Walid Mehanna, Group Chief Data and AI Officer at Merck Group captures this approach perfectly: "If we want to do AI, we need to build it on concrete, not quicksand. We are getting the process and data supply in good shape."
This aligns perfectly with the views of data leaders. A survey reveals a near-unanimous agreement (93%) that a modern data management strategy is crucial to unlocking GenAI's value.
Yet, despite its potential as an organization's most valuable asset, unstructured data faces an investment gap. An IDC report highlights this disconnect: Only 44% of companies can readily justify spending on managing unstructured data. Why is that?
The Challenge of Unstructured Data
This disconnect between potential and implementation stems from the inherent challenges of managing and mining unstructured data:
- Quantifying ROI: The indirect and long-term benefits of unstructured data management make ROI calculations tricky
- Volume and Complexity: The sheer amount and complexity of unstructured data can be overwhelming, leading companies to prioritize structured data initiatives
- Investment Gap: Unlike structured data stored in organized databases, unstructured data remains underinvested, mainly in terms of data management strategies
Unlike the well-managed world of spreadsheet data, unstructured data remains largely neglected by traditional data management strategies. What's causing this underinvestment?
Here's Why Structured Data Gets Prioritized
- Organization: Structured data fits neatly into databases, while unstructured data lacks a clear format. (Think filing spreadsheets vs. a free text email containing voice notes, scans and PDF attachments)
- Analysis: Structured data is analysis-ready, while unstructured data requires specialized tools to unlock its value
- Storage: Structured data thrives in data warehouses, while unstructured data needs flexible solutions like data lakes
- Decision-Making: Structured data facilitates quick decisions, while unstructured data requires processing but offers potentially hidden insights
Overcoming the Challenge of Unstructured Data
As generative AI continues to evolve and demonstrate its potential, forward-thinking companies recognize the need to overcome these hurdles and unlock the full value of their unstructured data assets. For freight, logistics, and supply chain companies, this means focusing on four key areas:
- Data quality: Ensuring accuracy, recency, and relevance of your unstructured data
- Data integration: Combining disparate data sources into a cohesive whole
- Domain-specific curation: Preparing data for industry-specific AI models
- Robust data governance: Implementing standards and processes for data management
Stargo: Building a Concrete Foundation for Your GenAI Journey
At Stargo, we understand the unique challenges facing the freight, logistics, and supply chain industries. Our domain-specific generative AI solutions are designed to help you unlock the full potential of your unstructured data.
By leveraging advanced proprietary LLM (Large Language Models), NLP (natural language processing), and ML (machine learning) architecture, Stargo can help you:
- Extract valuable insights from emails, documents, and customer communications
- Optimize routing and inventory management based on real-time data
- Identify inefficiencies and bottlenecks in your operations
According to the AWS/MIT survey, the time to transform data management strategies to supply chain management is now. Schedule a demo today. Our solutions can deliver ROI within 12 weeks.
More from the News Room
View allWe are publishing more related coverage here soon. Explore the full News Room for the latest articles.
See ROI in 12 weeks
See where enterprise data is slowing operations down.
Estimate the manual effort, delays, and leakage hidden across your current workflow before you automate it.