According to a new report by nRoad, analysts predict the global datasphere will grow to 163 zettabytes by 2025, and about 80% of that will be unstructured. In regulated industries, such as financial services, the challenges posed by unstructured data are exponentially higher. It is estimated that two-thirds of financial data is hidden in content sources that are not readily transparent. With unstructured data growing at an unprecedented rate, financial services firms are finding it difficult to harness data and derive actionable insights.
Through extensive research nRoad discovered that volume, velocity, variability and variety exacerbate the challenge. Unstructured data that lack metadata, such as field names, proliferate at increasing rates every year. However, most of an organization’s unstructured data is in the form of documents that include customer communication. And the content of documents differs so substantially — not just from domain to domain, but between specific use cases within fields.
[aditude-amp id="flyingcarpet" targeting='{"env":"staging","page_type":"article","post_id":2793297,"post_type":"story","post_chan":"data-infrastructure","tags":"category-business-industrial,category-computers-electronics-enterprise-technology-data-management","ai":false,"category":"data-infrastructure","all_categories":"data-infrastructure,enterprise,enterprise-analytics,","session":"C"}']Current approaches, from Robotic Process Automation (RPA) to Natural Language Processing (NLP) models that use deep learning to produce human-like text remain unfeasibly resource-intensive and too generalized to address the totality of niche problems in the enterprise. These generic, one-size-fits-all solutions lack domain knowledge and industry-specific terminology, which diminishes their value. Even if they can successfully process 90% of a document in many real-world scenarios, a critical 10% is not correctly extracted.
The landscape that emerges to tackle unstructured data will not consist of a single winner-takes-all platform. Instead, the ecosystem will be far more fragmented and specialized, with solutions providers responding to specific enterprise needs and generating business outcomes based on their demonstrated abilities to solve a handful of challenges relating to unstructured data rather than their abilities to solve all of them.
AI Weekly
The must-read newsletter for AI and Big Data industry written by Khari Johnson, Kyle Wiggers, and Seth Colaner.
Included with VentureBeat Insider and VentureBeat VIP memberships.
First and foremost, reliable unstructured data processing for enterprises requires incorporating domain knowledge as more than a mere adjunct to a larger platform. Instead, it is an inextricable component of any foundation for extracting and summarizing documents. Financial services firms cannot leave behind 85% of their data. With the approach outlined here, they have an opportunity to incorporate valuable information and insights from unstructured sources into mission-critical business flows.
Read the full report by nRoad.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More