Particularly in dynamic business areas, advanced knowledge often means a decisive advantage. However, it can be technically demanding to release this “intellectual capital” and make it usable. IDP faces a fundamental challenge here, which primarily concerns the handling of complex and loosely structured documents. Not least, this also reveals the future winners and losers in the industry.
Apparently, these are golden times for Intelligent Document Processing: the number of applications, further developments and analyzed units is growing rapidly. The market volume is set to increase more than sevenfold by 2030, if market researcher Fortune Business Insights is to be believed. As usual, such a forecast is subject to a number of basic assumptions. What has to happen to reach such high figures? After all, we are talking about growth of over 11 billion dollars in seven years. Let’s take a look at the status quo:
The range is currently peppered with solutions, often advertised as OCR or IDP software, which focus on extracting amounts and address data from optical text. At least simple AI implementations are now commonplace, but are also often degenerating into a widely propagated marketing label. At their core, the processes represent data organization, extraction and analysis. This works well for invoices, delivery bills or receipts and leads to considerable savings. These could soon be exhausted, as this is fundamental digitization, which according to Gartner 91% of companies are already working on. In order not to end up on the sidings afterwards, a knowledge advantage is necessary, which can only be achieved through a technological expansion of the IDP concept.
Document processing becomes interactive
The document types mentioned have one essential thing in common: they contain transactional data as it is generated in the course of normal business transactions. To increase the potential in terms of hyperautomation, it is necessary to consider another type of documents: formats with highly individualized content, but without the informative substance being able to be captured by simple text extraction. These include reports, contracts, emails, reviews or presentations. Each copy conveys an individual message – implicit knowledge that needs to be translated. That is why we speak of narrative documents.
The resulting procedure is fundamentally different from that for transactional documents. A higher-level interface is required that has access to all uploaded information. This enables basic processing using specialized multimodal AI models. The users themselves also play a key role: When creating new documents in Word or PowerPoint, they would ideally like to access the holistic knowledge of all archives. This can be implemented using an integrated chat plugin, via which user queries are sent to a large language model, interpreted and answered. A subsequent feedback loop can optimize the accuracy of further results and information.
Disruption only through increased development effort
Approaches such as these are by no means technically easy to implement. Every simplification in the application ultimately shifts the necessary level of effort and expertise to the development. Only the current spearhead of specialized IT expertise can disrupt existing solutions: for narrative document processing, a much more extensive and targeted use of AI is unavoidable. Language models need to be specially trained and modified. Narrative documents also often contain visual elements that can only be captured using techniques such as computer vision and multimodal models. Added to this is the visual integration of data into a user interface, which also plays a key role in the validation process.
IDP providers often try to circumvent these difficulties with the help of external models and services. At best, the user is unaware of this – but is still directly affected. A larger number of transmission paths can ultimately have a significant impact on data security, which increases the potential for damage. In this way, providers also relinquish responsibility for regular further developments. However, these are becoming the most important factor in the question of who can hold their own in the IDP jungle.
Survival of the Fittest
The focus of current IDP developments is ultimately on meeting a new application requirement relating to the transformation towards knowledge-creating companies. Information that enables this adaptation is not only contained in transactional documents. It can be found wherever companies communicate – even between the lines. The necessary interpretation and release of these resources also requires increased competence on the part of IDP providers. The future division of the industry depends on this alone. So while individual fit market leaders will clearly set themselves apart, some candidates are likely to fall victim to Digital Darwinism.
About the Author
Christopher Helm is CEO of Helm & Nagel GmbH, an international provider of artificial intelligence (AI) and new enabling technologies. Among those is the software Konfuzio, an AI-based platform for process optimization and data-based knowledge gain.
🟡Get industry news distilled, every week. Delivered straight to your inbox: