Contact Us

Blog | Nov 27, 2023

How To Use Generative AI For Document Extraction & Processing

Generative AI Document Extraction & Processing
Table of Contents

A Demo of Unstructured AI Document Processing

Document extraction: it’s repetitive, time-consuming, error-prone and can be very boring when done manually. So, how do we fix that? We’ll walk you through a demo of how to use generative AI with intelligent automation (IA) to automate document processing so it’s more accurate and efficient – saving you time and money.

A brief note: IA uses robotic process automation (RPA), business process management (BPM) and artificial intelligence (AI) to automate end-to-end business processes. This is sometimes referred to as AI automation.

Generative AI For Document Processing

Generative artificial intelligence (AI) uses algorithms to create new content, such as videos, pictures, code and text, based on its training data. Generative AI models are trained to recognize patterns in data and use that to generate new but similar data. That’s why the quality of training data is crucial to gen AI’s performance.

Okay, so how do we apply that so we can process documents automatically?

AI document processing

Also called intelligent document processing (IDP), document AI or document automation, AI document processing is where you can transform unstructured and semi-structured data into a usable, machine-readable format. IDP captures and extracts data from various formats before processing it.

AI document understanding

Using natural language processing (NLP), deep learning and machine learning (ML) among other AI technologies, IDP helps you classify, categorize, validate and extract information – transforming it into structured data that’s easier to understand, analyze and use.

Find out how you can turn your ideas into reality with generative AI and automation.

AI Document Extraction Examples

We’re going to look at two examples of how you can use intelligent automation and generative AI together to turn unstructured, high-volume documents into a usable format.

An unstructured tenancy agreement

Most of us know how text-heavy a legal contract like a tenancy agreement can be, and it’s easy to miss important information amidst all that paperwork. With scanned documents, generative AI can rapidly read, comprehend and extract data to classify documents quickly and accurately. So, let’s take these lengthy, non-standardized contracts that are difficult (and tedious) to parse manually and simplify them to ensure no details are missed.

Sara has unstructured data in the form of a 24-page tenancy agreement. Rather than reading through everything, let’s use AI to pull the information Sara needs.

Sara starts by creating a prompt asking for tenant names, maximum occupancy, included bills and the break clause summary.

Generative AI Document Extraction & Processing

She can then specify the responses, which should include the answer text with no additional text. Where there are multiple questions, each answer should begin with a question number.

Gen AI Document Extraction & Processing

Now, Sara can execute her document processing workflows in this order:

  1. Retrieving the document.
  2. Populating the queue with the agreement.
  3. Extracting the text.
  4. Sending it for data extraction.
  5. Formatting the data.
  6. Entering the extracted data.
  7. Reviewing and marking as complete.
Generative AI Document extraction & processing

In this process, optical character recognition (OCR) converts the document to machine-readable text for the AI. The AI locates the requested details spread across the document in seconds and instantly extracts and structures the relevant data, avoiding lengthy manual review.

The output is returned as a JSON, parsed and written into a notepad ready for downstream automation. Now, Sara can access the extracted data on the notepad and see the answers to the questions posed in her prompt. She can then compare the data with the original tenancy agreement, confirming the AI has indeed extracted the correct list of tenants, occupancy limit, bills and exit clauses.

Generative AI Document Extraction & Processing

In this hypothetical, Sara used generative AI as an easy way to extract key information from lengthy, unstructured documents in seconds rather than hours.

The gen AI-powered digital workers eliminated tedious document analysis while capturing key details with high accuracy and flexibility, enabling automation for any intelligent document processing work, from lease management to business documents and more. 

Watch this demo in full below: 

The Future of Intelligent Automation

So, given this scenario, how do we think bigger about generative AI and automation functioning in tandem?

First, to prepare for generative AI we need to look at how you can securely and effectively introduce this duo into your business processes. After looking at the example of document extraction, you can continue to build use cases for gen AI and automation that apply to your business needs, calculating the potential return on investment (ROI). Check that you have the right infrastructure in place. Next, look at your AI compliance requirements and establish best practices with AI governance. Finally, get your people on board with your plans and brainstorm fresh new ideas for how automation and gen AI can support your people and productivity.

Once you’re prepared for digital transformation, find the AI tools that suit your goals, such as the gen AI/intelligent automation collaboration between AWS and SS&C Blue Prism.

IDP solutions

SS&C | Blue Prism® Decipher IDP gives you intelligent automation at scale. Decipher is our intelligent document processing software that combines OCR, artificial intelligence (AI) and machine learning (ML) to extract the information you need from various structured and semi-structured document types at speed. With Decipher, you get:

  • Customizable form templates.
  • Digital workers that accurately process documents to reduce errors and save time.
  • Human-in-the-loop (HITL) validation to ensure levels are maintained.
  • Easy-to-use UI with no complex coding or setup required.


Digital workers allow you to automate data collection in almost all types of documents to save people time and ensure no critical information gets missed. The goal of automation is to remove the need for tedious, repetitive tasks such as data entry, and get people focused on what matters.

Want to learn more about automating document processing? Contact us.

The Latest From SS&C Blue Prism