Blog | Jun 8, 2018

The Future of the Digital Worker: Visual Perception

The Prism Logo

Blue Prism has been at the cutting edge of innovation in this area for over 12 years. We are privileged to hold this position but, far from being complacent, we continue to increase our investment in the area. We are working on a range of developments in other technologies that are already helping early-adopters derive even more value from Visual Perception as an Intelligent Automation Skill.

Visual perception is the ability to see and interpret (analyze and give meaning to) the visual information that surrounds us.

An effective Digital Worker needs the ability to interact with and interpret screens and the data within them, in the same way as a human operator. This is not as simple as it sounds – whilst some argue that “screen scraping” technology meets that brief, the rudimentary approach of blindly navigating systems without perception results in a brittle and fragile outcome prone to error. This is where the misconception that Blue Prism is an elaborate screen scraping tool is fundamentally wrong – if you think about the processing that occurs as you work through a process on your computer yourself, there is significantly more complexity involved that a simple screen scraping technology cannot provide. A digital worker needs to be able to exhibit much more advanced abilities if it is to be able to work within any human operated process. Consider the following as the minimum requirements:

  • Your Digital Worker needs to be able to work with and adapt to environmental changes, such as screen resolution, network and application performance, as well as changing elements within the application. Ignoring these factors can have catastrophic implications – the failure of an automation, or worse – a robot taking down a target system due to insufficient throttling of its operations.
  • It also needs to be able to work within any type of application – web applications, mainframe, Java, thick and thin client, and more—in a way that is both resilient and performant
  • Lastly, your Digital Worker needs to be able to interpret the data within the application. This may be achieved by a simple set of rules, or it may be more complex and need to be linked to other Digital Skills, such as Knowledge and Insight, in order to interpret unstructured data in Natural Language.

Why is Blue Prism Different to Screen Scraping?

There are several fundamental differences between Blue Prism and other approaches that are based on Screen Scraping technologies.

  • The Blue Prism platform was designed to integrate with any type of application at a much deeper level than the screen landscape. We now have more than 10 different technology interface mechanisms that allow you to directly integrate with and control an application and we are building more. All of these are designed with adaptability and scalability in mind and are designed (with business users in mind) to require no coding. That doesn’t mean creating interfaces to hundreds of different applications (a recipe for endless maintenance) – but having a focus on different presentation technologies.
  • Screen scraping implies a brittle form of recording actions that does not meet with the standards we set for Visual Perception – it is incapable of adapting to changes in the environment. You will sometimes hear this referred to as “happy path” automation, as it demos well and works great when everything runs exactly as when you recorded it, but as soon as there is a deviation from those parameters, the automation is doomed to fail. Creating a mess of isolated recorded macros is not a route to sustainable automation. Indeed, it is because of our focus on good design and re-use that Blue Prism DOES NOT even tempt fate and does not provide a record button.
  • Even when it is necessary to revert to using the visual layout of the screen (using Surface Automation – a term we invented and another much copied Blue Prism innovation) our approach is designed to work around some of the pitfalls of a screen scraping approach. We have built significant intelligence into the technology to enable the Digital Worker to adapt to changes in the environment. We don’t stand still – Surface Automation is a feature that we significantly enhanced in Blue Prism Version 6 and it continues to develop. Our customers can use Adaptive Positioning Technology to search for elements—whether they’re elements on a screen, in a fixed position, or in a relative position to another element. We’ve also given full control over tolerances to enable the Digital Worker to cope with moving elements, resolution changes, and RGB colour changes. We also significantly reduced the number of steps required to automate actions, making it much faster to robustly develop using Surface Automation.

Augmenting Your Workforce with Visually Perceptive Digital Workers

Indeed, one of our favorite case studies may not be the biggest deployment of robots or the highest ROI, but really demonstrates the power of the technology. Lighthouse Works is a nationwide non-profit organization that partners with the American Council for the Blind (ACB) to provide career advancement opportunities to the visually challenged. They use the Blue Prism platform in a really innovative way – typical accessibility software is expensive and precludes visually challenged workers from performing certain task components. By using the Blue Prism platform, these workers can execute end-to-end processes – such as scheduling, billing and claims processing – that otherwise would have been impossible or, at best, extremely challenging.

Final Thoughts – The Impact of AI on the Visually Perceptive Digital Worker

are working on a range of developments in other technologies that are already helping early-adopters derive even more value from Visual Perception as an Intelligent Automation Skill. Our ecosystem is unparalleled here too. We have integrations with three of the industry’s leading AI cloud platform Computer Vision APIs, which will enable Blue Prism’s Digital Workers to process and analyse documents and images, classify them, and then intelligently extract the content. We also have a strong partnership with Captricity – who combine OCR/ICR technology with Machine Learning. Our ecosystem includes integrations with Abbyy cloud OCR and we are working with IBM to bring you integrations with Datacap very soon.

This is a crowded space and you need flexibility to adapt your Digital Workforce with the latest innovations and an RPA provider that has their finger on the pulse and a strong vision and philosophy. Computer vision is an area that has seen huge advances in the last year. Through the latest techniques – such as Google’s AutoML – the accuracy of some of the models in this space has moved on in leaps and bounds. Indeed in this article, you can see that the application of AutoML to Image classification immediately surpassed the prediction accuracy of all previous models.

I recently had the pleasure of working with some visionaries in Google on and they explained how AutoML works (in words that mere mortals like myself can understand!). The concept of AI that trains AI is truly exciting (and perhaps a little scary?). Through our strong partnership with Google, we are lucky enough to be able to access some of these capabilities at an early stage and our Research team is experimenting with the AutoML technology right now. Imagine a world where training a Blue Prism robot is as simple as “showing it” a screen and all the elements are automatically discovered and mapped in an extensible and meaningful way – this could be the future of the Digital Worker and it is not as far off as you might think!