Screen Scraping Software Automation for Desktop Apps

Screen scraping software automation - blog cover

Updated: May 10, 2024

Published: October 12, 2017

Demand for screen scraping software automation stays with us from the days when only a small number of software solutions were designed with possible integrations in mind. Legacy enterprise solutions often do not provide a quick and reliable native way of data transfer.

List of the Content

Options for screen scraping
Screen scraping automation benefits
Why choose screen scraping software automation for desktop apps
Screen scraping software use case
Conclusion

OPTIONS FOR SCREEN SCRAPING SOFTWARE AUTOMATION

Modern enterprises require a wide variety of business applications to support their operations (ERP, CRM, and HRMS apps like SAP or Microsoft Dynamics). When deciding what application/product to use, all the current business needs are taken into consideration. However, businesses change over time, and so too do their needs. So at some point, the chosen application no longer supports all the company’s needs. When this happens, there are a few options:

Extend the current application, so it supports all the current needs of the business. This option is available if an application has some extensibility possibilities via API or custom modules or something else. But this is not always the case, and if not, then there are only the next two other options available.
Implement a helper application that will extract and manipulate data in your current application and support new usage scenarios. This is the easiest option as well as the most cost and time effective.
Migrate to an entirely new system. This scenario means all the data will have to be shifted across to a new system. Unfortunately, transferring across isn’t always easily achieved if there aren’t existing APIs to support the transfer.

Luckily even if an application does not offer any good APIs for scraping screen data (mostly, it is screen scraping software for capturing the text), there are still some options available.

OCR (Optical Character Recognition)

Even though we talked about this option first – in most cases, it is used only as a last (fallback) method if nothing else works. The reason for this is that OCR cannot guarantee 100% accuracy even with advanced training, which is not an acceptable solution if you are transferring sensitive information, e.g., accounting or medical information. Unfortunately, there are some applications like this out there.

System APIs Interception

This option is better than OCR because it can guarantee that you capture text from a desktop screen with 100% accuracy. The idea of this method is that the application will inject its code into all the running applications and intercept system API calls. Whatever UI framework the target application uses (WPF, WinForms, QT, or let’s say, MFC) and whatever code one writes to add a text label to some window – under the hood, in all of these cases, a very few common system functions are called (e.g., TextOut, DrawText, some GDI+ methods, etc.). So when intercepting these methods, you can get all the text that is shown on the screen regardless of what UI framework is used, what font is used, etc.

The only disadvantage to this approach is that it uses some undocumented Windows APIs, and with each security update, it becomes harder and harder to do it. However, it still works well on legacy systems.

Custom Mirror Driver or Accessibility Driver

The so-called Mirror Drivers (starting from Windows 8 Accessibility drivers) are virtual device drivers that mirror all the drawing operations (including text drawing operations we are interested in) that happen on your screen onto a virtual screen. This is a pretty good option that gives a high level of accuracy for the screen scraping software but is also the most complicated (and therefore time and cost-consuming) solution.

Using Standard APIs

For a normal desktop application, it is possible to use a set of standard APIs to scrap the text out of them. For edit and rich edit controls, there are some window messages that can be sent to a desktop app and get a cursor, selection, and text information. For other applications, different accessibility APIs can be used to allow your business application integration. Examples of such APIs are MSAA (IAccessible), IA2 (IAccessible2), UIA (UI Automation), and so on. For each specific application, different APIs are applicable because it depends on how that specific application is implemented (it might be compatible with some of the mentioned APIs but incompatible with other ones).

When it comes to the screen scraping software automation from the office applications (Microsoft Office, LibreOffice, OpenOffice, etc.), these products offer their own APIs (Microsoft Office Interop UNO, etc.), which are pretty advanced and allow you to do screen scraping in the most convenient for you the way. Also, they support extensions and macros, so integrating with them is pretty straightforward.

Browser Extensions/Plugins for the Screen Scraping

Even though some browsers support some of the standard APIs – this support is quite limited. So if you want to scrape text or any other data from an advanced web application (let’s say Salesforce) – the best option is to do screen scraping through a browser extension. Such an extension acts as a “proxy.” On the one hand, it injects JavaScript code into your application pages, which allows for the capture and manipulation of data in the application. On the other hand, it communicates either with your desktop or web application and feeds data back to it. Since the extension is only a proxy, developing the extension itself is pretty easy; the complicated part is interaction with a web application itself, especially if it uses lots of non-standard controls and complicated object hierarchy that can affect the screen scraping software performance.

Java Applications

Java is not very popular these days. Still, many Java-based applications like, for instance, Oracle E-Business Suite (Oracle EBS) are Java-based. What if you need to capture text data from such an application? In this case, not many options for screen scraping are available. Luckily there is such a thing as Java Access Bridge, a custom accessibility API that allows data extraction and manipulation in Java applications. It is quite limited but in combination with, for example, OCR, it can give pretty good results.

How to choose the right option for your project?

Feel free to contact Existek. Our expert team will be happy to answer your additional questions to address your project-specific

BENEFITS OF SCREEN SCRAPING SOFTWARE AUTOMATION FOR DESKTOP APPS

Screen scraping software automation for desktop apps can offer several benefits, including:

Time savings: Automating the screen scraping process can save significant time for businesses that rely on desktop applications for data processing. Instead of manually copying and pasting data from the UI of an application, screen scraping can automatically extract the necessary data in a matter of seconds. That saves time and reduces the risk of errors that might occur during manual data entry.

Increased accuracy: Manual data entry can be prone to errors, particularly when dealing with large data volumes. Screen scraping software can accurately extract data, reducing the error risks occurring during manual data entry. This is particularly important for businesses that rely on accurate data for decision-making.

Scalability: Screen scraping automation can scale to handle large volumes of data more efficiently than manual data entry. This is particularly important for businesses that deal with large amounts of data regularly, as it can significantly reduce the time and resources required for data processing.

Customization: It can be customized to extract specific types of data from the UI of different desktop apps. This allows businesses to tailor their data extraction processes to their specific needs and requirements.

Integration: Screen scraping automation can be integrated with other software applications to streamline data processing and improve overall workflow efficiency. For example, businesses can integrate screen scraping software with their CRM or ERP systems to automatically update customer or inventory data.

Reduced costs: Automation can reduce the costs associated with manual data entry and reduce the need for additional staff. It could lead to significant cost savings for businesses that rely heavily on data processing and analysis.

Businesses can streamline their data processing and analysis by automating the screen scraping process, improving productivity and profitability.

WHY DO TEAMS CHOOSE SCREEN SCRAPING SOFTWARE AUTOMATION FOR DESKTOP APPS

Screen scraping software automation can be a useful tool for automating tasks in desktop applications, especially when APIs or other direct integrations are unavailable. Here are a few reasons why it can be a good choice for desktop apps:

Automating repetitive tasks

It can help automate repetitive tasks such as data entry, copying and pasting information between applications, and navigating through menus and forms. With screen scraping, you can create scripts that can simulate user actions, such as mouse clicks and keystrokes, to perform these tasks automatically.

By automating these tasks, you can save a significant amount of time and reduce the chances of errors due to human mistakes. Screen scraping software can also run these tasks in the background, allowing you to work on other tasks while the automation runs.

No API or integration available

Some desktop applications may not have APIs or other direct integrations available to automate tasks. In such cases, screen scraping can be a viable alternative. This software can read and extract data from the user interface of the desktop application, just like a human would. This makes it possible to automate tasks that would otherwise be impossible or difficult to automate using other methods.

Legacy applications

Legacy applications are often no longer being actively developed or maintained, which means they may not have modern integration capabilities. Screen scraping can be used to automate tasks within these applications, allowing you to continue using them without needing to manually perform repetitive tasks.

Screen scraping can also help you modernize legacy applications by automating tasks that were previously performed manually. This can reduce the burden of maintaining these applications and allow you to focus on other areas of your business.

Customization

Screen scraping software can often be customized to meet specific needs and requirements. For example, you can modify scripts or workflows to automate specific tasks or integrate with other software and systems.

You can also create custom data extraction and processing rules to extract only the relevant data from the desktop application’s user interface. This can help you get the information you need quickly and accurately without manually sifting through large amounts of data.

Screen scraping automation can be a powerful tool for automating tasks in desktop applications. It’s a great opportunity to save time, reduce errors, and improve efficiency.

SCREEN SCRAPING SOFTWARE USE CASE

Now, let’s have a look at the example of the screen scraping automation developed by Existek for one of our clients operating in the Healthcare field. In this case, we’ve had an ordinary data transfer from the legacy desktop CRM to the web-based CRM solution. Considering that all the healthcare records are extremely sensitive, we’ve had to develop a screen scraping software automation sequence that ensures one hundred percent accuracy.

As has been mentioned before, OCR recognition works pretty well but can’t ensure that the data from the text and number fields will be scraped without any flaws. So, we’ve had to refuse this option to secure the project’s success.

Legacy CRM does not provide any standard API for such kinds of operations like screen scraping is a web interface, so we weren’t able to use an API integration method nor browser extension, or plugins. The same can be said about the custom mirror drivers or accessibility drivers because this approach would be way too time-consuming even despite its high accuracy level for the software automation. Also, the host CRM application wasn’t Java-based, so we weren’t able to use the standard Java Access Bridge approach in this particular case. Obviously, the only option that is left is System APIs Interception. Luckily, it also provides 100% screen scraping automation accuracy, which is vital for any software project in healthcare. The only disadvantage of this method is the fact that there are plenty of undocumented APIs, and this requires the involvement of highly qualified and experienced software engineers who understand these systems deeply enough to work without straight documentation and can improvise when needed. Since the Existek team has extensive experience with Microsoft technologies on average, this obstacle was barely noticeable.

Host CRM has been built on the pretty standard WPF framework, and the engineering team has built the screen scraping application that was able to inject its code into the CRM and read all the API calls. As a result, we got the screen scraping software that was able to accurately recognize the CRM fields and text in these fields, ignoring any possible distortions resulting, for example, by different text fonts. Speaking shortly, it didn’t require training your OCR engine separately for any language that might be used in your text data. Tests and later the client reported an absolute accuracy of the selected method for building this automation.

Interested in more details on this project?

You can visit our case study page to check all the additional details on the project or contact our team for any additional help.

Case Studies

CONCLUSION

Screen scraping for desktop apps typically works by interacting with the user interface of the target application, reading and interpreting the display screen, and using various techniques such as optical character recognition (OCR) and image processing to extract the desired data.

Some common examples of tasks that can be automated using screen scraping software for desktop apps include data entry, report generation, and data extraction for analysis or reporting purposes. Organizations can save time and reduce manual data entry and manipulation errors by automating these tasks.

There is always a way to integrate with any application. Integration options need to be chosen based on a wide range of factors to achieve the best possible speed and accuracy.

What’s your experience with the approaches to screen scraping mentioned in this article? What challenges did you meet during the data transfer from the applications that do not provide API? Just leave your thought in the comment section below, and let’s start the discussion!

Struggle to extract any data, including text, from some application or want to automate some processes there?

The Existek team will be happy to find the best possible solution for you, drop us a line, and we will get back to you as soon as possible, or visit our services page to learn more.

Services Page

by Oleksandr Kovalchuk, Solution Architect

Frequently asked questions

What is screen scraping software automation for desktop applications?

Screen scraping software automation for desktop apps involves using specialized software tools to automate and streamline the process of extracting data from desktop applications or other software running on a local machine. This is often used to extract data from legacy or custom applications that do not have built-in APIs or other data exchange protocols.

What are the main benefits of using screen scraping?

It offers such distinct advantages as

Time savings
Increased accuracy
Scalability
Customization
Integration
Reduced costs

How to approach screen scraping automation for desktop apps?

The team needs to identify the target application, choose a suitable screen scraping software tool, develop and test the automation workflow, implement error handling and data validation checks, and ensure compliance with applicable laws and regulations.