Estimated read time: 20 min What you’ll learn: How to automate Win32 applications using step’s .NET agent. Ideal profile(s): Application owner, Automation specialist, Tester Author: Dorian Cransac (exense GmbH)
Despite alternative options appearing on the market over the last decade, Microsoft desktop software and in particular, Win32 applications still play a big role in the IT landscape of most corporations. These applications are notoriously difficult to automate in an end-to-end way and in turn, most automation teams usually don’t bother scripting user workflows involving these applications. For people who do try to test them, measuring their performance or including them in their RPA toolchains, the road is usually pretty bumpy.
In this case study, we’d like to demonstrate how we’ve been able to break these barriers and automate software such as Outlook and other Win32 applications on the Windows platform. We’ll explain how one can leverage COM interfaces and step’s .NET agent to build large scale Win32 automation projects. As a last-resort mean, we will also show you how to package AutoIt as a Keyword in step in order to simulate keystrokes. This is particularly useful for automating Win32 applications which do not provide a COM-level API or for handling forms and popups.
The Windows runtime is pretty much unavoidable in IT. Whether it’s the Office Suite or some obscure program used to open a 3270 session and communicate with a mainframe host, Win32 applications are most likely part of your IT ecosystem.
In many cases, these applications are directly managed by operations engineers and basic single-user tests are deemed “enough” for them to ensure that your email works and that other proprietary desktop applications are rolled out properly. Additionally, people rely a lot on Microsoft’s own testing efforts and assume these IT components will always work once installed in their environment.
One could argue that end-to-end testing should be a mendatory step before a roll-out, regardless of the type of application, but even if you’re taking a hands-off approach to desktop applications, once in a while, you’ll come across a critical application workflow that requires E2E automation and involves a Win32 desktop component. Should you just give up or try to broaden your skills and the scope of your toolkit?
As part of this study, we will first take a look at a real-world scenario that one of our clients - a public Swiss administration - needed to automate. In a subsequent section, we will then take a look at the integrated stack that supports the implementation and execution of the scripts. We’ll see that our generic approach allows us to interact with just about any Windows program, and that several different strategies are at our disposal, depending on the context.
A quick digression: at the time, our client was already using step with Selenium Keywords in order to reuse E2E scripts provided by their third-party software vendor based in Austria. They were focused on two activities: massively concurrent regression testing and load testing (both against their new document management system). One day, this client identified a new critical workflow which would absolutely have to be tested.
This workflow involved:
* Microsoft Outlook itself
* a custom add-in built into Outlook
* a the web application (the document management system) which they were already throughly testing using the Selenium Keywords
* a third-party Win32 client used as glue between Outlook and the web app
Let’s take a closer look at what this workflow means in terms of testing.
The user workflow
The Document Management System which was the main target of the automation effort was essentially a web application. The vendor had provided their own suite of page objects for integration with step via Selenium Keywords. This happend to already be a corner stone of their automation platform.
One day, a new, custom add-in for Outlook was released. This add-in would allow a user to import an email or an attachment straight into the central system by invoking a local custom process on the user’s desktop, which would in turn initiate a browser session in which the user was already logged based on their windows credentials. The user would enter the attachment’s metadata in the window of the Win32 process and eventually land back in the web application (Internet Explorer) once the document’s import was successful. The virtual web folder in which the document had been imported would then be displayed to the user in their browser session.
The figure above summarizes the sequence of interactions required to upload the document into the DMS.
This scenario was tricky to automate because multiple types of technology were involved and because it required different automation approaches. On top of that, coordination and synchronization needed to be achieved between the different processes to reenact the user’s workflow in a meaningful way. In other words, this is a highly hybrid scenario requiring a lot of flexibility and maturity from the automation platform, not only to implement and run but also later scale up the executions.
leveraging either vendor-provided DLL’s and the COM API to invoke functions specific to the application
when no such API existed, using AutoIt to interact directly with the window via basic events such as key presses
Many tools developed on top of the Win32 platform provide DLLs and you’ll often find them in their installation folders. As our tutorials demonstrate, integrating such DLL’s in a .NET project is pretty easy.
Please note that AutoIt itself offers a DLL which exposes its functionality. If you’re not familiar with AutoIt, you can check out the official web page here.
We were then able to deploy our project in step, run it and scale up our executions using step’s .NET agent. The figure above illustrates how step leverages the .NET Framework to interact with Win32 applications via the COM interface. If you’re interested in scalability concerns, you can check out this section of our Getting Started guide as well as part two of our Selenium tutorial videos which illustrates the execution of a workload across 100 agents..
Using the .NET agent, testers can write scripts in C# against Win32 applications and simply register them in step as Keywords to build complex scenarios and execute them on step’s agent grid.
Our client was able to add this critical workflow to their test suite, allowing for a smooth roll-out of the functionality in production. With unique integration capability and new integrations coming out with almost every release, step keeps on expanding the coverage of tested applicative workflows for its users.
Demo & Tutorials
The following short video demonstrate how even applications such as Microsoft Outlook or the Windows Notepad can be manipulated automatically:
The following tutorials will guide a technician through the necessary steps to create such a project and then execute it on top of step’s .NET agent: