The End-to-End Testing of Asynchronous Systems

This article analyses different situations involving end-to-end testing and proposes automation and the Step platform as a viable solution.

Written by Dorian Cransac

Illustration for The End-to-End Testing of Asynchronous Systems

Context

The purpose of this article is to take an in depth look at different situations and make a case for a standard approach and an API in order to address these issues more effectively. If you’re not familiar with asynchronism, callbacks, or events, you may want to review this glossary.

In our glossary, we explain how asynchronous processing introduces a set of issues which make building end-to-end simulations difficult.

When attempting to replicate the client’s behavior, end-to-end testers will want to actively pool some criterion or exit condition in order to find out about the end of the scenario. Testers will want to wait for the appearance of a file in a folder or the presence of a new record in a database before verifying hypotheses. In cases involving callbacks, some code on the server’s side will actively push information back to the client and server and client roles will be temporarily “switched”.

As the impersonator of the client, testing tools do not generally provide semantics for receiving requests or events. Implementation of service mocks and the integration layer between the mock and test plan can be difficult.

Case Examples

Let’s look at a series of asynchronous scenarios posing non-trivial challenges:

Server Callback

In this scenario, client and server temporarily trade roles, so that at some point the initiator of the workflow hangs and waits for an incoming request from the server.

The figure above illustrates a couple of synchronous calls made to a service bus (ESB). These could be HTTP calls or any type of request-response protocol. After a response to synch_request_2 is sent, client and server roles are switched. The application server becomes a client and calls an endpoint on the client’s side; refer to the “Server endpoint” box on the figure.

This step of the scenario can be problematic for most E2E testing solutions as it requires the following for the simulated client:

Waiting indefinitely for the callback to take place.
Being able to receive the call; deploying a server context and waiting for incoming requests.
Mocking a specific business endpoint in order to return a meaningful response to the server.
Resuming the scenario upon reception of the request.

Push Notification

In this next scenario, asynchronism is caused by the creation and reception of an event in the form of an Android Push Notification. Notifications are pushed from the internal IT infrastructure to a third party in the Internet public zone, Google Firebase servers, and then received by emulated Android devices in a third party cloud system that is accessible over the public zone.

In order to monitor the propagation time of notifications in real time during the testing, building synchronicity back into the test plan is most convenient. This is accomplished by forwarding the reception of the notification by the device back to the initiator, Push Client, for validation and end-to-end performance measurements.

In addition to forwarding the notification message, a device identifier or device token needs to be sent from each device during the initialization phase of the test in order for the device to join the pool.

This scenario poses several technical challenges because the simulation needs to:

Implement an Android mock or a Fireback mock to receive the actual notification.
Forward the information that a notification has been received along with data.
Allow for the arbitrary measurement of the entire propagation time.
Provide semantics for passing uniquely identifiable messages and pools of data.

If you’re interested in learning more on Android emulation in the cloud, FIDO2 authentication, and Firebase, a comprehensive study of this case has been published separately here.

Two-Way Async Protocol

This third scenario focuses on bidirectional asynchronous communication using the FIX protocol. In the implementation encountered, messages were received on both sides using distinct event listeners. Once a channel is created between the Initiator and the Acceptor and a message has been sent, you must wait asynchronously in order to receive a response or inbound message. Therefore, the reception of subsequent inbound messages will take place in a different thread and method than the ones responsible for sending the outbound message.

The number of inbound messages received in response to a single emitted outbound message may vary and the order in which events are received is important and requires validation. This makes building plans and measuring response times difficult. Deploying the code for testing is also a challenge.

The figure above illustrates the way two entities communicate using FIX. Ideally, an encompassing synthetic transaction would provide us with response times that make sense from a business intent perspective.

New problems arise in this scenario:

Messages need to be potentially queued upon reception to avoid stalling the system.
Message order information needs to be kept for validation.
Deploying and operating both the Initiator’s and the Acceptor’s code can be difficult.
Clear semantics matching the expected response messages and a synthetic transaction are required.

Solutions

Key Functionality

The separation of concern between the management of mocks, the implementation of business-specific service endpoints, and the responsibility of managing, coordinating, and synchronizing events and threads. Central event management would occur in a dedicated entity called EventBroker.

In order to interact between service mocks and the EventBroker, semantics and controls need to be created and made available to the tester for use in test plans.

Functionality Overview

Below is a summary of the requirements retained in the design of the new Asynch packages:

Implementation

If you’re interested in how different requirements have been taken into account and implemented as a functionality in Step, take a look at the following documentation for our Async plugin.

Server Callback

Clear separation of concern between the mock and the management of events can be achieved by Using the standard EventBroker API to convert incoming requests into Step Events. Then developing and deploying an Adapter as a proxy between the server and the test platform.

Events can be managed and monitored centrally through Step’s controller. This eases the operational workload and simplifies analysis.

Push Notification

The end-to-end measurement of the notification’s propagation time was made possible by two actions:

Implementing the forwarding of the notification in the Android mock
Adding a proxy instance to enable micro-polling from within the client’s internal network towards our cloud

Subtracting the time between receipt of the actual notification from the receipt of the forwarded message proved to be difficult due to host time synchronization. However, it paid off in the end as real time monitoring was achieved.

More details on the results of this test campaign are provided here.

Two-Way Async Protocol

Using the EventBroker as a facade between Initiator and Acceptor and the new Async controls resulted in:

Accurate tracking of communications.
Design of clear concurrent test plans for validating the sequence of messages received.

No standalone adapter was used in this scenario because the server code could be run in short, repeated stints via Step’s Keyword API.

It is easy to scale out sessions on Step’s agent grid and implement multiple concurrent sessions when both the listener’s and the adapter’s code is packaged and deployed as Keywords:

SSE & WebSockets

A protocol oriented approach is possible in this scenario, but we recommend the use of browser-based automation instead. This white paper on browser-based vs http based automation explains how scenarios involving such a complex stack are easier to manage when running an actual browser instead of a mock.

This approach allows testers to deal with asynchronous patterns through a “black box”, the actual browser, and helps reduce the additional complexity and error that a protocol oriented solution would cause. Below is what the simulation’s architecture looks like in Step:

There is no need to use Step’s EventBroker because a real browser is handling the reception of server events.

This scenario demonstrates the flexibility provided by Step and highlights that no single approach solves all problems. We believe it is up to users to decide what the best fit for their project is.

Conclusion

We hope the information covered will be helpful to you and others. We are currently evaluating the introduction of a new functionalities as part of our Async & Event packages:

Continuous Keywords for distributed mock management and server style code deployment.
Persistency option for the EventBroker’s queue.
Out-of-the-box monitoring of functionality.
Information on the queue’s state and the event’s lifecycle.

Check out our Knowledge Base for future announcements as we are planning on releasing more information, tutorials, and demos.

Summary: This article analyses different situations involving end-to-end testing and proposes automation and the Step platform as a viable solution to these situations.