Human Centered Testing

For complex systems, unit and integration tests are inadequate for catching many types of issues that would prevent users from successfully using the system. Human-centered testing focuses on simulating real user interactions and behaviors to ensure a better user experience.

Overview

Human-centered testing focuses on using automated tests to simulate real user interactions and behaviors to ensure a better user experience. This approach goes beyond traditional technical testing by considering how users actually use the system, including their goals, workflows, and potential pain points.

Important

Human-centered testing does not replace traditional testing methods but complements them. It aims to fill the gaps left by unit and integration tests, particularly in complex systems where user interactions span multiple technical systems.

Context

In software testing, we often talk about the test pyramid, which defines a hierarchy of tests from unit tests at the base to end-to-end tests at the top. While this model is useful, it can sometimes lead to a focus on technical aspects of testing, overlooking the human element.

As the systems we work on get larger and user interactions span multiple components, thinking of testing purely in terms of technical layers misses important interactions. Real users do not interact with individual components in isolation; they engage with the entire system, and their experiences are shaped by the interplay of various components. This means that many issues that affect user satisfaction and usability may not be caught by traditional unit or integration tests.

In this practice, we explore the concept of human-centered testing, which emphasizes understanding and simulating real user interactions and behaviors as a way to ensure a better user experience. By focusing on how users actually use the system, we can design tests that are more effective at catching issues that matter most to our users.

The tools we use for human-centered testing are typically browser based tools, like Playwright or Cypress, that allow us to simulate user interactions in a realistic way.

Why do we test?

Quality assurance ensures that the software delivers value to users. There are two primary things to focus on in QA:

Regression: Preventing breakage of existing functionality.
Acceptance: Validating that new features deliver the value they were intended to provide.

To be successful, testing needs to deliver results quickly - both in terms of feedback to the team and in terms of delivering confidence to stakeholders. When the feedback loop is slow, it hinders development velocity and prevents timely releases, reducing the impact of changes as "the moment" passes and increasing the chances of defects reaching users as the team loses context for the change.

Why do we automate?

The holy grail of software delivery is Continuous Deployment, a practice where changes can be made and deployed to production in hours or even minutes. Few teams achieve this level of agility, but agile delivery is a spectrum, and it should be the goal of every team to move as far up this spectrum as possible for their use case.

As we move along the spectrum to continuous deployment, using automation becomes increasingly important. Manual testing becomes a bottleneck as we start to move into more frequent releases, so adding automated test coverage is essential to maintain speed and confidence.

Side note: Manual Testing and "Shifting Left"

It's also important to note that not all testing activities can or should be automated. Particularly with acceptance testing, where the goal is to validate that the software meets user needs, human judgment is often required to assess whether the experience is satisfactory. However, manual testing activities should be "shifted left" in the delivery cycle.

Whereas manual testing is traditionally performed in a staging phase just before release, it will result in better outcomes and less time spent in the staging phase if we can do manual testing earlier on. Some ways of shifting left include:

Design Reviews: Involving testers in design discussions to identify potential usability issues early.
Prototyping: Creating prototypes that can be tested with real users before development begins.
Preview Environments: Setting up environments where testers can interact with features prior to leaving the "Develop Solution" phase.

Key Principles of Human-Centered Testing

Below, you'll find some key principles to guide the design and implementation of human-centered tests. In this section, we'll use a hypothetical event registration system as an example to illustrate these principles.

Our event registration system allows users to sign up for events, receive confirmation emails, and view their registered events. The system consists of a web frontend, a backend API, and an email service. It also integrates with a third-party payment processor for paid events, and sends event registrations to a spreadsheet for event organizers to manage attendance.

Realistic Scenarios

Create test scenarios that mimic real user workflows. These scenarios should start by covering the tasks that are most common or important to your application. For example, if you're testing an unemployment system, you will want to create scenarios around filing claims, checking claim status, and updating personal information. As you progress, expand your scenarios to cover less common tasks.

In practice, scenario development often starts with manual acceptance testing scenarios. It can be helpful to identify existing manual test cases that reflect real user workflows and use them as a starting point for automated human-centered tests.

An example: In our event registration system, a realistic scenario might involve a first-time user registering for an event. This scenario reflects a common user journey.

Realistic Mock Data

Tests will be executed with mock data. It's important that this data reflects real-world usage as closely as possible. This includes using realistic names, addresses, and other personal information, as well as ensuring that the data covers a variety of cases. If you have access to anonymized production data, this can be a great source for creating realistic test data.

Start by looking at any existing data you have from real users. Look for patterns in the data that reflect different user types and scenarios. You will want to identify common user personas based on the scope of the system you are testing.

An example: Examining our event registration system might show two primary types of users: first time attendees, who are registering for their first event, and repeat attendees, who have a long history of previous attendance and saved information. Additionally, because our system operates in a predominantly Chinese-American region, we might find that a significant portion of our users have short first and last names, and that they would prefer to see the application in Chinese. Our mock data and scenarios should reflect these findings.

Test across system boundaries

The scenarios you create will usually span multiple components or services. This is part of the value proposition of this kind of testing - it helps to identify issues that arise from the interaction between different parts of the system. You will want to ensure that your tests cover these interactions, validating that data flows correctly and that user experiences are seamless across system boundaries.

An example: In our event registration system, the user will submit a form to be registered for an event. When they register, they expect to receive a confirmation email, and they expect that when they go to attend the event, their name will be on the printed list of registered attendants. A human-centered test for this scenario should cover all of these interactions to ensure that the user experience is smooth and that all parts of the system work together as expected. Thus, the test should verify:

The user can submit the registration form.
The user receives a confirmation email.
The registration appears in the spreadsheet for the event organizer.
Bonus: The registration appears in the list when the event organizer goes to print it out.

All of these activities could be in-scope for a human-centered test, as they represent the end-to-end experience of the user attending the event.

Narrow focus to what breaks the user's journey

A single test is capable of covering a lot of ground. You can imagine the number of assertions that might pile up if you tried to validate every single aspect of a user's journey through a system. Instead of having hundreds of assertions per test, focus on the key outcomes that matter to the user.

An example: In our event registration system, the key outcomes for a user registering for an event are that they receive a confirmation email and that their registration is made available to organizers. While it might be tempting to also validate that the email contains the correct formatting, that the event details are accurate, and that the spreadsheet is updated in real-time, these details can be covered in other types of tests (like unit or integration tests).

Limited but Meaningful Coverage

Human-centered testing is not about achieving 100% test coverage. Instead, focus on covering the most critical user journeys and interactions. Aim for a balance between breadth and depth, ensuring that you cover key functionalities without getting bogged down in exhaustive testing of every possible interaction. For exhaustive and edge-case testing, rely on unit and integration tests.

An example: Our event registration system might have less than five human centered tests, primarily focused around the experience of attending an event, because that's what users care most about. The tests might indirectly hit all the components of the system, but we don't need to provide a test for every feature. That's the role of unit testing.

Implementing Human-Centered Testing

Once you've identified what you want to test, you can start implementing your human-centered tests. You will need to select a tool for simulating user interactions using a web browser. It should also allow you to script various external interactions, such as email receipt or third-party service calls.

We prefer Playwright for this purpose because it provides both a generic browser automation framework and a test runner with built-in support for parallelization, retries, and reporting. However, other tools like Cypress or Selenium can also be used effectively.

We're not going to provide a full tutorial on test development here, but there are a lot of resources out there for learning how to use these tools. The key is to remain focused on user outcomes and validating system behavior from the user's perspective and not get bogged down in technical assertions.

Mighty Practices