BIRD UX - Beyond Interfaces, Real delight

Get in touch

hello@birdux.studio

Phone Berlin 0171.12 45 07 3
Phone Mannheim 0177.71 38 208

Which usability test do I need?

25 August 2022 | Research and Evaluation

Reading time: 9 minutes
Which usability test do I need?

At the Usability testing two broad categories of tests can be distinguished - namely summative usability tests (summarises results) and formative usability tests ("shapes" the design). Which type of test you need depends on what you want to find out. Let's take a look at the two major usability test categories.

Summative tests: Is our product efficient?

With summative tests, the focus is usually on statistical key figures. Roughly speaking, the focus here is on efficiency - i.e. finding out or measuring whether the designed solution efficient is.

Typical questions that such a test answers are, for example, whether the design fulfils a certain standard or criterion. On the one hand, these can be questions which the time required for the completion of a task, such as

  • How long do test subjects need for task X?
  • Are participants able to complete task x in (e.g.) less than one minute?

This is particularly relevant in an industrial context, in medicine and whenever devices (cars, aeroplanes) need to be controlled - in other words, wherever Response times play a role. Summative tests are also used in the Benchmarking used - i.e. to make comparisons. Typical questions here would be, for example

  • Is our product performing better than the competition?
  • Does Design A perform better than Design B?
  • Do more people click on the button for design A than for design B?

A/ B tests are a typical representative of this area.

Typical results of summative tests are based on numbers

  • 40% of our users were able to complete task x in less than 30 seconds.
  • Design A has a 40% higher error rate than Design B.
  • In Design A, 20% more people click on the button than in Design B.

The results therefore relate to "how much", or "how long" - but usually do not answer the "why" behind a specific behaviour. It is called a "summative" test as it aims to summarise / "sum up" the results.

Requirements for summative tests

As a rule, summative tests require a finished product or at least a fully functional prototype, as they must function "correctly" in order to be able to make a valid statement about efficiency.

Summative tests also require at least 20-30 (+/-) participants, depending on the statistical methods used. We therefore need someone who is familiar with statistics and the requirements for the respective statistical methods.

Formative tests: How do users experience the product / design?

Formative tests are more common within UX design processes as they can be used as part of an iterative design process. The aim of formative tests is to find out something about the participants' experience and behaviour - for example, participants should tell us directly if they find something confusing or funny. Unlike summative tests, these tests often answer the "Why" behind a specific behaviour.

Formative tests answer questions such as:

  • How do people experience our design?
  • Where do they get stuck, and above all: Why do they get stuck there?
  • What are the biggest problems/challenges with our design that we should fix next?

Formative tests can be carried out early on in the design process help to identify optimisation potential. This means that formative tests should ideally be carried out very early on in the design process, e.g. with click dummies, in order to recognise initial problems and rectify them quickly.

A typical result of such a test is usually more qualitative instead of quantitative nature e.g: "Participants had difficulty completing task X because the buttons labelled OK / Cancel were confusing."

Formative tests are therefore carried out when the aim is to uncover problems in order to identify further UX potential. They help us to "mould" the design for a product or service. Hence the name "formative". In contrast to summative tests, where we need more test subjects due to the requirements for statistical methods, formative tests with approx. 7-10 users can already identify some of the main problems that can then be optimised.

Gain additional information with the method of Thinking aloud /Thinking aloud

Experience has shown that the reasons for cancellations often lie in the fact that users do not find their way around or feel poorly informed. And this can be found out wonderfully with formative, moderated usability tests and the "thinking aloud" method (Ericsson & Simon, 1984). In "thinking aloud", participants constantly comment on their thought processes while interacting with the system. The aim of this is to gain additional information about the participants' cognitive processes while operating the system being tested: What is going through their minds while they are operating it? What questions do they have at the moment? In which knowledge structures do they categorise the information presented? What irritates or confuses them?

Limitations of the method of thinking aloud

Of course, we can only think aloud what we are conscious of. However, some - in fact, many - of our processes take place below the threshold of consciousness and therefore cannot be verbalised (Wilson, 1994).

This is important to understand and is the reason why we tend to favour this form of usability testing. moderated Recommend tests. Moderated usability tests are an excellent tool for finding out the WHY behind cancellations and poor conversion rates.

React to subtle behavioural cues with moderated formative tests

In a moderated test - and this is the important thing - there is a Real-time interaction between the usability experts who moderate the test - i.e. guide the participants through it - and the test subjects. This means that we as usability experts sit together with the test participants either remotely via a video conference or on site and guide them through the test. This is not possible with unmoderated usability tests, as there is no real-time interaction with the test subjects in unmoderated tests. More on this later.
Through constant observation during a moderated usability test, we are able to determine what is being thought out loud - i.e. what is being verbalised - independently of what is being said. can - additionally identify subtle cues in behaviour - so-called behavioural cues - such as facial expressions or squinting of the eyes, frowning etc., make a note of them and come back to these passages containing such subtle cues later after the actual test. These subtle behavioural cues are often an indicator that users feel insecure but are not necessarily (able to) verbalise it because - as explained above - they are not necessarily aware of it. The aim is therefore to return later after the actual test, in addition to the obvious problematic points, to exactly those points where such subtle behavioural cues were observable and to investigate in more depth whether something was not understood, uncertainty prevailed and what might have been going on. You can often obtain further valuable information by returning to the relevant points and having the respondents repeat things, for example, and asking specific questions.

Unmoderated formative tests

Unfortunately, this targeted enquiry is not possible with unmoderated usability tests, as unmoderated usability test sessions are conducted by the participant alone, i.e. the test subjects usually carry out the test remotely from home using special online tools. These sessions are recorded in video and audio so that we as usability experts can view and analyse them afterwards. So this is where none Real-time interaction with the respondents takes place. Nevertheless, it is possible to build questions into the study, which can either be asked after each task (e.g. "How difficult did you find that?") or at the end of the session. However, these questions are usually standardised - i.e. the same for all participants. In unmoderated sessions, there is no opportunity to ask detailed questions that can be customised. specifically to the behaviour of the respective participants or to deal with the respondents in detail.

Other disadvantages can also be that people think less out loud in unmoderated sessions - simply because there is no one there to remind the participants. In unmoderated sessions, we have already observed that participants become increasingly silent over time. That's a shame, because you never know what's going through the participants' minds while they're working on the task.

In addition, test subjects may drop out, skip tasks or generally be rather unmotivated to complete the tasks. We often rarely find out what caused them to drop out, for example. Did the technology not work? Did they no longer feel like it? Were they interrupted or was the task too difficult? This could mean that some sessions cannot be analysed. With the moderated test in particular, the social pressure of direct observation creates a little more motivation to carry out the tasks or to engage with them.

The lack of detailed enquiries about specific problems that the respective test subjects had is a major disadvantage of unmoderated tests - especially for tests that are to be carried out in an early design phase. Unmoderated tests are often used because of their alleged time savings. Of course, you save the time in which the moderators interact 1:1 with the participants - however, in our opinion, this is often accompanied by a not inconsiderable loss of knowledge, which we have described above. In addition, an unmoderated usability test requires exactly the same - if not more - planning as a moderated test. If, despite all this, an unmoderated test session is to be carried out, we only recommend this for systems that are functional, such as live websites, as non-functional aspects could raise too many questions in a click dummy, for example. If in doubt, we always recommend a moderated session instead of an unmoderated session, as moderated sessions generally provide more insight.

So we can see which type of usability test should be carried out - summative or formative and moderated or unmoderated - depends on what and how exactly you want to find out. Summative tests can help with a functional prototype or a finished product to provide information about the efficiency of a product and formative tests help either very early on in the design process or with finished products to identify problematic areas and therefore further UX potential. In formative tests, moderated tests offer the great advantage of specific follow-up questions and thus significantly increase the chances of gaining detailed insights into the user experience and thus valuable knowledge for improving and optimising the UX of the system.

Literature

  • Ericsson, K. A., & Simon, H. A. (1984). Protocol analysis: Verbal reports as data (p. 426). The MIT Press.
  • Wilson, T. D. (1994). The Proper Protocol: Validity and Completeness of Verbal Reports. Psychological Science, 5(5), 249-252. https://doi.org/10.1111/j.1467-9280.1994.tb00621.x

Illustration

Web illustrations by Storyset

A petition for a digital inclusivity countdown

A petition for a digital inclusivity countdown

A few weeks ago, we attended an event organised by Digital Media Women Rhein-Neckar and Business Professional Women Mannheim-Ludwigshafen, a "Future Talk" panel on the digital gender gap, which, among other things, highlights the different levels of digitalisation among men and women....

Cookie Consent with Real Cookie Banner