logo

Unit 9 Research types

Learning outcomes

By the end of this unit you should:

  • know the three types of evaluation: accuracy, usability and efficiacy 
  • know the advantages and disadvantages of each type
  • be able to describe the similiarities and differences among the three types
Rubik

Activity 1 Software development and evaluation

Read

Most research in the TNT lab involves the design, development and refinement of software. The software is likely to incorporate natural language processing or natural language generator. Although most of the time spent on your research will be on the design and development phases, the evaluation phase is a central part of your thesis. There are three common types of evaluation that are conducted: accuracy, usability and efficiacy. Each of these types is considered below.

Activity 2 Evaluating accuracy

Read.

If a calculator is 99% accurate, there will be one error in every 100 calculations. If the calcuator is 100% for all simple operations, but 50% for square root calculations; the software developer can focus on improving the square root function.

In the context of identifying tenses, we need to determine the overall accuracy. This could be done by using a set of sentences with known tenses and comparing the results of the software to the known tenses.

If the tense is identified correctly, this result is true positive. If the tense is identified incorrectly, the result is false positive. If no tense is discovered when there is no tense, this is a true negative result. Finally, if a tense is identified when there is no tense, the result is false negative. The aim for the developer is to increase the true results, and minimize the false results. Within those, the priority should be on minimizing false positive results as those will cause users the most problem.

accuracy vs. precision

High precision, but low accuracy

Activity 3 Accuracy studies

Watch this explanation of the difference between accuracy and precision (2 min 17 sec).

Activity 4 Evaluating usability

Decide which (if any) of these usability tests is appropriate for your prototype.

Usability studies can be conducted at very different levels from simply asking your best friend what he or she thinks about your software through to sophisticated well-planned extensive testing. There are three common ways of implementing tests:

  • Hallway test: Set up your device in a place where many people walk past (i.e. a high foot traffic area) and ask them to try out your software.
  • Remote test: Recruit people to try out your software online from the comfort of their home (or cafe, etc.) in their own time without any interaction with you. This more impersonal testing can sometimes result in more direct honest feedback.
  • Moderated test: Either face-to-face or via video conferencing or screensharing software to conduct your study. In this type you have the opportunity to ask users follow-up questions about their feedback and actions.

Usability studies are often designed as a sequence of instructions, giving users tasks to achieve. For a tense visualization software a task could "Find the tense of a short sentence using this software." The task does not tell users how or provide steps. If the software is user-friendly, the user should be able to work out where to go and how to do it.

More important than the usability test are the improvements you make to the software based on problems and feedback in the usability test. Frequently a series of tests will be used during the creation of a piece of software.

hallway in Corvinus University of Budapest
I need 5 people
no volunteers
Source: UX Planet

Activity 5 Usabilities studies

Watch this explanation of the differences between qualitative and quantitative usability studies (4 min 02 sec).

Activity 6 Evaluating efficiacy

Read.

Efficiacy is the ability to get the desired result. The adjective "effective" is used to describe the same concept. This means that the outcome must be the desired result. If the desired result for the user is to find the name of a tense and your propotype produces that, it is effective. Measuring effectiveness (efficiacy) in this case means (1) identifying the desired outcome, and (2) checking the outcome is accurate.

For example, if your software is designed to help people learn, the efficiacy evaluation will need to assess whether users learnt. This is non-trival undertaking, i.e. it is very difficult to do so. To establish learning, you need to show that (1) users did not know something, then (2) they learnt something and finally establish that (3) the learning was the result of the software. Measuring learning is very complex. There are many issues that make this task challenging. We can measure the values generated on tests, but do the tests measure what we want them to measure. We can ask whether students know something, but can we rely on their answers?

effective vs. efficient

Activity 7 Efficiacy evaluation

Watch this explanation of the triple E framework of engagement, enhancement and extension which can be used to evaluation learning from software (11 min 05 sec).

Activity 8 Performance metrics

Watch this explanation of the different types of metrics used in software engineering. These may also be used depending on the aims of the project (8 min 16 sec).

Knowledge and application

Activity 9 Decisions, decisions

Decide what type of evaluation would be most appropriate for your research project. Share your ideas with your tutor via Slack when you have decided.

mindmap

Source: Medium

Review

Make sure you know the meaning of the following terms:

  1. accuracy
  2. precision
  3. usability
  4. efficiacy
  5. engagement
  6. enhancement
  7. extension

Make sure you have completed your needs analysis and started to follow your action plan.