Unit 9: Prototype evaluation

Learning outcomes

By the end of this unit you should:

know how to evaluate the usability, accuracy and efficacy of a prototype
have practised at least one form of evaluation

Activity 1: Importance of evaluation

Read.

Software developers and engineers create software. The software itself is designed for a particular purpose. Users of software may have high expectations. How accurate do you expect the following software programs to be? If they exceed your expectation, you should be delighted. If they cannot meet your expection, you will be disappointed.

clock
online calculator
spell checker
grammatical error checker
tense identifier

Consider why you might have different levels of expectations for different types of software.

Commentary

In the days of analogue clocks, people did not expect clocks to be accurate to the second or even minute. Nowadays with atomic clocks, our expectations are much higher.

Calculators should be accurate, but is that the case? What does your calculator show for 2 divided by 3? Calculators need to truncate numbers to fit in the display and so depending on their method of truncation, different results may be obtained. For more information check here.

Spell checkers are reasonably accurate, but are limited by their dictionary size and choice of language standard (e.g. British, American, etc.). Analogue is spelt with "ue" at the end in British English, but this difference means it will be highlighted as a spelling error by software using a dictionary of American English.

Grammatical error checkers and tense identifiers are even less accurate.

Activity 2: Understanding accuracy and precision

Watch and listen to a short explanation of accuracy and precision

Activity 3: Evaluating accuracy

Read.

If a calculator is 99% accurate, there will be one error in every 100 calculations. If the calcuator is 100% for all simple operations, but 50% for square root calculations; the software developer can focus on improving the square root function.

In the context of identifying tenses, we need to determine the overall accuracy. This could be done by using a set of sentences with known tenses and comparing the results of the software to the known tenses.

If the tense is identified correctly, this result is true positive. If the tense is identified incorrectly, the result is false positive. If no tense is discovered when there is no tense, this is a true negative result. Finally, if a tense is identified when there is no tense, the result is false negative. The aim for the developer is to increase the true results, and minimize the false results. Within those, the priority should be on minimizing false positive results as those will cause users the most problem.

High precision, but low accuracy

Activity 4: Evaluating usability

Decide which (if any) of these usability tests is appropriate for your prototype.

Usability studies can be conducted at very different levels from simply asking your best friend what he or she thinks about your software through to sophisticated well-planned extensive testing. There are three common ways of implementing tests:

Hallway test: Set up your device in a place where many people walk past (i.e. a high foot traffic area) and ask them to try out your software.
Remote test: Recruit people to try out your software online from the comfort of their home (or cafe, etc.) in their own time without any interaction with you. This more impersonal testing can sometimes result in more direct honest feedback.
Moderated test: Either face-to-face or via video conferencing or screensharing software to conduct your study. In this type you have the opportunity to ask users follow-up questions about their feedback and actions.

Usability studies are often designed as a sequence of instructions, giving users tasks to achieve. For a tense visualization software a task could "Find the tense of a short sentence using this software." The task does not tell users how or provide steps. If the software is user-friendly, the user should be able to work out where to go and how to do it.

More important than the usability test are the improvements you make to the software based on problems and feedback in the usability test. Frequently a series of tests will be used during the creation of a piece of software.

hallway in Corvinus University of Budapest

Source: UX Planet

Activity 5: Evaluating efficiacy

Read.

Efficiacy is the ability to get the desired result. The adjective "effective" is used to describe the same concept. This means that the outcome must be the desired result. If the desired result for the user is to find the name of a tense and your propotype produces that, it is effective. Measuring effectiveness (efficiacy) in this case means (1) identifying the desired outcome, and (2) checking the outcome is accurate.

For example, if your software is designed to help people learn, the efficiacy evaluation will need to assess whether users learnt. This is non-trival undertaking, i.e. it is very difficult to do so. To establish learning, you need to show that (1) users did not know something, then (2) they learnt something and finally establish that (3) the learning was the result of the software. Measuring learning is very complex. There are many issues that make this task challenging. We can measure the values generated on tests, but do the tests measure what we want them to measure. We can ask whether students know something, but can we rely on their answers?

During this course, there is not enough time to evaluate efficiacy, though.

Activity 6: Purpose, outcome and measurement

Decide the exact purpose of your prototype, the desired outcome for the users, and how you can test or measure whether that outcome can be achieved.

Discuss your ideas with your teammates (if any).

Activity 7: Practice activity

Evaluate an online tool designed to help language learners on its accuracy and usability. Your tutor may assign you a specific tool to evaluate.

Activity 8: Reciprocal peer evaluation

Evaluate a prototype created by another team and share the evaluation with them. The, change over and have your prototype evaluated and received feedback. For any actionable feedback, implement the changes.

Knowledge and application

Activity 9: Prototype submission and evaluation

Follow the detailed instructions on ELMS and submit the required items for your selected project type.

Review

Rather than reviewing concepts from this course, check your code is clean. Here are some tips from "Clean code" by Robert C. Martin, 2017.

Follow standard conventions.
Reduce complexity.
Be consistent.
Avoid negative conditionals.
Choose descriptive and unambiguous names.
Add comments to clarify code if necessary.
Keep lines short.
Make sure the code does not smell!

Running count: 71 of 71 pattern-related concepts covered so far.