Toward a Reliable and Valid Usability Testing Methodology discusses a the usefulness of design testing:
There is no known usability testing method that is both reliable and valid. End of discussion. Why do people continue to ignore this, when instead we should be doing something about it?
We’ve struggled with this very problem ourselves. Do we even need this “science” when the art of humans observing humans could suffice? How can we improve user testing so that it’s more reliable, valid, and useful?
How do you take fundamentally qualitative data (comfort with navigation, understanding of a checkout process, etc.) and mix that with quantitative data (number of clicks, load speed, time to search for a product, etc.) to get something that’s reproducible to the point where the true user interface problems poke through?
[via WebWord]
i design for the least common denominator and then sprinkle in some razzle dazzle for some flare.
It's like trying to create a science to dictate "elegance".
It is very difficult to put numbers to such tests, but usually there are some basic guidelines or rules of thumb that can help. I wouldn't go as far as calling it "science" though.
As a QA Manager that doesn't always have the means of a proper usability testing lab, I can only say that some of these guidelines have proved to be very helpful to me in the past.
We need at least some of usability to be a science so we can create a standard by which we can avoid the "erroneous, fraudulent, unconscious cheating, and/or self-deception." I'm not hoping to measure what cannot be, only for a way to measure that part of usability that can be in a way that is meaningful and consistent.
Thanks for the comments. Keep them coming
Part of the problem is that usability people don't share anything like best practices and internal guidelines. Yes, people publish research, but nothing is collected then stated as a valid guideline. As in "For users 18-24 years of age, 8 point font is acceptable for reading passages that are 200 words or less when the objective is comprehension."
And everyone agrees that, given this case, 8 point font is what will be used. But we are too segregated by our companies to do this (so far). usability.gov is doing it. Of which Robert Bailey (reference in ronz blog and my own) is a part.
The best thing for it is usability objectives and design guidelines.
Probably the underlying question here is how do we make everyone take seriously what we do? We tend to through around numbers and point to research. But overall I think it is just going to take companies taking chances that usability does work. Then seeing a rise in sales, or a lowing of clicks, or whatever.
And another rambling comment... stop heuristic evaluations.
Okay. Not sure why 37signals was part of the link I typed (since I only typed my URL).
It should be here.
But it has been a long day, so perhaps it is me. :S
Okay. Not sure why 37signals was part of the link I typed (since I only typed my URL).
Fixed it... You'd swapped the colon and the forward slashes around.
Fixed it... You'd swapped the colon and the forward slashes around.
Cause I'm a loser, baby. Thanks.
I also want to give props (hey I've never given props before... I see the appeal) to Sanjay Koyani who also is involved with usability.gov. Interestingly enough which is part of the National Cancer Institute's communications department.
NCI's comm dept will be doing an article for User Experience magazine this year. The focus will be on their guideline research.
How can we improve user testing so that its more reliable, valid, and useful?
Scott, what did you mean specifically by useful? By definition, valid includes useful. Do you mean practical, or something else?
How do you take fundamentally qualitative data...
I don't think that there is anything wrong with the classic measures of a usability test: errors & time, especially when augmented with a measure of preference (e.g. a survey). Currently, there are probably problems with how people subjectively identify errors, but I don't think the problems are insurmountable.
I don't have much hope for many of the measures that the automated testing tools gather: number of clicks, load speed, etc. except to augment the classic measures. But who knows? Once we have a reliable, valid testing method then we can look for correlations.
by getting companies to invest in "humans observing humans" as part of the budget for a project. How often have big companies dumped thousands into new campaigns to "test run" them before theyre introduced, only to find out they miss the mark and have to be changed? everyday. Nothing beats observing groups or individual humans. But its expensive and takes time - and the internet still isnt seen as something thats important enough to put "real" money into, like print advertising campaigns or TV spots.