Software Testing Club -  An Online  Software Testing Community

James Christie

Testing - an exercise of skill, not a game of chance?

Yesterday evening Michael Bolton posted the sort of Tweet that makes Twitter worthwhile. It made me think.

"Q. What's wrong with this post? http://bit.ly/dDAL41 A. It treats testing as a game of chance, not an exercise of skill."

Normally I'd agree with Michael. Testing is an exercise of skill. In this context, however, I can't agree. The linked article is talking about the number of users required to detect usability defects. It's not about how many testers are required to detect defects in the functionality.

Testers do have to use their skill. Throwing more testers at an application without consideration of their skill levels is a ridiculous way to test. In fact, it wouldn't really be testing at all, but that's another debate.

Usability testing is about how the users will interact with the application. Testers can, and should, try to put themselves into the minds of the users, but they cannot be an adequate substitute for real users. 

Real users may approach an application with unrealistic expectations, or with attitudes that developers and testers consider irrational. They will usually have no existing knowledge of the application. In the case of internet applications they will often lack a knowledge of the conventions, and the culture, of these applications.

In short, users are liable be ignorant or uninformed. Testers will be knowledgeable. It is extremely difficult to think oneself back into a state of ignorance. To do so would require us to consciously choose what knowledge we are going to dispense with. Genuine users aren't even necessarily aware of what they don't know.

I am not denigrating users. Why should they know as much as we do about the applications under test, or the culture of web applications? We have to adapt to them. If we expect them to adapt to us they will leave us for a competitor who is more flexible.

We can anticipate many of the problems they might have, by use of heuristics, inspections, prototyping, testing on wireframes. But this doesn't tell us what real users might do when they get their hands on the application. Real users surprise us. 

Testing the functionality requires a high level of skill, and Michael is quite right that this is not a matter of chance. Testing the usability requires real users if it is to be effective. The decision about how many are required to give us an acceptable level of confidence in the application, at a cost we consider acceptable, becomes an important question. Probability is then relevant, and that is what Jeff Sauro was talking about in the article Michael referred to.

Am I right? Obviously I think so, but this isn't a matter of blind faith. My stance is just a working hypothesis I'm currently comfortable with. If someone wants to try and convince me that skilled, professional testers can do a good enough job impersonating real users, in all their baffling complexity, then I'd love to see the argument.

Reply to This

Replies to This Discussion

Thanks James!

- Rob

Reply to This

I think it would be a bit better if he had turned the logic on it's head and stated the real usefulness of the binomial distribution - with n tests completed and a k defects found how, confident can we be that the software is releasable? I think this also gets around all of Michaels "problems".

No, that would exacerbate them. The decision to release a product is not, to me, a function of this kind. To the extent that a function like this might be involved in the decision at all, the first thing I would want to do is invalidate its conclusion that the software is releasable.

Apropos of that, I happen to have been re-reading Quality Software Management Vol. 2 -- (Weinberg) last evening. A brief excerpt:

"Why is the attempt to invalidate interpretations so important? We all tend to see only those things that support our favourite hypotheses, and to be blind to others. What we think we're seeing is not always what we're seeing. So we always need to check it out, to avoid the following pitfalls:
  • There is no verification of a situation that everyone accepts as fact.
  • Data may have been unconsciously selected.
  • Data many have been consciously selected.
  • Measurements may be wrong because of misunderstanding.
  • Measurements may be wrong because of falsification."

So you could use the binomial distribution to tell you not to release the software, but using it to tell you to release the software would be a risky business indeed.

I thought my job as a skilled professional tester was to impersonate a real user?

Consider that carefully. If that were true, it would be a silly idea to train and retain testers. Why hire someone to impersonate a real user? Just give the product to a real user, thank them for their time, and (perhaps) pay them.

Instead, your job as a skilled professional is to identify the problems that a user might perceive and problems that a user might not perceive. There are certain things that users can bring to the table, and to the extent that you can't do that, you to engage with users directly and collaboratively.

To the table you bring not only what a user might think and do, but also technical skill, analytical skill, facility with tools, scientific thinking—all things that users might have as personal capabilities, but not as professional skills focused on the task of questioning, exploring, investigating, learning, and reporting about the product. Note that some users might be better at some or all of these of these things than some testers are. That should be a cause of serious concern for those testers.

---Michael B.

Reply to This

James and Michael,

Thank you for sharing your thought-provoking views. No "I disagree with this " or "I agree with that" in my comment, just a simple "thank you". It's this kind of dialogue, between people whose opinions I respect, who thoughtfully analyze a question with different perpectives, that makes coming back to softwaretestingclub worthwhile.

---- Justin Hunter



- Justin

Reply to This

Thank you for that Justin. If I disagree with someone with whom I normally agree then I probably learn more than if I'm simply agreeing all down the line. I'm forced to think more deeply and question my assumptions. I suppose the same applies if I find myself agreeing with someone who's usually wrong! However, that's something I find happens more in political discussions than testing ones!

Reply to This

Hi Michael,

I take your points and disagree.

First on using statistics to make decisions just isn't the scientific method. In science there are no true facts, just theories that stand that they haven't been disproved yet. To me testing is the same. I can never really say (except in a few limiting cases) that something is truly working. I can just say that for what I have tried it is definitely not not-working and by extension probably working. No one really knows if Einstein was right, but with the evidence we currently have, we can say that he probably wasn't wrong. I agree with Weinberg, heuristics such as the Anchoring and Adjustment are too prevalent, a cold objective scientific view is needed. Using the binomial properly is sensible and useful.

I think your idea of giving the product to a real user is hilarious! Give it to them, pay them and then wait for the law-suits to pour in (see Toyota, who seem to have tried something similar). Your point about the extra skills (exploring, reporting etc.) is valid but at the end of the day each of my tests are things that the user could intentionally or accidentally do. I create scenarios that replicate those a user is likely to encounter simply to see what will happen. My mate does really good impersonations of Matt Damon. We could of course just ask Matt Damon to say those things and pay him royally for his service. I suspect he wouldn't be as humorous, or as good at football or as available to go to the cinema as my mate though. I amn't a commercial user of my company's software, and the users aren't testers; I do impersonate their (likely) behaviour on a daily basis just to protect them from possible problems.

Thanks,

- Rob

Reply to This

That's not what scientific theories are. Scientific theories are based on and contain facts. Consider the theory of gravity. I don't think anyone here is waiting for someone to disprove gravity. The (often intentional) confusion of "scientific theory" with the more common usage of "theory" is a fallacy used to fight against empiricism more often than for it. Example: evolution denial. Believe it or not, science is based on facts. But don't take my word for it.

Also, in response to a couple of people on this thread, the end users are always testing the software. We never know whether or not there are bugs. We only know that they didn't exist in the specific paths and configurations we tested, assuming we would have noticed them. How much of the testing the end users do is determined by the amount of risk that management is willing to accept and our ability to try to determine that risk. The "continuous deployment" camp is probably the edgiest of the "make the users test for us" faction today. It's an interesting experiment and I look forward to seeing how it works longer term.

Reply to This

RSS

© 2010   Created by Rosie Sherry

Badges  |  Report an Issue  |  Terms of Service

Sign in to chat!