Is It Trustworthy? Pt.2

“There is no spoon.” – Spoon Boy

Welcome Back To Our Unit Trust Discussion!

I’ve been deep in QA for a while now and the wisdom of Spoon Boy perfectly sums up what I had to build in order to track an application’s health. I knew what I wanted from a system — but the existing tools didn’t allow it, so I bent them.

I will be super interested to know if any other QA people have built somethign similar or if I’m the only one on the Nebuchadnezzar!

Scores, Factors and Ranks

Now let’s talk about the structure that makes the Unit Trust system tick — a structure I believe is especially well-suited for game dev.

Sure, you can count bugs. Track open vs. fixed. Plot that on a graph and watch the lovely burndown chart go down over time.

But… that’s just a bit naff isn’t it?

You could show different lines/bars for the numbers of bugs based on their impact/severity?

Better but still a bit naff.

Breaking it down by every feature in a game? Now we’re getting somewhere.

Make it an area chart that shows each and every bug, for each feature, where the area is equal to the risk over time?

Now we’re talking — but at that point it becomes totally unreadable.

Picture every bug in your project visualised on a massive risk-area chart. It would look like a stained-glass window of chaos. The data’s valuable — but we need a better way to interpret it.

Bug Scores

The first step I took in bringing this data under control was resurrecting a simple system that I hadn’t seen used in over 20 years, bug scores! Convert Impact (the Impact type I use) into numbers and not like 1 to 5 but give each Impact a score that is an order magnitude greater than the previous. My “Trivial” bug is worth 1/50 of my “Showstopper” bugs. Why? Because these baseline numbers are much easier to work with than words, and exponential scores give us better resolution when calculating overall risk.

Bug Factors

What factors do we have? Well we’ve got:

🔹Bug Severity
🔹Bug Status
🔹Time in Status

Each of these factors alter the baseline bug score each time a round of testing is completed. Severity is the simplest to work out, I have it so that:

B class bug = baseline x 1
A class bug = baseline x 1.5
And so on..

The other two factors, I get a little more fancy with and this is where we really start to differentiate from a bog standard burndown chart. In those charts, once a bug is fixed, it’s gone. In my system, bugs cast shadows.

Warm-Ups and Cooldowns

From my experience, fixing a bug does not guarantee a feature is fixed and any experienced QA person will know that bugs like to make nest! In this system, when a bug’s status changes from Open to Fixed, a cooldown period starts. Similarly, when a bug is first added to the backlog a warm up period starts. Both of these periods are designed to represent increasing and diminishing risk associated with a particular feature.

The longer a bug is left unaddressed, the higher the baseline score increases. This is intended to reflect the fact that working on features associated with the bug but not addressing the bug really increases risk and the later it is addressed, the higher the chance there is of unintended consequences.

These warm up and cooldown periods are also the reason we use orders of magnitude in our baseline bugs scores to begin with. You don’t want a handful of Trivial bugs slowly creeping up in risk and overshadowing Showstoppers, but if there’s a whole swarm of them, this system keeps edging them up in rank if they’re left unaddressed for LONG periods of time.

Ranks Next week!

This post is getting a bit long so we’ll save the Ranks for next week.

Hopefully this hasn’t been too nebulous — and you’re starting to see how the system works. If it’s still a bit abstract, hang in there. Once the final piece clicks into place, it should pop like an epiphany.

Take it easy, QA travellers!

Blog Roll | About | Table Of Contents