The risks of the "Golden" sample

It's very tempting, and if you've ever worked with overseas contract manufacturers (CMs) they will almost prefer to use a "Golden Sample" instead of a detailed specification.

Coyau / Wikimedia Commons / CC BY-SA 3.0
In hardware manufacturing, it's easier to understand. I freely admit I've written incoming parts inspection instructions that don't have every detail of the part! Every dimension, the depth of any indentations, the gauge of any round protuberances, the color of all the materials, the hardness, even the chemical composition (important for fasteners and structural parts).

If you simply take a prototype or early unit from the manufacturing line, give it a thorough inspection, and declare it to be a "Golden Sample," it's a heck of a shortcut, saving a lot of *your* time as the person who would otherwise have to write the documentation!

There are risks, however, and you need to think about them, minimize them, and try to mitigate the harm.

The first risk is what I just refer to as the "bad baseline." Whenever you have a test where you're measuring one sample against another, you're always going to encounter a case, sooner or later, when the baseline is not what you want it to be. If you sat down to write the spec, you might say the height of the front and rear panels should be 10cm ±0.75% for them to screw together properly.

But imagine that your golden sample (which you inspected!) had front and rear panels on the smaller side at 99.3mm, but the next batch of rear panels was on the large side of the spec at 100.7.

If you had documented this as a spec, it's immediately clear what's going on, and you can move on to decide what to do about it (go ahead and use it? change the spec moving forward? warehouse the parts and wait for front panels that are a better match?'

But if your only spec is the golden sample, you're now going to have to do the research and documentation that you skipped earlier on to figure out the right corrective action.

But before you think "Well that's hardware, thank goodness I work on software!" They may not always be called "golden samples" but they are pretty common. Very often they are called things like "performance baselines" or "reference logs."

Similarly to manufacturing, the problem is only partially in the false pass (or false fail!) test result; it only gets thornier when you try to work out the best corrective action. Re-create the golden log? You're likely to fix the immediate problem, but introduce another!

You'll also need to think about the consequences of a spurious test result. In the case of a false-positive (test that reports a failure that doesn't exist), will you be yanking the chain of your design engineers for no good reason? Or does your reporting process have checks to make sure the bug is real?  What about the case of a test that appears to pass, but a real problem exists? You need to consider the context of your specific product, and decide if it's worth the risk, or if additional tests need to be put in place.

To fix the problem and prevent recurrence, you're almost certainly going to have to dig deeper; to analyze the output of the diff between your log files rather than just looking for identical output.

As for performance tests, my best experience has been to look for deviations from the historical data rather than selecting a single golden baseline. See my other post Pass/Fail; a Terrible Way to Test!

In general, I'd suggest working towards a spec, even if you've got golden samples now. If you're in software testing, regular expressions can be a huge help filtering through big log files to find the important parts!

Thanks to Vernon Lee of Synopsys, Inc. for some of the ideas in this post, which have come from our discussions!

Comments

Popular posts from this blog

The Boeing 737 MAX, a QA perspective

Applicability of Economics to QA