Many years ago I was contacted by someone who worked for a very large defense contractor. The gentleman (Mr. X) had the responsibility for helping ensure that the electronics modules used by his company met stringent reliability requirements, one of which was a minimum allowable Mean Time Between Failures (MTBF). He had read one of our DACI newsletters that mentioned such reliability predictions, and gave me a call.
“My problem,” he said (I paraphrase), “is that MTBF predictions per Military Handbook 217 don’t make any sense.” He subsequently provided detailed backup studies, including a data collection — using real fielded hardware — that showed the predicted times to failure for the hardware did not match the field experience. The predicted numbers were not just too low (as some folks claim for MIL-HDBK-217), they were also too high, or sometimes about right. In other words, they were pretty random, indicating that MIL-HDBK-217 had no more predictive value than you would get by using a Magic 8 Ball.
But that’s not all. “These reliability predictions,” he continued, “are worse than useless, because engineering managers are cramming in heavy heat sinks, or using other cooling techniques, to drive down the MTBF numbers. The result is a potential decrease of overall system reliability, as well as increased weight and cost, based on this MTBF nonsense.”
Until I heard from Mr. X, I had prepared numerous MTBF reports using MIL-HDBK-217, assuming (what a horrible word, I’ve learned) that the methodology was science-based. After reviewing the data, however, I agreed with Mr. X that MTBFs were indeed nonsense, and said so in the DACI newsletter. This sparked a minor controversy, including being threatened by a representative of a reliability firm (one that did a lot of business with the government) that DACI would be “out of business” because of our stance on the issue.
Well, DACI survived. Today, though, and sadly, my impression is that lots of folks still use MIL-HDBK-217-type cookbook calculations for MTBFs, which are essentially a waste of money, other than the important side benefit (that has nothing to do with MTBF predictions) of examining components for potential overstress. But that task can be done as part of a good WCA, skipping all of the costly and misleading MTBF pseudoscience.
Instead of trying to predict reliability, it’s better to ensure reliability by employing “physics of failure,” the scientific process of studying the chemistry, mechanics, and physics of specific materials and assemblies.
Bottom line: Skip the handbook-style MTBF nonsense, and use those dollars instead to keep abreast of materials science, as applicable to your specific products. (If for some reason you absolutely must prepare an MTBF report, use a Magic 8 Ball: it will be much quicker and just as accurate.)
p.s. Prior to my education by Mr. X, I had been deeply involved with the electronics design for a very ambitious spacecraft project. Thinking MTBF to be an important metric, I asked the project manager what the preliminary MTBF was for the system. He smiled and asked me to meet him privately.
Later, alone in his office, I was furtively told that the MTBF calculations indicated that the system was doomed to failure, so it had been decreed that the project was not going to use MTBFs. The rationale was that each system component would be examined on a case-by-case basis to ensure that its materials and assembly were suitable for its intended task. In essence, this can be viewed as an early example of the physics of failure approach. And yes, the mission was a complete success.