Category Archives: Testing
Van Brollini’s new book is an essential addition to the test engineer’s library, as well as the library of any product manager.
The Handbook contains practical advice that is based on Mr. Brollini’s extensive experience with test development, including unique insights that I have not seen elsewhere, insights that will provide the test engineer with a quantum leap in productivity.
The test engineer will also appreciate the fact that Brollini’s methods — clearly presented as a series of rules, tips, and straightforward equations — are practical and cost-effective, illustrated by real-world examples throughout.
The Handbook’s teachings can be applied with basic math and spreadsheet tools, although Brollini does recommend Design Master™ for best efficiency, particularly for more advanced applications.
(I have known Van for many years, as he was one of the first engineers to adopt our Design Master software. From time to time he has offered suggestions for improvements, which were incorporated into the software.)
The Test Engineer’s Measurement Handbook is available through the DACI website.
(Photo from “Oklahoma Jury Finds Toyota Liable For Sudden Acceleration Fault; Awards $3M In Damages,” by Arjun Kashyap, 25 Oct 2013)
Back in 2009 this newsletter started to express skepticism about Toyota’s insistence that unintended acceleration in their vehicles — in some cases resulting in fatalities — was due to a floor mat that could cause the accelerator pedal to stick. (Those earlier posts are listed at the bottom of this post.)
The reasons for the skepticism were (a) the number of reported cases was relatively high compared to other non-Toyota models, (b) there was at least one case where it was clear that a stuck pedal could not be the cause, and (c) the investigation by the National Highway Traffic Safety Administration (NHTSA) appeared to have some major flaws. Also, DACI’s own experience in investigating several low-probability events has convinced us that customers who report such problems tend to be too easily dismissed, rather than receiving the respectful assumption that they are honest and observant and reporting the problems carefully; i.e. the sudden acceleration stories related by some Toyota owners were not consistent with the floor mat hypothesis. Finally, it didn’t help Toyota’s credibility when Toyota employees were caught congratulating themselves on how they had slowed and limited the accident investigation.
Despite the above reservations, the official report by the NHTSA was that the root cause was the floor mat. Well, after several years it turns out that a detailed analysis of the electronics, as brought out during a recent trial, has confirmed that it was not just the floor mat. Key trial results are detailed in “Toyota Case: Vehicle Testing Confirms Fatal Flaws,” by Junko Yoshida in the 31 October 2013 EETimes. Here’s an excerpted summary of the problems identified:
• Software bugs that specifically can cause memory corruption
• Unmaintainable code complexity in Toyota’s software
• A multifunction kitchen-sink Task X designed to execute everything from throttle control to cruise control and many of the fail-safes
• That all Task X functions, including fail-safes, are designed to run on the main CPU in the Camry’s electronic control module
• That the brake override that is supposed to save the day when there is an unintended acceleration is also in Task X
• The use of an operating system in which there is no protection against hardware or software faults
• A number of other problems
The deficiencies in the throttle design are shocking, because good rules exist for the design of safety-critical electronics (e.g., Chapter 4, “Safety Analyses,” in The Design Analysis Handbook).
The Toyota case makes one wonder how many other possibly-catastrophic flaws are lurking within the cars we drive, or in other electronics-guided machinery, due to poorly-designed safety-related systems.
Prior Toyota Unintended Acceleration Posts:
3 Feb 2010
Stop Driving Recalled Toyotas
21 Feb 2010
Toyota Joins The Gallery Of Shame
“Sounds good,” I cautiously replied. “Send me a list of problems you’ve solved and I’ll consider it.”
It may sound like fun to be paid to solve problems, but the client is not at all interested in paying us to have fun. They are interested in paying us to solve problems, quickly. Usually these problems have been expensively festering, and the client’s design team is fatigued and understaffed. Bringing in a qualified outsider at such times makes sense, since a fresh perspective — unclouded by burnout (and sometimes politics) — can do wonders. In my own experience, oftentimes the solution to a stubborn problem is literally within an inch or two on the schematic of where the team has previously tread, missed only because of the disruptive pressures and distractions that major problems generate.
Therefore, if you want to be a consultant that helps quickly solve a challenging technical problem, there are three qualities that you must be able to offer your client:
1. You must be experienced — with a successful track record — in solving technical problems under high-stress conditions.
2. You must be organized and methodical. The client has already had enough of hit-or-miss scrambling, and is looking for a calm disciplined approach. This means that you will have requested all relevant data on the problem and be ready to provide productive input on day one. (If you are a shoot-from-the-hip type of person, your tenure will last about 24 hours or less.)
3. You must be diplomatic and willing to work closely with the client’s team, simply because you need them as much as they need you. Hotshots or other big ego types will get nowhere fast.
MicroViews: Electric Vehicles Are Not Greener and Cleaner / Dreamliner Batteries Still Misbehaving? / Robot Boogie Time
Recommended Reading: “Unclean at Any Speed“
“Electric cars don’t solve the automobile’s environmental problems,” by Ozzie Zehner, 30 June 2013 IEEE Spectrum. A standout example of scientific journalism. Mr. Zehner provides a remarkably thorough and balanced review of the overall relative pollution impact of electric vehicles.
Is The Boeing Dreamliner Lithium Battery Issue Really Solved?
From “Technical glitches delay two Dreamliner flights from Poland,” 4 July 2013, Reuters:
“A flight from Warsaw to Chicago that was scheduled to fly on Wednesday was canceled because the aircraft had “problems with the power supply…” “The spokeswoman would not say if the latest technical problems were related to over-heating batteries which forced the grounding of all Dreamliners for over three months.”
The Robots Are Coming! The Robots Are Coming!
And wow, can they dance!
“Boeing’s fix includes more insulation between each of the eight cells in the batteries. The batteries will also be encased in a new steel box designed to contain any fire and vent possible smoke or hazardous gases out of the planes.
“…both the F.A.A. administrator, Michael P. Huerta, and Transportation Secretary Ray LaHood said they were are satisfied that the proposed changes would eliminate concerns that the plane’s two lithium-ion batteries could erupt in smoke or fire.”
-“F.A.A. Endorses Boeing Remedy for 787 Battery” by C. Drew and J. Mouawad, 19 April 2013 New York Times
Conspicuously absent from this pronouncement is a definitive identification of the root cause of the lithium battery fires. Therefore Boeing, the FAA, and the Department of Transportation are all guessing that the stated modifications will fix the problem. I hope they are correct. But if they are it will be a matter of luck, not engineering diligence. The dissembling of the FAA and Department of Transportation are clearly evident in their own words: they say that they are “…satisfied that the proposed changes would eliminate concerns that the plane’s two lithium-ion batteries could erupt in smoke or fire.” If they are so satisfied, then why is it necessary to have a steel box to contain a fire? If they are so satisfied, then why did they not provide the supporting evidence to support their conclusions?
Also, Boeing and these government agencies have touted a few test flights as being of particular significance in proving the safety of the batteries. This is nonsense. The battery fires are low probability events, occurring only once for thousands of hours of operation. This implies that there are subtle variables in the battery construction, chemistry, and/or operation, which when combined worst case will cause the batteries to overheat. This combination may only occur for a small number of manufactured batteries, and fires may occur only when those particular batteries are exposed to a worst case combination of stresses (temperature, charge currents, etc.).
Therefore a handful of test flights, of a few dozen hours or so total, are not nearly sufficient to empirically identify a low-probability event. The identification of such an event would require hundred or even thousands of test flights, which is obviously not practical. Therefore the only alternative is an investigation that drills down and positively identifies the true underlying failure mechanism (as recommended here: “Flying the Flaming Skies: Should You Trust the Boeing Dreamliner?“). It is my opinion that this has not been done, because if it had, this knowledge would be trumpeted by Boeing.
I’m not flying the Boeing Dreamliner until I see the evidence that supports the optimistic conclusions of Boeing, the FAA, and the Department of Transportation.
Michael Sinnett, Boeing’s chief project engineer, said in a recent briefing that “Boeing is redesigning its batteries to ensure a fire isn’t possible. Among the new features will be a fire-resistant stainless steel case that will prevent oxygen from reaching the cells so fire can’t erupt.” (from “NTSB Contradicts Boeing Claim of No Fire in 787 Battery,” by Alan Levin , 15 Mar 2013 Bloomberg).
The problem with that statement is that once a lithium battery is heated sufficiently, it releases its own oxygen to fuel continued burning/explosion. That’s why lithium fires are extremely difficult to extinguish, and why an outer case, although it may keep a fire from spreading, will not prevent a fire from erupting.
In DACI’s 1st Quarter 2012 newsletter I predicted that a catastrophic safety event would eventually occur due to lithium batteries (please see “Li-Ion Battery Pack Hazards and our Psychic Prediction“). The recent fires in the initial flights of the new Boeing Dreamliner have come close to fulfilling that prophecy.
From “Detecting Lithium-Ion Cell Internal Faults In Real Time” (Celina Mikolajczak, John Harmon, Kevin White, Quinn Horn, and Ming Wu, in the Mar 1, 2010 issue of Power Electronics Technology) it is known that internal cell faults in lithium batteries can lead to thermal runaway, subsequently resulting in fires and/or explosions. Therefore the question arises: do the Boeing lithium batteries have an advanced internal construction that prevents cell faults, or mitigates thermal runaway in the event of a fault? If not, the Boeing team or vendor responsible for the battery system design is in big, big, trouble.
Although deficiencies in basic battery chemistry and/or construction appear to offer the best root cause hypothesis for the fires, there are also other possible factors. For example, it has been reported that perhaps the charging system malfunctioned, causing the batteries to overheat. However, a properly designed charger for an aircraft application would have fail-safe protection, preventing an overcharge. Plus, it was also reported that charging sensors did not detect an overvoltage. Although these factors sound reassuring, they are not sufficient to eliminate the charger from consideration. For example, one can hypothesize a charging waveform that contains spurious high frequency oscillations that create high rms charging currents. This would not necessarily result in overvoltage, but could result in overheating.
It is also possible that battery “cell defects” are nothing more than cell imbalances that vary according to production tolerances. In other words, the lithium battery, by its very nature, tends towards thermal runaway unless the internal cells are very tightly matched. This sensitivity would become more pronounced with a higher number of cells and higher mass, which would explain why no explosions have occurred in small button-style batteries, but do occur in the larger batteries.
There are other scenarios, including the thorny possibility that some combination of conditions conspired to create the failure. And, of course, the root cause may be highly intermittent, making detection extremely difficult. Such hypotheses are undoubtedly being examined by the Boing engineers. I wish them well, and hope that they are allowed to perform their work calmly, methodically, and thoroughly.
Note: Because it may take quite a long time to conclusively establish a root cause, I would suggest that Boeing immediately begin planning to retrofit the lithium system with one containing battery types that have not shown the proclivity to explode; e.g. nickel metal-hydride, or sealed lead acid gel. Heavier, yes, but in this case safety and the economic timeline indicate that it would be wise to be prepared with a retrofit design.
(For some brief guidelines on design failure crisis management, please see Scenario #6: “Coping with Design Panic,” in The Design Analysis Handbook, Appendix A, “How to Survive an Engineering Project.”
There are actually a few different types of WCA, primarily:
Extreme Value Analysis (EVA)
Statistical Analysis (Monte Carlo)
WCA+ is safer than Monte Carlo and more practical than EVA. Monte Carlo can miss small but important extreme values, and EVA can result in costly overdesign. WCA+ identifies extreme values that statistical methods can miss, and then estimates the probability that the extreme value will exceed specification limits, thereby providing the designer with a practical risk-assessment metric. WCA+ also generates normalized sensitivities and optimization, which can be used for design centering. (Ref. http://daci-wca.com/products_005.htm)
Myth #2: Worst Case Analysis is optional if you do a lot of testing
To maintain happy customers and minimize liability exposure, the effects of environmental and component variances on performance must be thoroughly understood. Testing alone cannot achieve this understanding, because testing — for economic reasons — is usually performed on a very small number of samples. Also, since testing typically has a short time schedule, the effects of long-term aging will not be detected.
Myth #3: Worst Case Analysis is optional if we vary worst case parameters during testing
Initial tolerances typically play a substantial role in determining worst case performance. Such tolerances, however, are not affected by heating/cooling the samples, varying the supply voltages, varying the loads, etc.
For example, a design might have a dozen functional specs and a dozen stress specs (these numbers are usually much, much higher). To expose worst case performance, some tolerances may need to be at their low values for some of the specs, but at their high or intermediate values for other specs. First, it’s not even likely that a tolerance will be at the worst case value for a single spec. Second, it’s impossible for the tolerance to simultaneously be at the different values required to expose worst case performance for all the specs. Therefore it’s not valid to expect a test sample to serve as a worst case performance predictor, regardless of the amount of temperature cycles, voltage variations, etc. that are applied to the sample.
Myth #4: Worst Case Analysis is best done by statistics experts
No, it is far better to have WCA performed — or at least supervised — by experts in the design being analyzed, using a practical tool like WCA+ that employs minimal statistical mumbo-jumbo. Analyses (particularly cook-book statistical ones), when applied by those without expertise in the design being analyzed, often yield hilariously incorrect results.