[Please note that the following article — while it has been updated from our newsletter archives — may not reflect the latest software interface and plot graphics, but the original methodology and analysis steps remain applicable.]
In order to track and improve the reliability of their products, manufacturing organizations must utilize an accurate and concise method to specify and measure that reliability. Although useful to some degree, the mean life function (often denoted as "MTTF" or "MTBF") is not a good measurement when used as the sole reliability metric. Instead, the specification of a reliability value with an associated time, along with an associated confidence requirement when possible, is a more versatile and powerful metric for describing a product's reliability.
The Mean Life Function
The mean life function, such as the mean time to failure (MTTF), is widely used as the measurement of a product's reliability and performance. This value is often calculated by dividing the total operating time of the units tested by the total number of failures encountered. This metric, which is valid only when the data is exponentially distributed (a poor assumption which implies that the failure rate is constant), is then used as the sole measure of a product's reliability.
One problem that arises from the use of this metric stems from confusion over the differentiation between the "mean" and the "median" values of a data set. The mean is what would normally be called the average, or the most likely value to be expected in a group of data. The mathematical definition is:
where f(T) is the probability density function of the data. The median, on the other hand, is the value that splits the data. Half of the data will be greater than the median and half will be less than the median. The mathematical definition of the median involves solving the equation:
, where f(T) is the probability density function of the data. If the data in question is from a symmetrical distribution, such as the normal or Gaussian distribution, the values for the mean and median are equal. However, when dealing with an asymmetrical distribution, such as the exponential or Weibull, there can be a large difference between the mean and median.
To give a discreet example, suppose we have a data set consisting of five values: (1,2,3,4,100). The mean value of this data set is (1+2+3+4+100)/5=110/5=22. However, the median value for this data set is 3, as it is the "middle number" in the set of five. Clearly, there is a sizable difference in the values of the mean and the median for this data set.
Reliability is a Function of Time
Because reliability is a function of time, in order to properly define a reliability goal or test result, the reliability value should be associated with a time. For example, "the reliability at 50,000 cycles should be 50%" is a more meaningful reliability goal than "the MTTF should be 50,000 cycles."
The MTTF is not an appropriate metric because the reliability value associated with the MTTF is not always 50% and can vary widely. The following example illustrates how the actual reliability can vary with a given MTTF. Suppose we are testing the reliability of products from three suppliers. We obtain eight samples from each supplier and test them until they fail. The next table gives the time-to-failure results for the three test lots.
|Test Lot 1
|Test Lot 2
|Test Lot 3
We determine that these three data sets follow the Weibull distribution and the probability plot of the data, generated with ReliaSoft's Weibull++ software, is displayed in Figure 1. For the Weibull distribution,
the MTTF is calculated with the equation:
are the Weibull location and shape parameters and
(*) is the Gamma function.
Figure 1: Probability Plot
Based on this analysis (utilizing Rank Regression - RRX), we determine that even though the three Weibull-distributed data sets are quite different, they have the same MTTF of 100,000. The actual reliability values, however, are quite different at different times as can be seen in the reliability vs. time plot in Figure 2. At the MTTF of 100,000 for data set 2, over 85% of the units are expected to fail while for data sets 1 and 3, 63% and 49% of the units are expected to fail respectively.
Figure 2: Reliability vs. Time Plot
This example illustrates the potential pitfalls of using the MTTF as the sole reliability metric. Attempting to use a single number to describe an entire lifetime distribution can be misleading and may lead to poor business decisions when a non-exponential lifetime distribution is assumed. The reliability of a product should be specified as a percentage value with an associated time. Ideally, a confidence level should also be associated, which allows for consideration of the variability of data being compared to the specification.