Another factor that is often considered by some, is the time between failures, and whether that is increasing, decreasing or remaining stable.For example, Shuttle first failed on its 25th launch, but it took 87 more flights before it failed a second time. While those are only two data points, it still suggests that the program's reliability was improving over time.More data points would help to indicate more useful trends in this regard.Ross.
Two data points suggest no such thing. You could just as easily argue that the reliability was about to go over a cliff. Or using the bathtub curve, that Shuttle was nearing the time where failures were about to be more common.
Quote from: Lars-J on 05/25/2018 04:29 pmTwo data points suggest no such thing. You could just as easily argue that the reliability was about to go over a cliff. Or using the bathtub curve, that Shuttle was nearing the time where failures were about to be more common.I don't believe that the bathtub curve applies to this problem. The bathtub curve applies to product or part reliability as the product or part ages in use, as I understand things. Neither of the STS failures were due to part aging.
Historically, most launch vehicles have become more reliable over time, as the bugs are worked out of their designs and processes. With Shuttle, we aren't just looking at two data points, we are looking at 135 mission "samples", spread over time, that include two outright destructive failures.Here's a graph that shows a view of a LaPlace point reliability estimate over the life of the STS program.
Quote from: Lars-J on 05/25/2018 04:29 pmTwo data points suggest no such thing. You could just as easily argue that the reliability was about to go over a cliff. Or using the bathtub curve, that Shuttle was nearing the time where failures were about to be more common.I don't believe that the bathtub curve applies to this problem. The bathtub curve applies to product or part reliability as the product or part ages in use, as I understand things. Neither of the STS failures were due to part aging.Historically, most launch vehicles have become more reliable over time, as the bugs are worked out of their designs and processes. With Shuttle, we aren't just looking at two data points, we are looking at 135 mission "samples", spread over time, that include two outright destructive failures. Here's a graph that shows a view of a LaPlace point reliability estimate over the life of the STS program. - Ed Kyle
I don't believe that the bathtub curve applies to this problem. The bathtub curve applies to product or part reliability as the product or part ages in use, as I understand things. Neither of the STS failures were due to part aging.Historically, most launch vehicles have become more reliable over time, as the bugs are worked out of their designs and processes. With Shuttle, we aren't just looking at two data points, we are looking at 135 mission "samples", spread over time, that include two outright destructive failures. Here's a graph that shows a view of a LaPlace point reliability estimate over the life of the STS program.
Can you point me toward a comprehensible (comprehensive is not necessary) explanation of the LaPlace point statistical analysis?
Quote from: S.Paulissen on 05/25/2018 11:18 pmCan you point me toward a comprehensible (comprehensive is not necessary) explanation of the LaPlace point statistical analysis?This page includes some concise descriptions of several methods. The Laplace point estimate represents the peak of the probability distribution within the confidence interval. The two should be used together, because a "confidence interval ... is much more informative than a point estimate ..."https://measuringu.com/wald/ - Ed Kyle
But I don't see both figures in your "launch vehicle reliability stats" on your site. Or am I missing it?
Launch Vehicles with 20 or More Orbital AttemptsRanked by LaPlace point estimate================================================================ Vehicle Successes/Tries Realzd Pred Consc. Last Dates Rate Rate* Succes Fail ================================================================ Soyuz-FG 53 53 1.00 .98 53 None 2001-Delta 2 152 154 .99 .98 99 1/17/97 1989-Atlas 5 77 78 .99 .98 68 6/15/07 2002-Falcon 9 v1.2 35 35(D)1.00 .97 35 None 2015- Delta 4M(+) 27 27 1.00 .97 27 None 2002-Ariane 5-ECA 64 66 .97 .96 1 01/25/18 2002- CZ-2D 38 39 .97 .95 7 12/28/16 1992-H-2A 37 38 .97 .95 32 11/29/03 2001-CZ-4(A/B/C) 54 56 .96 .95 6 08/31/16 1988-CZ-2(C)(/SD/SM) 47 49 .96 .94 13 08/18/11 1974-CZ-3B/3C 58 61 .95 .94 6 06/18/17 1996-PSLV 40 43 .93 .91 2 08/31/17 1993-CZ-3/3A 35 38 .92 .90 25 8/18/96 1984-Rokot/Briz/K(M) 27 29 .93 .90 13 02/01/11 1994-Proton-M/Briz-M 85 94 .90 .90 12 10/21/14 2001-Soyuz 2-1b/Fregat 27 30 .90 .88 1 11/28/17 2006-Pegasus (H/XL) 38 43 .88 .87 29 11/4/96 1991-================================================================ Ranked by Adjusted Wald 95% Confidence Interval Lower Limit================================================================ Vehicle Successes/Tries Realzd Adj Wald 95%CI Lower Consc. Last Dates Rate Limit* Succes Fail ================================================================Delta 2 152 154 .99 .95 99 1/17/97 1989-Atlas 5 77 78 .99 .92 68 6/15/07 2002- Soyuz-FG 53 53 1.00 .92 53 None 2001-Ariane 5-ECA 64 66 .97 .89 1 01/25/18 2002- Falcon 9 v1.2 35 35(D)1.00 .88 35 None 2015-CZ-4(A/B/C) 54 56 .96 .87 6 08/31/16 1988-CZ-3B/3C 58 61 .95 .86 6 06/18/17 1996-CZ-2D 38 39 .97 .86 7 12/28/16 1992-CZ-2(C)(/SD/SM) 47 49 .96 .86 13 08/18/11 1974-H-2A 37 38 .97 .85 32 11/29/03 2001- Delta 4M(+) 27 27 1.00 .85 27 None 2002- Proton-M/Briz-M 85 94 .90 .83 12 10/21/14 2001-PSLV 40 43 .93 .81 2 08/31/17 1993-CZ-3/3A 35 38 .92 .78 25 8/18/96 1984-Rokot/Briz/K(M) 27 29 .93 .77 13 02/01/11 1994- Pegasus (H/XL) 38 43 .88 .75 29 11/4/96 1991-Soyuz 2-1b/Fregat 27 30 .90 .74 1 11/28/17 2006-================================================================ Ranked by Wilson's Point Estimate================================================================ Vehicle Successes/Tries Realzd Wilsons Point Consc. Last Dates Rate Est* Succes Fail ================================================================Delta 2 152 154 .99 .98 99 1/17/97 1989- Soyuz-FG 53 53 1.00 .97 53 None 2001-Atlas 5 77 78 .99 .96 68 6/15/07 2002- Falcon 9 v1.2 35 35(D)1.00 .95 35 None 2015-Ariane 5-ECA 64 66 .97 .94 1 01/25/18 2002- Delta 4M(+) 27 27 1.00 .94 27 None 2002-CZ-4(A/B/C) 54 56 .96 .93 6 08/31/16 1988-CZ-2D 38 39 .97 .93 7 12/28/16 1992-CZ-2(C)(/SD/SM) 47 49 .96 .93 13 08/18/11 1974-H-2A 37 38 .97 .93 32 11/29/03 2001-CZ-3B/3C 58 61 .95 .92 6 06/18/17 1996- Proton-M/Briz-M 85 94 .90 .89 12 10/21/14 2001-PSLV 40 43 .93 .89 2 08/31/17 1993-CZ-3/3A 35 38 .92 .88 25 8/18/96 1984-Rokot/Briz/K(M) 27 29 .93 .88 13 02/01/11 1994- Pegasus (H/XL) 38 43 .88 .85 29 11/4/96 1991-Soyuz 2-1b/Fregat 27 30 .90 .85 1 11/28/17 2006-================================================================
Chasing this a bit, I've made three tables comparing several methods to rank reliabilities.The first table is the Laplace point estimate. The second table uses the lower bound of the 95% Confidence Interval using the Adjusted Wald method. The third table uses Wilson's point estimate, which is the midpoint of the Adjusted Wald 95% Confidence Interval.The latter two methods put more weight on total number of launches, moving rockets like Proton M/Briz M up the list versus the Laplace ranking.Launch Vehicles with 20 or More Orbital AttemptsRanked by LaPlace point estimate================================================================ Vehicle Successes/Tries Realzd Pred Consc. Last Dates Rate Rate* Succes Fail ================================================================ Soyuz-FG 53 53 1.00 .98 53 None 2001-================================================================
Launch Vehicles with 20 or More Orbital AttemptsRanked by LaPlace point estimate================================================================ Vehicle Successes/Tries Realzd Pred Consc. Last Dates Rate Rate* Succes Fail ================================================================ Soyuz-FG 53 53 1.00 .98 53 None 2001-================================================================
I'm not a statistician, but something strikes me as strange about this method of estimating reliability. If all the launches were successful, why would any statistical method ever predict a nonzero failure rate? There must be some kind of assumption built in that rockets sometimes fail, even in the absence of any data to that effect.
Are these assumptions made explicit somewhere, e.g. that in the absence of data 50% reliability is assumed or some such?
Do these models assume that the reliability of a rocket is constant over time?
Quote from: hplan on 05/30/2018 02:27 pmI'm not a statistician, but something strikes me as strange about this method of estimating reliability. If all the launches were successful, why would any statistical method ever predict a nonzero failure rate? There must be some kind of assumption built in that rockets sometimes fail, even in the absence of any data to that effect.Just because a rocket has not failed yet doesn't mean it will never fail in the future, right? Consider the Shuttle or Falcon 9, for example, which enjoyed multiple successes before their first failures.QuoteAre these assumptions made explicit somewhere, e.g. that in the absence of data 50% reliability is assumed or some such?The Laplace point estimate makes precisely that assumption. The estimated success rate after n launches of which s are successful is (s + 1)/(n + 2).QuoteDo these models assume that the reliability of a rocket is constant over time?Yes, in that only the total numbers of trials and successes, regardless of how long ago the occurred, matter.
Launch Vehicle Reliability Ranked by Lewis Point Estimate================================================================ Lewis Point AdjWald Consc. Last Dates Vehicle Successes/Tries Est* 95%CI* Succes Fail ================================================================Soyuz-FG 53 53 .98 .92-1.00 53 None 2001-Delta 2 152 154 .98 .95-1.00 99 01/17/97 1989-Atlas 5 77 78 .98 .92-1.00 68 06/15/07 2002-Falcon 9 v1.2 35 35(D) .97 .88-1.00 35 None 2015-Delta 4M(+) 27 27 .97 .85-1.00 27 None 2002-Ariane 5-ECA 64 66 .96 .89-1.00 1 01/25/18 2002-CZ-2D 38 39 .95 .86-1.00 7 12/28/16 1992-H-2A 37 38 .95 .85-1.00 32 11/29/03 2001-CZ-4(A/B/C) 54 56 .95 .87-1.00 6 08/31/16 1988-CZ-2(C)(/SD/SM) 47 49 .94 .86-1.00 13 08/18/11 1974-CZ-3B/3C 58 61 .94 .86-0.99 6 06/18/17 1996-CZ-2F(T/Y) 13 13 .93 .73-1.00 13 None 1999-Minotaur 1 11 11 .92 .70-1.00 11 None 2000-Vega 10 10xx .92 .68-1.00 10 None 2012-PSLV 40 43 .91 .81-0.98 2 08/31/17 1993-CZ-3/3A 35 38 .90 .78-0.98 25 08/18/96 1984-Soyuz 2-1a/Fregat 18 19# .90 .74-1.00 16 05/21/09 2006-Rokot/Briz/K(M) 27 29 .90 .77-0.99 13 02/01/11 1994-Soyuz 2-1b 8 8 .90 .63-1.00 8 None 2008-Proton-M/Briz-M 85 94 .90 .83-0.95 12 10/21/14 2001-Ariane 5ES 7 7 .89 .60-1.00 7 None 2008-Delta 4 Heavy 8 9 .89 .54-1.00 8 12/21/04 2004-H-2B 6 6 .88 .56-1.00 6 None 2009-Soyuz 2-1b/Fregat 27 30 .88 .74-0.97 1 11/28/17 2006-Pegasus (H/XL) 38 43 .88 .75-0.95 29 11/04/96 1991-Soyuz 2-1a 12 13# .87 .65-1.00 6 04/28/15 2004-Minotaur 4/5 5 5++ .86 .51-1.00 5 None 2010-Zenit 3F/FregSB 4 4 .83 .45-1.00 4 None 2011-CZ-11 4 4 .83 .45-1.00 4 None 2015-GSLV Mk2 5 6 .83 .42-0.99 5 04/15/10 2001-Strela 3 3 .80 .38-1.00 3 None 2003-Kuaizhou 1(A) 3 3 .80 .38-1.00 3 None 2013-Epsilon 3 3 .80 .38-1.00 3 None 2013-Antares 2xx 3 3 .80 .38-1.00 3 None 2016-CZ-6 2 2 .75 .29-1.00 2 None 2015-Shtil' 2 2 .75 .29-1.00 2 None 1998-CZ-7 2 2 .75 .29-1.00 2 None 2016-Shavit(-1,-2) 8 11 .73 .43-0.91 4 9/6/04 1988-Taurus (XL) 7 10 .70 .39-0.90 1 3/4/11 1994-Soyuz 2-1v/Volga 2 3 .67 .20-0.94 1 12/05/15 2013-Falcon Heavy 1 1 .67 .17-1.00 1 None 2018-Soyuz 2-1a/Volga 1 1 .67 .17-1.00 1 None 2016-Angara A5 1 1 .67 .17-1.00 1 None 2014-GSLV Mk3 1 1z .67 .17-1.00 1 None 2017-KT-2 1 1 .67 .17-1.00 1 None 2017-Soyuz 2-1v 1 1 .67 .17-1.00 1 None 2018-Safir 5 8(C) .63 .30-0.87 1 09/02/12 2008-Electron 1 2 .50 .09-0.91 1 05/25/17 2017-SS-520 1 2 .50 .09-0.91 1 01/14/17 2017-CZ-5 1 2 .50 .09-0.91 0 07/02/17 2016-Unha (TD-2) 2 5% .44 .12-0.77 2 04/12/12 2006-Proton-M/DM-03 1 3 .43 .06-0.80 1 07/02/13 2010-================================================================* Lewis Point Estimate Determined as Follows. Maximum Liklihood Estimate (MLE)= x/n where x=success, n=tries If MLE<=0.5, use Wilson Method = (x+2)/(n+4) If 0.5<MLE<0.9, use MLE = x/n If MLE>=0.9, use Laplace Method = (x+1)/(n+2) Lewis, J. & Lauro, J., "Improving the Accuracy of Small-Sample Estimates of Completion Rates", Journal of Usability Studies, Issue 3, Vol. 1, May 2006, pp. 136-150. Adjusted-Wald 95% Confidence Interval Range Also Provided.
Allow me to introduce a proposal for updating my ranking tables. ... - Ed Kyle
And yet from the data, both of these assumptions would appear to be false.This situation reminds me of the clash of cultures between statisticians and practitioners of machine learning (ML). Statisticians sometimes criticize the ML crowd as "not doing science," because they are trying to get the best possible results for a particular dataset, instead of doing what theoretical statisticians do--proving that a certain method has optimal results when certain assumptions about the distribution of input data are met.Sadly, real data never meets the statisticians' assumptions.
Allow me to introduce a proposal for updating my ranking tables....