I'm pretty sure Shuttle had occasional issues with one of its triple redundant flight computers... Was the same "sky is falling" mentality expressed about that?
Just a datapoint for everyone. I used to work at a company that used the RAD750 (for GLAST, now Fermi Gamma-ray Space Telescope). It is a rad-hard version of the PowerPC 750. Apple called it the PowerPC G3. It was used in the multi-color iMacs. It think it is still the top or near the top of the heap for rad-hard CPUs. It runs at 200MHz. It is used on Curiosity, Juno and many others. Cost per board was ~$200,000.
Quote from: Lars_J on 11/16/2012 04:34 pmI'm pretty sure Shuttle had occasional issues with one of its triple redundant flight computers... Was the same "sky is falling" mentality expressed about that?True, but Shuttle had five GPCs, all of which were rad hardened - making total loss much less likely.
Attacking strawmen is not "bringing balance."Identifying true mistakes and errors /does/ bring balance.
Quote from: Space Pete on 11/16/2012 04:42 pmQuote from: Lars_J on 11/16/2012 04:34 pmI'm pretty sure Shuttle had occasional issues with one of its triple redundant flight computers... Was the same "sky is falling" mentality expressed about that?True, but Shuttle had five GPCs, all of which were rad hardened - making total loss much less likely.Yes, but they had a very low MTBF (where low is bad) rating compared to modern electronics: http://www.atarimagazines.com/compute/issue132/92_Space_shuttle_techno.php
Quote from: Robotbeat on 11/16/2012 04:55 pmQuote from: Space Pete on 11/16/2012 04:42 pmQuote from: Lars_J on 11/16/2012 04:34 pmI'm pretty sure Shuttle had occasional issues with one of its triple redundant flight computers... Was the same "sky is falling" mentality expressed about that?True, but Shuttle had five GPCs, all of which were rad hardened - making total loss much less likely.Yes, but they had a very low MTBF (where low is bad) rating compared to modern electronics: http://www.atarimagazines.com/compute/issue132/92_Space_shuttle_techno.phpThat article is old and incorrect. The AP-101S (not B) was speced at 10000 hrs MTBF and had reached 80000 hrs MTBF at the end of the program. 80000 hrs is over 9 years without a failure and much better than generic commercial processors/computers. It is what is needed when someone's life depends upon the avionics working.
Quote from: alk3997 on 11/16/2012 07:32 pmQuote from: Robotbeat on 11/16/2012 04:55 pmQuote from: Space Pete on 11/16/2012 04:42 pmQuote from: Lars_J on 11/16/2012 04:34 pmI'm pretty sure Shuttle had occasional issues with one of its triple redundant flight computers... Was the same "sky is falling" mentality expressed about that?True, but Shuttle had five GPCs, all of which were rad hardened - making total loss much less likely.Yes, but they had a very low MTBF (where low is bad) rating compared to modern electronics: http://www.atarimagazines.com/compute/issue132/92_Space_shuttle_techno.phpThat article is old and incorrect. The AP-101S (not B) was speced at 10000 hrs MTBF and had reached 80000 hrs MTBF at the end of the program. 80000 hrs is over 9 years without a failure and much better than generic commercial processors/computers. It is what is needed when someone's life depends upon the avionics working. It did indeed experience failure: http://www.slashgear.com/space-shuttle-crew-gets-midnight-wakeup-over-computer-failure-15165274/Not a huge, scary problem. There were five redundant ones. But it wasn't error-free, either.
Let me take this viewpoint: NASA purchased launch services from SpaceX, and in the process of doing so issued a number of requirements for SpaceX to conform to. Was the use of RAD-hardened computer equipment amongst those requirements?If NO, then NASA apparently was short-sighted OR NASA expected other means of compensating for the increased radiation-induced malfunctions in-orbit to be sufficient. (Such as flying with added redundancy).If YES, then SpaceX didn't perform as required OR the radiation environment in-orbit exceeds the requirement levels. IF the former turns out to be correct, one could wonder why NASA signed off on a flight that did not meet requirements. But, the above is all big IF's.
Quote from: Robotbeat on 11/16/2012 08:37 pmQuote from: alk3997 on 11/16/2012 07:32 pmQuote from: Robotbeat on 11/16/2012 04:55 pmQuote from: Space Pete on 11/16/2012 04:42 pmQuote from: Lars_J on 11/16/2012 04:34 pmI'm pretty sure Shuttle had occasional issues with one of its triple redundant flight computers... Was the same "sky is falling" mentality expressed about that?True, but Shuttle had five GPCs, all of which were rad hardened - making total loss much less likely.Yes, but they had a very low MTBF (where low is bad) rating compared to modern electronics: http://www.atarimagazines.com/compute/issue132/92_Space_shuttle_techno.phpThat article is old and incorrect. The AP-101S (not B) was speced at 10000 hrs MTBF and had reached 80000 hrs MTBF at the end of the program. 80000 hrs is over 9 years without a failure and much better than generic commercial processors/computers. It is what is needed when someone's life depends upon the avionics working. It did indeed experience failure: http://www.slashgear.com/space-shuttle-crew-gets-midnight-wakeup-over-computer-failure-15165274/Not a huge, scary problem. There were five redundant ones. But it wasn't error-free, either.Come on. As I said earlier problems do come up. It is how they are dealt with that is important. Yes, there was this issue. However what is said above about the general failure rate and their overall reliability is quite correct. With this experience on Dragon, are you attempting to claim that reliability does not matter as long as there is sufficient redundancy?You seem to be arguing both sides of the fence here. If you are going to point to this issue on orbiter, if the same approximate situation on Dragon occured on orbiter would it still be "no big deal"?
In both cases, it's no big deal. Sure, it's better not to have any, but we live in an imperfect world.And yeah, sufficient redundancy can indeed compensate (in fact, usually MORE than compensates) for individual non-perfect reliability. Aggregate reliability is increased. That is why Shuttle went for 5 computers... they originally designed to a reliability rating of a MTBF of 5000 hours per computer. But having 5 computers meant they could still do their missions safely.I do systems engineering for redundant computer systems as a part-time job.
And yeah, sufficient redundancy can indeed compensate (in fact, usually MORE than compensates) for individual non-perfect reliability. Aggregate reliability is increased.
http://www.aviationweek.com/Blogs.aspx?plckBlogId=Blog:04ce340e-4b63-4d23-9695-d49ab661f385&plckPostId=Blog%3A04ce340e-4b63-4d23-9695-d49ab661f385Post%3Aa8b87703-93f9-4cdf-885f-9429605e14dfDragon uses the same design principles as the Shuttle and Hubble.
In other words, they're going to fly to Mars without any rad-hardened parts. Awesome!
Quote from: mlindner on 11/19/2012 02:50 am In other words, they're going to fly to Mars without any rad-hardened parts. Awesome!No, that was not said or inferred.
Come on. As I said earlier problems do come up. It is how they are dealt with that is important. Yes, there was this issue. However what is said above about the general failure rate and their overall reliability is quite correct. With this experience on Dragon, are you attempting to claim that reliability does not matter as long as there is sufficient redundancy?You seem to be arguing both sides of the fence here. If you are going to point to this issue on orbiter, if the same approximate situation on Dragon occured on orbiter would it still be "no big deal"?
Quote from: Go4TLI on 11/16/2012 08:50 pmCome on. As I said earlier problems do come up. It is how they are dealt with that is important. Yes, there was this issue. However what is said above about the general failure rate and their overall reliability is quite correct. With this experience on Dragon, are you attempting to claim that reliability does not matter as long as there is sufficient redundancy?You seem to be arguing both sides of the fence here. If you are going to point to this issue on orbiter, if the same approximate situation on Dragon occured on orbiter would it still be "no big deal"?Yes, no big deal. They had similar experience on Shuttle, during the first Hubble repair mission, and it was no big deal, exactly because of the level of redundancy and the way the redundancy logic is applied.Read this: http://www.aviationweek.com/Blogs.aspx?plckBlogId=Blog:04ce340e-4b63-4d23-9695-d49ab661f385&plckPostId=Blog%3A04ce340e-4b63-4d23-9695-d49ab661f385Post%3Aa8b87703-93f9-4cdf-885f-9429605e14df
Quote from: Space Pete on 11/16/2012 04:42 pmQuote from: Lars_J on 11/16/2012 04:34 pmI'm pretty sure Shuttle had occasional issues with one of its triple redundant flight computers... Was the same "sky is falling" mentality expressed about that?True, but Shuttle had five GPCs, all of which were rad hardened - making total loss much less likely.The following paper describes the improved GPCs and single-event upset events on their non-radiation hardened SRAM chips. Note that software was included to constantly check for upset events, to do error checking and correction, etc. I wonder if SpaceX does all of that.http://klabs.org/DEI/Processor/shuttle/oneill_94.pdf - Ed Kyle
Think of a computer as lots of white marbles that are arranged in a specific pattern on a table, and a black marble comes in and knocks one of the white marbles out of place. Now, the memories of our computers are constantly checking for that happening. So if we take a hit in our most dense part of our computer – the memory – the computer detects it and repairs it and there's no harm done. But our other circuits in the computer, places like where we're bringing information in and out of the processor, if we take a hit there it can cause basically a bit to flip from a zero to a one. And that instruction can be wrong, and that is where the two processors in a single computer element voting on each other can detect that, and it can force a reboot. And that's what happened, we rebooted the computer.