Author Topic: SpaceX CRS-1 Software/Computer Design Discussion Thread  (Read 36943 times)

Offline Robotbeat

  • Senior Member
  • *****
  • Posts: 39359
  • Minnesota
  • Liked: 25388
  • Likes Given: 12164
SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #20 on: 11/20/2012 05:34 pm »
Another point is that while throwing redundancy at the problem can indeed solve almost any reliability issue, the cost of such a decision may be an order of magnitude increase in complexity and software development costs.

And there you go.  So redundancy alone not is the fix-all you were suggesting.
Where did I say it will solve all issues ever? It does solve reliability issues, not necessarily in the most cost-effective way.
Quote
Also redundancy does NOT solve reliability issues.  It is a mitigation to the fact that one has poor reliability, that is a major difference. 
...
Yes, it does, and it isn't actually the major difference you make it out to be.

Since you're speaking in absolutes, let me use an example from my experience: It's actually better for reliability to have two consumer drives in a RAID 1 (especially if you are doing filesystem-level check-summing, another form of redundancy) than a single Enterprise drive by itself. There's more complication, sure, but reliability is absolutely dealt with using redundancy.

SSDs and hard-dives at the hardware/firmware level also use tons of redundancy (ESPECIALLY SSDs) to improve the reliability of each block. On top of that, you use RAID. On top of that, you use redundant mirrored systems and backups.

Defense-in-depth with redundancy (even if you're using crappy consumer drives) beats using a single drive with extreme tolerances every single time when it comes to reliability and data integrity. The only exception to this (besides software glitches or common design errors, which can affect you in either case but have less of an impact if you have backups) would be when rebuild time approaches the same order of magnitude as MTBF (for that unit). But there are ways around that, too.

You don't even realize the extent to which redundancy in data increases reliability in even our communication right here because it is essentially completely transparent. Checksumming occurs all over the place. You think all the cables that make up the Internet have such high S/N ratios that they are designed to never produce a flipped bit? Absolutely not. There is check-summing all over the place. Even in your computer's PCI-E bus, there are cyclic redundancy checks occurring with each transaction.
« Last Edit: 11/20/2012 05:44 pm by Robotbeat »
Chris  Whoever loves correction loves knowledge, but he who hates reproof is stupid.

To the maximum extent practicable, the Federal Government shall plan missions to accommodate the space transportation services capabilities of United States commercial providers. US law http://goo.gl/YZYNt0

Offline Chris Bergin

Hopefully I've not messed this up, but it seems we can have a standalone for the software rad issues. So a split thread.
Support NSF via L2 -- Help improve NSF -- Site Rules/Feedback/Updates
**Not a L2 member? Whitelist this forum in your adblocker to support the site and ensure full functionality.**

Offline mmeijeri

  • Senior Member
  • *****
  • Posts: 7772
  • Martijn Meijering
  • NL
  • Liked: 397
  • Likes Given: 822
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #22 on: 11/20/2012 07:11 pm »
I'd love to hear more details about why SpaceX went with non-hardened components. So far I've heard C++ and Linux as an explanation, which I don't find very convincing since both work just fine on several rad-hard processors. There must be more to it.
Pro-tip: you don't have to be a jerk if someone doesn't agree with your theories

Offline A_M_Swallow

  • Elite Veteran
  • Senior Member
  • *****
  • Posts: 8906
  • South coast of England
  • Liked: 500
  • Likes Given: 223
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #23 on: 11/20/2012 07:47 pm »
I'd love to hear more details about why SpaceX went with non-hardened components. So far I've heard C++ and Linux as an explanation, which I don't find very convincing since both work just fine on several rad-hard processors. There must be more to it.

I can understand why SpaceX did it in their control room, the computers have key boards and displays.  However most Dragon computers are embedded.  A rocket engine looks nothing like a display.

Offline guckyfan

  • Senior Member
  • *****
  • Posts: 7442
  • Germany
  • Liked: 2336
  • Likes Given: 2900
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #24 on: 11/20/2012 07:57 pm »
I'd love to hear more details about why SpaceX went with non-hardened components. So far I've heard C++ and Linux as an explanation, which I don't find very convincing since both work just fine on several rad-hard processors. There must be more to it.

An uneducated guess: on a slow system the realtime requirements may not be met with Linux and C++.

Offline MikeAtkinson

  • Full Member
  • ****
  • Posts: 1980
  • Bracknell, England
  • Liked: 784
  • Likes Given: 120
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #25 on: 11/20/2012 08:02 pm »
I'd love to hear more details about why SpaceX went with non-hardened components. So far I've heard C++ and Linux as an explanation, which I don't find very convincing since both work just fine on several rad-hard processors. There must be more to it.

A major reason must be cost. A quick search came up with $23,000 for a processor, so say $50,000 for a processor board. Six of these in a processing unit = $300,000. 18 processing units per dragon = $5.4M

They also probably take more power, so larger solar panels and radiators.

Then there is limited selection of chips available, which may lead to compromises in design.

If they are slower than non-rad hard parts, the software may require more optimisation, which can be very costly both in development and maintenance.

Offline mlindner

  • Software Engineer
  • Senior Member
  • *****
  • Posts: 2928
  • Space Capitalist
  • Silicon Valley, CA
  • Liked: 2240
  • Likes Given: 827
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #26 on: 11/20/2012 08:04 pm »
I'd love to hear more details about why SpaceX went with non-hardened components. So far I've heard C++ and Linux as an explanation, which I don't find very convincing since both work just fine on several rad-hard processors. There must be more to it.

So I've been trying to make this point several times now, but apparently people are thinking about this differently than I. I don't know if its a generational issue (I'm 23) or being a student in computer engineering or what. I think SpaceX is just trying to follow Amdahl's Law in that you shouldn't optimize a small part of the problem.

Why would you want to spend large amounts of money (relatively) in buying rad-hardened parts when you can just use multiple processors in parallel checking each other. Current top of the line rad-hardened parts get you (using stated prices for fastest rad hardened parts) 3 orders of magnitude less in speed for an increase of cost of also 3 orders of magnitude for a net 6 orders of magnitude increase.
LEO is the ocean, not an island (let alone a continent). We create cruise liners to ride the oceans, not artificial islands in the middle of them. We need a physical place, which has physical resources, to make our future out there.

Offline Robotbeat

  • Senior Member
  • *****
  • Posts: 39359
  • Minnesota
  • Liked: 25388
  • Likes Given: 12164
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #27 on: 11/20/2012 08:06 pm »
Plus, don't forget that radiation is just one cause of failure. A redundant design can guard against several.

They calculated (apparently correctly) that the error rate would be low enough that a triply redundant computer system with ability to reboot and resync the computers should allow for high enough reliability. In light of the successful mission and lower-than-expected error rate (if we believe SpaceX), that view was justified, with the caveat that they didn't automate resyncing and NASA told them not to resync manually, which can be classed as an oversight that should be (and appears to be) corrected.
« Last Edit: 11/20/2012 08:09 pm by Robotbeat »
Chris  Whoever loves correction loves knowledge, but he who hates reproof is stupid.

To the maximum extent practicable, the Federal Government shall plan missions to accommodate the space transportation services capabilities of United States commercial providers. US law http://goo.gl/YZYNt0

Offline mlindner

  • Software Engineer
  • Senior Member
  • *****
  • Posts: 2928
  • Space Capitalist
  • Silicon Valley, CA
  • Liked: 2240
  • Likes Given: 827
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #28 on: 11/20/2012 08:07 pm »
And operationally speaking, if one takes a step back and looks at the big picture of why these systems and vehicles exist, there are phases of potential mission scenarios where it is not optimal to have to assume one has poor reliability and then rely solely on redundancies that may require crew/ground input at less then ideal times and/or circumstances

Which is exactly why SpaceX is making re-syncing automatic in future software. A possible reason they didn't make it auto-resync initially is to not make the system overly complex right away and get some flight heritage on the existing system before they added that feature.
« Last Edit: 11/20/2012 08:07 pm by mlindner »
LEO is the ocean, not an island (let alone a continent). We create cruise liners to ride the oceans, not artificial islands in the middle of them. We need a physical place, which has physical resources, to make our future out there.

Offline john smith 19

  • Senior Member
  • *****
  • Posts: 10444
  • Everyplaceelse
  • Liked: 2492
  • Likes Given: 13762
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #29 on: 11/20/2012 08:24 pm »
I'd love to hear more details about why SpaceX went with non-hardened components. So far I've heard C++ and Linux as an explanation, which I don't find very convincing since both work just fine on several rad-hard processors. There must be more to it.
AFAIK the going price for the BA 750 board (POWER PC architecture) is in the $400-800k range. I'd expect that's "price competitve" in this market with similar products running lesser know instruction sets like the USAF 1750A and the USN ANsomething-or-other. IIRC this is about the capability of a mid 90s Apple Mac. Aside from the *eyewatering* price I think you'll find these boards are *mostly* instruction set compatible with other POWER PCs, but not *exactly*, much as the European equivalent (Mongoose?) is based on the SPARC 7 architecture.

So on the upside the hardware is mfg in a rad hard process (SOS/SOI substrates are only the *start*). from the transistor up, *all* registers are likely to have 3 way voting, as is all I/O and the watchdog timer so you get defense in depth (*provided* your software make appropriate use of the features).

*but* you've got not-quite compatibility with less popular instruction sets (possibly with *substantial* limitations, like a 1MB address space on 1750A, still used by ULA IIRC or the Shuttle's 4Pi architecture) probably favoring military standard 1553b bus protocols (with mil spec pricing) and a clock frequency at most in the low 100s of MHz with *no* control over the form factor and any additional peripherals will be available at the same "competitive" pricing.

I note in all this talk I've not seen any comment on what Spacex actually *uses*. My instinct is x86 compatibles or ARM's (which have enjoyed *much* better power consumption.
MCT ITS BFR SS. The worlds first Methane fueled FFSC engined CFRP SS structure A380 sized aerospaceplane tail sitter capable of Earth & Mars atmospheric flight.First flight to Mars by end of 2022 2027?. T&C apply. Trust nothing. Run your own #s "Extraordinary claims require extraordinary proof" R. Simberg."Competitve" means cheaper ¬cheap SCramjet proposed 1956. First +ve thrust 2004. US R&D spend to date > $10Bn. #deployed designs. Zero.

Offline schaban

  • Full Member
  • *
  • Posts: 180
  • Liked: 53
  • Likes Given: 132
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #30 on: 11/20/2012 08:36 pm »
Could ITAr or other limitations be one of the reason not to choose rad-hardened hardware?

Especially if Musk mentioned that ultimatly, he could try to sell Dragons to 3rd party, possibly outside of US...

Offline guckyfan

  • Senior Member
  • *****
  • Posts: 7442
  • Germany
  • Liked: 2336
  • Likes Given: 2900
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #31 on: 11/20/2012 08:39 pm »
Could ITAr or other limitations be one of the reason not to choose rad-hardened hardware?

Especially if Musk mentioned that ultimatly, he could try to sell Dragons to 3rd party, possibly outside of US...

You could also consider the possibility that SpaceX was not lying when they stated their reasons for not using rad-hardened in that article.


Offline Robotbeat

  • Senior Member
  • *****
  • Posts: 39359
  • Minnesota
  • Liked: 25388
  • Likes Given: 12164
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #32 on: 11/20/2012 08:43 pm »
I'd love to hear more details about why SpaceX went with non-hardened components. So far I've heard C++ and Linux as an explanation, which I don't find very convincing since both work just fine on several rad-hard processors. There must be more to it.
AFAIK the going price for the BA 750 board (POWER PC architecture) is in the $400-800k range. I'd expect that's "price competitve" in this market with similar products running lesser know instruction sets like the USAF 1750A and the USN ANsomething-or-other. IIRC this is about the capability of a mid 90s Apple Mac. Aside from the *eyewatering* price I think you'll find these boards are *mostly* instruction set compatible with other POWER PCs, but not *exactly*, much as the European equivalent (Mongoose?) is based on the SPARC 7 architecture.

So on the upside the hardware is mfg in a rad hard process (SOS/SOI substrates are only the *start*). from the transistor up, *all* registers are likely to have 3 way voting, as is all I/O and the watchdog timer so you get defense in depth (*provided* your software make appropriate use of the features).

*but* you've got not-quite compatibility with less popular instruction sets (possibly with *substantial* limitations, like a 1MB address space on 1750A, still used by ULA IIRC or the Shuttle's 4Pi architecture) probably favoring military standard 1553b bus protocols (with mil spec pricing) and a clock frequency at most in the low 100s of MHz with *no* control over the form factor and any additional peripherals will be available at the same "competitive" pricing.

I note in all this talk I've not seen any comment on what Spacex actually *uses*. My instinct is x86 compatibles or ARM's (which have enjoyed *much* better power consumption.
Quite informative...

So, four or five of those puppies would get to be in the millions of dollars, not counting peripherals. That becomes a significant portion of the spacecraft's cost... SpaceX is a company that likes to spend as little as possible on outside components. And presumably, they would want similiarity to their rocket's avionics as well. That would mean millions for each Falcon 9 or even Falcon 1 (back when they were still pursuing it) or the extra overhead of having two very different platforms.
Chris  Whoever loves correction loves knowledge, but he who hates reproof is stupid.

To the maximum extent practicable, the Federal Government shall plan missions to accommodate the space transportation services capabilities of United States commercial providers. US law http://goo.gl/YZYNt0

Offline cleonard

  • Full Member
  • **
  • Posts: 212
  • Liked: 34
  • Likes Given: 0
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #33 on: 11/20/2012 08:51 pm »
From that Avation Week link it's obvious that SpaceX spent a good deal of time engineering a computing solution.  They did a lot of analysis and even a good amount of testing.  The result is the current set of computing resources used in the SpaceX vehicles.  So far it's worked out. 

Please remember that Radiation Hardened means a lot of different things.  There are the transient effects of particles hitting computer components and there is the total dose over time.  Even hardened components suffer from SEU and you have to deal with that no matter what type of parts you use.

The total dose that a Dragon computer might see in a LEO mission is low.  just guessing I'd say 1 rad or so.  The Curiosity rover has a RAD750 computer that is specified for 100k rads.  To get that 100k rad you get to pay a reported $400k or so for it.

Now SpaceX had said that the Dragon could land on any solid surface in the solar system.  Good luck landing on Io with the current computer setup.  At the surface of Io you get about 2 rads per minute.   The current computer system would not survive the radiation environment for long.  What about Mars?  I'd say that is a maybe or maybe not. 
« Last Edit: 11/20/2012 08:51 pm by cleonard »

Offline Robotbeat

  • Senior Member
  • *****
  • Posts: 39359
  • Minnesota
  • Liked: 25388
  • Likes Given: 12164
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #34 on: 11/20/2012 09:15 pm »
From that Avation Week link it's obvious that SpaceX spent a good deal of time engineering a computing solution.  They did a lot of analysis and even a good amount of testing.  The result is the current set of computing resources used in the SpaceX vehicles.  So far it's worked out. 

Please remember that Radiation Hardened means a lot of different things.  There are the transient effects of particles hitting computer components and there is the total dose over time.  Even hardened components suffer from SEU and you have to deal with that no matter what type of parts you use.

The total dose that a Dragon computer might see in a LEO mission is low.  just guessing I'd say 1 rad or so.  The Curiosity rover has a RAD750 computer that is specified for 100k rads.  To get that 100k rad you get to pay a reported $400k or so for it.

Now SpaceX had said that the Dragon could land on any solid surface in the solar system.  Good luck landing on Io with the current computer setup.  At the surface of Io you get about 2 rads per minute.   The current computer system would not survive the radiation environment for long.  What about Mars?  I'd say that is a maybe or maybe not. 
Good one about Io... Not much could survive there! It'd be a challenge for even a very good rad-hard computer. You'd need additional shielding.
Chris  Whoever loves correction loves knowledge, but he who hates reproof is stupid.

To the maximum extent practicable, the Federal Government shall plan missions to accommodate the space transportation services capabilities of United States commercial providers. US law http://goo.gl/YZYNt0

Offline john smith 19

  • Senior Member
  • *****
  • Posts: 10444
  • Everyplaceelse
  • Liked: 2492
  • Likes Given: 13762
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #35 on: 11/20/2012 09:55 pm »
So, four or five of those puppies would get to be in the millions of dollars, not counting peripherals.
I guess that's more or less the going rate for this kind of hardware.
Quote
That becomes a significant portion of the spacecraft's cost... SpaceX is a company that likes to spend as little as possible on outside components.
It does mount up. The AvWeek article said they have about 54 processors on the whole LV/capsule doing various things. Commonality seems to be  a *very* big Spacex trait. Why support 2 (or 3?) architectures when you can standardize on 1?
Quote
And presumably, they would want similiarity to their rocket's avionics as well. That would mean millions for each Falcon 9 or even Falcon 1 (back when they were still pursuing it) or the extra overhead of having two very different platforms.
Exactly.
Note a classic issue with redundancy management is what happens if the SEU happens inside the *voting* logic. This *could* be done with off the shelf rad hard logic, acting as a "gatekeeper" on the processors I/O.
MCT ITS BFR SS. The worlds first Methane fueled FFSC engined CFRP SS structure A380 sized aerospaceplane tail sitter capable of Earth & Mars atmospheric flight.First flight to Mars by end of 2022 2027?. T&C apply. Trust nothing. Run your own #s "Extraordinary claims require extraordinary proof" R. Simberg."Competitve" means cheaper ¬cheap SCramjet proposed 1956. First +ve thrust 2004. US R&D spend to date > $10Bn. #deployed designs. Zero.

Offline IRobot

  • Full Member
  • ****
  • Posts: 1311
  • Portugal & Germany
  • Liked: 310
  • Likes Given: 272
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #36 on: 11/20/2012 11:13 pm »
This is from some of the reports on what went wrong:

•One of three flight computers failed while Dragon was docked at ISS due to a suspected radiation hit. The computer was restarted but could not re-synchronize with the other two units. The computer was restarted but was not resynchronized with the other two units. SpaceX says that NASA felt it was not necessary to continue the mission.
•One of three GPS units, the Propulsion and Trunk computers and Ethernet switch also experienced suspected radiation hits, but they were recovered during a power cycle.

This is for a about a 2 week long flight where the majority of the time it was not doing anything and just attached to ISS. 
Unsure if someone already mentioned it, but these could all be caused by EMI or static electricity. That would make sense as they were attached to ISS.

AFAIK, they had some EMI issues some months ago...

Radiation is not the only thing that causes this symptoms.

Offline jimvela

  • Member
  • Full Member
  • ****
  • Posts: 1672
  • Liked: 921
  • Likes Given: 75
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #37 on: 11/21/2012 04:44 am »
Replying to two items from this thread:

This is very off topic, but as someone who designs PCB, no one except the very largest manufacturers etch their own PCBs; and unless you are manufacturing at least several hundred assemblies a month it's generally not worth it to do your own component placement either.

I watch flight flight boards get populated/placed and assembled in unit quantities down to qty=1 in the lab next to mine- on a regular basis.

Space rated PWA and PWB assembly is a different game than nearly anything commercial and absolutely everything high-volume.

Aside from the *eyewatering* price I think you'll find these boards are *mostly* instruction set compatible with other POWER PCs, but not *exactly*
Mostly is... mostly correct.  :)

I have testbench hardware that sometimes substitutes COTS PPC hardware in place of a rad hard board. 

There's another option in that you can buy non-flight boards from the same vendors as the rad hard boards that save quite a bit of money for applications like a testbed because most of the flight assembly processing and some flight packaging/finishing is omitted.

Quote
, much as the European equivalent (Mongoose?) is based on the SPARC 7 architecture.

The ESA Sparc is a LEON- it's commercially available and very widely used. 

Quote
probably favoring military standard 1553b bus protocols (with mil spec pricing)

1553 is indeed common, but newer busses like spacewire are becoming common place alongside it.  1553 isn't terribly expensive- at least not rad hard flight processor board expensive.

Quote
I note in all this talk I've not seen any comment on what Spacex actually *uses*. My instinct is x86 compatibles or ARM's (which have enjoyed *much* better power consumption.

I have no knowledge about SpaceX avionics architecture, but I'd be shocked beyond words if it were ARM based today.

Offline mlindner

  • Software Engineer
  • Senior Member
  • *****
  • Posts: 2928
  • Space Capitalist
  • Silicon Valley, CA
  • Liked: 2240
  • Likes Given: 827
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #38 on: 11/21/2012 04:58 am »
That becomes a significant portion of the spacecraft's cost... SpaceX is a company that likes to spend as little as possible on outside components.
It does mount up. The AvWeek article said they have about 54 processors on the whole LV/capsule doing various things. Commonality seems to be  a *very* big Spacex trait. Why support 2 (or 3?) architectures when you can standardize on 1?

From the AvWeek article:
Quote
We've got 54 in a Dragon – and they're all different kinds of computers, different kinds of processors.
They don't standardize on one.

This here is one of the reasons they use linux and C++. If the processor has minimal elements like MMU (memory management unit) and timer interrupts and it supports POSIX then you can run linux on it.
« Last Edit: 11/21/2012 05:01 am by mlindner »
LEO is the ocean, not an island (let alone a continent). We create cruise liners to ride the oceans, not artificial islands in the middle of them. We need a physical place, which has physical resources, to make our future out there.

Offline mlindner

  • Software Engineer
  • Senior Member
  • *****
  • Posts: 2928
  • Space Capitalist
  • Silicon Valley, CA
  • Liked: 2240
  • Likes Given: 827
Re: SpaceX CRS-1 Software/Computer Design Discussion Thread
« Reply #39 on: 11/21/2012 05:30 am »
I note in all this talk I've not seen any comment on what Spacex actually *uses*. My instinct is x86 compatibles or ARM's (which have enjoyed *much* better power consumption.
I have no knowledge about SpaceX avionics architecture, but I'd be shocked beyond words if it were ARM based today.
Why do you say that? ARM is the leader in full-function (32-bit) low power applications. (You can go lower power but it generally means going below 32-bit processors.) The main reason new applications use x86 is for binary/assembly compatibility with previous x86 code. In today's era of cross-compiliation though this is a non-issue.
LEO is the ocean, not an island (let alone a continent). We create cruise liners to ride the oceans, not artificial islands in the middle of them. We need a physical place, which has physical resources, to make our future out there.

Tags:
 

Advertisement NovaTech
Advertisement Northrop Grumman
Advertisement
Advertisement Margaritaville Beach Resort South Padre Island
Advertisement Brady Kenniston
Advertisement NextSpaceflight
Advertisement Nathan Barker Photography
0