Dragon/Falcon Software

Pages: 1 2 3 ... 6 Next [All]
Author Topic: Dragon/Falcon Software  (Read 11412 times)
krytek
Full Member
*****
Offline

Posts: 527


« on: 04/12/2012 12:19 AM »

Just occurred to me that software is probably one of the least discussed and obscure topics around here.

Here's an interesting tidbit -
"Our Flight Software Group currently has opportunities developing software for embedded flight hardware using Linux and VxWorks as well as ground simulation software using Linux"
source - http://www.startuphire.com/job/software-engineer-embedded-linux-159520
So I guess there's no danger of Dragon getting a BSOD  ::)

Anyway, can someone explain what a real time OS like VxWorks actually does differently, and how it combines with Linux in the flight harware?


Advertisement
« on: 04/12/2012 12:19 AM »

 
goretexguy
Full Member
**
Offline

Posts: 2


« Reply #1 on: 04/12/2012 12:44 AM »

A RTOS is designed to provide consistently reliable response times- in other words, programmers don't need to worry if their flight control program will be interrupted or delayed by something else. A RTOS guarantees that information will flow back and forth within certain limits, as opposed to Windows or MacOS which may have great variability in response times. Humans may not care about 30 milliseconds, but rockets and nuclear reactors do. A RTOS like vxworks is also extremely reliable... Visit wikipedia for more info.
Linux will be used to host the apps which provide flight simulation data- pretending to be rockets, pumps, navigation, etc. and feeding all that data into the actual flight control systems.
butters
Full Member
*****
Offline

Posts: 1228


« Reply #2 on: 04/12/2012 03:19 AM »

The Linux kernel is primarily designed so that lots of different processes running on behalf of many different users can safely and fairly share the system resources. Each process runs in its own virtual address space, isolated from all other processes. User threads may only enter the kernel via one hardware exception handler which enforces strict protection of kernel resources. For example, there is a dedicated stack reserved for each thread when it is running in kernel. User processes are not to be trusted.

VxWorks is a single-user embedded OS designed to run a known set of real-time threads with static priority levels in a single kernel mode address space. The scheduling is fully-preemptive, but while they are scheduled to run, threads are trusted to play nice and to not barf all over resources belonging to other threads or the core kernel. This simple design permits very fast context switches between threads or between schedulable threads and interrupt service routines.

Over the years, Linux has made great big leaps in minimizing scheduling latency, which is perhaps the single most important measure of real-time capability. In Linux 2.0, the kernel could only service on thread at a time, and it always completed each system call before scheduling a different thread. Since Linux 2.6, threads can run in the kernel on all processors simultaneously, and threads can be preempted in favor of a higher-priority thread even while running in the kernel. The scheduler can also run concurrently on all processors, and no matter how many threads are queued, it always takes the same amount of time to find the highest-priority thread to run.

But no matter how small they squeeze the critical sections where kernel preemption or interrupts are disabled, the Linux kernel will still have non-deterministic response times because it makes some compromises to prevent device drivers from monopolizing the system and preventing user processes from running.

For example, Linux runs any pending "bottom halves" of device drivers whenever the kernel returns from any interrupt handler. But if a device driver reschedules its bottom half to run again while the processor is still executing bottom halves, then it is queued to special low-priority kernel threads (ksoftirqd) which wait to run until the processor is otherwise idle. This helps prevent device drivers from hogging the system (e.g. denial of service attacks which flood the network card with requests), but it also results in non-deterministic response time under heavy I/O loads.

In short, Linux is designed to withstand hostile multi-user networked environments while delivering excellent performance in client-server applications. But although response times have gotten very fast for soft real-time, there are no hard real-time guarantees.
starsilk
Full Member
****
Offline

Posts: 336
Location: Denver



« Reply #3 on: 04/12/2012 05:21 AM »

realtime OS like vxworks guarantee (absolutely, positively) that hardware interrupts (requests for attention) will be serviced within a certain time. they also tend to impose hard limits on processor time utilization by processes.

in a nutshell: they are simple OS which provide cast iron guarantees about timing. you can see how that would be useful for flight software.

linux is a multitasking OS designed for interacting with humans, who tend to care little if things are delayed by half a second once in a while. as a result, it can be far more complex and offer many more features. (same applies to Windows, OS X etc). needless to say a half second delay in rocket engine control would result in disaster...

because realtime OS are simple, and often impose draconian programming requirements to meet their guarantees, you often see a combination approach - realtime OS where it is absolutely needed, and a non-realtime OS where it can be safely used.

an example: tesla is also using vxworks for engine management, but linux for the giant touchscreen display in the Model S.

in reality, the line is being blurred somewhat. there are various patches or third party products to make linux a 'realtime' OS - either by providing a small subset of features and offering guarantees to 'realtime' processes, or by running the entire linux kernel as a low priority process in a small realtime OS wrapper.

for example, the linux 'realtime' patches do the former, but are generally claimed to be 'soft realtime' - ie: we *think* we guarantee this, but once in a blue moon we may not.. they are approaching true realtime though, for example: http://www.h-online.com/open/news/item/OSADL-experimentally-analyses-Linux-s-real-time-capabilities-1500366.html shows many, many thousands of hours with no deviations... but to be truly considered realtime the algorithms generally have to be mathematically proven, not shown by experiment.

baldusi
Full Member
*****
Offline

Posts: 3122
Location: Buenos Aires, Argentina


« Reply #4 on: 04/12/2012 05:34 AM »

Let's not forget the L4 based linux versions. Those are not only RT, but the L4 kernel is demonstrated as bug free. Or my loved QNX, which get's its deterministic performance by using a microkernel, so even driver problems won't create problems. It does have a performance hit on the context switch front, though.
LegendCJS
Full Member
****
Online

Posts: 469
Location: Boston, MA


« Reply #5 on: 04/12/2012 05:40 AM »

Sometimes there is nothing real time kernels can do if its on bad hardware.  I've been using the mathworks Real Time Target (formerly Real Time Windows Workshop) for real time control in simulink, and I don't always get hard real time.  Apparently the newer your hardware is the more likely it has some hardware level feature (BUG) energy management framework that automatically takes over everything for a few moment every now and then to make sure you aren't wasting energy.  This is just one example of problems with trying to go real time on some store bought computer.

I'm certain that the pros know how to take care of this with careful hardware selection.
IRobot
Full Member
****
Offline

Posts: 462
Location: Portugal


« Reply #6 on: 04/12/2012 10:49 AM »

This is the new approach, pass it on to software. The old approach for realtime control would be to use Xilinx. 10 years ago I would never think that software engineers would have so much power...
grr
Full Member
***
Offline

Posts: 162
Location: Highlands Ranch, Colorado


« Reply #7 on: 04/12/2012 11:14 AM »

To add to this, vxworks (like any realtime OS), allows you to control the deterministically control scheduling of a thread/process via set numbers . By that, you can state that we will give 4 units of time to a particular thread  out of say every 100 units.  This scheduling is important for making calculations on, well, real-time issues.

OTOH, regular Linux uses non-deterministic scheduling. You set priorities, etc. and HOPE to have a fair response time for all thread/processes. There are real-time extensions for Linux, but to be honest, these are considered soft real time vs. hard. Many of the RTOS companies have to deal with the popularity of Linux.

The 3 big RTOS out there would be QNX, VxWorks, and greenhills. However, in the aviation and space arena, you need a DO-178B OS. Basically, a hard RTOS generally with tight security. That would be VwWorks, GreenHills and Linux with RT extension.  Besides, QNX is falling out of favor esp. since RIM bought them. You will see it in their blackberry tablets, but to be honest, RIM is done for. Most likely QNX will fall to the side.

One of the biggest differences between greenhills and VxWorks, is that VxWorks is working closely with the Linux community, while Greenhill is fighting them tooth and nail.  As such, VxWorks API and development env. being ported to Linux allows a coder to come up to speed relatively quickly in a well known env and develop some useful code. Later, if needed, you can move that code to VxWorks quickly.  As such, greenhill is also dying a slow death.

However, what is also interesting is that they picked WindRiver's vxworks.  Says a lot about them.
Antares
ABO
Full Member
*****
Offline

Posts: 4608
Location: Done arguing with amateurs


« Reply #8 on: 05/03/2012 10:17 PM »

Quote
(Not a fan boy but just suspicious whenever anybody demands perfect software. Made doubly so by experience in cellular when poor quality systems trumped 5-9s when 5-9 infrastructure sold at 20x the cost of one nine....)
Flawed analogy. Cell phones don't kill people when the software fails. (Dragon may be unmanned, but it's flying an approach to a manned international space station that cost over $50B to build). No one expects perfect software, but mission-critical software is held to higher standards and undergoes more rigorous testing. The space shuttle software was considered the "gold standard" in this category:

http://www.fastcompany.com/magazine/06/writestuff.html

No one expects SpaceX (or any other commercial space company) to ever reach that level again, but there are still better analogies than cell phones:
Aircraft flight software
Air traffic control software
Nuclear powerplant control software

All of these must be more-or-less bug-free in normal operation, or people die.

Add in medical devices and high-volume passenger trains and subways.
Jorge
Full Member
*****
Offline

Posts: 6763


« Reply #9 on: 05/03/2012 10:20 PM »

Quote
(Not a fan boy but just suspicious whenever anybody demands perfect software. Made doubly so by experience in cellular when poor quality systems trumped 5-9s when 5-9 infrastructure sold at 20x the cost of one nine....)
Flawed analogy. Cell phones don't kill people when the software fails. (Dragon may be unmanned, but it's flying an approach to a manned international space station that cost over $50B to build). No one expects perfect software, but mission-critical software is held to higher standards and undergoes more rigorous testing. The space shuttle software was considered the "gold standard" in this category:

http://www.fastcompany.com/magazine/06/writestuff.html

No one expects SpaceX (or any other commercial space company) to ever reach that level again, but there are still better analogies than cell phones:
Aircraft flight software
Air traffic control software
Nuclear powerplant control software

All of these must be more-or-less bug-free in normal operation, or people die.

Add in medical devices and high-volume passenger trains and subways.

Good additions. Good thread move, too.
Jorge
Full Member
*****
Offline

Posts: 6763


« Reply #10 on: 05/04/2012 01:24 AM »

1) Don't forget about the dedicated SpaceX software thread http://forum.nasaspaceflight.com/index.php?topic=28611.msg893373#msg893373

2) The hard part in most software writing is bridging the gap between the hardware engineers who own the parts that the software is controlling and the coders.  The coder can build perfect code that doesn't do what the valve engineer really wanted.

Especially when the hardware is changing. Configuration management is critical. Shuttle software at least had the advantage of a (relatively) stable hardware platform, once initial development was complete.
watermod
Full Member
***
Offline

Posts: 56


« Reply #11 on: 05/04/2012 02:51 AM »


Quote
(Not a fan boy but just suspicious whenever anybody demands perfect software. Made doubly so by experience in cellular when poor quality systems trumped 5-9s when 5-9 infrastructure sold at 20x the cost of one nine....)
Flawed analogy. Cell phones don't kill people when the software fails
I beg to differ. I believe our customer  had some base stations bombed several times when some Yakuza calls were dropped.  (Nuff said and slinking off on that - Recent memory brings up the bus contention in a  famous brand of autos (with hundreds of embedded procs) was claimed to have led to some deaths and almost took down that company)

On another note read the Errata on both compilers, OS's and processors.  They are usually long and huge. For practical purposes errata are bugs.    There is no such thing as BUG FREE CODE.

As to the Linux kernel and RealTime...   There are many optional kernel schedulers and scheduling techniques.  They are usually compile time options for the Kernel but some are modules.   It's possible to radically change the scheduling behavior of a Linux kernel in several "Real Time" directions.   Hardware permitting it's possible to radically change the minimal clocking too.   If you look in the kernel code you see that ticks are macro-defined at compile time dependent on target hardware.  (obviously the std pc hardware clock is not going to cut fine grained time).  I should point out that syncing clocking among multiple machines below 500 ns is non-trivial. One should bear that in mind if you ever require tightly synchronized real time computers and commission a thesis level study first.
Jorge
Full Member
*****
Offline

Posts: 6763


« Reply #12 on: 05/04/2012 03:38 AM »


Quote
(Not a fan boy but just suspicious whenever anybody demands perfect software. Made doubly so by experience in cellular when poor quality systems trumped 5-9s when 5-9 infrastructure sold at 20x the cost of one nine....)
Flawed analogy. Cell phones don't kill people when the software fails
I beg to differ. I believe our customer  had some base stations bombed several times when some Yakuza calls were dropped.  (Nuff said and slinking off on that

s'alright. I needed the laugh!

Quote
- Recent memory brings up the bus contention in a  famous brand of autos (with hundreds of embedded procs) was claimed to have led to some deaths and almost took down that company)

I think it's quite wise of SpaceX to compare themselves with the auto company (especially concerning capital) and consider (internally) whether they would survive a similar issue. They've shown considerable integrity in standing down to fix issues. For a company with their burn rate, that's a real sacrifice.

Quote
On another note read the Errata on both compilers, OS's and processors.  They are usually long and huge. For practical purposes errata are bugs.    There is no such thing as BUG FREE CODE.

And that is why I was quite careful to say "more-or-less bug-free in normal operation" and "No one expects perfect software". The space shuttle demonstrates one can come very close to perfection, but at considerable cost ($83/SLOC/year in the mid-90s, if I did the math right). But other high-reliability applications such as the ones Antares and I listed show that one can still approach reasonably close to perfection at lower cost.
mnagy
Full Member
**
Offline

Posts: 13


« Reply #13 on: 05/04/2012 06:21 AM »

Let's not forget the L4 based linux versions. Those are not only RT, but the L4 kernel is demonstrated as bug free.
What do you mean by that? Surely no one is making a claim that a non-trivial software is free from bugs..
butters
Full Member
*****
Offline

Posts: 1228


« Reply #14 on: 05/04/2012 02:04 PM »

Let's not forget the L4 based linux versions. Those are not only RT, but the L4 kernel is demonstrated as bug free.
What do you mean by that? Surely no one is making a claim that a non-trivial software is free from bugs..

L4 is a true microkernel. It does interrupt routing, thread scheduling, (physical) memory allocation, and interprocess communication. That's it. It doesn't have device drivers or filesystems or network protocols. It's actually more like a hypervisor than an OS, and its feature set is small enough that it can be exhaustively tested.

But unless your application can run on bare metal, implementing its own I/O stacks and so forth, then you need to run an OS such a Linux in usermode on top of L4. In this setup, the Linux kernel is a process running alongside your application on L4, and when your application makes system calls to the Linux kernel, they are implemented via L4's very lean and clever message-passing primitives.

There are certain advantages to this setup. For example, multiple Linux instances can run on top of the single L4 instance, and because they run in usermode, if there is a kernel panic or other failure in a Linux instance, it can be restarted without affecting the other instances. So flight-critical processes can be isolated from the rest of the system or even replicated in a redundant set.
JohnFornaro
Not an expert
Full Member
*****
Offline

Posts: 6905


« Reply #15 on: 05/04/2012 03:03 PM »

So, does anybody have any idea what the software bug is that's responsible for the latest launch delay?
dcporter
Full Member
*****
Offline

Posts: 506


« Reply #16 on: 05/04/2012 03:07 PM »

It sounded like they're working on tweaking the tolerances - maybe like editing a settings file rather than changing the code itself - so that Dragon doesn't think things are bad when they're well within safe margins. Could be PR spin of course, but I usually trust that Elon believes what he's saying out loud.
oldAtlas_Eguy
Full Member
*****
Offline

Posts: 1074
Location: Florida



« Reply #17 on: 05/04/2012 03:49 PM »

Let's not forget the L4 based linux versions. Those are not only RT, but the L4 kernel is demonstrated as bug free.
What do you mean by that? Surely no one is making a claim that a non-trivial software is free from bugs..

L4 is a true microkernel. It does interrupt routing, thread scheduling, (physical) memory allocation, and interprocess communication. That's it. It doesn't have device drivers or filesystems or network protocols. It's actually more like a hypervisor than an OS, and its feature set is small enough that it can be exhaustively tested.

But unless your application can run on bare metal, implementing its own I/O stacks and so forth, then you need to run an OS such a Linux in usermode on top of L4. In this setup, the Linux kernel is a process running alongside your application on L4, and when your application makes system calls to the Linux kernel, they are implemented via L4's very lean and clever message-passing primitives.

There are certain advantages to this setup. For example, multiple Linux instances can run on top of the single L4 instance, and because they run in usermode, if there is a kernel panic or other failure in a Linux instance, it can be restarted without affecting the other instances. So flight-critical processes can be isolated from the rest of the system or even replicated in a redundant set.

Yes the Linux I/O routines are needed to support the Ethernet multibus architecture for the connection between the computers, sensor groups and actuator controllers. This wiring architecture is simpler, less weight and is hardware fault robust just as good as other wiring architectures and may even be better since it has less components and a higher number of identical components lowering manufacturing/design defect rate.

See the NASA computer architecture reliability study for SLS.
http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20110014793_2011015526.pdf
And there was a thread started for this as well.
http://forum.nasaspaceflight.com/index.php?topic=26552.0
Danderman
Extreme Veteran
Full Member
*****
Offline

Posts: 6984



WWW
« Reply #18 on: 05/04/2012 03:53 PM »

It sounded like they're working on tweaking the tolerances - maybe like editing a settings file rather than changing the code itself - so that Dragon doesn't think things are bad when they're well within safe margins. Could be PR spin of course, but I usually trust that Elon believes what he's saying out loud.

Without going into great detail about operating systems that SpaceX doesn't use, the fact that Elon talked about still writing code for rendezvous just 2 weeks ago does not sound good.

Robotbeat
Full Member
*****
Offline

Posts: 14559
Location: Minnesota



« Reply #19 on: 05/04/2012 04:02 PM »

It sounded like they're working on tweaking the tolerances - maybe like editing a settings file rather than changing the code itself - so that Dragon doesn't think things are bad when they're well within safe margins. Could be PR spin of course, but I usually trust that Elon believes what he's saying out loud.

Without going into great detail about operating systems that SpaceX doesn't use, the fact that Elon talked about still writing code for rendezvous just 2 weeks ago does not sound good.


[citation needed]
oldAtlas_Eguy
Full Member
*****
Offline

Posts: 1074
Location: Florida



« Reply #20 on: 05/04/2012 04:05 PM »

It sounded like they're working on tweaking the tolerances - maybe like editing a settings file rather than changing the code itself - so that Dragon doesn't think things are bad when they're well within safe margins. Could be PR spin of course, but I usually trust that Elon believes what he's saying out loud.

This one reminded me of a software design technique that is heavily upfront design and coding but is highly robust in operations in that most of the software parameters is in a load file so that they can be tweaked during testing without a code change. It enables fast turnaround during testing because no coding change is needed just a change to the parameter load file.

If their software designers are really good they designed the software this way.
Chris-A
Member
Full Member
*****
Offline

Posts: 531


« Reply #21 on: 05/04/2012 04:20 PM »

Falcon uses parameter files, but Dragon is unknown.
(information from aborts)
oldAtlas_Eguy
Full Member
*****
Offline

Posts: 1074
Location: Florida



« Reply #22 on: 05/04/2012 05:26 PM »

Falcon uses parameter files, but Dragon is unknown.
(information from aborts)

One thing I know about software designers is that they tend to repeat designs that are successful. With F9 and Dragon having nearly identical hardware, except in Dragons case there is 4 computers instead of F9’s 3, it is highly likely such is the case. It would allow them to share a greate deal of debugged subroutines that exist in the F9 software and would have the same exact usage in the Dragon software.

The primary difference in the software would be the top level scheduler/AI that made decisions on how to handle anomalies since the 2 vehicles would have nearly completely different anomaly sets. The other item different is the algorithms and parameter sets used for 6 degree maneuvering and control: different vehicle - different control problem.
QuantumG
Full Member
*****
Offline

Posts: 3407
Location: Australia



WWW
« Reply #23 on: 05/05/2012 01:13 AM »

Has anyone mentioned on this thread how HMXHMX was saying years ago that software is the reason why automated commercial cargo delivery is a *lot* harder than commercial crew? How many years could have been saved if the focus had been on commercial crew from the beginning?
Jorge
Full Member
*****
Offline

Posts: 6763


« Reply #24 on: 05/05/2012 01:18 AM »

Has anyone mentioned on this thread how HMXHMX was saying years ago that software is the reason why automated commercial cargo delivery is a *lot* harder than commercial crew? How many years could have been saved if the focus had been on commercial crew from the beginning?


I've noticed it, haven't mentioned it. Should now.

Yes, I agree with HMXHMX, manual prox ops and docking is much easier than an automated system, IMO. Yes, I know the Soviets developed automated systems first, with manual backup. Motivations were different.
Lurker Steve
Full Member
*****
Offline

Posts: 841


« Reply #25 on: 05/05/2012 01:53 AM »

Falcon uses parameter files, but Dragon is unknown.
(information from aborts)

One thing I know about software designers is that they tend to repeat designs that are successful. With F9 and Dragon having nearly identical hardware, except in Dragons case there is 4 computers instead of F9’s 3, it is highly likely such is the case. It would allow them to share a greate deal of debugged subroutines that exist in the F9 software and would have the same exact usage in the Dragon software.

The primary difference in the software would be the top level scheduler/AI that made decisions on how to handle anomalies since the 2 vehicles would have nearly completely different anomaly sets. The other item different is the algorithms and parameter sets used for 6 degree maneuvering and control: different vehicle - different control problem.


I would imagine that guidance / control of a vehicle that is being pushed from behind by engines that gimbal is completely different than a capsule with small manuvering thrusters on the side.

Plus this Dragon has a whole bunch of hardware that has never flown before. Notice that the initial concern was with the docking system code. Modularity and code reuse only get you so far, and simulations contain all sorts of assumptions about how the hardware should work. Now it's time to make sure the code works with the capsule as built. Especially when they have never built one like this before.
Chris-A
Member
Full Member
*****
Offline

Posts: 531


« Reply #26 on: 05/05/2012 01:54 AM »

I guess we have an answer, C++ on GNU/Linux.
http://twitter.com/elonmusk/status/198579161382649857
Jorge
Full Member
*****
Offline

Posts: 6763


« Reply #27 on: 05/05/2012 02:00 AM »

Falcon uses parameter files, but Dragon is unknown.
(information from aborts)

One thing I know about software designers is that they tend to repeat designs that are successful. With F9 and Dragon having nearly identical hardware, except in Dragons case there is 4 computers instead of F9’s 3, it is highly likely such is the case. It would allow them to share a greate deal of debugged subroutines that exist in the F9 software and would have the same exact usage in the Dragon software.

The primary difference in the software would be the top level scheduler/AI that made decisions on how to handle anomalies since the 2 vehicles would have nearly completely different anomaly sets. The other item different is the algorithms and parameter sets used for 6 degree maneuvering and control: different vehicle - different control problem.


I would imagine that guidance / control of a vehicle that is being pushed from behind by engines that gimbal is completely different than a capsule with small manuvering thrusters on the side.

Plus this Dragon has a whole bunch of hardware that has never flown before. Notice that the initial concern was with the docking system code. Modularity and code reuse only get you so far, and simulations contain all sorts of assumptions about how the hardware should work. Now it's time to make sure the code works with the capsule as built. Especially when they have never built one like this before.


I believe that's correct, that most of the rendezvous sensors did not fly on C1.
RocketJack
Full Member
**
Offline

Posts: 41
Location: Virginia


« Reply #28 on: 05/05/2012 02:44 AM »

Falcon uses parameter files, but Dragon is unknown.
(information from aborts)

One thing I know about software designers is that they tend to repeat designs that are successful. With F9 and Dragon having nearly identical hardware, except in Dragons case there is 4 computers instead of F9’s 3, it is highly likely such is the case. It would allow them to share a greate deal of debugged subroutines that exist in the F9 software and would have the same exact usage in the Dragon software.

The primary difference in the software would be the top level scheduler/AI that made decisions on how to handle anomalies since the 2 vehicles would have nearly completely different anomaly sets. The other item different is the algorithms and parameter sets used for 6 degree maneuvering and control: different vehicle - different control problem.


I would imagine that guidance / control of a vehicle that is being pushed from behind by engines that gimbal is completely different than a capsule with small manuvering thrusters on the side.

Plus this Dragon has a whole bunch of hardware that has never flown before. Notice that the initial concern was with the docking system code. Modularity and code reuse only get you so far, and simulations contain all sorts of assumptions about how the hardware should work. Now it's time to make sure the code works with the capsule as built. Especially when they have never built one like this before.


I believe that's correct, that most of the rendezvous sensors did not fly on C1.

DragonEye did fly on STS-127 and STS-133. Results are probably proprietary but there was a late push to add it to 133. Knocked Neptec off the flight I believe.
Jorge
Full Member
*****
Offline

Posts: 6763


« Reply #29 on: 05/05/2012 03:06 AM »

Falcon uses parameter files, but Dragon is unknown.
(information from aborts)

One thing I know about software designers is that they tend to repeat designs that are successful. With F9 and Dragon having nearly identical hardware, except in Dragons case there is 4 computers instead of F9’s 3, it is highly likely such is the case. It would allow them to share a greate deal of debugged subroutines that exist in the F9 software and would have the same exact usage in the Dragon software.

The primary difference in the software would be the top level scheduler/AI that made decisions on how to handle anomalies since the 2 vehicles would have nearly completely different anomaly sets. The other item different is the algorithms and parameter sets used for 6 degree maneuvering and control: different vehicle - different control problem.


I would imagine that guidance / control of a vehicle that is being pushed from behind by engines that gimbal is completely different than a capsule with small manuvering thrusters on the side.

Plus this Dragon has a whole bunch of hardware that has never flown before. Notice that the initial concern was with the docking system code. Modularity and code reuse only get you so far, and simulations contain all sorts of assumptions about how the hardware should work. Now it's time to make sure the code works with the capsule as built. Especially when they have never built one like this before.


I believe that's correct, that most of the rendezvous sensors did not fly on C1.

DragonEye did fly on STS-127 and STS-133. Results are probably proprietary but there was a late push to add it to 133. Knocked Neptec off the flight I believe.

Right. More importantly, it flew with its own data recording system, not anything resembling the current Dragon software.
HMXHMX
Full Member
*****
Offline

Posts: 865


« Reply #30 on: 05/05/2012 04:39 AM »

Has anyone mentioned on this thread how HMXHMX was saying years ago that software is the reason why automated commercial cargo delivery is a *lot* harder than commercial crew? How many years could have been saved if the focus had been on commercial crew from the beginning?


I've noticed it, haven't mentioned it. Should now.

Yes, I agree with HMXHMX, manual prox ops and docking is much easier than an automated system, IMO. Yes, I know the Soviets developed automated systems first, with manual backup. Motivations were different.

I've been watching this whole story unfold for the past several years, and have been largely bitting my tongue in an attempt to avoid an "I told you so" moment.  But I am only human...

I know that sometimes it appears I'm just moving my lips for effect, but there was a method to my madness back in 2004, when I urged that the "program that would become COTS" not do cargo first, but crew, and should not go to ISS as a visiting vehicle but demonstrate a rendezvous and dock with the upper stage of the LV that flew it to orbit.

Commercial operations could then have commenced with crew and carrying light cargo to ISS, while an automated docking system was developed and qualified in parallel with routine crewed flights.  We'd be much further ahead today and the current ridiculous debate in Congress would have been rendered moot.

Thanks to those who noticed.
Antares
ABO
Full Member
*****
Offline

Posts: 4608
Location: Done arguing with amateurs


« Reply #31 on: 05/05/2012 02:16 PM »

NASA Safety was adamant that humans not be on board.  All the sense in the world can be erased by one civil servant with an opinion.
A_M_Swallow
Elite Veteran
Full Member
*****
Offline

Posts: 5562
Location: South coast of England


« Reply #32 on: 05/05/2012 08:04 PM »

{snip}
Commercial operations could then have commenced with crew and carrying light cargo to ISS, while an automated docking system was developed and qualified in parallel with routine crewed flights.  We'd be much further ahead today and the current ridiculous debate in Congress would have been rendered moot.

Thanks to those who noticed.

It is worse than that - berthing is a manual docking system.  The arm operator in ISS performs the docking.
savuporo
Full Member
*****
Offline

Posts: 1893


« Reply #33 on: 05/05/2012 08:19 PM »

Quote
SpaceX and NASA are nearing completion of the software assurance process, ...Thus far, no issues have been uncovered during this process,
Right. But the seal of approval will make it safer for sure.
watermod
Full Member
***
Offline

Posts: 56


« Reply #34 on: 05/08/2012 02:11 AM »

For those in the know the following question:  Has SpaceX written their software in such a manner that all sensor data and events for a flight is recorded at a fine grain (with timestamps) in such a way that can be data-mined at a later date to correlate whatever or drive software tests?
 
beancounter
Full Member
*****
Offline

Posts: 757
Location: Perth, Western Australia



« Reply #35 on: 05/08/2012 02:16 AM »

Quote
SpaceX and NASA are nearing completion of the software assurance process, ...Thus far, no issues have been uncovered during this process,
Right. But the seal of approval will make it safer for sure.
Why?  Both could overlook the same thing.  Just because more than one looks over design specs or testing doesn't necessarily mean it's 'safer' just that it's had 2 different reviewers.  It wouldn't necssarily identify an underlying problem if the testing doesn't expose it or isn't designed to test for that issue, set of circumstances, etc.
LegendCJS
Full Member
****
Online

Posts: 469
Location: Boston, MA


« Reply #36 on: 05/08/2012 02:27 AM »

Quote
SpaceX and NASA are nearing completion of the software assurance process, ...Thus far, no issues have been uncovered during this process,
Right. But the seal of approval will make it safer for sure.
Why?  Both could overlook the same thing.  Just because more than one looks over design specs or testing doesn't necessarily mean it's 'safer' just that it's had 2 different reviewers.  It wouldn't necssarily identify an underlying problem if the testing doesn't expose it or isn't designed to test for that issue, set of circumstances, etc.

Savuporo's remark sounded like sarcasm to me...  I think he agrees with you completely.
epistefiend
Full Member
**
Offline

Posts: 8


« Reply #37 on: 05/08/2012 03:38 AM »

For those in the know the following question:  Has SpaceX written their software in such a manner that all sensor data and events for a flight is recorded at a fine grain (with timestamps) in such a way that can be data-mined at a later date to correlate whatever or drive software tests?
 
Yes.
rds100
Full Member
**
Online

Posts: 35


« Reply #38 on: 05/08/2012 04:14 AM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.
QuantumG
Full Member
*****
Offline

Posts: 3407
Location: Australia



WWW
« Reply #39 on: 05/08/2012 04:18 AM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.

Perhaps if we knew what you were talking about...
RDoc
Full Member
***
Offline

Posts: 208


« Reply #40 on: 05/08/2012 04:53 AM »

Quote
SpaceX and NASA are nearing completion of the software assurance process, ...Thus far, no issues have been uncovered during this process,
Right. But the seal of approval will make it safer for sure.
Why?  Both could overlook the same thing.  Just because more than one looks over design specs or testing doesn't necessarily mean it's 'safer' just that it's had 2 different reviewers.  It wouldn't necssarily identify an underlying problem if the testing doesn't expose it or isn't designed to test for that issue, set of circumstances, etc.
I don't know if they actually do meaningful code reviews, but real code reading is amazingly effective at finding obscure bugs. OTOH, pro forma documentation and test evaluations are much less useful.
ChefPat
Full Member
*****
Offline

Posts: 721
Location: Earth, for now



« Reply #41 on: 05/08/2012 12:50 PM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.
Maybe you should run on over there & tell'em how it's done? ::)
Airlock
Full Member
**
Offline

Posts: 44
Location: Orlando, FL



« Reply #42 on: 05/08/2012 01:09 PM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.


Last time I checked, getting a fully robotic spacecraft to fly in formation with an orbiting space station while going 17,500 mph is not easy. Also, it's not like you can find a solid ISS berthing API up on sourceforge...
Herb Schaltegger
I used to be a rocket scientist
Full Member
*****
Online

Posts: 843
Location: Murfreesboro TN



« Reply #43 on: 05/08/2012 01:13 PM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.


Yeah, automated systems are so reliable . . .

http://spaceflight.nasa.gov/history/shuttle-mir/history/h-f-foale-collision.htm
Lurker Steve
Full Member
*****
Offline

Posts: 841


« Reply #44 on: 05/08/2012 01:36 PM »

Quote
SpaceX and NASA are nearing completion of the software assurance process, ...Thus far, no issues have been uncovered during this process,
Right. But the seal of approval will make it safer for sure.
Why?  Both could overlook the same thing.  Just because more than one looks over design specs or testing doesn't necessarily mean it's 'safer' just that it's had 2 different reviewers.  It wouldn't necssarily identify an underlying problem if the testing doesn't expose it or isn't designed to test for that issue, set of circumstances, etc.

Having an extra set (or two or three) of eyes reviewing your work is always a good thing. Often the people who are too close to the design make assumptions about how a particular piece of code will work, as opposed to how it actually will work. An independant reviewer can look at the code without making any prior assumptions, and is able to verify that the code module appears to perform it's designed function, and can gracefully handle any invalid inputs from external code or hardware modules.
woods170
IRAS fan
Full Member
*****
Offline

Posts: 3074
Location: The Netherlands


IRAS fan


« Reply #45 on: 05/08/2012 03:04 PM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.


Yeah, automated systems are so reliable . . .

http://spaceflight.nasa.gov/history/shuttle-mir/history/h-f-foale-collision.htm

Wrong example. The automated systems on that Progress vehicle were shut down. It was a manual docking attempt. They tried to dock Progress thru tele-operation.
Danderman
Extreme Veteran
Full Member
*****
Offline

Posts: 6984



WWW
« Reply #46 on: 05/08/2012 03:14 PM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.


SpaceX is not attempting a docking, they are attempting berthing, which requires a rendezvous with an empty point in space, using the nearby ISS as a reference point. This maneuver is really, really hard.
IRobot
Full Member
****
Offline

Posts: 462
Location: Portugal


« Reply #47 on: 05/08/2012 04:11 PM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.


SpaceX is not attempting a docking, they are attempting berthing, which requires a rendezvous with an empty point in space, using the nearby ISS as a reference point. This maneuver is really, really hard.
Sorry, I'm not convinced it is that hard. For me hard is to have an autonomous car driving in the streets, like Google and other companies have been testing.
To have something comparable would be doing it without telemetry, just radar, cameras, GPS and IR sensors, while EVA's, Progress and ATV's crossing Dragon's path constantly, without having stars for reference, with rapid changes in light conditions and searching for traffic lights and crosswalks at the same time.

The only part where it is harder is the inertia, but that's minor compared to the rest.
docmordrid
Full Member
*****
Offline

Posts: 1776
Location: Michigan



« Reply #48 on: 05/08/2012 04:20 PM »

Driving a car is a 2D operation. Flying spacecraft is 3D.
Herb Schaltegger
I used to be a rocket scientist
Full Member
*****
Online

Posts: 843
Location: Murfreesboro TN



« Reply #49 on: 05/08/2012 04:24 PM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.


Yeah, automated systems are so reliable . . .

http://spaceflight.nasa.gov/history/shuttle-mir/history/h-f-foale-collision.htm

Wrong example. The automated systems on that Progress vehicle were shut down. It was a manual docking attempt. They tried to dock Progress thru tele-operation.

And why were they testing the manual system? Think about the bigger context.  And the bigger lesson here.
Jim
Night Gator
Full Member
*****
Offline

Posts: 17592
Location: Cape Canaveral Spaceport



« Reply #50 on: 05/08/2012 04:26 PM »


Sorry, I'm not convinced it is that hard. For me hard is to have an autonomous car driving in the streets, like Google and other companies have been testing.
To have something comparable would be doing it without telemetry, just radar, cameras, GPS and IR sensors, while EVA's, Progress and ATV's crossing Dragon's path constantly, without having stars for reference, with rapid changes in light conditions and searching for traffic lights and crosswalks at the same time.

The only part where it is harder is the inertia, but that's minor compared to the rest.

wrong, that is not harder.

A. a car can remain stationary and stop.
b. They use radar, cameras, GPS and IR sensors.  Telemetry is just for status.  Commanding is just for aborts.
c.  Orbital mechanics requires more computations than cruise control on a vehicle.
savuporo
Full Member
*****
Offline

Posts: 1893


« Reply #51 on: 05/08/2012 04:31 PM »

Driving a car is a 2D operation. Flying spacecraft is 3D.
Um. No, that is very, very wrong.
There is a ton of freely available research and design papers back from Darpa Grand Challenge teams, browse some of these to get a high-level understanding.
cneth
Full Member
****
Offline

Posts: 256


« Reply #52 on: 05/08/2012 04:32 PM »

All you armchair coders crack me up.    "Should be easy", "minor", etc.    Try walking a mile in the their shoes before you tell them how easy their job should be.

And FWIW, I've always found code reviews to be helpful, even on non-critical code.  It's a "best practice" for the Software industry, in general. 

Robotbeat
Full Member
*****
Offline

Posts: 14559
Location: Minnesota



« Reply #53 on: 05/08/2012 04:40 PM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.


Yeah, automated systems are so reliable . . .

http://spaceflight.nasa.gov/history/shuttle-mir/history/h-f-foale-collision.htm

Wrong example. The automated systems on that Progress vehicle were shut down. It was a manual docking attempt. They tried to dock Progress thru tele-operation.

And why were they testing the manual system? Think about the bigger context.  And the bigger lesson here.
The bigger lesson is to follow proper safety procedures. There were a whole host of problems that led to the collision, and it could've been avoided at several points by following proper procedure.

Poor procedures means that you're going to have problems. If you turn off the safety systems, don't have someone else watching, have poor communication in general and poorly tested systems doing an unneeded test and ignore previous near-misses, then you're going to screw up and something terribly bad is going to happen. And no, an architectural change like going to fewer launches won't save you, it will just ensure that your inevitable screw up is more expensive.
Herb Schaltegger
I used to be a rocket scientist
Full Member
*****
Online

Posts: 843
Location: Murfreesboro TN



« Reply #54 on: 05/08/2012 04:53 PM »

And no, an architectural change like going to fewer launches won't save you (yes, I know that's what your implication is), it will just ensure that your inevitable screw up is more expensive.

I don't know where you're going with that, but there is no implication in my comments regarding ANYTHING except the supposed reliability of automated systems for complicated real-time processes. 
Thunderbird5
"How hard could it be?" TM
Full Member
***
Offline

Posts: 105
Location: London, Ol' Blighty



« Reply #55 on: 05/08/2012 05:06 PM »

Being in the software industry myself, albeit never flight or safety-critical systems, I have learnt to never under estimate the "minor" elements.

Some years ago, I remember watching a programme following the design and development of the, then new, Boeing 777. When it came time to flight test the new engines, there was some discussion to whether it was necessary as it was felt that the latest computer modelling and simulation being used was sufficient assurance. The decision was to flight test the engine (on a 767 I believe) and assuming it worked as predicted, they could rely far more on just software modelling.

Needless to say, on the day of the test, the plane got about 300 feet in the air when the engine 'back-fired' big time with a spectacular flame burst, IIRC owning to a fan blade stall caused by the unexpected deformation of the engine casing.  :o

It was a very visual lesson learned for anyone who saw it.
Robotbeat
Full Member
*****
Offline

Posts: 14559
Location: Minnesota



« Reply #56 on: 05/08/2012 05:10 PM »

And no, an architectural change like going to fewer launches won't save you (yes, I know that's what your implication is), it will just ensure that your inevitable screw up is more expensive.

I don't know where you're going with that, but there is no implication in my comments regarding ANYTHING except the supposed reliability of automated systems for complicated real-time processes. 
Okay, sorry 'bout that. But remember they completely turned off the automated system. They were trying to get rid of the automated system as a cost-saving measure, and it bit them in the butt. Moral of the story is to have defense-in-depth. Use the automated system, but have a backup in case (or use manual and have a well-tested automated system as backup... either way, you need backup).
Danderman
Extreme Veteran
Full Member
*****
Offline

Posts: 6984



WWW
« Reply #57 on: 05/08/2012 05:16 PM »

And why writing software for automatic docking is so hard? If it could be done 30-40 years ago with the ancient computers of that time i don't see why it should be so hard to be done today.


SpaceX is not attempting a docking, they are attempting berthing, which requires a rendezvous with an empty point in space, using the nearby ISS as a reference point. This maneuver is really, really hard.

Sorry, I'm not convinced it is that hard.
  :o :o {snip}
The only part where it is harder is the inertia, but that's minor compared to the rest.

Wow. This post wins some sort of record.

I am even sorrier that it is so difficult to generate the automated rendezvous system for berthing; the Japanese lost 6 years in schedule slippage, partially over this issue, in developing the HTV.  And that was a national space agency. Elon is doing it with just a single company.

kevin-rf
Elite Veteran
Full Member
*****
Offline

Posts: 5314
Location: Next door to Mary's little Lamb


« Reply #58 on: 05/08/2012 05:35 PM »

Just a question... If 2D driving is so easy, why doesn't my car drive me to work?

Toyota demonstrated 2D parallel parking on the prius some 10 years ago, and it is just now coming to market. The self driving car is not on the market, it only exists in research and challenges. Just like the X prize(s) for space flight.

Pretty impressive that SpaceX has managed to get from zero to a point of being able to test in 10 years. Any one remember DART?
rklaehn
telemetry plumber
Full Member
*****
Offline

Posts: 1052
Location: germany


WWW
« Reply #59 on: 05/08/2012 05:54 PM »

Just a question... If 2D driving is so easy, why doesn't my car drive me to work?

You can probably thank our ligitious society for that.

Even if an autopilot for a car would be ten times as safe as a human driver, there would still be accidents. And after each accident there would be a huge number of lawyers trying to sue the car company for billions.

It's much safer for the company from a legal point of view to offer dozens of driving assistants, but to leave the ultimate responsibility at the driver.

(by the way: I am not saying that writing an autopilot for a car is easy. It is a hard problem. But a lot of research has been done in this area)
savuporo
Full Member
*****
Offline

Posts: 1893


« Reply #60 on: 05/08/2012 07:11 PM »

One thing that baffles me, is how could the duration of code review or software QA be so unpredictable ?

For code review, you have a number of modules and lines, types of analysis and static testing you might do, and its pretty easily countable in total hours.

For QA, you have your test plan to go through, with predictable cycle time. UNLESS you find bugs, need to improve your testing matrix, and restart the entire cycle. But they claim that "no issues have been found".

So i am puzzled.
Danderman
Extreme Veteran
Full Member
*****
Offline

Posts: 6984



WWW
« Reply #61 on: 05/08/2012 07:14 PM »

The problem with 2D automated driving is the environment is uncooperative; in the world of spacecraft, you always want a cooperative target like ISS.

AFAIK, there has never been an automated rendezvous with an uncooperative target.

Getting back to the real world, Elon is finding out that rendezvous with a point in space, albeit one near a cooperative target, is not that easy.

I should also mention that Soyuz and Progress perform rendezvous with a point in space, in the sense that Kurs guides these spacecraft to a general area in space, and then loose stationkeeping is performed, prior to the final dash into the docking apparatus.
Danderman
Extreme Veteran
Full Member
*****
Offline

Posts: 6984



WWW
« Reply #62 on: 05/08/2012 07:15 PM »

One thing that baffles me, is how could the duration of code review or software QA be so unpredictable ?

For code review, you have a number of modules and lines, types of analysis and static testing you might do, and its pretty easily countable in total hours.

For QA, you have your test plan to go through, with predictable cycle time. UNLESS you find bugs, need to improve your testing matrix, and restart the entire cycle. But they claim that "no issues have been found".

So i am puzzled.

"Claim" is the operative word.

Another answer may be that during simulation, superior algorithms may emerge, requiring another round of coding.
starsilk
Full Member
****
Offline

Posts: 336
Location: Denver



« Reply #63 on: 05/08/2012 07:18 PM »

One thing that baffles me, is how could the duration of code review or software QA be so unpredictable ?

For code review, you have a number of modules and lines, types of analysis and static testing you might do, and its pretty easily countable in total hours.

For QA, you have your test plan to go through, with predictable cycle time. UNLESS you find bugs, need to improve your testing matrix, and restart the entire cycle. But they claim that "no issues have been found".

So i am puzzled.

easy. the scope of the test plan is expanding. every test they do, someone says "but what if..." and they have to add a new set of variables. so far, none of the extra tests have failed (which is good news, of course).

if this was non-critical software, they would have reached the point quite some time ago where they would have declared it 'good enough', with all the core functionality tested, and shipped it. but since this is critical software, with lives and also a great deal of money on the line, they can't do that. instead, every 'what if' has to be tested, and they fervently hope they haven't missed any.
jongoff
Rocket Plumber
Full Member
*****
Offline

Posts: 3905
Location: Louisville, CO


WWW
« Reply #64 on: 05/08/2012 07:26 PM »

Driving a car is a 2D operation. Flying spacecraft is 3D.

I'm not convinced. I think the complexity comes in all of the abort modes they need to deal with, and the software/software and software/hardware interfaces, not with the underlying GN&C math of stationkeeping near another system. The relative velocities involved are very slow, and most of the algorithms they're using are probably older than I am. But I'm not a GN&C engineer, maybe IIIan would tell me differently.

~Jon
Jorge
Full Member
*****
Offline

Posts: 6763


« Reply #65 on: 05/08/2012 07:27 PM »

One thing that baffles me, is how could the duration of code review or software QA be so unpredictable ?

For code review, you have a number of modules and lines, types of analysis and static testing you might do, and its pretty easily countable in total hours.

For QA, you have your test plan to go through, with predictable cycle time. UNLESS you find bugs, need to improve your testing matrix, and restart the entire cycle. But they claim that "no issues have been found".

So i am puzzled.

"Claim" is the operative word.

Another answer may be that during simulation, superior algorithms may emerge, requiring another round of coding.

If they're still implementing algorithms at this stage, they are doing it wrong.

At this stage, the most likely causes for testing delays are interfaces (software-software and software-hardware, especially if the hardware is also being tweaked due to test results), and gracefully handling exceptions (including those caused by hardware failure).
jongoff
Rocket Plumber
Full Member
*****
Offline

Posts: 3905
Location: Louisville, CO


WWW
« Reply #66 on: 05/08/2012 07:32 PM »

AFAIK, there has never been an automated rendezvous with an uncooperative target.

XSS-11.

http://www.kirtland.af.mil/shared/media/document/AFD-070404-108.pdf

Now, I think that *docking/capture* of a totally non-cooperative target hasn't been done yet, but we're working on that...

~Jon
Antares
ABO
Full Member
*****
Offline

Posts: 4608
Location: Done arguing with amateurs


« Reply #67 on: 05/08/2012 07:54 PM »

All you armchair coders crack me up.    "Should be easy", "minor", etc.    Try walking a mile in the their shoes before you tell them how easy their job should be.

+10
mmeijeri
Full Member
*****
Offline

Posts: 7067
Location: NL


Martijn Meijering


« Reply #68 on: 05/08/2012 08:03 PM »

All you armchair coders crack me up.    "Should be easy", "minor", etc.    Try walking a mile in the their shoes before you tell them how easy their job should be.

You're obviously not a brogrammer, this stuff is easy.  ;D

The Ultimate Guide To Learning Brogramming -- The Hard Way

<a href="http://www.youtube.com/v/Qi_AAqi0RZM&rel=1" target="_blank">http://www.youtube.com/v/Qi_AAqi0RZM&rel=1</a>
psloss
Veteran armchair spectator
Full Member
*****
Offline

Posts: 15152


« Reply #69 on: 05/08/2012 08:11 PM »

Software was recognized as easy a long time ago (back when the mauve database first came out):
http://dilbert.com/strips/comic/1994-10-17/
savuporo
Full Member
*****
Offline

Posts: 1893


« Reply #70 on: 05/08/2012 08:58 PM »

AFAIK, there has never been an automated rendezvous with an uncooperative target.
We had a thread about this. DLR DEOS mission, after having its tech proven on DLR/SCC PRISMA, intends to do exactly that.
Source1 Source2.

savuporo
Full Member
*****
Offline

Posts: 1893


« Reply #71 on: 05/08/2012 09:05 PM »

easy. the scope of the test plan is expanding.
If that would be happening in my project with such stakes, my head of software QA would be fired by now along with half of his team.
Normally your test plans should be 99% final way before the software is complete.
Kabloona
Full Member
****
Offline

Posts: 384
Location: Fortress of Solitude

Velocitas Eradico


« Reply #72 on: 05/08/2012 10:00 PM »

Driving a car is a 2D operation. Flying spacecraft is 3D.

3D with 6 degrees of freedom...but anyway, as has been pointed out earlier, the GN&C algorithms for approaching ISS have already been solved by others.

What hasn't been solved yet, unfortunately, is how to keep a major software development program on schedule and budget.

savuporo
Full Member
*****
Offline

Posts: 1893


« Reply #73 on: 05/08/2012 10:10 PM »

What hasn't been solved yet, unfortunately, is how to keep a major software development program on schedule and budget.
Doesn't seem like this is the problem SpaceX is facing. Again, they claim their SW is complete and tests have not uncovered any issues. I'd pay to see a documentary that tells the actual story here.
starsilk
Full Member
****
Offline

Posts: 336
Location: Denver



« Reply #74 on: 05/08/2012 11:46 PM »

easy. the scope of the test plan is expanding.
If that would be happening in my project with such stakes, my head of software QA would be fired by now along with half of his team.
Normally your test plans should be 99% final way before the software is complete.

firing your QA department just before a release is an *awesome* way to stay on schedule. unfortunately it does not tend to improve the quality of the product.

Quote
no plan of battle survives first contact with the enemy

here's the problem. they are testing for something that has never been done before (or at least, something they have never done before). coming up with a test plan in that situation is like throwing darts at a wall. they have to refine the plan as they discover the areas that are not being tested, and that is bound to lead to schedule creep.
A_M_Swallow
Elite Veteran
Full Member
*****
Offline

Posts: 5562
Location: South coast of England


« Reply #75 on: 05/09/2012 01:01 AM »

One problem from encryption theory.  If your variable is 128 bits long and you have 128+ 'IF' statements it is trivial to write a program whose testing time exceeds the life expectancy of the universe.

The practical problem then becomes determining what you do not test.
Kabloona
Full Member
****
Offline

Posts: 384
Location: Fortress of Solitude

Velocitas Eradico


« Reply #76 on: 05/09/2012 01:16 AM »

What hasn't been solved yet, unfortunately, is how to keep a major software development program on schedule and budget.
Doesn't seem like this is the problem SpaceX is facing. Again, they claim their SW is complete and tests have not uncovered any issues. I'd pay to see a documentary that tells the actual story here.

Well, the SW is probably complete NOW, but they apparently had to make some changes following a January sim.

The best summary of the issue that I've seen was a Feb 3 Spaceflight Now article by Stephen Clark. Since SW is the main topic of discussion now, I thought it might be helpful to extract some nuggets from that article and reprint them here. These nuggets are not in the order that they appear in the article; I've resequenced them in an order I think gives the story a bit more coherence. I'll quote from three people:
Stephen Clark, who wrote the article's text; Mike Suffredini as quoted by Clark in the article, and Musk as quoted by Clark in the article. So here goes:

............Excerpts from February 3 Spaceflight Now article by Stephen Clark........

CLARK: "The software issues were noticed during a sim last month (i.e. January). SpaceX delivered the bulk of the SW to NASA in November, allowing the agency to complete an exhaustive series of analyses in December. No significant problems arose, but a sim in mid-January exposed some concerns with Dragon's real-time operations tools.

SUFFREDINI: "There are a few changes being made to SW that has to go through stage testing...enhancing their tools for real-time operations is very important...they've had a number of SW modifications...none of them are very critical, but all of them are important....all of them we agreed to get done and get the regression testing done."

MUSK: "critical path is verification of systems/failure response matrix...we need to make sure that the failover sequences work correctly in all scenarios."

CLARK: "Describing an 'insane amount of testing,' Musk said a sizeable chunk of the work in the weeks ahead will wring out the capsule's fault-tolerance capabilities...Five simulations are planned between now and launch to test the robustness of the fixed software, and regression testing to check for SW errors should be complete by the end of Feb, said Suffredini."

..........End of excerpts from article...............................................................

So it seems fairly clear that NASA identified some deficiences in mid-January. SpaceX agreed to make the changes, and then they had to go through a huge process of verifying the changes via their "systems/failure response matrix."

No surprise that verifying the "fixed" software can handle every conceivable failure or combination of failures, in every conceivable scenario/vehicle configuration, would take a few months.
beancounter
Full Member
*****
Offline

Posts: 757
Location: Perth, Western Australia



« Reply #77 on: 05/09/2012 01:30 PM »

What hasn't been solved yet, unfortunately, is how to keep a major software development program on schedule and budget.
Doesn't seem like this is the problem SpaceX is facing. Again, they claim their SW is complete and tests have not uncovered any issues. I'd pay to see a documentary that tells the actual story here.

Well, the SW is probably complete NOW, but they apparently had to make some changes following a January sim.

The best summary of the issue that I've seen was a Feb 3 Spaceflight Now article by Stephen Clark. Since SW is the main topic of discussion now, I thought it might be helpful to extract some nuggets from that article and reprint them here. These nuggets are not in the order that they appear in the article; I've resequenced them in an order I think gives the story a bit more coherence. I'll quote from three people:
Stephen Clark, who wrote the article's text; Mike Suffredini as quoted by Clark in the article, and Musk as quoted by Clark in the article. So here goes:

............Excerpts from February 3 Spaceflight Now article by Stephen Clark........

CLARK: "The software issues were noticed during a sim last month (i.e. January). SpaceX delivered the bulk of the SW to NASA in November, allowing the agency to complete an exhaustive series of analyses in December. No significant problems arose, but a sim in mid-January exposed some concerns with Dragon's real-time operations tools.

SUFFREDINI: "There are a few changes being made to SW that has to go through stage testing...enhancing their tools for real-time operations is very important...they've had a number of SW modifications...none of them are very critical, but all of them are important....all of them we agreed to get done and get the regression testing done."

MUSK: "critical path is verification of systems/failure response matrix...we need to make sure that the failover sequences work correctly in all scenarios."

CLARK: "Describing an 'insane amount of testing,' Musk said a sizeable chunk of the work in the weeks ahead will wring out the capsule's fault-tolerance capabilities...Five simulations are planned between now and launch to test the robustness of the fixed software, and regression testing to check for SW errors should be complete by the end of Feb, said Suffredini."

..........End of excerpts from article...............................................................

So it seems fairly clear that NASA identified some deficiences in mid-January. SpaceX agreed to make the changes, and then they had to go through a huge process of verifying the changes via their "systems/failure response matrix."

No surprise that verifying the "fixed" software can handle every conceivable failure or combination of failures, in every conceivable scenario/vehicle configuration, would take a few months.

What does '... none of them are very critical.' translate as.  If they're not critical why do them otherwise just call them required and leave it at that.  Have I missed something in the translation?
starsilk
Full Member
****
Offline

Posts: 336
Location: Denver



« Reply #78 on: 05/09/2012 03:54 PM »

What hasn't been solved yet, unfortunately, is how to keep a major software development program on schedule and budget.
Doesn't seem like this is the problem SpaceX is facing. Again, they claim their SW is complete and tests have not uncovered any issues. I'd pay to see a documentary that tells the actual story here.

Well, the SW is probably complete NOW, but they apparently had to make some changes following a January sim.

The best summary of the issue that I've seen was a Feb 3 Spaceflight Now article by Stephen Clark. Since SW is the main topic of discussion now, I thought it might be helpful to extract some nuggets from that article and reprint them here. These nuggets are not in the order that they appear in the article; I've resequenced them in an order I think gives the story a bit more coherence. I'll quote from three people:
Stephen Clark, who wrote the article's text; Mike Suffredini as quoted by Clark in the article, and Musk as quoted by Clark in the article. So here goes:

............Excerpts from February 3 Spaceflight Now article by Stephen Clark........

CLARK: "The software issues were noticed during a sim last month (i.e. January). SpaceX delivered the bulk of the SW to NASA in November, allowing the agency to complete an exhaustive series of analyses in December. No significant problems arose, but a sim in mid-January exposed some concerns with Dragon's real-time operations tools.

SUFFREDINI: "There are a few changes being made to SW that has to go through stage testing...enhancing their tools for real-time operations is very important...they've had a number of SW modifications...none of them are very critical, but all of them are important....all of them we agreed to get done and get the regression testing done."

MUSK: "critical path is verification of systems/failure response matrix...we need to make sure that the failover sequences work correctly in all scenarios."

CLARK: "Describing an 'insane amount of testing,' Musk said a sizeable chunk of the work in the weeks ahead will wring out the capsule's fault-tolerance capabilities...Five simulations are planned between now and launch to test the robustness of the fixed software, and regression testing to check for SW errors should be complete by the end of Feb, said Suffredini."

..........End of excerpts from article...............................................................

So it seems fairly clear that NASA identified some deficiences in mid-January. SpaceX agreed to make the changes, and then they had to go through a huge process of verifying the changes via their "systems/failure response matrix."

No surprise that verifying the "fixed" software can handle every conceivable failure or combination of failures, in every conceivable scenario/vehicle configuration, would take a few months.

What does '... none of them are very critical.' translate as.  If they're not critical why do them otherwise just call them required and leave it at that.  Have I missed something in the translation?

I'd bet they are examples of 'Dragon runs away' problems, where it's not an aggressive enough driver to get the job done. not 'critical' in that it would not endanger the ISS, but a problem because it may not complete the mission.
MikeAtkinson
Full Member
*****
Offline

Posts: 645
Location: Bracknell, England


« Reply #79 on: 05/09/2012 08:44 PM »

I think that the Clark and Suffredini quotes are not about Dragon flight software itself. Instead they seem to refer to tools used on the ground to control Dragon or analyse data from Dragon.

There is probably a couple or orders of magnitude of control software and similar which is important to mission success. It is likely that this has not been tested so thoroughly. That together with the larger number of lines of code and likely changing requirements is where I would bet any problems lie.
miguel
Full Member
**
Offline

Posts: 7


« Reply #80 on: 05/09/2012 09:37 PM »

Driving a car is a 2D operation. Flying spacecraft is 3D.

I'm not convinced. I think the complexity comes in all of the abort modes they need to deal with, and the software/software and software/hardware interfaces, not with the underlying GN&C math of stationkeeping near another system. The relative velocities involved are very slow, and most of the algorithms they're using are probably older than I am. But I'm not a GN&C engineer, maybe IIIan would tell me differently.

~Jon

Apart from many other differences, probably the most important is the difficulty to test the system in the field. No mater what you do with cranes, helicopters and so on, it's not the same as the real thing. With cars it's easier: you can try, improve and try again, and scale the environment from an empty parking lot to off-road or to traffic.

Miguel
rcoppola
Full Member
****
Offline

Posts: 454
Location: USA



« Reply #81 on: 05/09/2012 09:42 PM »

From that article and others including interviews I have seen and heard with Musk, the biggest challenge has been the triple redundant failure response matrix.

I can't imagine the testing of 3 redundant strings that need to waterfall over if any one goes down, but without missing a beat.

Those must be some serious QA scenarios they are running. Love it!



oldAtlas_Eguy
Full Member
*****
Offline

Posts: 1074
Location: Florida



« Reply #82 on: 05/10/2012 06:06 PM »

I was thinking about what would be the easiest, cheapest and most reliable (from the standpoint of crew safety) to evolve from cargo dragon software / hardware implementation to the DragonRider implementation.

The first thing was to make the least amount of software changes to the GN&C software and hardware.

The second would be to isolate the user interface and non-critical non real-time support applications from the GN&C system hardware and software.  This isolation could be done through adding three additional computers that are running a GUI such as UBUNTU that primarily run the display of the massaged telemetry output from the GN&C and also can run support applications as well. If the GN&C computers have a lot of extra cpu capacity then the GUI can be run in a OS partition isolating it from the GN&C flight critical functions. The direct manual controls critical for flight would be connected direct to the GN&C computers/software bypassing the GUI systems/software.

By adding three additional computers (redundancy but if one fails it can be rebooted because these are not flight critical), they can be configured and contain a lot of the same software that was developed to display Dragon flight data and perform support functions that is performed in the Dragon command center in Hawthorne. The software would be upgrades and more integrated than the current but would have a history of operations. Also during some flight periods these computers can be turned off or in the case of cargo Dragon left off completely giving cargo Dragon and DragonRider identical hardware software configurations for GN&C.
JohnFornaro
Not an expert
Full Member
*****
Offline

Posts: 6905


« Reply #83 on: 05/10/2012 07:35 PM »

The brogramming video was great.  I remember a Dilbert strip where Dilbert was talking with Wally:

"When I was learning to brogram, all we had was ones and zeros."

Wally:  "They let you have ones?"
miguel
Full Member
**
Offline

Posts: 7


« Reply #84 on: 05/12/2012 11:27 AM »

The first thing was to make the least amount of software changes to the GN&C software and hardware.

The second would be to isolate the user interface and non-critical non real-time support applications from the GN&C system hardware and software.  This isolation could be done through adding three additional computers that are running a GUI such as UBUNTU that primarily run the display of the massaged telemetry output from the GN&C and also can run support applications as well.

First you have to ensure that the GNC designed for autonomous operations is still valid for manned ones. The difference is that, as soon as the crew has the capability to control the vehicle, they can take it to a situation that is unforeseen in the design of the automation.

For instance, suppose that you want to manually maneuver the capsule so that it can be visually inspected from the ISS, just like for the Shuttle tiles. What happens if the GUI computer fails then? Can the GNC system take the vehicle to a safe place starting from that situation? The GNC software for the cargo Dragon probably can't. Can the crew control the vehicle without the GUI?

In consequence, either the GNC software needs important modifications, or the GUI computer becomes more critical, so a regular OS like Ubuntu won't be safe to use. At the end, the crew interfaces will have more and less critical functions that will be allocated to the GUI or to other means (video, physical panels, etc). A regular OS like Linux or Windows (like in the ISS laptops), is only suitable for the non critical stuff.

Remes
Full Member
**
Offline

Posts: 20
Location: Germany


« Reply #85 on: 05/12/2012 10:00 PM »

The problem with 2D automated driving is the environment is uncooperative; in the world of spacecraft, you always want a cooperative target like ISS.

AFAIK, there has never been an automated rendezvous with an uncooperative target.

As far as I know, one object is the active one, the other one is kept as passive as possible (as the process is easier to manage). I would guess the lighter vehicle (fuel consumption) will always be the active one (accelerating/approaching/decelerating). A passive target can always be considered cooperative.

Going back to the comparison with cars: Cars driving at a the grand challenge and crashing into something will not destroy unreplacable equipment worth 150b$. If things must not go wrong, they get very complicated.

I also think, implementing the laws of rigid body motion is the smallest part. Especially if everyone agrees, that as smaller the distance gets, as smaller the allowed velocities become.

But nevertheless typcial problems remain: Optical sensors might be confused by changing light conditions or reflections. Radar based sensors might be disturbed by other electronic equipment.  Sensors might be defect, there is always a lot of diagnosis software involved to detect defect sensors and even more software to isolate corrupt sensor data. The same applies to actuators. Abort manoeuvres need to be planned at any point in time and in combinations with different failiure modes (e.g. a faulty propulsion). We are talking about two systems, which are docking. Both system states must converge, therefore there is a necessity for communication, which might be faulty again.

garidan
Full Member
**
Offline

Posts: 29


« Reply #86 on: 05/13/2012 03:59 AM »

Just one thought.
QNX was born for realtime requirements and developed pretty good GUI libraries too.
Blackberry now is going to "collapse": it owns QNX and much more.
I don't know the money but it could become cheap: tesla and spacex could gain advantages having a true, all theirs real time platform, the same for GUI and for core systems.
It's not their core business, but could be intersting invest money on it and adopt QNX, while selling it to others in the real time and automotive market as now.

http://www.qnx.com/
krytek
Full Member
*****
Offline

Posts: 527


« Reply #87 on: 08/18/2012 11:33 PM »

This page came up in a google search.
http://www.kevin.org/

Quote
What am I working on right now? Well, having finished the first generation fault-tolerant computer and networking hardware used on our Dragon spacecraft, I'm now working on the second generation of computers and networking hardware that will be used on future Falcon rockets and Dragon spacecraft.

-Kevin

dcporter
Full Member
*****
Offline

Posts: 506


« Reply #88 on: 08/18/2012 11:44 PM »

Last updated 4/24/12 (FWIW).
Tags:
Pages: 1 2 3 ... 6 Next [All]
 

Powered by MySQL Powered by PHP Powered by SMF 2.0 Beta 3.1 Public | SMF © 2006–2008, Simple Machines LLC
All content © 2005-2011 NASASpaceFlight.com
Valid XHTML 1.0! Valid CSS!
Page created in 0.531 seconds with 22 queries.