-
#1040
by
LouScheffer
on 09 Feb, 2020 14:09
-
This is Boeing's John Mulholland at 21:08 of the teleconference, talking about the bad thruster mapping in the service module:
when we went into the lab during the flight, conducted the verification test with the flight propulsion controller, it exhibited that issue, the team very quickly recoded the software, we verified it in the labs and we were able to upload that software correction...
This seemingly means that, before launch, they did NOT run the verification test with the flight propulsion controller, or they would have seen this problem?? How can this be possible, that they did not run the flight software load with the flight propulsion controller? I'd think that would be the very first requirement before uploading any software. How can anyone sign off on software that has not been tried on the most realistic tests available?
-
#1041
by
ugordan
on 09 Feb, 2020 14:20
-
This is Boeing's John Mulholland at 21:08 of the teleconference, talking about the bad thruster mapping in the service module:
when we went into the lab during the flight, conducted the verification test with the flight propulsion controller, it exhibited that issue, the team very quickly recoded the software, we verified it in the labs and we were able to upload that software correction...
This seemingly means that, before launch, they did NOT run the verification test with the flight propulsion controller, or they would have seen this problem?? How can this be possible, that they did not run the flight software load with the flight propulsion controller? I'd think that would be the very first requirement before uploading any software. How can anyone sign off on software that has not been tried on the most realistic tests available?
My impression was that once they got a "stinker" flag with respect to the MET issue they scrambled to see what other software areas were suspect. And lo and behold, they found an additional one.
Basically, implying their software verification is woefully insufficient, in the sense that they did only simplistic checks like which sequence gets started when, but not what the actual details of that sequence did.
Personally, I think that speaks volumes on the quality of their preflight software testing, but what do I know.
-
#1042
by
GWR64
on 09 Feb, 2020 14:33
-
Chris Ferguson has a high responsibility. He'll have to make sure that Starliner thoroughly tested.
Before flight with a crew.
For his own life and that of his two passengers, but also for future missions.
-
#1043
by
Coastal Ron
on 09 Feb, 2020 15:07
-
This is Boeing's John Mulholland at 21:08 of the teleconference, talking about the bad thruster mapping in the service module:
when we went into the lab during the flight, conducted the verification test with the flight propulsion controller, it exhibited that issue, the team very quickly recoded the software, we verified it in the labs and we were able to upload that software correction...
This seemingly means that, before launch, they did NOT run the verification test with the flight propulsion controller, or they would have seen this problem?? How can this be possible, that they did not run the flight software load with the flight propulsion controller? I'd think that would be the very first requirement before uploading any software. How can anyone sign off on software that has not been tried on the most realistic tests available?
My impression was that once they got a "stinker" flag with respect to the MET issue they scrambled to see what other software areas were suspect. And lo and behold, they found an additional one.
Yes, I recall them saying in the press conference that once they had the one issue, that they specifically went looking for other potential issues - and the separation sequence issue is the one they found.
Basically, implying their software verification is woefully insufficient, in the sense that they did only simplistic checks like which sequence gets started when, but not what the actual details of that sequence did.
Personally, I think that speaks volumes on the quality of their preflight software testing, but what do I know.
It certainly calls into question not only their software testing system, but their software production system too.
-
#1044
by
SoftwareDude
on 09 Feb, 2020 16:33
-
Software errors are like cockroaches when inspection finds one, there are at least 100 more unseen.
-
#1045
by
Lee Jay
on 09 Feb, 2020 16:54
-
Starliner used its CM thrusters during pad abort test, why didn't mapping error show up then?
The failure is very strange. It should have shown up in the simulations as well.
I've seen this happen before. Engineer A writes the thruster command table, saying what the software expects when each thruster is fired. Engineer B writes the code that simulates spacecraft attitude changes as a result of thruster firings. This will contain a very similar, if not identical table. If both are wrong in the same way, the sim will pass but the spacecraft fail.
In theory these two pieces are written independently. But maybe they were both written from a spec that contained an error. Maybe A and B, tracking down a sim failure, noticed they were different, but made the wrong fix. Maybe B asked person A for his tables as a starting point. Maybe A and B were completely independent, but made the same mistake (This is quite common. If a single person programming has an error probability E, having a second person program the same thing independently only reduces the error chance to E/4, not E*E, since programmers tend to make similar errors.) There's lots of ways a cancelling error can occur.
The same error can screw up hardware-in-the-loop simulations, too. Unless it's super-realistic, with real thrusters moving a fake mass on low friction bearings, measured by real gyroscopes, (this is very rare, especially in 3D), something still has to transform thruster firings into attitude changes. If that table is screwed up in the same way as the command software, the sim will pass and the flight will fail.
You don't need the thrusters to fire to do end-to-end testing - just see which valves open when given which commands. No fuel or oxidizer need be present.
-
#1046
by
FinalFrontier
on 09 Feb, 2020 16:58
-
This is the systemic problem within Boeing and its the reason why the previous CEO was sent packing.
In recent years they took an unusual approach to software development in the name of reducing costs and long lead time development projects. Essentially this boiled down to developing flight rationale for avionics software based on simulation only not actual testing with flight loads. When they would find issues on the ground they would do a code patch and then ASSUME they fixed it without bothering to run a real world test or look for additional issues, or in the case of MCAS they just ASSUMED no situation would ever arise that would cause MCAS to malfunction.
They also simultaneously decided that in order to make certification of new aircraft and new systems EASIER, if a piece of software was deemed not mission critical by Boeing then they didn't have the obligation (in their mindset) to even explain that it existed or how it worked to regulators.
On top of this Boeing (under the previous CEO) decided that they didn't want to retain experienced employees or pay for large development teams for new systems. MCAS was coded by H1B interns on a team that was too small to handle it with a deadline that could not be met on a shoestring budget so it was no surprise what you got was garbage. Hence the emails describing a program run by clowns.
We know this was not quite the approach with starliner as it seems most of the software development was done in house, but the lack of QC or proper testing is still front and center.
Another way Boeing in this mindset decided to save money was to eliminate debugging entirely. Now in this regard they aren't alone. Alot of companies including some software giants like Apple have recently been moving more to a mindset where they don't properly debug new software and the attitude seems to be "if it breaks on the user's f them we will just patch into a working product after launch". This has disgusted people worldwide and its also made various regulators very angry in recent years.
Here's the rub. It's one thing to release a phone or a laptop that's a paper launch because of software and patch into working product after. Same thing with a video game. Yes it's a scummy practice but nobody gets hurt.
You can't take this approach with planes trains automobiles or especially spacecraft. Now you are dealing with complex machines and if you don't do debugging before hand chances are something will crash or explode.
And yet this is exactly what Boeing has been doing.
The company is under new leadership now as a result of all these messes but it will take time before any of this gets fixed or cleaned up internally and it will take time for a culture shift to occur, assuming they actually try to fix this situation. With that said as it relates to Starliner it should probably be assumed (by NASA) that this thing hasn't been debugged at any point during software development and Boeing is lying about how much testing or quality verification they actually ever even did.
Realistically speaking the entire flight software package should be audited, patched or totally recoded if necessary, then debugged patched and THEN tested, the way software is supposed to be developed for sensitive systems. I think it would be extremely unwise for NASA to trust Boeing to fix this and not to require a total audit and debug of the entire package. But NASA has at least said they are going to do a "total organizational review" so maybe this will fall under that.
I am not holding my breath given how this same process played out with SLS.
-
#1047
by
lrk
on 09 Feb, 2020 17:18
-
-- In 2019, we completed:
-- Service module hot fire test, validating the performance of our propulsion system in both nominal and contingency scenarios.
-- All parachute qualification tests without a single test failure, demonstrating the resiliency of our parachute system even in dual-fault scenarios.
-- Discussions with NASA about our system led to our mutual agreement to perform even more tests and analysis, which validated our system as designed.
-- We are confident in the safety of our system, and we have proven through extensive testing that we have a robust design that has consistently performed above requirements, even in dual-fault scenarios.
-- Pad Abort Test, which was Starliner’s first flight test and a near-flawless performance of our integrated propulsion and flight control systems in an abort case.
emphasis by me
We don't know what is behind the little word "near".
Parachute deployment failure?
-
#1048
by
RonM
on 09 Feb, 2020 17:21
-
This is Boeing's John Mulholland at 21:08 of the teleconference, talking about the bad thruster mapping in the service module:
when we went into the lab during the flight, conducted the verification test with the flight propulsion controller, it exhibited that issue, the team very quickly recoded the software, we verified it in the labs and we were able to upload that software correction...
This seemingly means that, before launch, they did NOT run the verification test with the flight propulsion controller, or they would have seen this problem?? How can this be possible, that they did not run the flight software load with the flight propulsion controller? I'd think that would be the very first requirement before uploading any software. How can anyone sign off on software that has not been tried on the most realistic tests available?
Overconfidence. Remember the problem Hubble had with its main mirror? Any amateur astronomer who grinds their own mirrors could have found the error with a simple knife-edge test, but the people who ground Hubble's mirror didn't bother to check.
-
#1049
by
Lemurion
on 09 Feb, 2020 18:23
-
-- In 2019, we completed:
-- Service module hot fire test, validating the performance of our propulsion system in both nominal and contingency scenarios.
-- All parachute qualification tests without a single test failure, demonstrating the resiliency of our parachute system even in dual-fault scenarios.
-- Discussions with NASA about our system led to our mutual agreement to perform even more tests and analysis, which validated our system as designed.
-- We are confident in the safety of our system, and we have proven through extensive testing that we have a robust design that has consistently performed above requirements, even in dual-fault scenarios.
-- Pad Abort Test, which was Starliner’s first flight test and a near-flawless performance of our integrated propulsion and flight control systems in an abort case.
emphasis by me
We don't know what is behind the little word "near".
Parachute deployment failure?
One would hope, but I'm not sure that I would categorize the parachutes as part of the integrated propulsion or flight control system. Maybe the latter, but only maybe.
-
#1050
by
Kang54
on 09 Feb, 2020 18:37
-
Doug Loverro has answered Keith Cowing's questions - below - here:
http://nasawatch.com/archives/2020/02/boeing-really-n.html#comments . It's the first, featured comment.
1) Q: "Is there any overlap between software teams or management between Starliner and SLS (or 737 Max)?
2) Q: Since Boeing's current software process has clearly failed after many years and billions of dollars spent, what do you need to do differently in order to get this whole software thing working properly again?
-
#1051
by
Comga
on 09 Feb, 2020 18:40
-
This is Boeing's John Mulholland at 21:08 of the teleconference, talking about the bad thruster mapping in the service module:
when we went into the lab during the flight, conducted the verification test with the flight propulsion controller, it exhibited that issue, the team very quickly recoded the software, we verified it in the labs and we were able to upload that software correction...
This seemingly means that, before launch, they did NOT run the verification test with the flight propulsion controller, or they would have seen this problem?? How can this be possible, that they did not run the flight software load with the flight propulsion controller? I'd think that would be the very first requirement before uploading any software. How can anyone sign off on software that has not been tried on the most realistic tests available?
Overconfidence. Remember the problem Hubble had with its main mirror? Any amateur astronomer who grinds their own mirrors could have found the error with a simple knife-edge test, but the people who ground Hubble's mirror didn't bother to check.
Analogies inevitably lead the topic OT so...
That’s not a correct assessment of the Hubble error.
PE
did check the figure of the Hubble Primary mirror.
They checked it twice.
The issue is how they responded to those tests.
Hearing the full story from people directly involved, the similarity would be in the upper management, all the way to NASA, and the work environment they created.
But that was a classic, custom build to NASA requirements contract.
Starliner under Commercial Crew is different.
Comparing them is difficult at best.
-
#1052
by
noogie
on 09 Feb, 2020 18:51
-
The alternative is to have less than nothing when starliner slams into the space station or kills its crew and a future earth centric anti science congress responds by canceling all HSF programs and funding.
Just mention China and its HSF program and all that talk will go away, regardless of mishaps or crew losses. I can't see how any congress, no matter how anti science, will leave such a gap with a strategic rival.
-
#1053
by
RonM
on 09 Feb, 2020 21:45
-
This is Boeing's John Mulholland at 21:08 of the teleconference, talking about the bad thruster mapping in the service module:
when we went into the lab during the flight, conducted the verification test with the flight propulsion controller, it exhibited that issue, the team very quickly recoded the software, we verified it in the labs and we were able to upload that software correction...
This seemingly means that, before launch, they did NOT run the verification test with the flight propulsion controller, or they would have seen this problem?? How can this be possible, that they did not run the flight software load with the flight propulsion controller? I'd think that would be the very first requirement before uploading any software. How can anyone sign off on software that has not been tried on the most realistic tests available?
Overconfidence. Remember the problem Hubble had with its main mirror? Any amateur astronomer who grinds their own mirrors could have found the error with a simple knife-edge test, but the people who ground Hubble's mirror didn't bother to check.
Analogies inevitably lead the topic OT so...
That’s not a correct assessment of the Hubble error.
PE did check the figure of the Hubble Primary mirror.
They checked it twice.
The issue is how they responded to those tests.
Hearing the full story from people directly involved, the similarity would be in the upper management, all the way to NASA, and the work environment they created.
But that was a classic, custom build to NASA requirements contract.
Starliner under Commercial Crew is different.
Comparing them is difficult at best.
I don't want to get into an argument over my example, but you're making my point. For some reason they assumed the two test instruments were wrong. In reality the problem was with reflective null corrector used during manufacturing. A simple knife-edge test would have confirmed the test instruments were correct and the mirror had a spherical aberration.
The point is there was a lack of proper quality control. Probably because of overconfidence. I've seen this a lot when I worked in IT.
-
#1054
by
mn
on 09 Feb, 2020 23:35
-
This is the systemic problem within Boeing and its the reason why the previous CEO was sent packing.
In recent years they took an unusual approach to software development in the name of reducing costs and long lead time development projects. Essentially this boiled down to developing flight rationale for avionics software based on simulation only not actual testing with flight loads. When they would find issues on the ground they would do a code patch and then ASSUME they fixed it without bothering to run a real world test or look for additional issues, or in the case of MCAS they just ASSUMED no situation would ever arise that would cause MCAS to malfunction.
They also simultaneously decided that in order to make certification of new aircraft and new systems EASIER, if a piece of software was deemed not mission critical by Boeing then they didn't have the obligation (in their mindset) to even explain that it existed or how it worked to regulators.
On top of this Boeing (under the previous CEO) decided that they didn't want to retain experienced employees or pay for large development teams for new systems. MCAS was coded by H1B interns on a team that was too small to handle it with a deadline that could not be met on a shoestring budget so it was no surprise what you got was garbage. Hence the emails describing a program run by clowns.
We know this was not quite the approach with starliner as it seems most of the software development was done in house, but the lack of QC or proper testing is still front and center.
Another way Boeing in this mindset decided to save money was to eliminate debugging entirely. Now in this regard they aren't alone. Alot of companies including some software giants like Apple have recently been moving more to a mindset where they don't properly debug new software and the attitude seems to be "if it breaks on the user's f them we will just patch into a working product after launch". This has disgusted people worldwide and its also made various regulators very angry in recent years.
Here's the rub. It's one thing to release a phone or a laptop that's a paper launch because of software and patch into working product after. Same thing with a video game. Yes it's a scummy practice but nobody gets hurt.
You can't take this approach with planes trains automobiles or especially spacecraft. Now you are dealing with complex machines and if you don't do debugging before hand chances are something will crash or explode.
And yet this is exactly what Boeing has been doing.
The company is under new leadership now as a result of all these messes but it will take time before any of this gets fixed or cleaned up internally and it will take time for a culture shift to occur, assuming they actually try to fix this situation. With that said as it relates to Starliner it should probably be assumed (by NASA) that this thing hasn't been debugged at any point during software development and Boeing is lying about how much testing or quality verification they actually ever even did.
Realistically speaking the entire flight software package should be audited, patched or totally recoded if necessary, then debugged patched and THEN tested, the way software is supposed to be developed for sensitive systems. I think it would be extremely unwise for NASA to trust Boeing to fix this and not to require a total audit and debug of the entire package. But NASA has at least said they are going to do a "total organizational review" so maybe this will fall under that.
I am not holding my breath given how this same process played out with SLS.
How much of this is baseless assumptions?
-
#1055
by
dlapine
on 09 Feb, 2020 23:40
-
This is the systemic problem within Boeing and its the reason why the previous CEO was sent packing.
In recent years they took an unusual approach to software development in the name of reducing costs and long lead time development projects. Essentially this boiled down to developing flight rationale for avionics software based on simulation only not actual testing with flight loads. When they would find issues on the ground they would do a code patch and then ASSUME they fixed it without bothering to run a real world test or look for additional issues, or in the case of MCAS they just ASSUMED no situation would ever arise that would cause MCAS to malfunction.
They also simultaneously decided that in order to make certification of new aircraft and new systems EASIER, if a piece of software was deemed not mission critical by Boeing then they didn't have the obligation (in their mindset) to even explain that it existed or how it worked to regulators.
On top of this Boeing (under the previous CEO) decided that they didn't want to retain experienced employees or pay for large development teams for new systems. MCAS was coded by H1B interns on a team that was too small to handle it with a deadline that could not be met on a shoestring budget so it was no surprise what you got was garbage. Hence the emails describing a program run by clowns.
We know this was not quite the approach with starliner as it seems most of the software development was done in house, but the lack of QC or proper testing is still front and center.
Another way Boeing in this mindset decided to save money was to eliminate debugging entirely. Now in this regard they aren't alone. Alot of companies including some software giants like Apple have recently been moving more to a mindset where they don't properly debug new software and the attitude seems to be "if it breaks on the user's f them we will just patch into a working product after launch". This has disgusted people worldwide and its also made various regulators very angry in recent years.
...
(Expanding on MN)
FF, those are pretty significant claims against Boeing management and it's software development process, are there sources that you can point to which give us something other than your informed assertion that it's this bad? I'd ask the same question if this were a statement about the practices at SpaceX, ULA or any anywhere else.
-
#1056
by
FinalFrontier
on 10 Feb, 2020 00:16
-
The alternative is to have less than nothing when starliner slams into the space station or kills its crew and a future earth centric anti science congress responds by canceling all HSF programs and funding.
Just mention China and its HSF program and all that talk will go away, regardless of mishaps or crew losses. I can't see how any congress, no matter how anti science, will leave such a gap with a strategic rival.
Prepare to be surprised then. The anti-space lobby doesn't think in these terms they think only in "we want this money now for x projects how do we get it".
It won't be the first time this happened either similar types of anti-common sense logic by specific factions in congress have occurred in the past and resulted in strategic shortfalls and major misadventures.
Read Alfred Thayer Mahan's principles of seapower if you want further insight on this type of thinking regarding appropriations.
-
#1057
by
dalek
on 10 Feb, 2020 00:30
-
To stir up the hornets nest- Why hasn't some reporter made up a list of all the fod coming from Boeing and NASA and aggressively confroned both of them?
Apollo GSE 65 66
-
#1058
by
FinalFrontier
on 10 Feb, 2020 00:35
-
FF, those are pretty significant claims against Boeing management and it's software development process, are there sources that you can point to which give us something other than your informed assertion that it's this bad? I'd ask the same question if this were a statement about the practices at SpaceX, ULA or any anywhere else.
There has been extensive reporting and disclosure by the FAA, former employees, whistle-blowers, and congress regarding internal processes on software development within Boeing all of which is widely and publicly availble. Up to and including emails being leaked from within the company describing a culture where major programs are "worked on by monkeys and managed by clowns" direct quote of a Boeing employee.
This is OT and outside the scope of this site with the exception that it was worth pointing out its a specific corporate culture mindset and management mindset regarding this issue. And that it's systemic.
Case and point Boeing's investors have spoken and the previous executive management team and the CEO have been shown the door as a result of choosing to foster this type of culture and choosing a cheapskate crackpot way of developing critical software for new products.
edit zubenelgenubi: fixed quote attribution
-
#1059
by
CyndyC
on 10 Feb, 2020 01:21
-
Boeing might be having a hard time maintaining a top programming team because so many younger people probably think it's cooler to work for SpaceX or Tesla or Google, or to be somewhere else in Silicon Valley, instead of working for their grandparents' or great-grandparents' kind of company. I don't have the stats in front of me, but the average age of SpaceX and Google employees is unusually low, somewhere around 30, and an old established company like Boeing is probably more accommodating and welcoming of older employees (AARP does research and can validate such distinctions). If the latter is the case, then a communication gap among the differing generations could be heavily at play in more ways than one.
That's mostly speculation, but no one has brought up the possibility of a need for intercession from further afield, by industrial psychologists. Edit to add or professional consultants whose sole purpose is to help companies grow and thrive.