Author Topic: Master Thread for NASASpaceflight.com and NSF Forum Outages, Upgrades and Error Message Reporting  (Read 249352 times)

Offline Jarnis

  • Full Member
  • ****
  • Posts: 1327
  • Liked: 847
  • Likes Given: 209
Eeek. The shakes... terrible when I kept getting Error 500...  :'(

(I'll live... stuff happens...)

Offline Bargemanos

  • Full Member
  • ***
  • Posts: 331
  • The Netherlands
  • Liked: 108
  • Likes Given: 684
This stuff happens in IT world.

No worries, had a very productive day at work  ;)

Offline Jarnis

  • Full Member
  • ****
  • Posts: 1327
  • Liked: 847
  • Likes Given: 209
No worries, had a very productive day at work  ;)

I had a similar problem today. Suddenly had more time to do some PC repair stuff on couple of systems I had been putting off. I wonder how this happened.  :o

Online meekGee

  • Senior Member
  • *****
  • Posts: 17574
  • N. California
  • Liked: 17888
  • Likes Given: 1502
Well that was ridiculous!

We had no idea (and certainly no warning) the outage would be all morning UK time. Clearly something went wrong with the host's hardware replacement efforts - and that was certainly intimated as they lost their own portal pages.

Really very sorry for what happened there. Totally out of our control although we'll work to see what we can do to avoid that in future. Was out first major outage in years, but still totally unacceptable. If any L2 members were overly annoyed by this, contact me and I'll extend your term to compensate.

Speaking of which, I really want to thank the people who e-mailed me about this. Everyone - every single person - was very understanding and that meant so much as I was starting to feel very ill about this (no joke, felt physically sick!)

The site looks fine, but everyone's going to be piling in now, so it may be a bit sluggish for an hour or so.

Again, very sorry and those of us in charge of the site (Mark, Jester, Aaron and myself) will be having a discussion about ensuring this doesn't happen again.

Chris.

Ill?   Your standards are a tad too perfectionists....   You're doing 10x better than exceptionally well.   'nuff said.
ABCD - Always Be Counting Down

Offline Poole Amateur

  • Full Member
  • *
  • Posts: 153
  • Liked: 120
  • Likes Given: 6592
I missed NSF, too, but stuff happens.

Please don't tack any days onto my L2 subscription.  Who really needs some sort of "restitution" for a little outage like this?  (*Stifling the urge to break into some rant about the "entitlement mentality."*)

Keep up the great work.

Hear Hear!

Cheers Chris, this is a great site and your personal response to emails, including to a non space person like me, really does underline that greatness.

Offline RocketmanUS

  • Senior Member
  • *****
  • Posts: 2226
  • USA
  • Liked: 71
  • Likes Given: 31
Our hosting company in Dallas, Texas is doing some big maintenance tomorrow morning and there "may" be a small outage for a matter of minutes some time between 6am and midday UTC.

You probably won't notice it, if that happens, but best to give a heads up in case it does and you're in the middle of posting something. A good trick for this period would be to copy your post into Word before you post it.....but the chances are very small you'd be posting at the exact time.

(Read below, cause that didn't go to plan!)
Thanks for the heads up  :).

Checked other sites, they worked so figured it was just the server maintenance.
As I powered down my computer then I thought to check your Twitter page to see for additional info, oops.

For future do you have others ( staff members ) with Twitter if your not online to post info if site has a problem in the future?

Offline Jester

  • Administrator
  • Senior Member
  • *****
  • Posts: 7993
  • Earth
  • Liked: 6625
  • Likes Given: 162
Our hosting company in Dallas, Texas is doing some big maintenance tomorrow morning and there "may" be a small outage for a matter of minutes some time between 6am and midday UTC.

You probably won't notice it, if that happens, but best to give a heads up in case it does and you're in the middle of posting something. A good trick for this period would be to copy your post into Word before you post it.....but the chances are very small you'd be posting at the exact time.

(Read below, cause that didn't go to plan!)
Thanks for the heads up  :).

Checked other sites, they worked so figured it was just the server maintenance.
As I powered down my computer then I thought to check your Twitter page to see for additional info, oops.

For future do you have others ( staff members ) with Twitter if your not online to post info if site has a problem in the future?

What do you mean, not online ??? Chris is ALWAYS online ;)

Online darkenfast

  • Member
  • Full Member
  • ****
  • Posts: 1635
  • Liked: 1964
  • Likes Given: 10181
Three cheers for Chris and his associates, who give us the best space-related news and forum in the Greater Solar System!

Can I have a pony now?
Writer of Book and Lyrics for musicals "SCAR", "Cinderella!", and "Aladdin!". Retired Naval Security Group. "I think SCAR is a winner. Great score, [and] the writing is up there with the very best!"
-- Phil Henderson, Composer of the West End musical "The Far Pavilions".

Offline RocketmanUS

  • Senior Member
  • *****
  • Posts: 2226
  • USA
  • Liked: 71
  • Likes Given: 31
Our hosting company in Dallas, Texas is doing some big maintenance tomorrow morning and there "may" be a small outage for a matter of minutes some time between 6am and midday UTC.

You probably won't notice it, if that happens, but best to give a heads up in case it does and you're in the middle of posting something. A good trick for this period would be to copy your post into Word before you post it.....but the chances are very small you'd be posting at the exact time.

(Read below, cause that didn't go to plan!)
Thanks for the heads up  :).

Checked other sites, they worked so figured it was just the server maintenance.
As I powered down my computer then I thought to check your Twitter page to see for additional info, oops.

For future do you have others ( staff members ) with Twitter if your not online to post info if site has a problem in the future?

What do you mean, not online ??? Chris is ALWAYS online ;)
Work
Sleeping
Most important Football game ( Soccer )   ;D

Online Chris Bergin

OK, so that outage just now was totally out of the blue and really frustrating. Scheduled maintenance is one thing, but that an unexpected host issue.

I saw the site go down (as I'm always on it) and immediately went to the host in Dallas, Texas and requested resolution.

Told it was a VSI outage at their end. Outage was resolved and we were rebooted back to life.

Most of the time was their resolution of the outage. Rebooot was about 5-10 minutes I believe.

Really, really sorry for the inconvenience surrounding that. We are looking at mitigation options, as much as that appears to be the first ever specific fault of its kind since we've been with these guys.

We will likely take an option to do an OS Update Reboot early morning GMT (quietest time of the day for the site), which will be a 1-5 minute outage. I'll provide advanced warning.
Support NSF via L2 -- JOIN THE NSF TEAM -- Site Rules/Feedback/Updates
**Not a L2 member? Whitelist this forum in your adblocker to support the site and ensure full functionality.**

Online Chris Bergin

So that was only a short outage, but any outage is not good.

The Dallas datacenter had a "software based crash" on their routers. They appeared to get on it within four minutes and then we had to reboot and remount (Mark did it, I don't understand all of that).

Really apologize for that. Amazing we've survived the mass events of SpaceX stuff (the servers are very good) and then go down to two random issues at the host in a month.

Mark is investigating and will talk to the host about mitigation.

Chris.
Support NSF via L2 -- JOIN THE NSF TEAM -- Site Rules/Feedback/Updates
**Not a L2 member? Whitelist this forum in your adblocker to support the site and ensure full functionality.**

Offline kirghizstan

  • Full Member
  • ****
  • Posts: 671
  • Liked: 181
  • Likes Given: 86
if it matters, the forum links on the main page went down before the whole site did

Online Chris Bergin

That would point to the forum going down first, but the whole site is based in Dallas, so it was probably a snowball effect (all their hosted sites went down, not just us). Maybe time to look elsewhere.....although web people tell me there's no such thing as a host who is 100 percent up all the time.
Support NSF via L2 -- JOIN THE NSF TEAM -- Site Rules/Feedback/Updates
**Not a L2 member? Whitelist this forum in your adblocker to support the site and ensure full functionality.**

Offline RonM

  • Senior Member
  • *****
  • Posts: 3340
  • Atlanta, Georgia USA
  • Liked: 2233
  • Likes Given: 1584
Even if a host is 99.9% reliable, that still means your site is down nearly nine hours a year. Nothing is perfect.

Online jcm

  • Senior Member
  • *****
  • Posts: 3941
  • Jonathan McDowell
  • Somerville, Massachusetts, USA
    • Jonathan's Space Report
  • Liked: 1745
  • Likes Given: 975
if it matters, the forum links on the main page went down before the whole site did

Yes, I had no problem seeing the main page and articles, but could not see any of the forums. Glad you are back!
-----------------------------

Jonathan McDowell
http://planet4589.org

Offline Hog

  • Senior Member
  • *****
  • Posts: 2863
  • Woodstock
  • Liked: 1722
  • Likes Given: 7074
I couldnt access anything, was there then not.  Wasnt down long.
Paul

Offline mvpel

  • Full Member
  • ****
  • Posts: 1125
  • New Hampshire
  • Liked: 1303
  • Likes Given: 1685
You'd think a commercial data center wouldn't have a SPOF in something as basic as their routers. I'm rather surprised.
"Ugly programs are like ugly suspension bridges: they're much more liable to collapse than pretty ones, because the way humans (especially engineer-humans) perceive beauty is intimately related to our ability to process and understand complexity. A language that makes it hard to write elegant code makes it hard to write good code." - Eric S. Raymond

Online Chris Bergin

So the one night I go to bed early due to a cold and I get up to "site's down" e-mails. Beats any need for a morning coffee, I can tell you!  :o

This was a very annoying and unfortunate incident where the servers were up, but the public IP accesses had failed at the host - whatever that means. Mark sorted it out.

Huge apologies for that. We've had a really good run of the site being up and through SpaceX launches (with some mitigation) etc. Then that happens.

Amazed the host didn't pro-actively sort it out the second it happened. Got an automated e-mail about it, but we had to personally do something. Hmmm!

Have to go to work. Watched the site for the last 15 mins and looks fine and Mark says it's fine now. Will keep watching it from work like an overly concerned father.

Chris.
Support NSF via L2 -- JOIN THE NSF TEAM -- Site Rules/Feedback/Updates
**Not a L2 member? Whitelist this forum in your adblocker to support the site and ensure full functionality.**

Offline Tomness

  • Full Member
  • ****
  • Posts: 757
  • Into the abyss will I run
  • Liked: 345
  • Likes Given: 776
Congrats to Server Admin, I fell asleep yesterday CST and woke up and seen the site was down I was like oh no, CCtCap was awarded and I didn't know about it : :-\ Well I hope this reset allows you guys to handle CCtCap, AsaiaSat-6 Launch, Soyuz Return and Dragon SPX-4 Looking forward to whats a head!

Offline Ronpur50

  • Senior Member
  • *****
  • Posts: 2118
  • Brandon, FL
  • Liked: 1028
  • Likes Given: 1886
Glad you are back, I couldn't handle the withdrawals from my addiction to this place when I got home from work last night!

Tags:
 

Advertisement NovaTech
Advertisement
Advertisement Margaritaville Beach Resort South Padre Island
Advertisement Brady Kenniston
Advertisement NextSpaceflight
Advertisement Nathan Barker Photography
1