Author Topic: Moving The Cloud to orbit  (Read 95435 times)

Online Lee Jay

  • Elite Veteran
  • Senior Member
  • *****
  • Posts: 9133
  • Liked: 4277
  • Likes Given: 407
Re: Moving "the cloud" to orbit
« Reply #80 on: 07/10/2025 05:57 pm »
The problem Lee Jay described was due to a faulty power supply.

Wow, you didn't read a single thing he wrote, did you?
he literally confirmed my suspicion that it was likely due to faulty power supply

Did you read it?  That's not a faulty power supply, it's a resonance between the power supplies (10,000ish of them) that were not "faulty".  They all worked and they all did the same thing.

You could just as easily say the transformers were faulty or the grid was faulty or the wiring was faulty  It's a resonance between them stimulated by rapid changes in load in the servers.

Online Lee Jay

  • Elite Veteran
  • Senior Member
  • *****
  • Posts: 9133
  • Liked: 4277
  • Likes Given: 407
Re: Moving "the cloud" to orbit
« Reply #81 on: 07/10/2025 06:00 pm »
Inexcusable, then!

Please explain why I have seen publications and presentations by industry, government and academic experts, including those running high-MW field tests and collecting actual data, describing this problem in detail including how difficult it is to solve.  Some of the solves involve changing the software to stop the load from varying so much, but that costs energy (essentially when the actual load drops, you give the servers dummy tasks to keep the load from dropping to zero all at once).

It's a hard problem which you are over simplifying because you haven't thought through it or studied the issue.
Because they’re all attaching to an AC grid dominated by slow response thermal power plants.

Additionally, the fact that a bunch of people are studying secondary effects from the load profile of AI training as almost no bearing on the actual discussion, as clearly these problems are solvable or tolerable because we HAVE these models and these models have been trained successfully again and again on the ground, even using The more complicated power system of the grid.

The much-larger and much more stable grid.

Quote
it’s pretty clear you’re not arguing in good faith as you think we should pursue some sort of energy degrowth policy and have, in the past, expressed hostility to the idea of human expansion into space. The thing about being highly educated is you can produce a whole bunch of highly technical sounding arguments that a lot of observers might find plausible… if you don’t actually look at the issues being discussed from first principles or apply basic logic. Such as:

Is AI load profile a showstopper for orbital datacenters? No more so than for Earth, where it clearly isn’t, and in fact, probably much less so due to the fact you have full control over the power distribution architecture.

And yet, data centers are having to invest in equipment specifically designed to mitigate this exact issue.  Why?  Two reasons - the utilities are making them do it and because it can cause equipment to trip and thus cause expensive outages and even damage.

Online edzieba

  • Virtual Realist
  • Senior Member
  • *****
  • Posts: 7408
  • United Kingdom
  • Liked: 11377
  • Likes Given: 52
Re: Moving "the cloud" to orbit
« Reply #82 on: 07/11/2025 02:28 pm »
Everyone IS doing it. Every power supply (including DC-DC) has a bunch of capacitors. The problem Lee Jay described was due to a faulty power supply.


We’re inventing fake problems where there are already plenty of real ones. Take an electricity or electronics course.
The problem is real. It is acknowledged by the people who build and operate the AI training datacentres causing the problem, and by the people who operate the grids affected by the problem. There are papers written on the subject (such as linked upthread) and articles that explain how the issue occurs (also linked upthread), and it has nothing to do with baseload powerplants (indeed, baseload powerplants with nice big spinning high-inertia generators serve to reduce the impact, no increase it).

For those clearly unfamiliar with the issues that inverter-based (rather than generator-based) power sources can cause, this video gives a good overview. Just declaring "AC, not DC, everything is fine!" does not actually work - any time you have a source and a sink at different voltages, you will have AC involved in voltage step-up/step-down process, and DC alone does not mean voltage and current spikes suddenly stop occurring: the only difference is that line frequency is not a factor. Since power distribution within the affected AI training datacentres in question is DC-based, clearly this alone is not sufficient to prevent the issue. The idea that a standalone datecentre (earthbound or otherwise) would not affected by this issue is also false: datacentres of that scale are operating their own internal distribution grid out of necessity in order to wrangle several tens to hundreds of megawatts without surprise molten copper issues, so going 'off grid' just insources the problem.


Unilaterally declaring a problem as non-existent sadly has not been an effective method of solving it historically.

Online mn

  • Full Member
  • ****
  • Posts: 1418
  • United States
  • Liked: 1343
  • Likes Given: 539
Re: Moving "the cloud" to orbit
« Reply #83 on: 07/11/2025 05:21 pm »
I suspect the energy generated by this 'discussion' could probably power a datacenter or two ;)

Offline sdsds

  • Senior Member
  • *****
  • Posts: 8641
  • “With peace and hope for all mankind.”
  • Seattle
  • Liked: 3051
  • Likes Given: 2783
Re: Moving "the cloud" to orbit
« Reply #84 on: 07/11/2025 10:08 pm »
[...] any time you have a source and a sink at different voltages, you will have AC involved in voltage step-up/step-down process, and DC alone does not mean voltage and current spikes suddenly stop occurring: the only difference is that line frequency is not a factor. Since power distribution within the affected AI training datacentres in question is DC-based, clearly this alone is not sufficient to prevent the issue.

Yes, this. Sadly system designers cannot avoid the realities of transient phenomena. It seems unlikely anyone here is arguing they can. Apparently the power system designs in datacenters today suffer from unexpected (maybe even 'emergent') resonances. I think the point is that those issues are solvable with sufficient analysis and redesign. both on Earth and in orbit.
— 𝐬𝐝𝐒𝐝𝐬 —

Offline InterestedEngineer

  • Senior Member
  • *****
  • Posts: 3586
  • Seattle
  • Liked: 2616
  • Likes Given: 4397
Re: Moving "the cloud" to orbit
« Reply #85 on: 07/14/2025 03:22 am »
[...] we *measured* 7kHz variations at the MW scale

This was on the AC side? That's insanity. Was the datacenter providing AC to each server in the cluster? My recollection of Cisco AGS routers is a bit fuzzy but I think back in the day when they were sometimes used in telco environments supplying the chassis with 24 or 48V DC was a configurable option. Is that not an option for AI servers these days?

Have you ever been in an old-school DC data center? I have.

The tons of copper is quite impressive.  But insanely expensive.  and masses far more than what  modern data centers are doing. 

Running 16kw of power across 48VDC requires 350 amps of wire capacity.   That's 3 4/0 wires in parallel, which masses 350kG over 200 feet.

I have a friend installing electrical in a brand new data center.  It's 480V 3-phase 12 gauge wire to each rack.  That's very low mass. That's 1.73 * 480V * 20A = 16kW to each rack.  The mass of a 200 foot run of that is about 18kg using 12/4 MC cable.  They also run 10/4 which gives you 1.73 * 480V * 30A = 24kW.

They then convert this to a ~380V DC bus for in-rack distribution, and each individual compute element steps that to whatever they need (e.g. down to 0.9V).

The electronics to do this are small, efficient and cheap.

It's funny you ask this, as it's the same answer that Tesla used to beat Edison 120+ years ago, with modern IGBT and power MOSFET switchers making the conversion easy and trivial.  See that factor of 1.73? That's the 3-phase advantage.  No return wire needed.

If you run a data center in space off of solar and  batteries, you're going to convert the DC output into 480V 3 phase just to save the mass and expense of the wire.  I suspect it'll be similar to the modern data center.
« Last Edit: 07/14/2025 03:23 am by InterestedEngineer »

Online Lee Jay

  • Elite Veteran
  • Senior Member
  • *****
  • Posts: 9133
  • Liked: 4277
  • Likes Given: 407
Re: Moving "the cloud" to orbit
« Reply #86 on: 07/14/2025 01:25 pm »
[...] we *measured* 7kHz variations at the MW scale

This was on the AC side? That's insanity. Was the datacenter providing AC to each server in the cluster? My recollection of Cisco AGS routers is a bit fuzzy but I think back in the day when they were sometimes used in telco environments supplying the chassis with 24 or 48V DC was a configurable option. Is that not an option for AI servers these days?

Have you ever been in an old-school DC data center? I have.

The tons of copper is quite impressive.  But insanely expensive.  and masses far more than what  modern data centers are doing. 

Running 16kw of power across 48VDC requires 350 amps of wire capacity.   That's 3 4/0 wires in parallel,

No it's not!  I use 3 4/0's per phase to run 1200A at work.  Granted, that's good wire but even the worst wire in the world is capable of 180A per 4/0 strand.

Offline sdsds

  • Senior Member
  • *****
  • Posts: 8641
  • “With peace and hope for all mankind.”
  • Seattle
  • Liked: 3051
  • Likes Given: 2783
Re: Moving "the cloud" to orbit
« Reply #87 on: 07/14/2025 01:32 pm »
Interesting rule-of-thumb from @dmasten:
https://twitter.com/dmasten/status/1944745491532742736
— 𝐬𝐝𝐒𝐝𝐬 —

Offline Vultur

  • Senior Member
  • *****
  • Posts: 3406
  • Liked: 1513
  • Likes Given: 208
Re: Moving "the cloud" to orbit
« Reply #88 on: 07/14/2025 05:01 pm »
Interesting, I'd have thought thermal management would be way worse than that.

Offline InterestedEngineer

  • Senior Member
  • *****
  • Posts: 3586
  • Seattle
  • Liked: 2616
  • Likes Given: 4397
Re: Moving "the cloud" to orbit
« Reply #89 on: 07/14/2025 06:07 pm »
[...] we *measured* 7kHz variations at the MW scale

This was on the AC side? That's insanity. Was the datacenter providing AC to each server in the cluster? My recollection of Cisco AGS routers is a bit fuzzy but I think back in the day when they were sometimes used in telco environments supplying the chassis with 24 or 48V DC was a configurable option. Is that not an option for AI servers these days?

Have you ever been in an old-school DC data center? I have.

The tons of copper is quite impressive.  But insanely expensive.  and masses far more than what  modern data centers are doing. 

Running 16kw of power across 48VDC requires 350 amps of wire capacity.   That's 3 4/0 wires in parallel,

No it's not!  I use 3 4/0's per phase to run 1200A at work.  Granted, that's good wire but even the worst wire in the world is capable of 180A per 4/0 strand.

https://www.electricaltechnology.org/2022/04/american-wire-gauge-awg-chart-wire-size-ampacity-table.html

the 1200A is across 3 phases so it's 1200/1.72 = 700A per phase, which is within the capability of 4/0 wire (233A per wire).

350 amps really requires 4 wires - Kirchoff's current law, you have to return the current.  350A is beyond one, so you need two, so you have to have 2 in and 2 out or 4 wires.  It was late last night, not sure how I came up with three.

Online Lee Jay

  • Elite Veteran
  • Senior Member
  • *****
  • Posts: 9133
  • Liked: 4277
  • Likes Given: 407
Re: Moving "the cloud" to orbit
« Reply #90 on: 07/14/2025 06:50 pm »
[...] we *measured* 7kHz variations at the MW scale

This was on the AC side? That's insanity. Was the datacenter providing AC to each server in the cluster? My recollection of Cisco AGS routers is a bit fuzzy but I think back in the day when they were sometimes used in telco environments supplying the chassis with 24 or 48V DC was a configurable option. Is that not an option for AI servers these days?

Have you ever been in an old-school DC data center? I have.

The tons of copper is quite impressive.  But insanely expensive.  and masses far more than what  modern data centers are doing. 

Running 16kw of power across 48VDC requires 350 amps of wire capacity.   That's 3 4/0 wires in parallel,

No it's not!  I use 3 4/0's per phase to run 1200A at work.  Granted, that's good wire but even the worst wire in the world is capable of 180A per 4/0 strand.

https://www.electricaltechnology.org/2022/04/american-wire-gauge-awg-chart-wire-size-ampacity-table.html

the 1200A is across 3 phases so it's 1200/1.72 = 700A per phase, which is within the capability of 4/0 wire (233A per wire).

350 amps really requires 4 wires - Kirchoff's current law, you have to return the current.  350A is beyond one, so you need two, so you have to have 2 in and 2 out or 4 wires.  It was late last night, not sure how I came up with three.

It was 1200A *per phase* on a three-phase.  I used high-temp (150C) wire, though.

I didn't realize you were including the return.  When I say single-wire run on DC, that includes positive and negative for two total (one out, one back).

Offline Twark_Main

  • Senior Member
  • *****
  • Posts: 5315
  • Technically we ALL live in space
  • Liked: 2793
  • Likes Given: 1604
Re: Moving "the cloud" to orbit
« Reply #91 on: 07/16/2025 01:18 pm »
Batteries are around 200-300Wh/kg for low end ones for your car. For space, you can use newer ones at 400-500Wh/kg but they cost more.

ISS uses about 10-20% of the total storage capacity to increase cycle life from 500 cycles to 100,000 cycles, since you get 16 cycles a day.  So multiply your weights by 10 or divide your energy density by 10.


I have no problems accepting that range (160-200 Wh/kg).

It's more like 25Wh/kg because of the need to preserve cycle life.

Realistically you're looking at more like 30,000 cycles (5 year service life before it's obsolete anyway), which is readily achievable with automotive LiFePO4.

So the actual mass penalty drops from 5-10x (I note you took the most pessimistic end of that range ::) ) down to 2x.

Online Lee Jay

  • Elite Veteran
  • Senior Member
  • *****
  • Posts: 9133
  • Liked: 4277
  • Likes Given: 407
Re: Moving "the cloud" to orbit
« Reply #92 on: 07/16/2025 02:25 pm »
Batteries are around 200-300Wh/kg for low end ones for your car. For space, you can use newer ones at 400-500Wh/kg but they cost more.

ISS uses about 10-20% of the total storage capacity to increase cycle life from 500 cycles to 100,000 cycles, since you get 16 cycles a day.  So multiply your weights by 10 or divide your energy density by 10.


I have no problems accepting that range (160-200 Wh/kg).

It's more like 25Wh/kg because of the need to preserve cycle life.

Realistically you're looking at more like 30,000 cycles (5 year service life before it's obsolete anyway), which is readily achievable with automotive LiFePO4.

So the actual mass penalty drops from 5-10x (I note you took the most pessimistic end of that range ::) ) down to 2x.


LiFePO4s are already typically about half the energy density of high-grade NMC - typically in the 160WH/kg range in practice.  I know there are some supposedly-higher ones, but I haven't found them commercially available from the manufacturers who supposedly make them.  And they're in the range of 4000 cycles of total cycle life, not 30,000.

If the thing is obsolete in 5 years, there's even less of a point in building it since replacement cost is so high.

Offline Twark_Main

  • Senior Member
  • *****
  • Posts: 5315
  • Technically we ALL live in space
  • Liked: 2793
  • Likes Given: 1604
Re: Moving "the cloud" to orbit
« Reply #93 on: 07/19/2025 03:30 am »
I know there are some supposedly-higher ones, but I haven't found them commercially available

The idea that you wouldn't spin custom cells for such an extreme use case doesn't engender much credulity. That's like building your Saturn V out of Temu parts.

For this application, you'd definitely go with custom low-volume produced cells.



If the thing is obsolete in 5 years, there's even less of a point in building it since replacement cost is so high.

If 5 year lifespan were a show-stopper, there'd be no point in building terrestrial data centers either. That's pretty typical.

https://www.tomshardware.com/pc-components/gpus/datacenter-gpu-service-life-can-be-surprisingly-short-only-one-to-three-years-is-expected-according-to-unnamed-google-architect

https://medium.com/@celions/the-hidden-environmental-cost-of-data-center-growth-millions-of-tons-of-e-waste-0bb4a18dbaa1

If you need to replace the batteries at that same 5-year "upgrade," that means you're replacing 60% of the satellite mass instead of 50%. Probably not economical either way! Just launch a new one.

Just to be clear, I don't see the point of data centers in orbit. It's just cost for no benefit IMO. However having an (industry standard) hardware replacement interval on an extra subsystem isn't a huge show-stopper all by itself.

At the end of the day it's just one item in a big pile of extra costs, solving a bunch of problems in exchange for zero advantages.  IMHO.
« Last Edit: 07/19/2025 03:59 am by Twark_Main »

Online Lee Jay

  • Elite Veteran
  • Senior Member
  • *****
  • Posts: 9133
  • Liked: 4277
  • Likes Given: 407
Re: Moving "the cloud" to orbit
« Reply #94 on: 07/19/2025 04:48 am »
If 5 year lifespan were a show-stopper, there'd be no point in building terrestrial data centers either. That's pretty typical.

Yeah, but you can relatively easily re-hardware them.  On-orbit, that's a whole bunch harder - many orders of magnitude harder.

Offline Twark_Main

  • Senior Member
  • *****
  • Posts: 5315
  • Technically we ALL live in space
  • Liked: 2793
  • Likes Given: 1604
Re: Moving "the cloud" to orbit
« Reply #95 on: 07/19/2025 02:36 pm »
If 5 year lifespan were a show-stopper, there'd be no point in building terrestrial data centers either. That's pretty typical.

Yeah, but you can relatively easily re-hardware them.  On-orbit, that's a whole bunch harder - many orders of magnitude harder.

Exactly, not economical. The hardware is just a thin "shell" around the compute hardware, mostly just power and thermal and a bit of structure. Nothing terribly worth saving once the compute hardware is end-of-life.

The structure and thermal are going to be tightly integrated with the compute, so you wouldn't want to reuse those elements anyway because it would hold back the design. If the batteries are EOL and obsolete besides, then the only sensible "re-hardware" concept is really just to unbolt the solar panels and move them to the new satellite.  8)

Anyway, you get the point. Starlink Solution: no upgrades in space, you just replace the constellation over time.

Online Lee Jay

  • Elite Veteran
  • Senior Member
  • *****
  • Posts: 9133
  • Liked: 4277
  • Likes Given: 407
Re: Moving "the cloud" to orbit
« Reply #96 on: 07/19/2025 03:21 pm »
Anyway, you get the point. Starlink Solution: no upgrades in space, you just replace the constellation over time.

But you're talking about something that's millions of times bigger than a Starlink satellite.  In fact, it's probably bigger than an entire Starlink constellation.

Offline Asteroza

  • Senior Member
  • *****
  • Posts: 3127
  • Liked: 1211
  • Likes Given: 35
Re: Moving "the cloud" to orbit
« Reply #97 on: 07/19/2025 07:12 pm »
Once you are at such sizes, unless you are doing direct integration like a solar sail with thin film compute etched on, by definition you are plumbing into manifolds and interfaces allowing a degree of fungibility of power/cooling/comms. Then it becomes an economics question of whether keeping a solar/radiator bank plugged into an accreting structure style worth it.

Space coral, yea or nay?

Project Natick ideas centered around nameplate EoL capacity, when the pods were pulled due to internal failure exceeding certain margins. In operation, as servers died they are by definition abandoned in place due to no immediate servicability. There is some redundancy in the pod overall but not excessively so.

So there is a theoretical secondary market for still functioning, but at reduced capacity, pods.

Current GEO sats are sorta similar, fitting EoL nameplate capacity solar to design spec, meaning they are overpowered at beginning of life.

Radiators I suppose could be built like that, having coolant circuit fuses to seal individual circuits in a fin that get punctured.


Though scaling does weird tricks to design the bigger you go. Hyperscalars are doing software redundancy at the data center level ("oops I lost datacenter" was supposedly muttered by a googler in the "I have too many"  sense), so in the above radiator example, putting fault tolerance down to the radfin level might not be worth it to a hyperscaler when they are more interested in swapping whole fins. If you wanted to support a broad infrastructure ecosystem with compatibility in mind (and a degree of forced competition in vendors), dictating a degree of "gradual degradation" capability at lower parts levels may be beneficial.

Do we want space legos or not?

Offline Twark_Main

  • Senior Member
  • *****
  • Posts: 5315
  • Technically we ALL live in space
  • Liked: 2793
  • Likes Given: 1604
Re: Moving "the cloud" to orbit
« Reply #98 on: 07/19/2025 09:05 pm »
Anyway, you get the point. Starlink Solution: no upgrades in space, you just replace the constellation over time.

But you're talking about something that's millions of times bigger than a Starlink satellite.  In fact, it's probably bigger than an entire Starlink constellation.

First, I don't subscribe to your pulled-from-your-backside number right there.

Second, if so, and?? The fundamental economic tradeoff between the two options doesn't care about size, because the null hypothesis is that the cost of both options scales roughly with size (so it cancels out).

Do you have any reason (preferably something not utterly contrived merely for the sake of winning the argument) to think otherwise?  ???

Offline Coastal Ron

  • Senior Member
  • *****
  • Posts: 9798
  • I live... along the coast
  • Liked: 11422
  • Likes Given: 13081
Re: Moving "the cloud" to orbit
« Reply #99 on: 07/19/2025 10:14 pm »
If 5 year lifespan were a show-stopper, there'd be no point in building terrestrial data centers either. That's pretty typical.
Yeah, but you can relatively easily re-hardware them.  On-orbit, that's a whole bunch harder - many orders of magnitude harder.
Exactly, not economical. The hardware is just a thin "shell" around the compute hardware, mostly just power and thermal and a bit of structure. Nothing terribly worth saving once the compute hardware is end-of-life.

You are only talking about the compute side, but there is also the storage side. A manufacturer like Seagate has a general warranty of 2 years, but in reality you can expect them to last 3-5 years. Of course that is in an environment here on Earth, but we don't know what environment "the cloud" would experience.

Here on Earth you can just swap out a bad drive, or swap out a good drive to upgrade it. More factors to consider...

Quote
The structure and thermal are going to be tightly integrated with the compute...

Why is that? Servers here on Earth don't do that, the compute hardware is separate from the cooling system. Of course you are looking at this from a single disposable unit, but there are alternatives.

Quote
...so you wouldn't want to reuse those elements anyway because it would hold back the design.

If we're thinking in terms of Starlink type design, then it would be disposed of. But if the design is a big server farm on a station, then replacing hardware becomes easier. Not sure how economical, but could be.

Quote
If the batteries are EOL and obsolete besides, then the only sensible "re-hardware" concept is really just to unbolt the solar panels and move them to the new satellite.

Well again, for Starlink type disposable hardware, you only build for a defined lifespan, with no need to consider upgrades. But if you are doing this on a space station, then everything could be replaceable - and would be designed that way from the start.
If we don't continuously lower the cost to access space, how are we ever going to afford to expand humanity out into space?

Tags:
 

Advertisement NovaTech
Advertisement
Advertisement Margaritaville Beach Resort South Padre Island
Advertisement Brady Kenniston
Advertisement NextSpaceflight
Advertisement Nathan Barker Photography
0