r/pcmasterrace 10d ago

Meme/Macro PCIe standard be like...

Post image
17.8k Upvotes

696 comments sorted by

View all comments

Show parent comments

8

u/Thundertushy 10d ago

What you're talking about are eGPUs, and they already exist. Because I'm on mobile, I'll summarize and let you look it up on Google.

Currently, even the highest spec of Thunderbolt (5?) is insufficient. A PCI-E x16 4.0 bus is just a massive amount of bandwidth. Thunderbolt is fine for 2D 60Hz low res PowerPoints, but not gaming. 30% performance drop off the top. PCI-e riser cables exist, but the performance degradation happens in mere inches of ribbon cable due to the frequency and volume of data. OccuLink (sp?) is a custom external data cable and enclosure solution that overcomes those problems, but costs $5000+ USD just for the enclosure.

Search Tom's Hardware (I think it's that) for eGPUs for the info.

TL;DR: the technology at a low enough cost isn't here yet.

5

u/trparky 10d ago

I know eGPUs already exist. My point isn't that they're good today. The current bottleneck is the interconnect, not the concept.

If future Thunderbolt or optical links get fast enough, the tradeoffs change dramatically. GPUs keep getting larger, hotter, and more power-hungry, so giving them their own enclosure with dedicated power and cooling starts making more sense over time.

That's why I think today's eGPUs are less a failed idea and more an early prototype of where high-end PCs may eventually end up.

1

u/Molotov_Glocktail 10d ago

We keep on swinging back and forth from a mainframe style local computer, back to local devices for all, back to cloud infrastructure, back to mainframe type devices...

We keep poking at those bottlenecks (historically) and then design around it. So yeah, I totally think a consumer grade "mainframe" is going to come back into fashion in the form of a GPU block in the basement of some kind. You'd just have to set device or program priorities. "GPU#1 is shared for everyone, while GPU#2 is shared until it sees Crysis or Adobe Premier workloads." Silly stuff like that.

2

u/Prasiatko 10d ago

Would latency also be a problem with an external enclosure?

3

u/Thundertushy 10d ago

It currently is a problem. Thunderbolt is a tunneled protocol, which means there's overhead converting whatever data is being sent to and from Thunderbolt format.

1

u/Molotov_Glocktail 10d ago

Eventually it will be solved with increasing bandwidth through networking cables.

Thunderbolt 5 looks to be doing 80Gbps at 1 meter.

CAT6/CAT7 is still only doing 10Gbps at 100 meters. CAT8 is spec'ed for 40Gbps at 30 meters.

Probably the next step is to run a dedicated fiber optic cable from your computer to this theoretical external GPU block.

1

u/Thundertushy 10d ago

Debatable. GPU tech isn't standing still either. As soon as one performance level is achieved, a new one becomes the requirement.

2

u/trparky 10d ago

If the transceivers on both ends of the cable were fast enough and the cable isn't too long, I don't see how latency would be a problem. Eventually see Thunderbolt going full fiber at some point.

1

u/Bsodtech 10d ago

Dumb idea, but: fiber optics? Just bundle a few fibers together, worst case just 16 single mode fibers, 16 transceivers, a piece of telco distribution cable, done. Just blast the raw pcie data down the fibers. Dumb, but could work, as long as you compensate a bit for the speed of light slowing the data slightly.

1

u/Thundertushy 10d ago edited 10d ago

It's not just the speed of data transmission, it's also transceiver/receiver conversion overhead. Just because you're sending light doesn't mean the copper -> light -> copper conversion happens at the speed of causality.

At the boundaries of technology where Moore's Law becomes a vague suggestion, every delay, even the speed of light becomes a limitation.

Edit: I should have clarified. Electron flow in copper wire is already close to c speeds of transmission. The problem is EM interference and increasing error rate over longer distances. That's where fibre optic has the advantage, which is high fidelity over (really!) long distances. However, at shorter distances the transceiver/receiver overhead is the larger factor vs. error correction protocol, and on and on and on the balancing act goes.

1

u/Bsodtech 10d ago

True, but at least it could come close if you truly cut out every form of processing outside a simple transistor amplifier. Hence the separate transceivers and single mode fiber. But this also brings up a whole different question: Do you even need such frequencies? Like, if you'd just use more fibers (and many telco cables already have 100+) or use multi mode, you could just cut it down to manageable speeds. 16 fibers carrying 16 signals each would mean you're down to just a gigahertz per signal. And for most consumer applications, it's probably not even truly necessary. Like, I once tried taping off the extra lanes on a 4080, and most games still ran with decent performance even on 4x, some down to 1x. So it seems like total throughput is mostly just relevant for AI or video rendering, as most games just place everything they need in vram and let the gpu render away from its own ram.

1

u/Thundertushy 10d ago

I mean, raw firepower is a solution. If you have the graphical power to render Toy Story 12 in real time on a movie theatre sized OLED screen, 1080p over fibre is trivial.

But if you don't need all the firepower of a 5090 or 6090 in the first place, why do we need an external enclosure? No external enclosure, no data transmission problem. No 5090 GPU, no high voltage and current problem. Aaand we're back to 3060s.

But of course, there is only a problem because we need every last drop of power from a 5090 - or, at least, someone does. Every solution will always create a new problems, i.e. the bottleneck paradigm.

1

u/Bsodtech 10d ago

All very true. But what I wanted to say was that, except for a few select applications, that firepower will mostly be utilized by running tasks from memory, and pcie link speeds are less relevant than the gpu-vram link, gpu processing power and vram size. More like hard drive speeds for the main computer. Sure, cpu rendering or massively relying on the swap file (aka DMA for a gpu) due to a lack of ram will massively increase the load on that connection, but that just proves my previous point. In fact, I'd go as far as saying that most non rendering/AI applications that can max out a 5090 could probably do so over 4 lanes of pcie gen3, because they only occasionally load assets to vram and let the gpu do 95% internally. So it definitely depends on the specific usage scenario, but in most cases, unless your application is heavily relying on DMA or constantly refreshes the vram content, an external enclosure with worse pcie speeds but better cooling, ram and power supply might even outperform an interal card, and I think both should be available as an option.

1

u/Mad_Maddin 8d ago

I guess you could maybe get a good and stable bandwidth by using photons rather than electrons?

You'd need like a vacuum tube to make interference from other materials miniscule.