We could do with some GPU acceleration....

CoolColJ · Post by **CoolColJ** » Mon Oct 15, 2007 5:20 pm

When Nvidia cards can do 3 terraflops

http://redline-oc.spaces.live.com/blog/ ... !287.entry

Zom-B · Post by **Zom-B** » Mon Oct 15, 2007 6:20 pm

Speed isn't this much important, but the lack of double precision support is...

GPU accelerated Rendering should be the main Boom for commercial biased rendering engines in 2008 (at least I hope so ^^).

Knaxknarke · Post by **Knaxknarke** » Mon Oct 15, 2007 8:09 pm

CoolColJ wrote:When Nvidia cards can do 3 terraflops

http://redline-oc.spaces.live.com/blog/ ... !287.entry

Hi, I can't believe in the linked statement. More likely it will be 65nm, a power saver and PCIe 2.0 version of the G80. I think G100 will be the next big step. But 3 TFLOPS - no way! It's between 300-500 GFLOPS for G80 (depends a lot on what ALUs you can use in parallel) - never count some texture filtering and blending units - they won't help you much doing GPGPU stuff.

If the spec says 1 TFLOPS it's a more realistic lie.

And: computing and raytracing is not only about FLOPS. You also need the data in and out and this isn't to simple with the current GPU architectures, as the caching is poor and all this Giga-blah from ATI/NV only applies for toy problems with a good alignment to 2d arrays and a lot float ops per IO-traffic.

BTW they count the MADD instruction as 2 FLOP. And raytracing isn't all about MADD (well it can speed up intersection calculation and maybe some shading stuff).

Really important with GPGPU programming for example with G80/CUDA is to use only a hand full registers (to get more hardware threads up and running to hide mem latency), to have code with few conditionals, so the SIMD groups (in CUDA called "warps") do all the same (no bubbles in the pipeline for masked out threads). To use the shared (on-chip) memory for much faster access, to align memory access of the warps, and so on.

There are papers about ray tracing engines using ATI/CTM and NV/CUDA (Stanford and Saarbrücken). There is no 100 times speedup to fast CPU real-time ray tracing. They are happy if it reaches the speed of some good Intel ray tracer on dual quad core (Penryn?).

I fact Intel is investing a lot in real-time ray tracing research and then there may be Larrabee end of 2008 start of 2009.

If ATI/AMD and Nvidia don't start better GPGPU developer support (bug free compilers, useful profiling tools, no hiding of tech details, ... ) then they may have some chance in HPC. Otherwise they will live and die with game rasterization.
But CTM is a joke (where is the HL compiler?) and CUDA isn't mature enough just yet (buggy compiler).

Knaxknarke · Post by **Knaxknarke** » Mon Oct 15, 2007 8:15 pm

ZomB wrote:Speed isn't this much important, but the lack of double precision support is...

Double precision will be in the next Gxx/CUDA generation. But it will only be supported in the Quadro and Tesla line, not for GeForce. So not for mass market and consumers. The Quadros are much to expensive, you better buy two QuadCore Xeons for this much money.

The Tesla PCIe board may be a good choice for GPGPU. but they don't do 3D rasterization graphics. So you still need a GeForce.

ZomB wrote:GPU accelerated Rendering should be the main Boom for commercial biased rendering engines in 2008 (at least I hope so ^^).

It could also speed up every app that does ray shooting, for biased or unbiased light transport. But it won't be interactive or real-time to soon.

zsouthboy · Post by **zsouthboy** » Tue Oct 16, 2007 2:47 am

TFLOPs != other TFLOPs

Comparing FLOPs of one type of device to a completely different one is a common mistake among consumers - that's why marketing is always sure to quote a FLOPs number (for example: PS3 marketing, GPU marketing)

We could do with some GPU acceleration....

We could do with some GPU acceleration....

Re: We could do with some GPU acceleration....

Who is online