The Turing architecture includes three processors, including the usual compute and shading processor, along with two brand new processing units dedicated to ray-tracing and to deep learning, respectively. The announcement was made on Tuesday by Nvidia founder and chief executive, Jensen Huang, at SIGGRAPH 2018 in Vancouver.
“This is a historic moment,” Huang said.
“nVidia started with its pursuit to produce the most amazing imagery within one 30th of a second, at the price-point consumers would pay for."
The announcement added a fourth major milestone, he said.
“We have been able to operate and optimise performance across the stack, removing bottlenecks, and creating new ideas that break Moore’s law. All this horsepower and performance is following the road to real-time photorealism with amazing amounts of geometry in a scene, physically-based materials, simulated physics, facial animation and character animation."
However, “there has been one enormous roadblock, one as fundamental as can be – the simulation of light. We can do amazing things, but they take a lot of effort like placing invisible lights in the environment. Simulating light for photorealism has been pursued as the holy grail of computer graphics since Turner Whitted first wrote about it in 1979.”
Whitted’s algorithm — multi-bounce recursive ray-tracing — is intensively computationally heavy. Back in 1979, Whitted could animate one frame on a VAX 11/780 in 1.2 hours of computation. He estimated it would take a Cray supercomputer behind every single pixel for real-time ray tracing.
“Thirty years later we can put a Cray supercomputer behind every pixel,” Huang said, announcing the world’s first real-time ray-tracing GPU, the Quadro RTX. Using Nvidia’s Turing architecture, the RTX includes a new powerful compute and shader processor, as well as a new processor named RT Core for ray tracing, and another new processor for deep learning and artificial intelligence named Tensor Core.
The specifications include
- Up to 10 Giga Rays per second
- Up to 16 TFLOPS + 16 TIPS with both floating point and integer calculations able to be performed concurrently
- Up to 500 Trillion tensor operations per second
- Up to 100 GB/sec data transfer with NVLink, a brand new GPU connector so RTX cards can connect to the frame buffer in other FTX cards, effectively doubling the frame buffer capacity.
The Tensor Core aids the ray tracing engine by allowing it to render smaller scenes and then using deep learning to infer the missing pixels and fill in the scene, producing the final image in a faster time with less computation. The deep learning was trained using very high-quality images, Huang says.
“No CPU in the history of CPU has ever commandeered this much computational power on one chip,” Huang said.
By comparison, Nvidia's Pascal architecture offers 11.8 billion stores with 24Gb RAM and 10GHz throughput. The Turing architecture can deliver 18.6 billion stores with 48Gb RAM (+ another 48Gb using NVLink) at 14GHz.
Turing renders images at six times the speed of Pascal running on the Epic Unreal 4 RTRT Engine on top of Microsoft DirectX ray-tracing.
Software-wise, Turing includes a plug-in architecture for deep learning and offers interoperability between rasterisation, ray tracing, compute and AI. It provides hardware ray tracing acceleration in OptiX, DXR and Vulkan, includes an NGX SDK for DNN plug-ins, supports Pixar’s Universal Scene Description (USD) language, and provides a new open source nVidia Material Design Language (MDL).
Nvidia’s new GPU brings real-time ray tracing to market “years before anyone thought possible and it’s going to completely change how artists and designers work”, said Tim Sweeney, chief executive, Epic Games.
Huang says not only will Turing and the Quadro RTX make amazing games today and in the future, it opens access to the $250B visual effects industry that until now could not use GPU acceleration because their requirements are photorealism with global illumination, physically-based materials and to hold large detailed assets. As such, Huang said, this industry — such as architecture and engineering and film and television — mostly runs on CPUs, running vast multi-million dollar render farms with pre-computed lights by artists.
To serve that industry, Nvidia also announced the Nvidia RTX Server for production rendering with global illumination, reducing rendering time from hours to minutes. The server will be available in early access within quarter four and then released the following quarter.
Huang stated a current render farm used by the film industry comes to 240 dual 12-core Skylake CPU servers, using 144kW of power for US$2 million. However, the same power can be replicated by four RTX 8-GPU servers consuming 13kW of power and priced at US$500,000. This takes one-tenth the space, at a quarter of the cost and one-eleventh the power consumption.
“A three-second shot can be produced in an hour instead of four to five hours,” Huang said, taking film producers from two shots per day to seven.
“We never expected to see results this dramatic. This will completely change how our artists work,” said Michele Sciolette, chief technology officer, Cinesite.
Nvidia says its RTX processor will be supported by Adobe Dimension CC, Autodesk Arnold, Solidworks, Renderman, Siemens NX, Unity, Unreal engine, and other leading imaging products.
The Quadro RTX will be released in three models
- The Quadro RTX 5000 with 16Gb RAM (supporting 32Gb using NVLink), 6 Giga Rays per second, for US$2300
- The Quadro RTX 6000 with 24Gb RAM (supporting 48Gb), 10 Giga Rays per second, for US$6300.
- The Quadro RTX 8000 with 48Gb RAM (supporting 96Gb), 10 Giga Rays per second, for $USD 10,000.
The first Toy Story movie rendering took 800,000 hours of computation on 100MHz SPARC workstations, Huang said. Yet nVidia’s announcement brings real-time rendering to the next generation of film and games.
The writer attended SIGGRAPH18 as a guest of Nvidia.