Nvidia’s new flagship single slot workstation GPU might not offer as big a gen-on-gen performance leap as some 4000-class graphics cards from the past, but it’s a very solid performer for CAD-centric visualisation, writes Greg Corke
Over the years, Nvidia’s 4000 class pro GPUs have formed a very important part of the company’s workstation GPU portfolio. With a total board power of around 150W the single slot graphics cards have found their way into an extremely broad range of workstations, from the entry-level to the high-end. They have provided architects, engineers, and designers with enough horsepower to augment CAD and BIM modelling with more demanding workflows such as real time visualisation, VR, and others.
The latest incarnation, the Nvidia RTX 4000 Ada Generation, is based on the AD104 graphics processor, the exact same chip found in the dual slot, small form factor version, the Nvidia RTX 4000 SFF Ada Generation (read our review).
Both GPUs feature 20 GB of GDDR6 memory, 4 GB more than the previous ‘Ampere’ generation Nvidia RTX A4000. However, the full sized Nvidia RTX 4000 Ada Generation has slightly more memory bandwidth than its SFF sibling (360 GB/s vs 280 GB/sec). It also draws more power, and hence offers more performance.
The Nvidia RTX 4000 Ada has 130W to play with compared to the SFF version, which has 70W. This means the processor can be clocked higher, resulting in more horsepower across the board. SinglePrecision Performance is rated at 26.7 TFLOPs versus 19.2, RT Core Performance for ray tracing is rated at 61.8 TFLOPs versus 44.3, and Tensor Performance for AI is rated at 327.6 TFLOPs versus 306.8.
Despite there being a significant performance difference between them on paper, both GPUs cost the same (£1,159 + VAT). In terms of price/performance, this means users pay a premium to have the AD104 chip in a small form factor or super compact workstation like the HP Z2 Mini or Lenovo ThinkStation P3 Ultra.
Interestingly, the SFF version comes with a half-height bracket and optional full height ATX bracket, so can be used in full sized towers as well. This doesn’t really make sense unless you are particularly focused on using less power.
The Nvidia RTX 4000 Ada Generation cannot draw all of its electricity from the PCIe slot on a workstation motherboard, so gets its additional power from a 16-pin 12VHPWR connector. This modern connector is only supported directly on new generation Power Supply Units (PSUs). It’s not a problem if your workstation doesn’t have one. Nvidia includes a standard 8-pin to 16-pin adapter, so you can plug into your existing PSU.
The pro viz workhorse
For testing, we compared the Nvidia RTX 4000 Ada Generation (20 GB) to the SFF version, as well as the previous Ampere generation, Nvidia RTX A4000 (16 GB).
Here, it’s important to state that our benchmark comparisons aren’t perfect, as not all GPUs were tested in the same workstation. For the Nvidia RTX 4000 Ada Generation and Nvidia RTX A4000 we used the 537.7 driver inside an AMD Ryzen Threadripper Pro 7800X workstation from Armari. For the Nvidia RTX 4000 SFF Ada Generation, we used the 536.25 driver inside an AMD Ryzen 7950X3D-based workstation, also from Armari. PNY loaned the Nvidia RTX 4000 SFF GPU to us several months ago, so we no longer have the card.
As both CPUs are built around the same AMD ‘Zen 4’ architecture and hit similar single core frequencies, any variance from testing on two different systems should be very small.
Our testing focused predominantly on design viz, with the architectural-focused Twinmotion, Lumion, V-Ray and Nvidia Omniverse, the product design focused KeyShot, and Unreal Engine with an automotive model. We also tested with CAD software Solidworks, although for a GPU with this much horsepower, it’s a given that you’ll get good performance.
The big trend in visualisation at the moment is the expansion of GPU ray tracing for much more realistic renders. GPU ray tracing used to be handled exclusively by offline renderers like V-Ray and KeyShot. Now it’s also available in real time viz tools, typically enabled through the graphics APIs DirectX 12 with DirectX Ray Tracing (DXR) and Vulkan Ray tracing.
Normally with a new pro GPU, one would be very happy with a 25% to 41% performance improvement over the previous generation. The problem is, Nvidia has set expectation levels very high
In Unreal Engine, Twinmotion, Lumion, Omniverse and others, users can choose between standard rasterisation, or ray tracing to increase visual quality and realism. This is all done on the GPU and has largely been driven by Nvidia, and the dedicated RT and Tensor cores in its RTX cards. In the Nvidia RTX 4000 Ada, the RT cores are third generation, and the Tensor cores are fourth generation. While it has the same number of cores as the Ampere-based Nvidia RTX A4000, all are one generation ahead.
As you would expect, ray tracing increases the load on the GPU. In Unreal Engine, when testing at 4K resolution with the Audi car configurator model, viewport performance went down from 29.64 Frames Per Second (FPS) with ray tracing disabled to 18.31 FPS with it enabled. This is below the ideal 24 FPS, but navigation within the scene was still relatively smooth. On paper, it’s not a massive improvement over the Nvidia RTX A4000 (14.11 FPS) but there was a noticeable difference in terms of how easy it was to quickly and accurately reposition the model in the viewport of Unreal Engine Editor.
The rest of our main tests were all about render times, expressed either in seconds (smaller is better) or as a benchmark score (bigger is better). In general, the Nvidia RTX 4000 Ada Generation was about 25% to 30% faster than the Nvidia RTX A4000. However, in V-Ray it was only 12% faster and in Twinmotion with path tracer enabled there was no difference at all.
Comparisons to the Nvidia RTX 4000 SFF Ada Generation were in some cases, quite unexpected. The full-sized Nvidia RTX 4000 Ada card had a clear lead in real time viz tools — Unreal Engine (41%) and Lumion (32% with default render and 25% with ray trace effect). However, in offline renderers, V-Ray and KeyShot, this lead was very slender (2% and 4%).
This could be explained by how much power the full sized Nvidia RTX 4000 Ada card consumes when rendering in those applications. In Unreal Engine, it uses close to its maximum 130W. In KeyShot and V-Ray, however, it uses considerably less (100W), much closer to the 70W maximum of the SFF version. It could also be that core count is more important than frequency in those applications (as a reminder, the chip in both cards is identical, it’s just clocked higher in the RTX 4000 Ada), or that the SFF version can maintain higher boost frequencies in those tests. Without having a SFF card at our disposal, it’s hard to tell.
Deep Learning Super Sampling (DLSS)
The Nvidia RTX 4000 Ada Generation also brings other technical advancements to the table. One of those is the Frame Generation feature in Nvidia’s Deep Learning Super Sampling 3 (DLSS) technology, which is supported exclusively on Nvidia Ada Generation GPUs.
Nvidia DLSS has been around for several years and with ‘Ada’, it is now on its third generation. It uses the GPU’s AI Tensor cores to boost frame rates in real time applications.
With Nvidia’s previous generation ‘Ampere’ GPUs, DLSS 2 took a low resolution current frame and the high resolution previous frame to predict, on a pixel-by-pixel basis, what a high resolution current frame would look like.
With Frame Generation in DLSS 3, the Tensor cores generate entirely new frames rather than just pixels. The technology is used to process the new frame, and the prior frame, to discover how the scene is changing as you navigate a scene. It then generates entirely new frames without having to process the graphics pipeline.
DLSS 3 Frame Generation has been implemented in Nvidia Omniverse, Autodesk VRED, Chaos Vantage, D5 Render and others.
In Omniverse USD Composer 2023.2.2, we tested out the feature with the Brownstone building sample model. In ‘RTX – Real-Time’ mode with DLSS enabled the Nvidia RTX 4000 Ada was a substantial 2.24 times faster than the Nvidia RTX A4000. However, there’s a case of comparing apples with pears here as the Nvidia RTX 4000 Ada uses DLSS 3 while the Nvidia RTX A4000 use DLSS 2 (see earlier on). In saying that, we saw no visual difference between the two. In ‘RTX – interactive (path tracing)’ mode, which doesn’t take advantage of DLSS, the Nvidia RTX 4000 Ada was 41% faster, which is closer to what we saw in other applications.
Normally with a new pro GPU, one would be very happy with a 25% to 41% performance improvement over the previous generation. This is what you’ll typically get if upgrading to the Nvidia RTX 4000 Ada Generation from the Nvidia RTX A4000 — although in some workflows the boost is much smaller.
The problem is, Nvidia has set expectation levels very high. In 2021, when it went from the ‘Turing’ Nvidia Quadro RTX 4000 to the ‘Ampere’ Nvidia RTX A4000, the generation-on-generation leap was much higher. In most real time viz tools you were talking about 45% to 60%. In KeyShot it was as much as 89%.
Of course, most workstation users don’t have the luxury of upgrading their GPU every two years. Many will be looking to make the move from 2019’s ‘Turing’ Nvidia Quadro RTX 4000. Those on tight budgets may perceive better value in the Nvidia RTX A4000, which is still available for £812 Ex VAT or the AMD Radeon Pro W7700 (16 GB) (£833+VAT) (read our review), but this could end up being a false economy over the lifetime of the card.
First, you get 4 GB less memory, and as datasets continue to swell, applications and operating systems become more memory hungry, and multi-application workflows more prevalent, that 4 GB could be extremely important.
Second, you get access to technologies exclusive to Ada Generation GPUs, which can deliver real benefits. Frame Generation in DLSS 3, for example, can increase frames rates quite considerably.
Of course, the RTX 4000 Ada Generation is only one of many new Ada Generation workstation GPUs from Nvidia. To push performance higher, there’s also the Nvidia RTX 4500 Ada (24 GB) (£2,099), Nvidia RTX 5000 Ada (32 GB) (£3,699) and, if your pockets are really deep, the Nvidia RTX 6000 Ada (48 GB) (£6,700) (read our review). All three powerful dual slot GPUs should significantly boost real time interactivity and cut render times, especially at higher resolutions, and if visualisation is a critical part of your workflow, they should be under serious consideration.
This article is part of AEC Magazine’s Workstation Special report
Scroll down to read and subscribe here
- Power to the people: the importance of power in performance
- Know your workstation – From GPU to CPU, memory to storage
- Beyond performance: from power and warranty to chassis and bottlenecks
- Review: AMD Ryzen Threadripper 7000 Series
- Review: HP Z6 G5 A (Threadripper Pro)
- Review: Lenovo ThinkStation P8 (Threadripper Pro)
- Review: Armari Magnetar M64T7 (Threadripper HEDT)
- Review: Workstation Specialists WS IC-Z7900 (14th Gen Intel Core)
- Review: AMD Radeon Pro W7500, W7600 & W7700 workstation GPUs
- Working and rendering beyond the desktop
- Remote possibilities: Lenovo targets the cloud
- Inevidesk: flexible virtual workstations