Nvidia’s new Ampere-based pro GPUs, the Nvidia RTX A4000 and RTX A5000, offer a big step up from the Turing-based Quadro RTX family. With more memory and significantly enhanced processing, they promise to make light work of demanding real-time ray tracing, GPU rendering and VR workflows.
In February 2021 we reviewed the Nvidia RTX A6000, the first pro desktop GPU to be based on Nvidia’s ‘Ampere’ architecture. With 48 GB of memory and buckets of processing power, the dual slot 300W graphics card is designed for the most demanding visualisation workflows – think city-scale digital twins or complex product visualisations using very hi-fidelity textures, such as those captured from real-life scans.
Of course, the Nvidia RTX A6000 is complete overkill for most architects or product designers who simply want a capable GPU for real-time visualisation, GPU rendering or VR. And it’s here that the new Nvidia RTX A4000 and Nvidia RTX A5000 come into play.
Announced at Nvidia’s GTC event this year, the PCIe Gen 4 ‘Ampere’ Nvidia RTX A4000 and Nvidia RTX A5000 are the replacements for the PCIe Gen 3 ‘Turing’ Nvidia Quadro RTX 4000 and Quadro RTX 5000, which launched in 2019.
The RTX A4000 and A5000 are mid-range ‘Quadro’ GPUs in everything but name. Nvidia’s long-serving Quadro workstation brand might be being retired, but the features remain the same.
Both cards offer more memory than their consumer GeForce counterparts, are standard issue in workstations from Dell, HP and Lenovo, and come with pro drivers with ISV certification for a wide range of CAD/BIM applications.
And with an estimated street price of $1,000 for the Nvidia RTX A4000 and $2,250 for the Nvidia RTX A5000, they have much more palatable price tags than the Nvidia RTX A6000 which comes in at $4,650.
Nvidia RTX (Ampere) / Quadro RTX (Turing) comparison
|Nvidia RTX A4000||Nvidia RTX A5000||Nvidia RTX A6000||Quadro RTX 4000||Quadro RTX 5000||Quadro RTX 6000|
|GPU memory||16 GB GDDR6||24 GB GDDR6||48 GB GDDR6||8 GB GDDR6||16 GB GDDR6||24 GB GDDR6|
|SP perf||19.2 TFLOPS||27.8 TFLOPS||38.7 TFLOPS||7.1 TFLOPS||11.2 TFLOPS||16.3 TFLOPS|
|RT Core perf
||37.4 TFLOPS||54.2 TFLOPS||75.6 TFLOPS||N/A||N/A||N/A|
||153.4 TFLOPS||222.2 TFLOPS||309.7 TFLOPS||57.0 TFLOPS||89.2 TFLOPS||130.5 TFLOPS|
|Graphic bus||PCI-E 4.0 x16||PCI-E 4.0 x16||PCI-E 4.0 x16||PCI-E 3.0 x16||PCI-E 3.0 x16||PCI-E 3.0 x16|
|Connectors||DP 1.4 (4)||DP 1.4 (4)||DP 1.4 (4)||DP 1.4 (3), USB-C||DP 1.4 (4), USB-C||DP 1.4 (4), USB-C|
|Form Factor||Single slot||Dual Slot||Dual Slot||Single slot||Dual Slot||Dual Slot|
||N/A||NVIDIA RTX vWS||NVIDIA RTX vWS||N/A||N/A||NVIDIA RTX vWS|
|Nvlink||N/A||2x RTX A5000||2x RTX A6000||N/A||2x RTX 5000||2x RTX 6000|
|Power Connector||1x 6-pin PCIe||1x 8-pin PCIe||1x 8-pin CPU||1x 6-pin PCIe||1x 8-pin PCIe||2x 8-pin PCIe|
Nvidia RTX A4000 (16 GB)
With 16 GB of GDDR6 ECC memory, the Nvidia RTX A4000 offers a big step up from the 8 GB Quadro RTX 4000. 8 GB is fine for mainstream viz workflows but for more complex projects it can be limiting, so delivering 16 GB in a sub $1,000 pro GPU is a big step forward. Previously, 16 GB was only available on the ‘Turing’-based Quadro RTX 5000.
As you’d expect from Nvidia’s new ‘Ampere’ architecture, the Nvidia RTX A4000 also offers a significant improvement in processing. This can be seen in all areas of the GPU with more CUDA cores for general processing, third-generation Tensor Cores for AI operations and second-generation RT Cores for hardware-based ray tracing. It leads to a substantial performance increase in many different applications (see later on).
Furthermore, as the Nvidia RTX A4000 is a single slot GPU with a max power consumption of 140W delivered through a single 6-pin PCIe connector, it’s available in a wide range of desktop workstation form factors. This includes compact towers like the HP Z2 Tower G8 and Dell Precision 3650.
The board features four DisplayPort 1.4a ports and can drive up to four displays at 5K resolution. It is cooled by a single ‘blower’ type fan, which draws in cool air from the top and bottom of the card, pushes it through a radiator and then directly out of the rear of the workstation chassis. This is in contrast to most consumer GeForce GPUs which use axial fans that recirculate air inside the machine.
There are pros and cons to each design, but a blower fan does mean you can stack cards within the chassis without having to leave space between them. This means you can get a very good density of GPUs inside a mid-sized chassis.
With the AMD Threadripper Pro-based Lenovo ThinkStation P620, for example, you could get four Nvidia RTX A4000s back-to-back, which could be a very interesting proposition for GPU rendering. Even though the RTX A4000 doesn’t support NVlink (so there’s no pooling of GPU memory) 16 GB is still a good amount and two, three or four RTX A4000s could work out well in terms of price/performance compared to the more powerful RTX A5000 or A6000.
Another potential use case for high-density multi-GPU is workstation virtualisation using GPU passthrough, where each user gets a dedicated GPU. Again, this workflow looks well suited to the Lenovo ThinkStation P620, which can be configured with up to 64 CPU cores and 2TB of memory.
Other more niche pro viz features include support for 3D Stereo, Nvidia Mosaic for professional multi-display solutions, and Quadro Sync II, an add-in card that can synchronise the display and image output from multiple GPUs within a single system, or across a cluster of systems.
Nvidia RTX A5000 (24 GB)
With 24 GB of GDDR6 ECC memory, the Nvidia RTX A5000 offers only a 50% memory uplift compared to the Quadro RTX 5000 it replaces.
Like the Nvidia RTX A4000 it offers a significant upgrade in all areas of processing – CUDA, Tensor and RT cores.
It’s a double height board, with a max power consumption of 230W which it draws from the PSU via an 8-pin PCIe connector, but it’s still available in compact towers.
The board also features four DisplayPort 1.4a ports and is cooled by a single ‘blower’ type fan, but only draws in cool air from one side of the card.
The Nvidia RTX A5000 supports all the same features as the Nvidia RTX A4000 but differs in two main areas.
One, it supports Nvidia NVLink, so GPU memory can be expanded to 48 GB by connecting two 24 GB GPUs together.
Two, it supports Nvidia RTX vWS (virtual workstation software) so it can deliver multiple high-performance virtual workstation instances that enable remote users to share resources. In the Lenovo ThinkStation P620, for example, you could get a very high density of CAD/BIM users who only need high-end RTX performance from time to time.
Testing the Nvidia RTX A4000 / RTX A5000
We put the Nvidia RTX A4000 and Nvidia RTX A5000 through a series of real-world application benchmarks, for GPU rendering, real-time visualisation and 3D CAD.
All tests were carried out using the AMD Ryzen-based Scan 3XS GWP-ME A132R workstation at 4K (3,840 x 2,160) resolution using the latest 462.59 Nvidia driver.
The full spec can be seen below. We’ll be posting a full review soon.
Scan 3XS GWP-ME A132R
- AMD Ryzen 5950X CPU (3.4GHz to 4.9GHz) (16 cores)
- 64 GB (2 x 32 GB) Corsair Vengeance DDR4 3200MHz memory
- 2 TB Samsung 980 Pro NVMe PCIe 4.0 SSD system drive
- 4 TB Samsung 870 Evo SATA SSD storage drive
- Asus Pro WS X570-ACE motherboard
- Noctua NH-D15 air cooler
- 750W Corsair RMX, 80PLUS Gold PSU
- 1GbE NIC Networking
- 3XS workstation case with tempered glass window
- Microsoft Windows 10 Professional 64-bit
- 3 Years – 1st Year Onsite, 2nd and 3rd Year RTB (Parts and Labour) warranty
- Price (with Nvidia RTX A4000) £3,333 (Ex VAT) (not yet available).
- Price (with Nvidia RTX A5000) £4,166 (Ex VAT)
For comparison, we used the last two generations of ‘4000’ class Nvidia pro GPUs – the 8 GB ‘Turing’ Nvidia Quadro RTX 4000 (from 2019) and the 8 GB ‘Pascal’ Nvidia Quadro P4000 (from 2017). Three to four years is quite a typical upgrade cycle in workstations, so the intention here is to give a good idea of the performance increase one might expect from an older machine.
We also threw some Nvidia RTX A6000 scores in there. These were done on two different workstations with a 32-core Threadripper Pro 3970X and a quad core Intel Xeon W-2125 CPU. While both CPUs have lower frequencies and instructions per clock (IPC) the results should still give a pretty good idea of comparative performance, especially in GPU rendering software.
Hardware-based ray tracing with the Nvidia RTX A4000 / A5000
It’s been just over two years since Nvidia introduced ‘Turing’ Nvidia Quadro RTX, its first pro GPUs with RTX hardware ray tracing.
In a classic chicken and egg launch, there were very few RTX-enabled applications back then, but this has now changed. For design viz, there’s Chaos V-Ray, Chaos Vantage, Enscape, Unreal Engine, Unity, D5 render, Nvidia Omniverse, Autodesk VRED, KeyShot, Siemens NX Ray Traced Studio, Solidworks Visualize, Catia Live rendering and others.
Nvidia RTX gave GPU rendering a massive kick start and while there is increased competition from hugely powerful CPUs like the 64-core AMD Threadripper [Pro], we are seeing deeper penetration of GPU rendering tools, especially in architect / engineer / product designer friendly workflows.
Nvidia RTX is being used to massively accelerate classic viz focused ray trace renderers like V-Ray, KeyShot and Solidworks Visualize, which we test below. However, some of the more exciting developments are coming from the AEC sector in tools like Enscape, Chaos Vantage and Unreal Engine, which really make ray tracing ‘real-time’. Vantage, for example, is built from the ground up for real-time ray tracing so can maximise the usage of RT cores within the new GPUs.
Chaos Group V-Ray
V-Ray is one of the most popular physically based rendering tools, especially in architectural visualisation. We put the new cards through their paces using the freely downloadable V-Ray 5 benchmark, which has dedicated tests for Nvidia CUDA GPUs, Nvidia RTX GPUs, as well as CPUs.
The results were impressive. In the CUDA test, the Nvidia RTX A4000 was 1.62 times faster than the previous generation Nvidia Quadro RTX 4000 and in the RTX test 1.70 times faster. The lead over the Pascal-based Quadro P4000 was nothing short of colossal – 3.53 times faster in the CUDA test. As the P4000 does not have dedicated RT cores, it could not run the RTX test.
Stepping up to the Nvidia RTX A5000 will give you an additional boost. Compared to the Nvidia RTX A4000 it was between 1.27 and 1.37 times faster.
Interestingly, the RTX A5000 was not that far behind the RTX A6000, which costs more than twice as much.
KeyShot, a CPU rendering stalwart, is a relative newcomer to the world of GPU rendering. But it’s one of the slickest implementations we’ve seen, allowing users to switch between CPU and GPU rendering at the click of a button.
In the Keyshot 10 benchmark, part of the free KeyShot Viewer, the performance leap was even more substantial than in V-Ray. The Nvidia RTX A4000 and Nvidia RTX A5000 outperformed the Quadro RTX 4000 by a factor of 1.89 and 2.51 respectively. And the RTX A5000 was only 20% slower than the RTX A6000.
DS Solidworks Visualize 2021 SP3
The name of this GPU-accelerated physically based renderer is a bit misleading as it works with many more applications than the CAD application of the same name. It can import models from Creo, Solid Edge, Catia and Inventor, as well as several neutral formats.
Since the 2020 release the software has supported Nvidia RT cores and Tensor cores to improve rendering performance with Nvidia RTX GPUs. Users can choose to render scenes with or without denoising enabled.
Denoising is a post-processing technique based on machine learning that filters out noise from unfinished / noisy images and is the foundation for many RTX-accelerated applications. It means you can get better looking renders with significantly fewer rendering passes.
DS Solidworks reckons that if a scene routinely needs 500 passes without the denoiser, then you may be able to achieve the same rendering quality with 50 passes with the denoiser enabled.
We tested the stock 1969 Camaro car model at 4K resolution with 1,000 passes (denoising disabled) and 100 passes (denoising enabled) set to accurate quality. Both settings produced excellent visual results.
The RTX A4000 and RTX A5000 delivered the 100-pass render in 22 seconds and 14 seconds respectively. This isn’t the most complex scene but being able to render at such speeds is quite incredible and can have a profound impact on workflows. In comparison, it took the Quadro P4000 GPU 105 seconds, so you can see just how far things have progressed in four years.
Real time 3D with the Nvidia RTX A4000 / A5000
While GPU rendering is a major play for the Nvidia RTX A4000 and Nvidia RTX A5000, real-time 3D using OpenGL, DirectX and (in the future) Vulkan continues to be a very important part of architectural visualisation, with applications including TwinMotion, Lumion, Enscape, Unreal Engine, LumenRT and others.
Of course, the boundaries between real-time 3D and ray tracing continue to blur. In fact, out of the list above only Lumion and Twinmotion are yet to support RTX, although it should be coming to Twinmotion soon.
To test frame rates, we used a combination of monitoring software including FRAPS, Xbox Game Bar and MSI Afterburner. We only tested at 4K (3,840 x 2,160) resolution. At FHD (1,920 x 1,080) resolution this class of GPU simply isn’t stressed enough.
Enscape is a real-time viz and VR tool for architects that uses OpenGL and delivers very high-quality graphics in the viewport. Enscape has used elements of ray tracing in its software for some time. Version 3.0 is RTX-enabled, so full ray tracing can be toggled on and off. Later versions will use the more modern Vulkan API and support ray tracing on both Nvidia and AMD GPUs.
For our tests, we used a large architectural scene of a museum and its surrounding area in Enscape 2.6 (non RTX). At 7.5GB, the GPU memory requirements of this model are relatively high, but Enscape models can be much larger.
In terms of performance, the Nvidia RTX A4000 and A5000 delivered a very smooth experience with 30 and 40 FPS respectively. This is around twice as fast as the Nvidia Quadro RTX 4000 and three to four times faster than the Nvidia Quadro P4000.
Autodesk VRED Professional 2022
Autodesk VRED Professional is an automotive-focused 3D visualisation, virtual prototyping and VR tool. It uses OpenGL and delivers very high-quality visuals in the viewport. It offers several levels of real-time anti-aliasing (AA), which is important for automotive styling, as it smooths the edges of body panels. However, AA calculations use a lot of GPU resources, both in terms of processing and memory. We tested our automotive model with AA set to ‘off’ and ‘ultra-high’.
Considering that this pro viz application used to only really run effectively on Nvidia’s ultra-high-end professional GPUs, it’s quite astounding that the Nvidia RTX A4000 – a sub $1,000 card – delivered over 30 FPS at 4K resolution with medium anti-aliasing. In saying that, those really pushing the boundaries of automotive visualisation will still likely need the top-end Nvidia RTX A6000 especially for high-res VR workflows.
Unreal Engine 4.26
Over the past few years Unreal Engine has established itself as a very prominent tool for design viz, especially in architecture and automotive. It was one of the first applications to use GPU-accelerated real-time ray tracing, which it does through Microsoft DirectX Ray tracing (DXR).
We used two datasets for testing, both freely available from Epic Games: an arch viz interior of a small apartment and the Automotive Configurator, which features an Audi A5 convertible. Both scenes were tested with ray tracing enabled (DirectX Ray tracing (DXR)) and without (DirectX 12 rasterisation).
The results were pretty much as expected with good scaling between all the GPUs with DirectX 12 rasterization. With real-time ray tracing enabled, performance naturally takes a hit in general, but the Quadro P4000 really suffers without any RT cores.
We also tested with VRMark, a dedicated Virtual Reality benchmark that uses both DirectX 11 and DirectX 12. It’s biased towards 3D games, so not perfect for our needs, but should give a good indication of the performance one might expect in ‘game engine’ viz tools.
CAD and BIM with the Nvidia RTX A4000 / A5000
Most 3D CAD and BIM tools tend to be CPU limited, so performance is largely bottlenecked by the frequency (GHz) of the CPU. As a result, the Nvidia RTX A4000 and Nvidia RTX A5000 are really overkill for most CAD and BIM applications. They are unlikely to give you significantly better 3D performance than more mainstream GPUs like the Nvidia Quadro P1000 or P2200.
However, CAD applications are changing and, in the future, should be able to make much better use of the plentiful power of higher-end GPUs like the RTX A4000 and A5000. Both Autodesk with its new One Graphics System and Dassault Systèmes are currently working on new graphics engines that use more modern graphics APIs like Vulkan. This should not only improve general 3D performance but will make real-time ray tracing available directly in the viewport. So, while the RTX A4000 / A5000 might not give notable performance benefits in CAD right now, they could certainly do so in the future.
In addition, it is important to note that both GPUs will be certified for a wide range of pro CAD / BIM applications, which is important for some firms. This is especially true for enterprises that buy 100s or 1,000s of workstations from large OEMs like HP, Dell and Lenovo and want assurance that the GPUs will be stable and that they will be properly supported by the software developer.
Certification is a major reason why some firms choose Nvidia’s pro-focused RTX GPUs over their ‘consumer GeForce’ counterparts so they can confidently use applications like Revit, Solidworks, PTC Creo, and Siemens NX alongside more viz-focused tools like Chaos V-Ray, Enscape, Luxion KeyShot and Solidworks Visualize.
While most CAD applications won’t benefit from any GPU more powerful that the Nvidia Quadro P1000 or P2200, Solidworks 2021 is an exception. By using OpenGL 4.5, a more modern version of the popular graphics API, more algorithms can be pushed onto the GPU so there is a benefit to higher performance GPUs.
Even so, the application is still CPU limited to some extent, so the performance benefit the new cards give you isn’t as big as you’d get from a dedicated real-time viz tool.
Like most CAD tools, the most popular way to view models in Solidworks is in shaded with edges mode. Using the SPECapc for SolidWorks 2021 benchmark we saw a small improvement, generation on generation with this display style. The Nvidia RTX A4000 was 1.10 times faster than the Quadro RTX 4000 and 1.44 times faster than the ‘Pascal’ Quadro P4000.
Solidworks also features more realistic display styles for viewing models in real time. SolidWorks RealView, which is only supported by pro GPUs, adds realistic materials and supports environment reflections and floor shadows. Meanwhile, ambient occlusion adds more realistic shadows and helps bring out details.
Both viewing styles are more GPU-intensive, so performance is less limited by the frequency of the CPU. In our tests, we saw a bigger benefit over older GPUs. With RealView, Shadows and Ambient Occlusion enabled the Nvidia RTX A4000 was 1.16 times faster than the Quadro RTX 4000 and 1.57 times faster than the Quadro P4000.
We were unable to test the Nvidia RTX A5000 as Solidworks 2021 SP3 did not recognise the card. We expect this to be fixed in SP4, out soon.
With the new Nvidia RTX A4000 and A5000, Nvidia has made its ‘Ampere’ GPU architecture much more accessible to a wider audience. In particular, we see the sub $1,000 Nvidia RTX A4000 hitting the sweet spot for designers, engineers or architects that want a pro viz capability in their workflow.
The performance leap from ‘Turing’ to ‘Ampere’ (Quadro RTX 4000 to RTX A4000) is nothing short of impressive. In real-time 3D, a 45% to 60% boost, generation on generation, seems typical, with even bigger gains from real-time ray tracing when the enhanced RT and Tensor cores come into play. The step up from the four-year old ‘Pascal’ Quadro P4000 is simply phenomenal, especially for GPU rendering.
Equipping the RTX A4000 with 16 GB of memory is very significant. While we often see models / scenes that surpass 8 GB (the capacity of the previous generation Quadro RTX 4000) scenes that are 16 GB and above are certainly less common, and more the preserve of viz specialists than most architects or product designers who use standard materials and assets.
As we wait for AMD to deliver a Pro version of its ‘Big Navi’ Radeon RX 6000 series GPUs with hardware ray tracing, Nvidia’s biggest competitor in pro graphics is currently itself.
The new 12 GB ‘consumer’ GeForce RTX 3080 Ti, for example, might have half the memory of the Nvidia RTX A5000, but offers more performance on paper for half the price. Nvidia even has a GeForce Studio driver for applications including Enscape, Unreal Engine and V-Ray.
Despite the obvious attraction of Nvidia’s consumer GPUs, Nvidia’s ‘A’ class models should continue to find favour in large firms and enterprises that buy in volume, want more memory, consistent supply, pro viz features or the assurance of certification. There’s also the question of supply. As the global chip shortage continues to bite, Nvidia may well prioritise manufacture of its higher-margin pro GPUs, making GeForce even harder to get hold of.
Nvidia still has some work to do to flesh out its Ampere family. While mobile workstations already have entry-level RTX A2000 and A3000 GPUs, there’s no equivalent for desktops.
The AEC industry would certainly welcome a sub $500 pro RTX GPU to replace the Pascal-based Quadro P2200, which is now long in the tooth. In years gone by, we would have expected to see a desktop RTX A2000 before the end of 2021, but with ongoing supply challenges and high demand, things are very hard to predict right now.