In the latest instalment of his series of articles which look at the components of a PC workstation, Robert Jamieson gives us the low down on hard disks, their physical characteristics and which are best for CAD.
The hard disk is the slowest component (bar floppy and CDs) that is used in a modern workstation. This means it has more effect on the total performance than most people realise. Even if you work from a network, loading up applications, or using the pagefile, will all be done locally using the hard drive. Most CAD applications put some of the loaded information into the pagefile or at least create temporary files which all sit on the hard drive.
Physical characteristics
I will first go through the physical characteristics of what a hard disk is. Hard disks are mechanical devices. As a result they fail more than any other device in a computer. Each hard disk has a rotating platter (glass for example) coated in magnetic media. These spin at different rates from 4,200 RPM to 15,000 RPM in the latest SCSI (Small Computer System Interface) models. A drive can have multiple platters depending on the amount of storage it has. The data is read and written to this by a head that tracks across the spinning disk. In the old days the disk would spin three times before the data was read off but today it’s all read in one go. Each drive has its own RAM or Cache on the drive and this ranges from 2Mb to 16Mb. Drives are measured in access times and data transfer.
Workstation disks fall into two main categories IDE (Integrated Drive Electronics) and SCSI. SCSI was the traditional workstation drive and attracted the latest technology in spin rates and cache. SCSI needs a dedicated intelligent controller card or high-end motherboard to support the drive. The Ultra 320 standard is the latest SCSI generation. The problem with SCSI is the cost of each drive is more and the controllers aren’t cheap either. IDE, on the other hand, has drive controllers embedded on each motherboard and the IDE drives offer cheap reasonable performance. The current sub types of IDE drives are parallel and serial.
Parallel ATA is the primary internal storage interconnect for the computers, connecting the host system to peripherals such as hard drives, optical drives, and removable magnetic media devices. Parallel ATA is an extension of the original parallel ATA interface introduced in the mid 1980s and maintains backward compatibility with all previous versions of this technology. The cable standard is 40-wire cable, which has been replaced by an 80-wire version for the Ultra standard. These are often “round cables” to improve airflow.
Serial ATA is an update of Parallel with better interface speed and cabling. They also require a different interface and are not interchangeable with Parallel ATA. However, you can get an adapter to fit a Parallel ATA drive to a serial controller. There is no great improvement of performance between PATA and SATA – it’s more of a future standard that will give improvements once the mechanical performance of drives improve. New SATA drives support NCQ technology, which can manage the internal queue in which commands can be dynamically rescheduled and reordered. This is supported on Intel’s 925 chipset, for example. This looks to be a good technology and shows that the various manufacturers have put a lot of development into SATA.
Raid
There is a lot of talk about whether or not RAID (Redundant Array of Independent (or Inexpensive) Disks) gives increased performance. RAID is where several physical disks are combined into an array for better speed and/or fault tolerance. A Level 0 Raid array implements data striping where file blocks are written to separate drives. The setup doesn’t provide any fault tolerance, because failure of one drive will result in data loss and this actually increases MTBF (Mean Time Before Failures). However, in practice a Level 0 array does help the performance of intensive applications quite a bit.
Level 1 implements data mirroring. Here, data is duplicated on two drives either through software or hardware. It provides faster read performance than a single drive. Level 3 requires at least three drives. Data block is striped at byte level across drives and error correction codes (parity info) is recorded on another drive. Provides fault tolerance but slower writing performance. Level 5 improves performance but also striping parity info across multiple drives and provides redundancy for three or more drives.
RAID can be implemented in IDE (PATA and SATA) and SCSI. However, the cost of some of the controllers with extra cache is quite a lot and often more of a technology for a server than a humble workstation. A lot of the newer motherboards come with RAID onboard for IDE, which means you need only an extra drive to have one. However, a lot of larger manufacturers don’t give you the option as it’s harder to pre-install Windows (the RAID has to be functioning before the OS is installed).
What’s a hard disk cache?
The controller on a drive copies the most recent access information into the RAM or Cache on the hard disk. The algorithm that controls this has an effect on what is copied. The RAM is 10x faster than accessing the physical platters and if the data is accessed again, it’s supplied faster.
What does CAD need
Now I have talked about the physical hardware – lets look at what’s best performance for CAD? SCSI still just gives ultimate performance in a RAID array. This is expensive and an IDE RAID setup with high performance IDE drives is good and a lot more affordable. The thing about IDE drives is there is a great range from entry-level to performance, whereas practically all SCSI drives are performance drives so you always get a good one. The performance of a single top end IDE drive can be good enough for most CAD applications.
I have seen tests in some computer magazines stating that RAID doesn’t make any difference; this is because they tested loading games which are single large files. An assembly of a 3D model would have a lot of smaller files often repeated. Imagine how many nuts and bolts are in a given assembly. If the drive or RAID has a large cache which is RAM (RAID 0 doubles up the cache) this bolt would come from the cache and therefore load faster and not need to access the platters. This is also true when saving or closing an application where the data has to be written.
This is all OK if you are buying a new computer but what can you do to improve your current performance? Defragment your drives! As data is written to drives it’s placed in the first available space. After a while as data gets deleted and replaced with bigger files the hard disk gets messy with a single file spread all over the disk. The standard defragmenter in Windows is a start but 3rd party defragmenters are a lot better and can put the applications and most recent access data together. Some can even defragment the pagefile and put it at the front of the disk. Why should this make it faster?
If you look at the disk platter the outside edge track covers more distance in one revolution than the inside edge track. The areal density is the same – this can account for 20% difference in performance from the front of the disk to the back. This is also why disks with increased areal density have better transfer rates. For this same reason it’s better to keep the disk half full. With the sizes available today in IDE drives this should not be a problem. One tip on defragmenting from old – after you have defragmented a drive restart the computer and don’t run other applications while it’s working. This will stop applications going for data that’s been moved.
Reliability
As I said at the beginning drives will always fail. Hard Disks will fail when they are new i.e. manufacturing defect or when the bearings have worn out after extended use etc. It’s a good idea to test a new system before it’s put into a production environment and likewise cycle out two year old drives that have had a hard life. Different manufactures have different track records with reliability. I have a large collection of two to three year old drives from one manufacturer that have all failed with similar faults. I’m not buying them again! Some of the bleeding edge stuff tends to have slightly more problems – a candle that burns twice as bright lasts half as long.
If I’m saying drives fail, why increase data loss chance by using RAID 0? I have been using SCSI RAID and now IDE RAID 0 for ten years and as long as you have good backup and replace drives after two years, I will still take the chance. I want the performance it offers!
Robert Jamieson works for workstation graphicsspecialist, ATI