RAID stands for Redundant Array of Independent Drives. It is a virtualization technology that creates arrays of multiple drives for performance, redundancy, backup, etc. There are different types of RAIDs with their pros and cons. However, the three main techniques in RAID are striping, mirroring, and parity. Different types of RAID levels come under these three techniques, which we will discuss in detail today.
RAID is mostly used in data centers, servers, Network Attached Devices (NAS), multimedia production, high-performance computing, etc. In the consumer environments, it has major applications in data backups, disaster recovery, and gaming environments. Now, for different types of applications, different types of RAIDs are configured. Some RAIDs are just to use the combined throughput of different drives and increase the read/write performance or lower the latency. However, the more effective use of RAID is for the fault tolerance. In those types of RAID configurations, multiple drives are used to either stripe or mirror the data. It is possible that, in a RAID, two or more drives are holding the same data just to make sure there is no data loss in case of a drive failure.
Striping and Mirroring are the most important concepts on RAID. In striping, the data is split or striped into smaller chunks and spread across multiple disks. RAID 0, 5, and 6 come under the striping. Striping helps with parallel processing and read/write performance. In mirroring, copies of the same data are written in two more disks. RAID 1 and 10 are the most common RAID levels using mirroring technique. These RAID levels are mainly employed for redundancy or failure protection.
This might sound simple to hear but a lot of engineering goes behind it. In the background, depending on the RAID type, a RAID controller or a complex program is working on the data sharing or striping just to make sure the RAID does the work it is supposed to do.
History of RAID
RAID was founded in 1987 by a team of three computer scientists at the University of California, Barkley. Their idea was just to combine multiple slow drives into a single drive and increase the total speed and reliability. RAID was known as Redundant Array of Inexpensive Disks. They named their paper “A Case for Redundant Arrays of Inexpensive Disks). You can read this paper here.
The 5th heading of this paper explains the MTTF formula and states “Without fault tolerance, large arrays of inexpensive disks are too unreliable to be useful“.
They named the 6th heading “A Better Solution: RAID“. In this, they explained mathematically how the groups of drives are going to break into reliability groups. At that time, the RAID just had two types i.e. RAID 0 (Striping) and RAID 1 (Mirroring). However, RAID 2, 3, and 4 were experimental levels.
RAID actually became popular in the 90s when the hard drives started becoming cheap and storage vendors like Sun Microsystems and IBM started offering RAID solutions to big companies. In the 2000s, NAS became popular and started reaching home and office environments. The RAID 10 (RAID 0 + RAID 1) became really popular at this time among the businesses. The RAID 1, on the other hand, became popular in home systems helping people protect their personal as well as professional files.
As the storage became affordable over time, the distinction of being “inexpensive” remained less relevant. RAID started being called “Redundant Array of Independent Disks” in the 90s. Also, because it started reaching the enterprise environment, RAID didn’t remain just about the price but performance, scalability, and reliability.
Today, we can implement RAID with our home computers and the motherboards are coming with their well-developed RAID controllers. However, for most users, RAID is still an irrelevant thing because of the faster SSDs for performance and the cloud storage for backups.
Implementation of RAID
There are two ways you can use RAID on your computer. One is the hardware RAID and the other is the Software RAID. The first introduction of RAID was interestingly the Software RAID. It relied on the operating system to combine drive and implement fault tolerance. The key thing about the software RAID is it doesn’t have a dedicated RAID controller or RAID card to manage the RAID Array. However, the hardware RAID has dedicated hardware to handle the drive arrays. Let’s discuss more about them.
1. Hardware RAID
Hardware RAID generally has better performance, scalability, and features, and less to no dependency on the CPU or OS. In some motherboards, there could be a dedicated RAID controller to manage the RAID. In other motherboards, you can get a PCIe RAID card and set it up through the given connectors. The key quality of the hardware RAID is that it uses its own controller and does not rely on the operating system or the Host CPU for managing the RAID. A hardware RAID will work even if you re-install the operating system. The RAID configurations are managed primarily by the controller’s BIOS or the firmware during the boot-up.
Hardware RAID comes with some extra features like hot spares (dedicated backup hard drive kept on standby) and battery-backed cache.
2. No OS dependency | Cons of Hardware RAID |
---|---|
1. No extra Load on the CPU (because of the RAID controller) | 1. Increased price due to extra hardware |
2. If the RAID failed, the same controller would be required | 2. If the RAID failed, same controller would be required |
4. Advanced fault-tolerance features | 3. Less flexibility |
5. Highly scalable with support for large arrays | 2. If the RAID failed, the same controller would be required |
6. High performance |
A little about the Fake RAIDS
There isn’t just one type of hardware RAID. There is surely a hardware RAID that uses its own RAID controller and onboard cache to handle the RAID processing independently. However, many consumer and even server-grade motherboards have their RAID controllers integrated into the chipset. This type of RAID relies on both your MB’s firmware and CPU as well. This type of RAID which is common in the consumer systems is called a Fake RAID or Firmware RAID. In fact, all the motherboards that say that their chipsets have dedicated RAID controllers set up fake RAIDs. Along with the dependency on the CPU and OS, the fake RAIDs are less reliable because of increased complexity.
Setting up these RAIDs through the BIOS/UEFI can be a little complex for some users. Also, there is limited support and documentation. It is hard to move the RAID to other systems in a Fake RAID. Check this great video from LevelOneTechs to understand the concept and issues of the Fake RAIDs in consumer systems. The real hardware RAID is that which has its own RAID controller which works independently from the operating system and the system CPU.
2. Software RAID
Software RAID is generally set up, managed, and controlled by the operating system or the special RAID software. A software RAID can be configured on any system. In the Linux operating system, the mdadm tool is generally used to create, manage, and monitor RAIDs. In Windows, you get tools like Windows Storage Spaces and Disk Management Utility. For example, in Windows, you can create a RAID volume by right-clicking on unallocated space in Disk Management and selecting “New RAID-1 Volume.” In macOS, the software RAID can be configured through Disk Utility menu.You can select multiple drives there and choose your desired RAID level.
Software RAID supports RAID 0, 1, 5, 6, and 10 (1+0). Along with low performance, and CPU overhead, the software RAID has the biggest problem of OS dependency. In case the OS fails, accessing the RAID array can be complicated and even impossible without proper recovery tools.
Pros of Software RAID | Cons of Software RAID |
---|---|
1. Cost Effective mainly because on dedicated hardware | 1. CPU overhead |
2. Support for multiple RAID levels | 2. OS dependency |
3. Easy migration between systems | 3. Limited Recovery options |
4. No Vendor Lock-In (Can use any drives) | 4. Longer Rebuild times |
5. Generally user-friendly with GUIs | 5. Risk of data loss and hard recovery |
6. Great for low workloads |
Types of RAID
There are many types of RAID. Some of them are really common like RAID 1 or 5 but there are some less utilized variations. Let’s discuss each of them one by one.
1. RAID 0 (Striping)
RAID 0 works on data stripping which means that the data is shared across drives to improve the performance. This RAID is suitable for applications where high data read/write speed is required such as video editing, gaming, file copying, etc. However, because there is no redundancy, a drive failure will always result in data loss. With RAID 0, the RAID controller or software utilizes the benefits of parallel operations.
RAID 0 strips the data into smaller blocks called blocks. These blocks are then written to multiple disks. If you have written some data, it could be written in different drives in the form of strips. The size of these strips may vary from 16KB to 128KB and even more depending on the performance requirements.
2. RAID 1 (Mirroring)
RAID 1 has redundancy because it mirrors the same data across different drives. By sharing the same data in different drives, the risk of failure is reduced if one drive fails. When the data is written to a drive, it is written to another drive simultaneously. This creates an exact copy which is like a backup to the main drive. A minimum of 2 drives are required for RAID 1. For example, if you are using 4 drives of 1TB, two will be used for mirroring which means, you compromise with the storage capacity with the RAID 1 but get redundancy in return.
RAID 1 can have benefits in read performance but when it comes to writing the data, the performance is generally slower because of the write simultaneous operations on two drives. RAID 1 is considered the simplest method to ensure redundancy without the need of complex technical knowledge. If you are using a RAID controller in a hardware RAID, RAID 1 can make use of the error detection to ensure data integrity.
RAID 1 is mostly used in databases, servers, small businesses, home NAS, and workstations where data integrity is important. Although the fault tolerance is limited because of a single point of failure, RAID 1 is suitable for less critical but important data.
3. RAID 5 (Striping with Parity)
RAID 5 combines data striping with parity for better performance and higher fault tolerance than RAID 1. RAID 5 also utilizes the storage space in a much better manner. RAID 5 stripes the data across devices but there is a parity information for redundancy. Parity is a type of error checking that allows for data recovery in the cause of failure. Parity is created by using the XOR operations on the data blocks. For example, the parity for bits 0 and 1 will be 1. This parity is shared across disks in the RAID 5 to recover data if a single disk fails. RAID 5 requires a minimum of 3 drives. You can calculate the total usable space in RAID 5 using our RAID calculator. The total usable capacity can be calculated using this formula:
Total Usable Capacity = (N-1) x Size of the smallest drive
N is the total number of drives in the array. For example, if you have three 1TB drives, the total usable space would be 2TB while the 1TB will be utilized for fault tolerance. In RAID 5, the total number of blocks and the parity blocks depends on the total number of drives. A RAID 5 setup with 3 drives would like something like this.
RAID 5 offers a good performance along with single-drive fault protection. Storage density is utilized much more efficiently. For example, if you are using 8 disks for RAID 5, you’ll be able to use the combined storage space of 7 drives while the size equivalent for one drive will be used for parity. The main drawback of RAID 5 is its slow write performance. Also, this RAID is complex and requires more rebuilding time in case of failures.
RAID 5 is widely used in enterprise storage, server environments, and many other places where performance, redundancy, and storage efficiency are required.
Working of RAID 5 and Parity
Let’s take an example of 4 drives in a RAID 5. When the data comes to the RAID 5, it divides the data into 3 stripes and stores them in the first 3 disks. The parity of these 3 stripes is stored in the 4th drive. Now, this parity information can be used to reconstruct the data if one of these four drives fails. In case of more than 1 drive failure, the data can’t be reconstructed. Now, let’s see we get three more write requests and we now have 4 blocks filled with data with their 4 parity blocks.
Before understanding the working of parity for redundancy, keep note that the data is written and read in the form of stripes. Now, if the controller demands the first stripe of the data, three disks will be read. Now, if any of the one drives fails, its data can be recovered from the parity information which is stored on a different drive. However, when two drives fail, we can’t reconstruct the data because we have one parity for each disk. In case of a single drive failure, we can easily install a new drive and the RAID controller would write the same information using the parity information. Also, its own parity would be written to the dedicated drives.
4. RAID 6 (Striping with Dual Parity)
RAID 6 is an advanced level of RAID with good performance along with a fault tolerance of up to two disks. A minimum of 4 Disks are required for RAID 6. The total effective space is reduced in this. You can calculate the total usable space using this formula:
Usable Capacity = (Nā2) Ć Size of the Smallest Drive
N is the total number of drives in an array.
For the incoming data, it is first striped and shared across drives. If you have four drives in a RAID 6, the striped data will be written to the first two drives while the remaining two will store their parities. The first parity will be like the normal RAID 5 parity. However, in RAID 6, there is another parity using different combinations. This means, that if two drives fail, the system can recover that data from the other two drives. Both these parities are different and generated using different algorithms.
The biggest benefits of RAID are higher fault tolerance, enhanced error correction, and good read speed. The write operations are slower because of the additional steps for creating two parties. The configuration is much more complex than any other RAID level that we discussed here. Again, because of more parities included for redundancy, it reduces the total effective storage space. The rebuild time is more in case of failures. Also, while the rebuild process is running, a new failure can result in data loss.
RAID 6 is suitable for highly-critical data storage environments. Servers, large-scale database and enterprise file storage systems generally use RAID 6.
5. RAID 10 (1 + 0)
RAID 10 combines RAID 1 and 0 to get the benefits of both mirroring and striping. RAID 10 works by first mirroring the data across pairs of disks and then stripping the mirrored data across multiple drive pairs. This increases the performance as well as provides redundancy.
First of all, the incoming data is mirrored to two pairs of disks. Then the stripped data is mirrored in two drives which is done normally in the RAID 0. If the RAID 0 strips the data in two blocks (i.e. Block 0 and 2), these two blocks will be mirrored to two different drives. A minimum of 4 drives are required to configure a RAID 10. The usable storage capacity is 50% of the total capacity.
The biggest benefit of RAID 10 is its high performance, especially in write-intensive tasks. The mirroring of data offers a fault tolerance of a single drive in one pair. The rebuild time is less because the same data is to be written to the new drive without the need for parity calculations. Higher costs and low storage efficiency are the biggest drawbacks. RAID 10 is widely utilized where high performance with some redundancy is demanded such as high-speed database servers.
The reverse of RAID 10 i.e. RAID 01 can also be implemented but it is less fault tolerant. In RAID 01, the data is first striped across multiple disks and then each stripped set is mirrored. The biggest problem with this RAID is if a single disk in the striped set fails, the entire strip is lost.
Some other common RAID levels
There are some other RAID levels such as RAID 2, RAID 3, RAID 4, RAID 50, RAID 60, RAID 7, RAID 1E, RAID DP, and RAID-Z. RAID DP is popular in the Netapp storage while RAID-Z is used in ZFS systems where data integrity is the priority. All other RAID levels are either very old or rarely used.
Conclusion
This was a beginner guide to RAID because there is a lot to talk about it. A lot of mathematics and engineering goes behind everything happening in RAID. However, you should have a basic idea of what is the purpose of RAID. Yes, it is either about the redundancy or performance or both in some cases. Different RAID levels are there to fulfil those demands but with their own pros and cons.