Sunday, April 17, 2011

Tutorial 4: Redundant Arrays for Inexpensice Drive- " RAID"


What is RAID?

RAID is the idea of combining multiple small and inexpensive disk drives into one array, which would result in a higher performance of a “Single Large Expensive Drive (SLED). Besides, this array of drives seems to the computer as a single drive.

In order to find the perfect RAID, you will need, first, to identify your needs.  

There are different types (levels) of RAID’s:

RAID 0 (striping):
-A disk that increases performance by interleaving data across two or more disk drives.
-Data are broken into blocks, called “stripes”, in order to increase speed (Data is written on both drives simultaneously)

For example:

If you are writing the word DATA:
DA= would go on the 1st drive
TA=would go on the2nd drive

NOTE: Example is only using 2 disk drives. Keep in mind that the more disks, the faster RAID 0 works, and the higher percentage of failing.

*There is only one huge problem using RAID 0; no redundancy exists.
In other words, if one disk drive fails, data is automatically lost.

Key factors:
Advantage: high performance
Disadvantage: no redundancy
Ideal for: highest resolution HD, and temporary storage.

RAID 1 (mirroring or duplexing):
-RAID 1 uses a pair of drives
-RAID 1 is always implemented as mirroring; data is copied on both drives using either a hardware or software.
-Redundancy exits; if one drive fails, the other one keeps working as a single drive, until the death drive is replaced. In other words, data is not lost.

For example:

If you are writing the word DATA:
Drive 1: DATA
Drive 2: DATA

*Data is been written simultaneously on both disk drives.

NOTE: Once one of the disk drives fails, you need to replace it, and re-mirror the data to the new disk drive.  System keeps running at same speed even with one disk drive.

*It is very important to know that we can use a hot spare disk. A hot spare disk is a 3rd disk doing absolutely nothing, until one disk fails. Basically, this hot spare will replace the death disk.

*A variant of RAID 1 is duplexing, which duplicates the controller card as wells as the drive, providing tolerance against failure.

Key factors:
Advantage: high redundancy
Disadvantage: expensive (cost per GB is doubled)
Ideal for: maximum disk failure protection, enterprise servers, and small database systems,
Fault tolerance: very good; better duplexing
Array capacity: size of smaller drive
Availability: Most RAID controllers will support hot sparing and automatically rebuild RAID 1

RAID 3 (striping with a dedicated parity drive):
-Raid level 3 stripes data at a byte level across several disk drives, with parity stored on one drive
-Byte level stripping requires hardware support for efficient use
-RAID level 3 can be used in data intensive or a single-user environments which access long sequential records to speed up data transfer
-RAID level 3 does not allow multiple operations to be met in order to avoid performance degradation.

Key factors:
Advantages: Efficient drive redundancy (one drive only used for parity)
Disadvantages: Loses one disk drive capacity
Ideal for:  Well-balanced for video requiring redundancy and performance

RAID 5 (stripping the parity across all drives):
-RAID level 5 is considered one of the popular RAID’s due to its block level stripping with distributed parity.
-RAID 5 requires three or more drives
-Data blocks are spread across all drives except one
-RAID 5 allows better parallelism in a multiple transactions environment.
-Fault tolerance is kept by guaranteeing that the parity information for any given block of data is placed on a drive separate from those used to store the data itself
-Performance of RAID 5 can be adjusted by trying different stripes sizes.
   
   1. Fast reading speed (similar to RAID 0)
   2. Moderate writing speed (CPU must spend some time in computing parity)

-If single drive failures; no data is lost (but system is running slower)
-Death disk drive needs to be replaced; system allows reconstructing data on the new drive before another failure occurs (Mr. Olson)
-Hot spare disk can be used

Key factors:
Advantage: Efficient drive redundancy; parity distributed to all drives.
Disadvantage: one disk drive capacity is lost.
Ideal for: network services, well- balanced for file server requiring performance and redundancy.
Fault tolerance: good; can tolerate one disk loss.
Array capacity: (Size of Smallest Drive) * (Number of Drives - 1)

RAID 6 (striping with two dedicated parity drives- “double parity”):
-RAID level 6 eliminates the risk of data loss if a second hard disk drive fails, while RAID array is rebuilding.
-A second set of parity is calculated, written, and distributed across all disk drives. This calculation provides a very high fault tolerance because, if two disk drives fail, there is no data loss.
- RAID 6 requires a second set of parity calculations to be made so that data from two failed drives can be rebuilt from the parity information retained on the surviving disks.
-Performance:
    
   1. Like RAID 5
   2. Fast reading
   3. Slower writing
  4. Needs hardware support

-RAID level 6 provides higher data protection to industries
-Higher data availability and tolerance default than RAID 5.
-How to avoid a second drive failure?

  1. Periodically run “verify” on the array
  2. Hot sparing with automatic rebuilds
  3. Set the rebuild priority to its highest level.
  4. Balance reliability requirements and performance expectations against the number of arrays in a single     volume.

Key factors:
Advantage: Double drive redundancy
Disadvantage: Loses two disk drives capacity
Ideal for: Mission critical

Resources:








No comments:

Post a Comment