Friday, February 26, 2010

What is RAID

Basically Different type of RAID provide performance advantage & Fault tolerance feature or to mirror the HDD.
Type of RAID with feature:-
If you want to put RAID onto your home PC, then in my opinion, RAID1 is the best way to go. It's simple. it works and it only needs two disks. It will even perform if it is software Implementation. If you run big storage systems with gigabytes of cache and hundreds of physical disks, then I would definitely go for RAID5. Why? It is cheaper because it uses fewer disks for a given capacity and it performs just as good as RAID1. If you have eighty 500GB disks, you can only store 20 Terabytes of data on them with RAID1, but you will get 35 TB on them in a 7+1 RAID5 implementation. That's why I claim that RAID5 is cheaper than RAID1. It is for big systems, but not for small systems, say less than a couple of terabytes. I had an animated discussion (which is one way of describing it) with a DBA last year who insisted that Oracle databases had to have RAID1 or they would not perform. We bought a DMX and ran some tests with the same database on RAID1 and RAID5, and the RAID5 setup actually performed better, I suspect, because it was pulling the data off more spindles. However, I would never touch a software implementation of RAID5 as the write penalty will kill performance. So there you go, PCs and small systems; RAID1, big systems RAID5 but at the end of the day it is your money.
RAID can be implemented by software in the host, but this is not usually successful. It is best implemented by microcode in the storage subsystem controller. The various types of RAID are explained below. In the diagrams, the square box represents the controller and the cache. Parity is a means of adding extra data, so that if one of the bits of data is deleted, it can be recreated from the parity. For example, suppose a binary halfword consists of the bits 1011. The total number of '1's in the halfword is odd, so we make the parity bit a 1. The halfword then becomes 10111. Suppose the third bit is lost, the halfword is then 10?11. We know from the last bit that there should be an odd number of '1's, the number of recognisable '1's is even, so the missing but must be a '1'. This is a very simplistic explanation, in practice, disk parity is calculated on blocks of data using XOR hardware functions. The advantage of parity is that it is possible to recover data from errors. The disadvantage is that more storage space is required. .
RAID 0 is simply data striped over several disks. This gives a performance advantage, as it is possible to read parts of a file in parallel. However not only is there no data protection, it is actually less reliable than a single disk, as all the data is lost if a single disk in the array stripe fails.
RAID1 is data mirroring. Two copies of the data are held on two physical disks, and the data is always identical. RAID1 has a performance advantage, as reads can come from either disk, and is simple to implement. However, it is expensive, as twice as many disks are needed to store the data.
RAID2 is a theoretical entity. It stripes data at bit level across an array of disks, then writes check bytes to other disks in the array. The check bytes are calculated using a Hamming code. Theoretical performance is very high, but it would be so expensive to implement that no-one uses it.
RAID3 A block of data is striped over an array of disks, then parity data is written to a dedicated parity disk. Successful implementations usually require that all the disks have synchronized rotation. RAID3 is very effective for large sequential data, such as satellite imagery and video. In the gif above, the right hand disk is dedicated parity, the other three disks are data disks.·
RAID 4 data is written in blocks onto the data disks (i.e. not striped), then parity is generated and written to a dedicated parity disk. In the gif above, the right hand disk is dedicated parity, the other three disks are data disks.
RAID 5 data is written in blocks onto data disks, and parity is generated and rotated around the data disks. Good general performance, and reasonably cheap to implement. Used extensively for general data. The gif below illustrates the RAID5 write overhead. If a block of data on a RAID5 disk is updated, then all the unchanged data blocks from the RAID stripe have to be read back from the disks, then new parity calculated before the new data block and new parity block can be written out.
.RAID6 is growing in popularity as it is seen as the best way to guarantee data integrity as it uses double parity. It was originally used in SUN V2X devices, where there are a lot of disks in a RAID array, and so a higher chance of multiple failures. RAID6 as implemented by SUN does not have a write overhead, as the data is always written out to a different block. The problem with RAID6 is that there is no standard method of implementation; every manufacturer has their own method.
RAID0+1 is implemented as a mirrored array whose segments are RAID 0 arrays, which is not the same as RAID10. `.RAID 0+1 has the same fault tolerance as RAID level 5. The data will survive the loss of a single disk, but at this point, all you have is a striped RAID0 disk set. It does provide high performance, with lower resilience than RAID10. · RAID-S or parity RAID is a specific implementation of RAID5, used by EMC. It uses hardware facilities within the disks to produce the parity information, and so does not have the RAID5 write overhead. It used to be called RAID-S, and is sometimes called 3+1 or 7+1 RAID. · RAIDZ is part of the SUN ZFS file system. It is a software based variant of RAID5 which does not used a fixed size RAID stripe but writes out the current block of data as a varying size RAID stripe. With standard RAID, data is written and read in blocks and several blocks are usually combined together to make up a RAID stripe. If you need to update one data block, you have to read back all the other data blocks in that stripe to calculate the new RAID parity. RAIDZ eliminates the RAID 5 write penalty as any read and write of existing data will just include the current block. In a failure, data is re-created by reading checksum bytes from the file system itself, not the hardware, so recovery is independent of hardware failures. The problem, of course is that RAIDZ closely couples the operating system and the hardware. In other words, you have to buy them both from SUN.
for more details plz send me mail vikas_kotgarh@yahoo.co.in