Monday, July 13, 2009

Re: [BLUG] Large RAID config suggestions?

Wow, that's a lot of disks.

I have one major suggestion. Don't make one big filesystem. Don't
even make one big Volume Group. With that much space, I'd recommend
dividing it up somehow, otherwise if you need to recover, it can take a
day or more just to copy files over.

At last year's Red Hat Summit, Rik van Riel gave a presentation called
"Why Computers Are Getting Slower". He mostly talked from a low level
point of view since he's a kernel developer, which was great. One of
the things that he talked about is how we're getting to the point where
filesystem sizes are getting too large for even fast disks to handle a
recovery in a reasonable amount of time. And the algorithms need to be
better optimized. So he recommended breaking up your filesystems into
chunks. So on a server your /home partition might be /home1 /home2
/home3, etc. and on a home machine you probably should put mediafiles on
a seperate partition or maybe even break that up. Plus, using volume
management like LVM is a necessity.

On something like a mail server, with lots of little files, you may
have millions of files and copying them over takes a lot of time, even
on a SAN. I recently had to recover a filesystem with 6 million files
on it and it was going to take about 16 hours or more just to copy stuff
over to a SATA2 RAID-1 array running on a decent hardware raid
controller, even though it was only about 180GB of data. This was direct
disk to disk too, not over the network. What I had to do instead was
some disk image trickery to get the data moved to a new set of disks (I
had a RAID controller fail). If I had lost the array, I couldn't have
done it this way.

It was the first time I've had to do such a recovery in a while and
wasn't expecting it to take so long. Immediately afterwards I decided
that we need to breakup our filesystems into smaller chunks and also
find ways to reduce the amount of data affected if a RAID array is lost.

The short answer is that you can build all kinds of redundancy into
your setup, but can still end up with the filesystem failing or
something frying your filesystem that leads to major downtime.

Of course this mythical thing called ZFS that comes with the J4400 may
solve all these problems listed above.

What kind of database system are you going to be using?

Mark

On Mon, Jul 13, 2009 at 06:45:54PM GMT, Josh Goodman [jogoodman@gmail.com] said the following:
>
> Hi all,
>
> I have a 24 x 1 TB RAID array (Sun J4400) that is calling out to be initialized and I'm going round
> and round on possible configurations. The system attached to this RAID is a RHEL 5.3 box w/
> hardware RAID controller. The disk space will be used for NFS and a database server with a slight
> emphasis given to reliability over performance. We will be using LVM on top of the RAID as well.
>
> Here are 2 initial configuration ideas:
>
> * RAID 50 (4x RAID 5 sets of 6 drives)
> * RAID 60 (3x RAID 6 sets of 8 drives)
>
> I'm leaning towards the RAID 60 setup because I'm concerned about the time required to rebuild a
> RAID 5 set with 6x 1 TB disks. Having the cushion of one more disk failure per set seems the better
> route to go. I'm interested in hearing what others have to say especially if I've overlooked other
> possibilities.
>
> I'm off to start simulating failures and benchmarking various configurations.
>
> Cheers,
> Josh
>
> _______________________________________________
> BLUG mailing list
> BLUG@linuxfan.com
> http://mailman.cs.indiana.edu/mailman/listinfo/blug
>

--
Mark Krenz
Bloomington Linux Users Group
http://www.bloomingtonlinux.org/
_______________________________________________
BLUG mailing list
BLUG@linuxfan.com
http://mailman.cs.indiana.edu/mailman/listinfo/blug

No comments: