Tuesday, July 14, 2009

Re: [BLUG] Large RAID config suggestions?

Hi Steven and Mark,

You all raise exceptional points. What lured me into the idea of 5EE was the prospect of "enhanced
performance" and being overly optimistic about the possible performance of the RAID card. My
initial tests this morning show that the compression phase would take ~2 hours to complete followed
by another 3-4 hours for the decompression phase; too risky for my taste.

Our current database setup is a product of many different factors. Our choice of PostgreSQL was in
part dictated by our use of an open source biological database schema called Chado
(http://gmod.org/wiki/Chado) that we helped develop. The mix of PostgreSQL and mySQL came about
because this work started 5 years ago when, as you pointed out, the two databases were very
different beasts. We also rely on a mixed bag of open source bioinformatics tools that only support
either mySQL or PostgreSQL.

Our mySQL servers right now are mostly v5 but we still have a couple of v4 in production, but they
are slated to be replaced soon. For PostgreSQL we are currently running v8.1 but have a plan to
move to 8.4 later on this summer in order to take advantage of some nice new features. Features of
particular interest include improved performance, parallel restores, recursive queries, and no more
max_fsm_pages! I'm also happy to see that they finally have a migration tool in the works. I
really loathed those full dump and restores when upgrading.

Josh

Steven Black wrote:
> On Tue, Jul 14, 2009 at 03:37:37PM +0000, Mark Krenz wrote:
>> The whole recompression of the array that can take hours or days
>> sounds VERY risky. I wouldn't do it unless you are experimenting. Some
>> of these non-standard raid levels are just companies coming up with
>> new combinations to have an extra feature over the competition, they
>> aren't necessarily good things.
>
> Yeah, I don't really see the benefit of this compared to having a
> straight RAID5 + hotspare.
>
> I mean, with a hotspare, the array is rebuilt on to the hotspare when a
> drive fails. That's one major I/O operation.
>
> With RAID 5EE you have a possibly similar major I/O operation during the
> compression, then an additional (possibly similar) I/O operation during
> decompression once you add the new drive.
>
> What troubles me is that the system is subject to "a second drive
> failure" during *decompression*. That is, it is subject to data loss if
> you have a drive failure after you've inserted the hot spare until the
> RAID5 finishes becoming RAID5EE. This means you're subject to a second
> drive failure being an issue for a longer period than you would be under
> RAID5.
>
> An example: Your RAID5EE has a drive failure. It compresses it down to
> RAID5. During that time it is subject to a second drive failure. (This
> is on par to having a RAID5 with a hotspare. You're subject to a second
> drive failure until the hotspare becomes a full member of the array.)
> Then you have a happy period where the previously RAID5EE array is now a
> RAID5 array and immune to a second drive failure. Then you plug in the
> spare drive and -- unlike RAID5 with a hotspare -- you're subject to a
> second period of time where the system is subject to data loss if there
> is a drive failure.
>
> Can you be sure that the compression and decompression will take less
> total time than a RAID5 with a hotspare? This concerns me, as the total
> I/O transferred for RAID5EE will be more than the total I/O for RAID5
> with a hotspare.
>
> Since the hotspare is spread across all of the drives, and after
> compression it becomes a standard RAID 5 array, you're still
> transferring a whole disk work of data. It's just, with RAID 5EE you're
> doing the full disk worth of data transfer twice. (Okay, not quite a
> full disk's worth of data, a full disk's worth of data minus the spare
> percentage. Still, twice whatever-it-is is still well more than a disk
> and a half worth of data.)
>
> I mean, most hotspare systems treat the replaced drive as the new
> hotspare. It isn't like slot 10 (or whatever) is always the hotspare
> slot. With a traditional hotspare, you transfer data once. With RAID5EE
> you transfer data twice.
>
>> Also, PostgreSQL is different from MySQL when it comes to number of
>> files. MySQL has 2 or 3 files per table, whereas PostgreSQL has many
>> more files, but it might not get really high. I have a decent size
>> database with lots of data and somewhere over 120 tables, and its only
>> 2000 files in /var/lib/pgsql. But if you are doing a large database
>> like flybase, you might pay attention to how many files you're using
>> on the filesystem like this:
>
> With MySQL it depends on the database engine being used. MyISAM (and
> probably Maria -- I've not read much about it) uses three files per
> table. InnoDB uses one per table plus three shared data files. (There
> are additional database engines at this point, too.)
>
> InnoDB and Falcon (I'm not sure how many files it has per table) should
> both provide ACID compliance. (I'm not sure about Maria.)
>
> Maria and Falcon are new in MySQL 6. Prior to MySQL 5 you couldn't get
> anything close to full ACID compliance due to missing core features.
>
> InnoDB can run in to issues of file-size limits on some systems. This
> can be relieved by enabled "innodb_file_per_table" prior to creating
> the tables. This results in InnoDB creating 2 files per table plus the
> earlier mentioned three shared InnoDB files.
>
> Please tell me you're not using MySQL prior to v5. :)
>
> I'm actually a little surprised by the combined PostgreSQL/MySQL
> approach. (Unless, of course, it started with MySQL prior to version 5,
> at which point it would be understandable. That was before MySQL had the
> features, and before PostgreSQL had the speed.) I would have expected a
> consistent product with replication off to one or more slaves that only
> accept read operations.
>
>> Its nice talking about sysadmin type stuff on the blug list again,
>> we don't do it enough.
>
> Indeed.
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> BLUG mailing list
> BLUG@linuxfan.com
> http://mailman.cs.indiana.edu/mailman/listinfo/blug
_______________________________________________
BLUG mailing list
BLUG@linuxfan.com
http://mailman.cs.indiana.edu/mailman/listinfo/blug

No comments: