Main Page | Report this Page
Computers Forum Index  »  Computer Architecture - Storage  »  Unimpressive performance of large MD raid...
Page 1 of 3    Goto page 1, 2, 3  Next

Unimpressive performance of large MD raid...

Author Message
kkkk...
Posted: Wed Apr 22, 2009 11:29 pm
Guest
Hi there,
we have a big "storage" computer: dual Xeon E5345 at (no spam) 2.33GHz (8 cores
total) with lots of disks partly connected to a 3ware 9650SE controller
and partly to the SATA/SAS controllers in the mobo.
The hard disks are Western Digital WD7500AYYS 750GB

We are using an ext3 filesystem with defaults mount on top of LVM + MD
raid 6. The raid-6 is on 12 disks (hence it is 10 disks for data, 2 for
parity). 6 of those disks are through the mobo controller, the others
are through the 3ware.

I hoped I would get something like 1 GB/sec sequential write on 10 disks
Razz instead I see MUCH lower performances

I can't understand where is the bottleneck!?

In sequential read with separate instances of "dd" one for each drive
(directly from the block device), I can reach at least 800 MB/sec no
problem (I can probably go much higher, I just have not tried). So I
would exclude that it is a bus bandwidth problem (it's pci-express in
any case, and the 3ware is on an 8x).

Here are my write performances:

I am writing a sequential 14GB file with dd
time dd if=/dev/zero of=zerofile count=28160000 conv=notrunc ; time sync
(the throughput I report is not the one reported by dd: it is adjusted
by hand after also seeing the time sync takes, so it's near to the real
throughput. I confirm the drives LEDs are off after sync finishes.)
There is no other I/O activity. Disk scheduler is deadline for all drives.

All caches enabled on both 3ware an disks attached to the MOBO:
first write = 111 MB/sec
overwrite = 194 MB/sec

Cache enabled only in disks connected to the MOBO (6 over 12):
first write = 95 MB/sec
overwrite = 120 MB/sec

Cache disabled everywhere: (this takes an incredibly long time to do the
final flush)
first write = 63 MB/sec
overwrite = 75 MB/sec


I have looked in top and htop what happens. Htop reports LOTS of red
bars (iowait?) practically 50% red bars in every core (8 cores).

Here is what happens in a few of those situations:
- Cache all enabled, overwrite:
dd is constantly at 100% CPU (question: shouldn't it be always 0% CPU
always waiting on blocking IO??). Depending on the moment, either
kjournald or pdflush are at about 75%. More time it is kjournald.
md1_raid5 (raid6 in fact) is around 35%.

- Cache all enabled, first write:
like above but there are often moments in which neither kjournald nor
pdflush are running. Hence the speed difference. dd is always at near
100% CPU.

- cache only in disks attached to mobo, overwrite:
Similar to "cache all enabled, overwrite" except that in this case dd
can never reach 100%, it is around 40%, the other processes are down
accordingly, hence the lower speed. There are more red bars shown in
htop, for all cores.

- cache only in disks attached to mobo, first write:
dd reaches 100% but kjournald reaches 40% max. Pdflush reaches 15% max.
md1_raid5 is down to about 15%.

- cache all disabled, overwrite:
dd reaches about 30%, kjournald is max 20% and md1_raid5 reaches 10%
max. Actually dd alone reaches even 100% but only in the first 20
seconds or so and at that time kjournald and md1_raid5 are still at 20%
and 10%.

- cache all disabled, first write:
similar to above.


So I don't understand how the thing works here.
I don't understand why dd CPU is at 100% (caches on) instead of being 0%
I don't understand why kjournald doesn't go 100%, I don't understand
what kjournald has to do in the case of overwrite (there is no
significant journal on overwrites, right? I am using defaults, should be
data=ordered)
I don't understand why the caches change the performance so much for
sequential write...
Also, question: if I had the most powerful hardware RAID, would
performances be limited anyway to 200MB/sec due to kjournald??


Then I have another question: "sync" from the bash really seems to work
in the sense that it takes time and after this time I confirm that the
activity LEDs of the drives are really off. I have a MD-raid-6+LVM here!
Weren't both MDraid5-6 AND LVM supposed NOT to pass the write barriers
downstream to the disks?? Doesn't sync use exactly barriers?
(implemented with device flushes) Sync here seems to work!

Thanks for your help
 
...
Posted: Thu Apr 23, 2009 12:55 am
Guest
U comp.arch.storage kkkk <kkkk at (no spam) bbbb.com> prica:
Quote:
We are using an ext3 filesystem with defaults mount on top of LVM + MD
raid 6. The raid-6 is on 12 disks (hence it is 10 disks for data, 2 for
parity). 6 of those disks are through the mobo controller, the others
are through the 3ware.

First of all, you've got best of breed 3Ware 9650SE controller which has the
best RAID6 of all SATA RAID controllers... And you're mixing it with onboard
controller to produce the software RAID6?! WHY?!!!

Second, 12 drives for RAID6 is suboptimal... Go with 8 or 16 drives and
attach them directly to 3Ware...

Do this:
Attach 8 drives to 3Ware 9650SE (it is 9650SE-8, right?), and 4 drives to
the onboard controller... Build hardware RAID6 on 3ware and if possible,
hardware RAID6 on onboard controller... And have 2 logical drives with
that... If you really want, concatenate them via LVM, but I won't suggest
it...

Other thing, RAID6 has double distributed parity (like RAID5 does)... So,
every drive is for data, and every drive is used for parity...

Quote:
I hoped I would get something like 1 GB/sec sequential write on 10 disks
Razz instead I see MUCH lower performances

With your configuration - very unlikely! :/

--
"Divovskis li pijetaou maltretiru ?" upita majmuna pjeva Rudio mirise.
"Ne znam ja nista !" rece skupstinaa maltretira "Ja samo plivaco zvace divovskim !"
By runf

Damir Lukic, calypso at (no spam) _MAKNIOVO_fly.srk.fer.hr
http://inovator.blog.hr
http://calypso-innovations.blogspot.com/
 
kkkk...
Posted: Thu Apr 23, 2009 2:51 pm
Guest
calypso at (no spam) fly.srk.fer.hr.invalid wrote:
Quote:
U comp.arch.storage kkkk <kkkk at (no spam) bbbb.com> prica:
We are using an ext3 filesystem with defaults mount on top of LVM + MD
raid 6. The raid-6 is on 12 disks (hence it is 10 disks for data, 2 for
parity). 6 of those disks are through the mobo controller, the others
are through the 3ware.

First of all, you've got best of breed 3Ware 9650SE controller which has the
best RAID6 of all SATA RAID controllers... And you're mixing it with onboard
controller to produce the software RAID6?! WHY?!!!

We are a research entity and the funding comes at unknown times. We
decided to build the system so that any component can be replaced with
any other similar component available in shops at any time, e.g. the
9650SE can be replaced with multiple controllers in the future. If the
3ware breaks in the future, at that time there might not be a compatible
controller in production. (We cannot buy from ebay!)
With the current setup, in any emergency, the disks can be connected via
any controller or even USB, and the MD linux raid will still work and we
will be able to get data out. Furthermore we trust visible, open,
old/tested, linux MD code more than any embedded RAID code which nobody
knows except 3ware. What if there was a bug in 9650SE code? It was a
recent controller when we bought it, and we would have found out only
later, maybe years later after setting up our array.
Also, we were already proficient with linux MD.

Anyway since linux MD raid never occupies more than 35% CPU (of a single
core!) in any test, I don't think it is the bottleneck. But this is part
of my question.


Quote:
Second, 12 drives for RAID6 is suboptimal...

WHY??

Quote:
Go with 8 or 16 drives and attach them directly to 3Ware...

We have already lots of data and virtual machines loaded in there. Even
if it was possible to attach all to 3ware controller (actually it might
indeed be possible, since it is a 16ML [we have 24 drives on the
machine]), we wouldn't have used the RAID from 3ware for the reasons
explained above.

With MD raid it shouldn't make a difference unless you say that the
larger cache in the 3ware speeds up the operation. This is again part of
my question: the cache seems to have a dramatic effect which I do not
completely understand for sequential I/O. It must be something related
to the bus overhead or the context switching of the CPU (for serving the
interrupts) but I would like a confirmation. Also consider that with 8
cores and a PCI express bus, both overheads should have been negligible.
Anyway the cache from the drives should be enough to minimize this
overhead (I mean for the MOBO drives) so I would not expect a tremendous
speedup from using the 3ware cache for all the drives (I mean still with
MD).

Quote:
If you really want, concatenate them via LVM, but I won't suggest
it...

LVM concatenation looks like very unsafe...

Quote:
Other thing, RAID6 has double distributed parity (like RAID5 does)... So,
every drive is for data, and every drive is used for parity...

I know. My sentence was for explaining the exact chunk/stride size we have.

Quote:
I hoped I would get something like 1 GB/sec sequential write on 10 disks
Razz instead I see MUCH lower performances

With your configuration - very unlikely! :/

What performance would you expect from 3ware raid-6 12-disks with ext3
(defaults mount) sequential dd write?

Thank you
 
Ed Wilts...
Posted: Thu Apr 23, 2009 2:58 pm
Guest
Quote:
Had we found a cheap no-raid 16 drives SATA controller for PCI-Express
we would have bought it. If you know of any, please tell me.

The standard rule of thumb is "good, fast, cheap" - pick any 2. If
you want reasonably good and cheap, you're not going to get fast.

I've seen cheap and fast implementations but they weren't any good - I
had the pleasure of recovering a corrupt 25TB file system over my
Christmas break. We've since replaced it with a more expensive but
good solution.
 
...
Posted: Thu Apr 23, 2009 4:52 pm
Guest
U comp.arch.storage kkkk <kkkk at (no spam) bbbb.com> prica:
Quote:
We are a research entity and the funding comes at unknown times. We
decided to build the system so that any component can be replaced with
any other similar component available in shops at any time, e.g. the
9650SE can be replaced with multiple controllers in the future. If the
3ware breaks in the future, at that time there might not be a compatible
controller in production. (We cannot buy from ebay!)

For data recovery purposes, anyone can spare 50$ and buy it as a private
person from ebay...

Quote:
With the current setup, in any emergency, the disks can be connected via
any controller or even USB, and the MD linux raid will still work and we
will be able to get data out. Furthermore we trust visible, open,
old/tested, linux MD code more than any embedded RAID code which nobody
knows except 3ware. What if there was a bug in 9650SE code? It was a
recent controller when we bought it, and we would have found out only
later, maybe years later after setting up our array.
Also, we were already proficient with linux MD.

There is very good support for 3Ware controllers on www.3ware.com, just
check knowledge base...

3Ware 9650SE is at least 4 years old controller, and all the bugs are
solved (there were some incompatibilities at first with chipsets, OS's and
such stuff, but now it's working as it should)...

Quote:
Anyway since linux MD raid never occupies more than 35% CPU (of a single
core!) in any test, I don't think it is the bottleneck. But this is part
of my question.

RAID ASIC + onboard cache vs. software implementation? RAID ASIC always...

Quote:
Second, 12 drives for RAID6 is suboptimal...

WHY??

Because! :)

8 or 16 is optimal for RAID5 and RAID6...

How do you calculate block size of a stripe with number of drives different
than 8 or 16?

64/12 = ?
256/10 = ?

I don't know how cache is organized in 3Ware, but in EMC storage systems
(CLARiiON) you can choose the memory page size (4kB, 8kB and 16kB) to
optimize it for some applications...

Quote:
Go with 8 or 16 drives and attach them directly to 3Ware...

We have already lots of data and virtual machines loaded in there. Even
if it was possible to attach all to 3ware controller (actually it might
indeed be possible, since it is a 16ML [we have 24 drives on the
machine]), we wouldn't have used the RAID from 3ware for the reasons
explained above.

Your reasons are quite paranoic... And using 9650SE-16ML (1000$ controller)
as a normal SATA controller is, sorry for the term, stupid... ;)

http://store.3ware.com/?category=10&subcategory=8&productid=9650SE-16ML

Place 16 drive on this controller, and build RAID6 from them... If you
really want, buy yourself another one as a spare, but the company I worked
for had sold many servers and workstations on Supermicro+3Ware combinations,
and haven't heard yet that a controller went dead... That was around 5 years
ago when I first started working for that company, back then I've used 3Ware
8506 controllers...

Quote:
With MD raid it shouldn't make a difference unless you say that the
larger cache in the 3ware speeds up the operation. This is again part of
my question: the cache seems to have a dramatic effect which I do not
completely understand for sequential I/O. It must be something related
to the bus overhead or the context switching of the CPU (for serving the
interrupts) but I would like a confirmation. Also consider that with 8
cores and a PCI express bus, both overheads should have been negligible.
Anyway the cache from the drives should be enough to minimize this
overhead (I mean for the MOBO drives) so I would not expect a tremendous
speedup from using the 3ware cache for all the drives (I mean still with
MD).

256MB cache is quite much for the RAID controller, and is used mostly as a
write cache (and local memory for RAID ASIC chip that provides RAID5 and
RAID6 calculations)...

And, why is RAID controller much faster than your software RAID is very
simple... RAID controller has got it's own firmware with many optimized RAID
features (lots of years were spent for researching RAID algorithms that were
implemented in hardware, 9650SE uses 8th generation of StorSwitch
technology) and has got onboard cache for RAID functions...

Quote:
I hoped I would get something like 1 GB/sec sequential write on 10 disks
Razz instead I see MUCH lower performances

With your configuration - very unlikely! :/

What performance would you expect from 3ware raid-6 12-disks with ext3
(defaults mount) sequential dd write?

I haven't tested RAID6 on 9650SE, but have tested RAID5 on 9650SE (older
generation on PCI-X), and IIRC got around 250MB/s write from 15x160GB
Hitachi 7200rpm SATA drives... So, with this 9650SE I expect at least around
350MB/s from 16 drives (today's SATA)... Consider that bandwidth is not what
you'll be worried about, it's more to RAID6 write penalty that cache memory
annulates (it's 6 IOPS per write)...


Just use this 3Ware as it should be used and stop thinking about 'what will
happen' things... Smile 3Ware won't die, drives will die more likely...

--
Kinez prdi bradat pekarau izdrkavu navecer u ribarnici ?
By runf

Damir Lukic, calypso at (no spam) _MAKNIOVO_fly.srk.fer.hr
http://inovator.blog.hr
http://calypso-innovations.blogspot.com/
 
David Brown...
Posted: Thu Apr 23, 2009 5:45 pm
Guest
kkkk wrote:
Quote:
Hi there,
we have a big "storage" computer: dual Xeon E5345 at (no spam) 2.33GHz (8 cores
total) with lots of disks partly connected to a 3ware 9650SE controller
and partly to the SATA/SAS controllers in the mobo.
The hard disks are Western Digital WD7500AYYS 750GB

We are using an ext3 filesystem with defaults mount on top of LVM + MD
raid 6. The raid-6 is on 12 disks (hence it is 10 disks for data, 2 for
parity). 6 of those disks are through the mobo controller, the others
are through the 3ware.


I understand entirely your reasons for wanting to use Linux software
raid rather than a hardware raid. But I've a couple of other points or
questions - as much for my own learning as anything else.

If you have so many disks connected, did you consider having at least
one as a hot spare? If one of your disks dies and it takes time to
replace it, the system will be very slow while running degraded.

Secondly, did you consider raid 10 as an alternative? Obviously it is
less efficient in terms of disk space, but it should be much faster. It
may also be safer (depending on the likely rates of different kinds of
failures) since there is no "raid 5 write hole". Raid 6, on the other
hand, is probably the slowest raid choice. Any writes that don't cover
a complete stripe will need reads from several of the disks, followed by
parity calculations - and the more disks you have, the higher the
chances of hitting such incomplete stripe writes.

<http://www.enterprisenetworkingplanet.com/nethub/article.php/10950_3730176_1>
 
kkkk...
Posted: Thu Apr 23, 2009 6:37 pm
Guest
David Brown wrote:
Quote:
If you have so many disks connected, did you consider having at least
one as a hot spare?

Of course! We have 4 spares shared among all the arrays.

Quote:
Secondly, did you consider raid 10 as an alternative?

I wouldn't expect performances of raid 10 via MD to be higher than the
raid-6 of my original post (and might even be much slower at the same
number of drives) because, as I mentioned, the "md1_raid5" (raid-6
actually) process never goes higher than 35% CPU occupation. Regarding
the read+checksum+write problem of raid5/6 for small writes, there
shouldn't be any in this case because I am doing a sequential write.

Quote:
Any writes that don't cover
a complete stripe will need reads from several of the disks,

Not the case here because I am doing sequential write.

Also, the overhead you mention is present if the stripe is not in cache,
but with large amounts of RAM I expect the stripe should be in cache
(especially the stripe related to the file/directory metadata should
be... while the rest doesn't matter as it is sequential). Yesterday
during the tests the free amount of RAM was 33GB on that machine over a
total of 48GB...
 
David Schwartz...
Posted: Fri Apr 24, 2009 2:43 am
Guest
On Apr 23, 3:51 am, kkkk <k... at (no spam) bbbb.com> wrote:

Quote:
Anyway since linux MD raid never occupies more than 35% CPU (of a single
core!) in any test, I don't think it is the bottleneck. But this is part
of my question.

It is the bottleneck, it's just not a CPU bottleneck, it's an I/O
bottleneck. The problem is simply the number of I/Os the system has to
issue. With a 12 disk RAID 6 array implemented in software, a write of
a single byte (admittedly the worst case) will require 10 reads
followed by 12 writes that cannot be started until all 10 reads
complete. Each of these operations has to be started and completed by
the MD driver.

I understand the reasoning behind your configuration choices, they
just utterly sacrifice performance.

DS
 
...
Posted: Fri Apr 24, 2009 7:50 am
Guest
U comp.arch.storage Bill Todd <billtodd at (no spam) metrocast.net> prica:
Quote:
Calypso seems especially ignorant when talking about optimal RAID group
sizes. Perhaps he's confusing RAID-5/6 with RAID-3 - but even then he'd
be wrong, since what you really want with RAID-3 is for the total *data*
content (excluding parity) of a stripe to be a convenient value, meaning
that you tend to favor group sizes like 5 or 9 (not counting any spares
that may be present). And given that you've got both processing power
and probably system/memory bus bandwidth to burn, there's no reason why
a software RAID-6 implementation shouldn't perform fairly competitively
with a hardware one.

RAID3 implementation doesn't exist on 3Ware controllers... So far, I've seen
RAID3 only on some storage arrays (EMC to tell you the truth only, can check
the others), and the old SCSI RAID controller - AMI MegaRAID Enterprise
1300 that I once had...

RAID3 is very similar to RAID5, but hasn't got distributed parity drive,
instead it has got dedicated parity drive, and, yes, it's used for special
purposes only where big sequential read/write speed is needed... But, like I
said already, 3Ware doesn't support it...

Seems like I was partially right with 8 or 16 drives as a optimal number of
drives... Seems like that for RAID6 it's optimal to have 6, 10 or 18 drives
(4+2, 8+2, 16+2)... Here's a nice text from EMC guy (look at Stripe size of
a LUN):

http://clariionblogs.blogspot.com/



--
U frizideru se ponekad sretan Peroo lize. By runf

Damir Lukic, calypso at (no spam) _MAKNIOVO_fly.srk.fer.hr
http://inovator.blog.hr
http://calypso-innovations.blogspot.com/
 
Michel Talon...
Posted: Fri Apr 24, 2009 9:21 am
Guest
calypso at (no spam) fly.srk.fer.hr.invalid wrote:
Quote:
U comp.arch.storage Bill Todd <billtodd at (no spam) metrocast.net> prica:
Calypso seems especially ignorant when talking about optimal RAID group
sizes. Perhaps he's confusing RAID-5/6 with RAID-3 - but even then he'd
be wrong, since what you really want with RAID-3 is for the total *data*
content (excluding parity) of a stripe to be a convenient value, meaning
that you tend to favor group sizes like 5 or 9 (not counting any spares
that may be present). And given that you've got both processing power
and probably system/memory bus bandwidth to burn, there's no reason why
a software RAID-6 implementation shouldn't perform fairly competitively
with a hardware one.

RAID3 implementation doesn't exist on 3Ware controllers... So far, I've seen
RAID3 only on some storage arrays (EMC to tell you the truth only, can check
the others), and the old SCSI RAID controller - AMI MegaRAID Enterprise
1300 that I once had...

RAID3 you can get with FreeBSD and its geom module, if you need it.

Quote:

RAID3 is very similar to RAID5, but hasn't got distributed parity drive,
instead it has got dedicated parity drive, and, yes, it's used for special
purposes only where big sequential read/write speed is needed... But, like I
said already, 3Ware doesn't support it...

Seems like I was partially right with 8 or 16 drives as a optimal number of
drives... Seems like that for RAID6 it's optimal to have 6, 10 or 18 drives
(4+2, 8+2, 16+2)... Here's a nice text from EMC guy (look at Stripe size of
a LUN):

http://clariionblogs.blogspot.com/




--

Michel TALON
 
...
Posted: Fri Apr 24, 2009 10:26 am
Guest
U comp.arch.storage kkkk <kkkk at (no spam) bbbb.com> prica:
Quote:
I haven't tested RAID6 on 9650SE, but have tested RAID5 on 9650SE (older
generation on PCI-X), and IIRC got around 250MB/s write from 15x160GB
Hitachi 7200rpm SATA drives... So, with this 9650SE I expect at least around
350MB/s from 16 drives (today's SATA)...

What filesystem and operating system? This is important...

Windows XP, NTFS...

Quote:
I assume you mean "first write of a sequential file"..?
(overwrites as you see are much faster)

These results are from a benchmarking tool used with BlackMagic Video Design
capture cards...

Quote:
Consider that bandwidth is not what
you'll be worried about, it's more to RAID6 write penalty that cache memory
annulates (it's 6 IOPS per write)...

"6 IOPS per write"? Could you explain this?

Normal write penalty for small writes in RAID6... RAID5 has got 4 IOPS write
penalty...

http://www.slichke.com/viewer.php?id=rgh1240568505h.png

This picture is taken from Berkeley lectures about RAID from prof. Patterson
(one of inventors of RAID arrays)...

--
Biljkaa rascvjetava debeli krekero pije navecer pod stolom.
By runf

Damir Lukic, calypso at (no spam) _MAKNIOVO_fly.srk.fer.hr
http://inovator.blog.hr
http://calypso-innovations.blogspot.com/
 
...
Posted: Fri Apr 24, 2009 10:29 am
Guest
U comp.arch.storage kkkk <kkkk at (no spam) bbbb.com> prica:
Quote:
I still don't understand. Why 4+2, 8+2, 16+2 should be more optimal?
Please note that one raid chunk is NOT one block long (512 bytes). In
facts on my raid-6 it is 64KB long, so the stripes are 64*10=640KB long.
What's wrong with that? Why should that be less performing than 512KB
long? Please note that ext3 has blocksize 4K, so there are 160 and 128
ext3 blocks in one stripe respectively in the two configurations. I
don't see why 128 blocks should be significantly better than 160 blocks..!?

Because you're thinking decimal instead of binary/hexadecimal...

Cache memory optimizations and firmware optimizations are on base2, not
base10...

--
Iza kuce cigano siluje crven Crnogorkaog gladija
za pet minuta. By runf

Damir Lukic, calypso at (no spam) _MAKNIOVO_fly.srk.fer.hr
http://inovator.blog.hr
http://calypso-innovations.blogspot.com/
 
David Schwartz...
Posted: Fri Apr 24, 2009 11:02 am
Guest
On Apr 24, 2:10 am, kkkk <k... at (no spam) bbbb.com> wrote:
Quote:
David Schwartz wrote:
It is the bottleneck, it's just not a CPU bottleneck, it's an I/O
bottleneck.

With an 8x PCI-e bus there should be space for 2 GB/sec transfer...

Yeah, I agree with you. It looks like an MD issue. On the bright side,
I heard from a reliable source that:

"Furthermore we trust visible, open, old/tested, linux MD code more
than any embedded RAID code which nobody knows except 3ware. What if
there was a bug in 9650SE code? It was a recent controller when we
bought it, and we would have found out only later, maybe years later
after setting up our array. Also, we were already proficient with
linux MD."

The flipside is, you have an untested configuration and nobody
specific who is obligated to provide you with support. You're probably
ahead of the curve, so you may hit every problem before anyone else
does.

DS
 
...
Posted: Fri Apr 24, 2009 11:38 am
Guest
U comp.arch.storage kkkk <kkkk at (no spam) bbbb.com> prica:
Quote:
What filesystem and operating system? This is important...

Windows XP, NTFS...

I suspected that. I suspect NTFS is much faster than ext3, it will
probably be like XFS in Linux. (and also more unsafe e.g. in case of
power losses, just like XFS) Speed depends among other things on how
paranoid is the journal behaviour.

NTFS unsafe in case of power loss? You missed something, we're not talking
about FAT here (which is faster than NTFS)...

--
"Bradats li mackau farbu ?" upita Dzonia pasira Miskoo podmazuje.
"Nisam ja nikog bombardiro !" rece bombao udise "Ja samo Zidovo hoce cokoladanm !" By runf

Damir Lukic, calypso at (no spam) _MAKNIOVO_fly.srk.fer.hr
http://inovator.blog.hr
http://calypso-innovations.blogspot.com/
 
kkkk...
Posted: Fri Apr 24, 2009 1:10 pm
Guest
David Schwartz wrote:
Quote:
It is the bottleneck, it's just not a CPU bottleneck, it's an I/O
bottleneck.

With an 8x PCI-e bus there should be space for 2 GB/sec transfer...

Quote:
The problem is simply the number of I/Os the system has to
issue. With a 12 disk RAID 6 array implemented in software, a write of
a single byte (admittedly the worst case) will require 10 reads
followed by 12 writes that cannot be started until all 10 reads
complete. Each of these operations has to be started and completed by
the MD driver.

This is true only for non-sequential write.

In my case the system starts writing 5 seconds after dd is pushing data
out (dirty_writeback_centisecs = 500). At that time there is so much
sequential data to write that it will fill many stripes completely.
 
 
Page 1 of 3    Goto page 1, 2, 3  Next
All times are GMT
The time now is Sun Nov 29, 2009 2:56 am