Main Page | Report this Page
Computers Forum Index  »  Computer - Databases - Oracle (Server)  »  IO Contention on the Redo Logs...
Page 1 of 1    

IO Contention on the Redo Logs...

Author Message
Pat...
Posted: Wed Oct 14, 2009 8:58 pm
Guest
I've been trying to troubleshoot a troublesome (sic) performance issue
on one of our busiest Oracle servers and I was hoping somebody here
might have some insight into what I'm seeing.

Every "now and then" (its not predictable), when the box is under a
whole lot of IO load (lots of read/write activity), the whole box bogs
down incredibly and I get about a 3 minute "hang". If you look at the
wait tree, everybody is waiting on the log_file_parallel_write.

Problem is, if I look at my SAR reports (or vmstat) during one of
these 3 minute hangs, the IOs on the box drop to the floor e.g. we do
a lot less IOs during a hand than before and after. We're seeing maybe
25k blocks/sec in/out before and after, and drop down to 25-50 blocks/
sec in/out during the hang.

We've engaged Oracle support on this, and, while they're not certain
then know what's going on, they have pointed to generally poor
performance of the IO subsystem when writing REDO logs.

Right now, the entire database has everything mounted on a single
Fiber Channel LUN o /u01. Even the REDO logs are on that same LUN.

One of the things the storage guys have been pointing to is that
there's a single IO queue on our QLogic cards per lun, so my REDO
traffic is, in fact, fighting its way down the same scheduler queue as
my normal data blocks and they've suggested provisioning a new pair of
smaller luns, one for each half of the REDO log group.

Another thing that's been suggested is that I switch the RedHat IO
scheduler from CFS to NOOP and just let the HBA and SAN handle the
block reordering. I'm dubious about this one though since I'm not
seeing a bottleneck on the host scheduler and I have to assume there's
some benefit to the block reordering going on here.

So I suppose my questions to the group are:

1) Has anybody else seen similar "hangups" with the characteristic
lack of IO throughput I identified above?
2) If you're deploying Oracle on a SAN, what, in your experience, is
the optimal layout of files on LUNs? I know how I lay things out on
DASD, but the rules in the SAN world look to be subtly different.
3) Does anybody have any experience tweaking the RedHat IO schedulers?
What are folks experience with the different options?

Particulars:
Oracle: 10.2.0.4
Host: 8 cores (intel) 32G
OS: RedHat EL 5
Storage: Netapp 3040
HBA: QLogic
 
hpuxrac...
Posted: Wed Oct 14, 2009 10:32 pm
Guest
On Oct 14, 4:58 pm, Pat <pat.ca... at (no spam) service-now.com> wrote:

snip

Quote:
I've been trying to troubleshoot a troublesome (sic) performance issue
on one of our busiest Oracle servers and I was hoping somebody here
might have some insight into what I'm seeing.

Every "now and then" (its not predictable), when the box is under a
whole lot of IO load (lots of read/write activity), the whole box bogs
down incredibly and I get about a 3 minute "hang". If you look at the
wait tree, everybody is waiting on the log_file_parallel_write.

Problem is, if I look at my SAR reports (or vmstat) during one of
these 3 minute hangs, the IOs on the box drop to the floor e.g. we do
a lot less IOs during a hand than before and after. We're seeing maybe
25k blocks/sec in/out before and after, and drop down to 25-50 blocks/
sec in/out during the hang.

It sounds like the SAN workload may be impacting you. You are trying
to do IO but not able to do it quickly.

My system gets around 1 to 2 ms for log file parallel write. 25k
blocks/sec is when it is good? Ouch!

Quote:
We've engaged Oracle support on this, and, while they're not certain
then know what's going on, they have pointed to generally poor
performance of the IO subsystem when writing REDO logs.

Right now, the entire database has everything mounted on a single
Fiber Channel LUN o /u01. Even the REDO logs are on that same LUN.

You really want to have separate LUNs. All my redo logs for any kind
of production system are on RAID 10. You probably do not want RAID 5
for redo logs.

Quote:
One of the things the storage guys have been pointing to is that
there's a single IO queue on our QLogic cards per lun, so my REDO
traffic is, in fact, fighting its way down the same scheduler queue as
my normal data blocks and they've suggested provisioning a new pair of
smaller luns, one for each half of the REDO log group.

Sure.

Quote:
Another thing that's been suggested is that I switch the RedHat IO
scheduler from CFS to NOOP and just let the HBA and SAN handle the
block reordering. I'm dubious about this one though since I'm not
seeing a bottleneck on the host scheduler and I have to assume there's
some benefit to the block reordering going on here.

Dunno about that but we are using ASM which is pretty similar to RAW
and the setup of async etc goes along with all that.

My question is ... what are the SAN people telling you about what else
is impacting the SAN when your IO throughput goes thru the floor?
 
joel garry...
Posted: Wed Oct 14, 2009 10:59 pm
Guest
On Oct 14, 1:58 pm, Pat <pat.ca... at (no spam) service-now.com> wrote:
Quote:
I've been trying to troubleshoot a troublesome (sic) performance issue
on one of our busiest Oracle servers and I was hoping somebody here
might have some insight into what I'm seeing.

Every "now and then" (its not predictable), when the box is under a
whole lot of IO load (lots of read/write activity), the whole box bogs
down incredibly and I get about a 3 minute "hang". If you look at the
wait tree, everybody is waiting on the log_file_parallel_write.

Problem is, if I look at my SAR reports (or vmstat) during one of
these 3 minute hangs, the IOs on the box drop to the floor e.g. we do
a lot less IOs during a hand than before and after. We're seeing maybe
25k blocks/sec in/out before and after, and drop down to 25-50 blocks/
sec in/out during the hang.

We've engaged Oracle support on this, and, while they're not certain
then know what's going on, they have pointed to generally poor
performance of the IO subsystem when writing REDO logs.

Right now, the entire database has everything mounted on a single
Fiber Channel LUN o /u01. Even the REDO logs are on that same LUN.

One of the things the storage guys have been pointing to is that
there's a single IO queue on our QLogic cards per lun, so my REDO
traffic is, in fact, fighting its way down the same scheduler queue as
my normal data blocks and they've suggested provisioning a new pair of
smaller luns, one for each half of the REDO log group.

Another thing that's been suggested is that I switch the RedHat IO
scheduler from CFS to NOOP and just let the HBA and SAN handle the
block reordering. I'm dubious about this one though since I'm not
seeing a bottleneck on the host scheduler and I have to assume there's
some benefit to the block reordering going on here.

So I suppose my questions to the group are:

1) Has anybody else seen similar "hangups" with the characteristic
lack of IO throughput I identified above?
2) If you're deploying Oracle on a SAN, what, in your experience, is
the optimal layout of files on LUNs? I know how I lay things out on
DASD, but the rules in the SAN world look to be subtly different.
3) Does anybody have any experience tweaking the RedHat IO schedulers?
What are folks experience with the different options?

Particulars:
Oracle: 10.2.0.4
Host: 8 cores (intel) 32G
OS: RedHat EL 5
Storage: Netapp 3040
HBA: QLogic

Don't know anything about it, but this has an interesting graph:
http://www.redhat.com/magazine/008jun05/features/schedulers/ . The
comment about not wanting to use noop unless you have a saturated cpu
seems reasonable too, although simple changes to, say, SGA size or
some programs can change the cpu characteristics enormously.

Note that redo is the Achilles' heel of Oracle - if you mess it up,
you can lose data. That's why you want to have it on the fastest
possible serially writing device, with redundancy. That separate LUN
suggestion may be what you need. It is entirely possible that you
simply fill up the hardware buffer and then everyone has to wait - the
lgwr can't continue until it is told the write has really happened,
and that's not something you want to turn off (google asynchronous
commit oracle if you want to be stupid, but note that pl/sql does it,
because it is smart enough to wait for final ack).

Check out the first (especially the links) and last posts here:
http://www.linux-archive.org/device-mapper-development/8763-performance-considerations-io-schedulers-dmmultipathing.html

It's worth it to read through this entire thread:
http://www.freelists.org/post/oracle-l/log-writer-tuning

Not real apropro, but interesting thoughts on investigating this kind
of thing nonetheless:
http://oraclesponge.wordpress.com/2006/10/02/linux-26-kernel-io-schedulers-for-oracle-data-warehousing-part-ii/

jg
--
at (no spam) home.com is bogus.
http://www.networkworld.com/community/node/46155
 
 
Page 1 of 1    
All times are GMT
The time now is Sun Mar 21, 2010 5:21 am