Main Page | Report this Page
Computers Forum Index  »  Computer - Databases - MS SQL Server  »  UPDATE into large table...
Page 1 of 1    

UPDATE into large table...

Author Message
imani_technology_spam at (no spam) yahoo.com...
Posted: Tue Sep 29, 2009 11:03 pm
Guest
We have a table that contains billions of rows. When we try to UPDATE
that table, we overload the transaction log. What are some ways to
avoid this? Is there a way to "turn off" the transaction log
temporarily?
 
Erland Sommarskog...
Posted: Wed Sep 30, 2009 10:00 pm
Guest
imani_technology_spam at (no spam) yahoo.com (imani_technology_spam at (no spam) yahoo.com) writes:
Quote:
We have a table that contains billions of rows. When we try to UPDATE
that table, we overload the transaction log. What are some ways to
avoid this? Is there a way to "turn off" the transaction log
temporarily?

For UPDATEs no. An alternative may be to copy the data to a new
table with SELECT INTO, which is minimally logged, meaning that only
page allocations are logged. Note that this applies only in simple and
bulk_logged recovery.

Obviously this approach comes with its own set of problems.

Plamen's suggestion to update in smaller batches is also worth
investigating.

--
Erland Sommarskog, SQL Server MVP, esquel at (no spam) sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 
Thomas Arthur Seidel...
Posted: Thu Oct 01, 2009 1:37 am
Guest
Possibly it is worth to try a bigger machine? SQL 2008 on Win2008Server with lots
of RAM might solve a performance problem like yours, its worth a try, isn't it?

TAS


imani_technology_spam at (no spam) yahoo.com wrote:
Quote:
We have a table that contains billions of rows. When we try to UPDATE
that table, we overload the transaction log. What are some ways to
avoid this? Is there a way to "turn off" the transaction log
temporarily?
 
Plamen Ratchev...
Posted: Thu Oct 01, 2009 1:56 am
Guest
You can update on smaller batches to avoid transaction log growth:
http://www.tek-tips.com/faqs.cfm?fid=3141

--
Plamen Ratchev
http://www.SQLStudio.com
 
Tom van Stiphout...
Posted: Thu Oct 01, 2009 5:15 am
Guest
On Tue, 29 Sep 2009 16:03:16 -0700 (PDT),
"imani_technology_spam at (no spam) yahoo.com" <imani_technology_spam at (no spam) yahoo.com>
wrote:

You already got some good suggestions, but to answer your direct
question: no, transaction logging cannot be turned off as far as I
know. As I understand it, it's such a fundamental part of database I/O
that even a switch to turn it off (e.g. when in development mode) was
not possible.

-Tom.
Microsoft Access MVP


Quote:
We have a table that contains billions of rows. When we try to UPDATE
that table, we overload the transaction log. What are some ways to
avoid this? Is there a way to "turn off" the transaction log
temporarily?
 
Erland Sommarskog...
Posted: Thu Oct 01, 2009 7:27 am
Guest
Tom van Stiphout (tom7744.no.spam at (no spam) cox.net) writes:
Quote:
You already got some good suggestions, but to answer your direct
question: no, transaction logging cannot be turned off as far as I
know. As I understand it, it's such a fundamental part of database I/O
that even a switch to turn it off (e.g. when in development mode) was
not possible.

It is correct, that SQL Server does not offer a way to turn off logging.
However, it would be perfectly possible to so. It would probably even be
fairly simple to implement. The problem is that people would not understand
the implications.

Many many years ago I worked with DEC/Rdb which had SET TRANSACTION START
BATCH_UPDATE. The instruction was clear: this is a non-logged transaction,
and if the transaction fails, restore a backup. Later, I was told by one of
the developers - he now works for Microsoft and SQL Server - what they did
was that they set a bit in the database header, meaning "database corrupt",
and whe the transaction was committed they cleared that bit.

The reason why Microsoft do not offer this option is obvious: there would be
no end of misery in this world of people losing their databases, because
they don't understand the implications. There's more than enough with people
who delete their log files.


--
Erland Sommarskog, SQL Server MVP, esquel at (no spam) sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 
David Portas...
Posted: Thu Oct 01, 2009 12:24 pm
Guest
"Erland Sommarskog" <esquel at (no spam) sommarskog.se> wrote in message
news:Xns9C9760453D5ACYazorman at (no spam) 127.0.0.1...
Quote:

Many many years ago I worked with DEC/Rdb which had SET TRANSACTION START
BATCH_UPDATE. The instruction was clear: this is a non-logged transaction,
and if the transaction fails, restore a backup. Later, I was told by one
of
the developers - he now works for Microsoft and SQL Server - what they did
was that they set a bit in the database header, meaning "database
corrupt",
and whe the transaction was committed they cleared that bit.


Erland,

How would that be possible in an ACID system though? Suppose I update 1
million rows in a single statement and then the 1 million +1 row causes a
constraint violation. Is my database corrupt?

--
David Portas
 
Erland Sommarskog...
Posted: Thu Oct 01, 2009 2:22 pm
Guest
David Portas (REMOVE_BEFORE_REPLYING_dportas at (no spam) acm.org) writes:
Quote:
"Erland Sommarskog" <esquel at (no spam) sommarskog.se> wrote in message
news:Xns9C9760453D5ACYazorman at (no spam) 127.0.0.1...
Many many years ago I worked with DEC/Rdb which had SET TRANSACTION
START BATCH_UPDATE. The instruction was clear: this is a non-logged
transaction, and if the transaction fails, restore a backup. Later, I
was told by one of the developers - he now works for Microsoft and SQL
Server - what they did was that they set a bit in the database header,
meaning "database corrupt", and whe the transaction was committed they
cleared that bit.

How would that be possible in an ACID system though? Suppose I update 1
million rows in a single statement and then the 1 million +1 row causes a
constraint violation. Is my database corrupt?

I don't think they had constraints in the versions of DEC/Rdb I had. But
of course they had unique indexes, and yes, an index duplicate would kiss
your database bye-bye.

Theoretically, the engine could pre-validate all rows before it actually
perform the update, and thereby reducing the risk for a "rollback". But
there could still be triggers and other nasties to cause problems.

Note that you can achieve this corrupt database today. Run your transaction,
don't commit. Stop SQL Server. Delete the log file. (A very very bad idea).
Get the database into emergency mode, and you can access it read-only.
But what you are looking at can be very inconsistent. (And I like to stress
this again for the causual passers-by: you should *never* delete your log
file.)


--
Erland Sommarskog, SQL Server MVP, esquel at (no spam) sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 
Banana...
Posted: Thu Oct 01, 2009 3:50 pm
Guest
David Portas wrote:
Quote:
Erland,

How would that be possible in an ACID system though? Suppose I update 1
million rows in a single statement and then the 1 million +1 row causes a
constraint violation. Is my database corrupt?


I'm no Erland, but the point isn't so much about constraint violation
but rather being able to rollback the changes. If one wanted to update 1
million rows, then there has to be enough space to "cache" those
affected rows so if something goes wrong, be it constraint violation or
something else, it can be rollbacked. But if we allowed update and
didn't have big enough logs to write the changes those same changes
would then get overwritten and we lose the whole ACID-ity.

IMHO, it's unfortunate that such thing has made it more painful than
actually warranted though I can understand Microsoft's reasoning for not
wanting to provide "disable it" button as I do agree with Erland that
there's too many people out there who don't stop and think before they
push the giant, red button. Wink
 
Dave...
Posted: Fri Oct 02, 2009 5:47 am
Guest
You can't turn off logging. Here are some choices:

1. Switch to simple recovery model during your update and use
batches. You'll being flying without point-in-time recovery until you
can switch back to full model, but maybe you have other systems in
place to manage that risk.

2. Do it slowly. Figure out how much extra tlog you can handle per
day and break your update job to do it over a period of days.

3. Far out idea... Create a new filegroup in prod and backup the
db. Restore to another machine - create updated version of table with
new name in new file group - backup file group. do partial db restore
of just filegroup to prod, process delta since backup, rename tables.
 
Dave...
Posted: Fri Oct 02, 2009 6:01 am
Guest
One more thing: Keep your batches small (< 5000) otherwise your
update will take out a table lock. Play with your batchsize and
"paglock" hint to get the biggest batch that doesn't lock your table.
 
Enorme Vigenti...
Posted: Fri Oct 02, 2009 11:48 am
Guest
imani_technology_spam at (no spam) yahoo.com ha scritto:
Quote:
We have a table that contains billions of rows. When we try to UPDATE
that table, we overload the transaction log. What are some ways to
avoid this? Is there a way to "turn off" the transaction log
temporarily?
my tips:

Split the big table into multiple "Patition Tables" (enterprise version
only).
 
 
Page 1 of 1    
All times are GMT
The time now is Fri Dec 11, 2009 7:01 pm