Pages

Monday, October 1, 2012

Deleting large amounts of data

Situation(s):

Need to delete a very large amount of rows from a very large table in SQL Server.

Straight Forward Approach:

Just delete all of the rows in one shot. Things generally are not this simple ,now are they?

Problems:
  • This would be the quickest way to delete a large number of rows but you will probably end up getting lock escalation to a table lock, which essentially makes the table unavailable during the delete. You would need a maintenance window
  • Log can explode during the operation. You might run into issues of increasing the log space etc.
  • If you have transactional replication on the table or you have DB mirroring then deleting a large number of rows from a table will generate a lot of log activity and it can bring the system under a lot of load.

Refined Approach:

To address the above problems we can do :
  • Delete rows in an gradual and an iterative manner, by running the DELETE statement in a loop.
  • Take log backups in regular intervals to contain the log growth.
  • Space the operations to ease the pressure on the system if there is transactional replication or DB mirroring involved.


Partitioned Tables (Special case):

In the case of partitioned Tables we can use an interesting approach to address the log explosion.
  • Create an identical Trash table ( name it anything) .
  • Switch the partitions that are to be deleted.
  • Truncate the Trash table.
TRUNCATE is minimally logged when compare to DELETE

Other special cases:

If the amount of data that has to be deleted is way more than the amount of data that has to be retained, you can exploit BULK Logging features:

  • BULK COPY the data to be retained into another table.
  • TRUNCATE the table
  • BULK COPY the data back.