Excel VBA Performance - 1 million rows - Delete rows containing a value, in less than 1 min Excel VBA Performance - 1 million rows - Delete rows containing a value, in less than 1 min vba vba

Excel VBA Performance - 1 million rows - Delete rows containing a value, in less than 1 min


I'm providing the first answer as a reference

Others may find it useful, if there are no other options available

  • Fastest way to achieve the result is not to use the Delete operation
  • Out of 1 million records it removes 100,000 rows in an average of 33 seconds

.

Sub DeleteRowsWithValuesNewSheet()  '100K records   10K to delete                                    'Test 1:        2.40234375 sec                                    'Test 2:        2.41796875 sec                                    'Test 3:        2.40234375 sec                                    '1M records     100K to delete                                    'Test 1:        32.9140625 sec                                    'Test 2:        33.1484375 sec                                    'Test 3:        32.90625   sec    Dim oldWs As Worksheet, newWs As Worksheet, rowHeights() As Long    Dim wsName As String, t As Double, oldUsedRng As Range    FastWB True:    t = Timer    Set oldWs = Worksheets(1)    wsName = oldWs.Name    Set oldUsedRng = oldWs.Range("A1", GetMaxCell(oldWs.UsedRange))    If oldUsedRng.Rows.Count > 1 Then                           'If sheet is not empty        Set newWs = Sheets.Add(After:=oldWs)                    'Add new sheet        With oldUsedRng            .AutoFilter Field:=1, Criteria1:="<>Test String"            .Copy                                               'Copy visible data        End With        With newWs.Cells            .PasteSpecial xlPasteColumnWidths            .PasteSpecial xlPasteAll                            'Paste data on new sheet            .Cells(1, 1).Select                                 'Deselect paste area            .Cells(1, 1).Copy                                   'Clear Clipboard        End With        oldWs.Delete                                            'Delete old sheet        newWs.Name = wsName    End If    FastWB False:   InputBox "Duration: ", "Duration", Timer - tEnd Sub

.

At high level:

  • It creates a new worksheet, and keeps a reference to the initial sheet
  • AutoFilters column 1 on the searched text: .AutoFilter Field:=1, Criteria1:="<>Test String"
  • Copies all (visible) data from initial sheet
  • Pastes column widths, formats, and data to the new sheet
  • Deletes initial sheet
  • Renames the new sheet to the old sheet name

It uses the same helper functions posted in the question

The 99% of the duration is used by the AutoFilter

.

There are a couple limitations I found so far, the first can be addressed:

  1. If there are any hidden rows on the initial sheet, it unhides them

    • A separate function is needed to hide them back
    • Depending on implementation, it might significantly increase duration
  2. VBA related:

    • It changes the Code Name of the sheet; other VBA referring to Sheet1 will be broken (if any)
    • It deletes all VBA code associated with the initial sheet (if any)

.

A few notes about using large files like this:

  • The binary format (.xlsb) reduce file size dramatically (from 137 Mb to 43 Mb)
  • Unmanaged Conditional Formatting rules can cause exponential performance issues

    • The same for Comments, and Data validation
  • Reading file or data from network is much slower than working with a locall file


A significant gain in speed can be achieved if the source data do not contain formulas, or if the scenario would allow (or want) the formulas to be converted into hard values during the conditional row deletions.

With the above as a caveat, my solution uses the AdvancedFilter of the range object. It's about twice as fast as DeleteRowsWithValuesNewSheet().

Public Sub ExcelHero()    Dim t#, crit As Range, data As Range, ws As Worksheet    Dim r&, fc As Range, lc As Range, fr1 As Range, fr2 As Range    FastWB True    t = Timer        Set fc = ActiveSheet.UsedRange.Item(1)        Set lc = GetMaxCell        Set data = ActiveSheet.Range(fc, lc)        Set ws = Sheets.Add        With data            Set fr1 = data.Worksheet.Range(fc, fc.Offset(, lc.Column))            Set fr2 = ws.Range(ws.Cells(fc.Row, fc.Column), ws.Cells(fc.Row, lc.Column))            With fr2                fr1.Copy                .PasteSpecial xlPasteColumnWidths: .PasteSpecial xlPasteAll                .Item(1).Select            End With            Set crit = .Resize(2, 1).Offset(, lc.Column + 1)            crit = [{"Column 1";"<>Test String"}]            .AdvancedFilter xlFilterCopy, crit, fr2            .Worksheet.Delete        End With    FastWB False    r = ws.UsedRange.Rows.Count    Debug.Print "Rows: " & r & ", Duration: " & Timer - t & " seconds"End Sub


On my elderly Dell Inspiron 1564 (Win 7 Office 2007) this:

Sub QuickAndEasy()    Dim rng As Range    Set rng = Range("AA2:AA1000001")    Range("AB1") = Now    Application.ScreenUpdating = False        With rng            .Formula = "=If(A2=""Test String"",0/0,A2)"            .Cells.SpecialCells(xlCellTypeFormulas, xlErrors).EntireRow.Delete            .Clear        End With    Application.ScreenUpdating = True    Range("AC1") = NowEnd Sub

took about 10 seconds to run. I am assuming that column AA is available.

EDIT#1:

Please note that this code does not set Calculation to Manual. Performance will improve if the Calculation mode is set to Manual after the "helper" column is allowed to calculate.