Want VBA in excel to read very large CSV and create output file of a small subset of the CSV Want VBA in excel to read very large CSV and create output file of a small subset of the CSV vba vba

Want VBA in excel to read very large CSV and create output file of a small subset of the CSV


The following code should do the trick. I don't have Excel in front of me, so I haven't tested it, but the concept is sound.

If this ends up being too slow, we can look at ways to improve the efficiency.

Sub SelectSomeRecords()    Dim testLine As String    Open inputFileName For Input As #1    Open outputFileName For Output As #2    While Not EOF(1)        Line Input #1, testLine        If RecordIsInteresting(testLine) Then            Print #2, testLine        End If    Wend    Close #1    Close #2End SubFunction RecordIsInteresting(recordLine As String) As Boolean    Dim lineItems(1 to 8) As String    GetRecordItems(lineItems(), recordLine)    ''// do your custom checking here:    RecordIsInteresting = lineItems(8) = "LS1 7AA"End FunctionSub GetRecordItems(items() As String, recordLine as String)    Dim finishString as Boolean    Dim itemString as String    Dim itemIndex as Integer    Dim charIndex as Long    Dim inQuote as Boolean    Dim testChar as String    inQuote = False    charIndex = 1    itemIndex = 1    itemString = ""    finishString = False    While charIndex <= Len(recordLine)        testChar = Mid$(recordLine, charIndex, 1)        finishString = False        If inQuote Then            If testChar = Chr$(34) Then                inQuote = False                finishString = True                charIndex = charIndex + 1 ''// ignore the next comma            Else                itemString = itemString + testChar            End If        Else            If testChar = Chr$(34) Then                inQuote = True            ElseIf testChar = "," Then                finishString = True            Else                itemString = itemString + testChar            End If        End If        If finishString Then            items(itemIndex) = itemString            itemString = ""            itemIndex = itemIndex + 1        End If        charIndex = charIndex + 1    WendEnd Sub


How about VBScript, though this would also work in Excel:

Set cn = CreateObject("ADODB.Connection")'Note HDR=Yes, that is, first row contains field names ''and FMT delimted, ie CSV 'strCon="Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\Docs\;" _& "Extended Properties=""text;HDR=Yes;FMT=Delimited"";"cn.open strcon'You would not need delimiters ('') if last field is numeric: '    strSQL="SELECT FieldName1, FieldName2 INTO New.csv FROM Old.csv " _& " WHERE LastFieldName='SomeTextValue'"'Creates new csv filecn.Execute strSQL


This doesn't directly answer your question, but grep (or one of the Windows equivalents) would really shine for this, e.g.,

grep -e <regex_filter> foo.csv > bar.csv