Want VBA in excel to read very large CSV and create output file of a small subset of the CSV
The following code should do the trick. I don't have Excel in front of me, so I haven't tested it, but the concept is sound.
If this ends up being too slow, we can look at ways to improve the efficiency.
Sub SelectSomeRecords() Dim testLine As String Open inputFileName For Input As #1 Open outputFileName For Output As #2 While Not EOF(1) Line Input #1, testLine If RecordIsInteresting(testLine) Then Print #2, testLine End If Wend Close #1 Close #2End SubFunction RecordIsInteresting(recordLine As String) As Boolean Dim lineItems(1 to 8) As String GetRecordItems(lineItems(), recordLine) ''// do your custom checking here: RecordIsInteresting = lineItems(8) = "LS1 7AA"End FunctionSub GetRecordItems(items() As String, recordLine as String) Dim finishString as Boolean Dim itemString as String Dim itemIndex as Integer Dim charIndex as Long Dim inQuote as Boolean Dim testChar as String inQuote = False charIndex = 1 itemIndex = 1 itemString = "" finishString = False While charIndex <= Len(recordLine) testChar = Mid$(recordLine, charIndex, 1) finishString = False If inQuote Then If testChar = Chr$(34) Then inQuote = False finishString = True charIndex = charIndex + 1 ''// ignore the next comma Else itemString = itemString + testChar End If Else If testChar = Chr$(34) Then inQuote = True ElseIf testChar = "," Then finishString = True Else itemString = itemString + testChar End If End If If finishString Then items(itemIndex) = itemString itemString = "" itemIndex = itemIndex + 1 End If charIndex = charIndex + 1 WendEnd Sub
How about VBScript, though this would also work in Excel:
Set cn = CreateObject("ADODB.Connection")'Note HDR=Yes, that is, first row contains field names ''and FMT delimted, ie CSV 'strCon="Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\Docs\;" _& "Extended Properties=""text;HDR=Yes;FMT=Delimited"";"cn.open strcon'You would not need delimiters ('') if last field is numeric: ' strSQL="SELECT FieldName1, FieldName2 INTO New.csv FROM Old.csv " _& " WHERE LastFieldName='SomeTextValue'"'Creates new csv filecn.Execute strSQL