Matching values in string array Matching values in string array vba vba

Matching values in string array


If we're going to talk about performance then there's no substutute for running some tests. In my experience Application.Match() is up to ten times slower than calling a function which uses a loop.

Sub Tester()    Dim i As Long, b, t    Dim arr(1 To 100) As String    For i = 1 To 100        arr(i) = "Value_" & i    Next i    t = Timer    For i = 1 To 100000        b = Contains(arr, "Value_50")    Next i    Debug.Print "Contains", Timer - t    t = Timer    For i = 1 To 100000        b = Application.Match(arr, "Value_50", False)    Next i    Debug.Print "Match", Timer - tEnd SubFunction Contains(arr, v) As BooleanDim rv As Boolean, lb As Long, ub As Long, i As Long    lb = LBound(arr)    ub = UBound(arr)    For i = lb To ub        If arr(i) = v Then            rv = True            Exit For        End If    Next i    Contains = rvEnd Function

Output:

Contains       0.8710938 Match          4.210938 


I used to look for a best replace solution. It should work for simple finding as well.

To find first instance of a string you can try using this code:

Sub find_strings_1()Dim ArrayCh() As VariantDim rng As RangeDim i As Integer ArrayCh = Array("a", "b", "c")With ActiveSheet.Cells    For i = LBound(ArrayCh) To UBound(ArrayCh)        Set rng = .Find(What:=ArrayCh(i), _        LookAt:=xlPart, _        SearchOrder:=xlByColumns, _        MatchCase:=False)        Debug.Print rng.Address    Next iEnd WithEnd Sub

If you want to find all instances try the below.

Sub find_strings_2()Dim ArrayCh() As VariantDim c As RangeDim firstAddress As StringDim i As Integer ArrayCh = Array("a", "b", "c") 'strings to lookupWith ActiveSheet.Cells    For i = LBound(ArrayCh) To UBound(ArrayCh)        Set c = .Find(What:=ArrayCh(i), LookAt:=xlPart, LookIn:=xlValues)        If Not c Is Nothing Then            firstAddress = c.Address 'used later to verify if looping over the same address            Do                '_____                'your code, where you do something with "c"                'which is a range variable,                'so you can for example get it's address:                Debug.Print ArrayCh(i) & " " & c.Address 'example                '_____                Set c = .FindNext(c)            Loop While Not c Is Nothing And c.Address <> firstAddress        End If    Next iEnd WithEnd Sub

Keep in mind that if there are several instances of searched string within one cell it will return only one result due to the specific of FindNext.

Still, if you need a code for replacing found values with another, I'd use the first solution, but you'd have to change it a bit.


"A more efficient way (compared to Application.Match)of finding whether a string value exists in an array":

I believe there is no more efficient way than the one you are using, i.e., Application.Match.

Arrays allow efficient access in any element if we know the index of that element. If we want to do anything by element value (even checking if an element exists), we have to scan all the elements of the array in the worst case. Therefore, the worst case needs n element comparisons, where n is the size of the array. So the maximum time we need to find if an element exists is linear in the size of the input, i.e., O(n). This applies to any language that uses conventional arrays.

The only case where we can be more efficient, is when the array has special structure. For your example, if the elements of the array are sorted (e.g. alphabetically), then we do not need to scan all the array: we compare with the middle element, and then compare with the left or right part of the array (binary search). But without assuming any special structure, there is no hope..

The Dictionary/Collection as you point, offers constant key access to their elements (O(1)). What perhaps is not very well documented is that one can also have index access to the dictionary elements (Keys and Items): the order in which elements are entered into the Dictionary is preserved. Their main disadvantage is that they use more memory as two objects are stored for each element.

To wrap up, although If Not IsError(Excel.Application.match(...)) looks silly, it is still the more efficient way (at least in theory). On permission issues my knowledge is very limited. Depending on the host application, there are always some Find-type functions (C++ has find and find_if for example).

I hope that helps!

Edit

I would like to add a couple of thoughts, after reading the amended version of the post and Tim's answer. The above text is focusing on the theoretical time complexity of the various data structures and ignores implementation issues. I think the spirit of the question was rather, "given a certain data structure (array)", what is the most efficient way in practice of checking existence.

To this end, Tim's answer is an eye-opener.

The conventional rule "if VBA can do it for you then don't write it again yourself" is not always true. Simple operations like looping and comparisons can be faster that "agreegate" VBA functions. Two interesting links are here and here.