How to normalize filenames listed in a range How to normalize filenames listed in a range vba vba

How to normalize filenames listed in a range


Personally I hate using VBA where worksheet functions will work, so I've worked out a way to do this with worksheet functions. Although you could cram this all into one cell, I've broken it out into a lot of independent steps in separate columns so you can see how it's working, step by step.

For simplicity I'm assuming your file name is in A1

B1 =LEN(A1)
determine the length of the filename

C1 =SUBSTITUTE(A1," ","")
replace spaces with nothing

D1 =LEN(C1)
see how long the string is if you replace spaces with nothing

E1 =B1-D1
determine how many spaces there are

F1 =SUBSTITUTE(A1," ",CHAR(8),E1)
replace the last space with a special character that can't occur in a file name

G1 =SEARCH(CHAR(8), F1)
find the special character. Now we know where the last space is

H1 =LEFT(A1,G1-1)
peel off everything before the last space

I1 =MID(A1,G1+1,255)
peel off everything after the last space

J1 =FIND(".",I1)
find the first dot

K1 =FIND(".",I1,J1+1)
find the second dot

L1 =FIND(".",I1,K1+1)
find the third dot

M1 =MID(I1,1,J1-1)
find the first number

N1 =MID(I1,J1+1,K1-J1-1)
find the second number

O1 =MID(I1,K1+1,L1-K1-1)
find the third number

P1 =TEXT(M1,"00")
pad the first number

Q1 =TEXT(N1,"00")
pad the second number

R1 =TEXT(O1,"00")
pad the third number

S1 =IF(ISERR(K1),M1,P1&Q1&R1)
put the numbers together

T1 =H1&" "&S1&".pdf"
put it all together

It's kind of a mess because Excel hasn't added a single new string manipulation function in over 20 years, so things that should be easy (like "find last space") require severe trickery.


Here's a screenshot of a simple four-step method based on Excel commands and formulas, as suggested in a comment to the answered post (with a few changes)...

enter image description here


This function below works. I've assumed that the date is in ddmmyy format, but adjust as appropriate if it's mmddyy -- I can't tell from your example.

Function FormatThis(str As String) As String    Dim strDate As String    Dim iDateStart As Long    Dim iDateEnd As Long    Dim temp As Variant    ' Pick out the date part    iDateStart = GetFirstNumPosition(str, False)    iDateEnd = GetFirstNumPosition(str, True)    strDate = Mid(str, iDateStart, iDateEnd - iDateStart + 1)    If InStr(strDate, ".") <> 0 Then        ' Deal with the dot delimiters in the date        temp = Split(strDate, ".")        strDate = Format(DateSerial( _            CInt(temp(2)), CInt(temp(1)), CInt(temp(0))), "ddmmyy")    Else        ' No dot delimiters... assume date is already formatted as ddmmyy        ' Do nothing    End If    ' Piece it together    FormatThis = Left(str, iDateStart - 1) _        & strDate & Right(str, Len(str) - iDateEnd)End Function

This uses the following helper function:

Function GetFirstNumPosition(str As String, startFromRight As Boolean) As Long    Dim i As Long    Dim startIndex As Long    Dim endIndex As Long    Dim indexStep As Integer    If startFromRight Then        startIndex = Len(str)        endIndex = 1        indexStep = -1    Else        startIndex = 1        endIndex = Len(str)        indexStep = 1    End If    For i = startIndex To endIndex Step indexStep        If Mid(str, i, 1) Like "[0-9]" Then            GetFirstNumPosition = i            Exit For        End If    Next iEnd Function

To test:

Sub tester()    MsgBox FormatThis("Smith, J. 01.03.12.pdf")    MsgBox FormatThis("Smith, J. 010312.pdf")    MsgBox FormatThis("Smith, J. 1.03.12.pdf")    MsgBox FormatThis("Smith, J. 1.3.12.pdf")End Sub

They all return "Smith, J. 010312.pdf".