How to normalize filenames listed in a range
Personally I hate using VBA where worksheet functions will work, so I've worked out a way to do this with worksheet functions. Although you could cram this all into one cell, I've broken it out into a lot of independent steps in separate columns so you can see how it's working, step by step.
For simplicity I'm assuming your file name is in A1
B1 =LEN(A1)
determine the length of the filename
C1 =SUBSTITUTE(A1," ","")
replace spaces with nothing
D1 =LEN(C1)
see how long the string is if you replace spaces with nothing
E1 =B1-D1
determine how many spaces there are
F1 =SUBSTITUTE(A1," ",CHAR(8),E1)
replace the last space with a special character that can't occur in a file name
G1 =SEARCH(CHAR(8), F1)
find the special character. Now we know where the last space is
H1 =LEFT(A1,G1-1)
peel off everything before the last space
I1 =MID(A1,G1+1,255)
peel off everything after the last space
J1 =FIND(".",I1)
find the first dot
K1 =FIND(".",I1,J1+1)
find the second dot
L1 =FIND(".",I1,K1+1)
find the third dot
M1 =MID(I1,1,J1-1)
find the first number
N1 =MID(I1,J1+1,K1-J1-1)
find the second number
O1 =MID(I1,K1+1,L1-K1-1)
find the third number
P1 =TEXT(M1,"00")
pad the first number
Q1 =TEXT(N1,"00")
pad the second number
R1 =TEXT(O1,"00")
pad the third number
S1 =IF(ISERR(K1),M1,P1&Q1&R1)
put the numbers together
T1 =H1&" "&S1&".pdf"
put it all together
It's kind of a mess because Excel hasn't added a single new string manipulation function in over 20 years, so things that should be easy (like "find last space") require severe trickery.
Here's a screenshot of a simple four-step method based on Excel commands and formulas, as suggested in a comment to the answered post (with a few changes)...
This function below works. I've assumed that the date is in ddmmyy
format, but adjust as appropriate if it's mmddyy
-- I can't tell from your example.
Function FormatThis(str As String) As String Dim strDate As String Dim iDateStart As Long Dim iDateEnd As Long Dim temp As Variant ' Pick out the date part iDateStart = GetFirstNumPosition(str, False) iDateEnd = GetFirstNumPosition(str, True) strDate = Mid(str, iDateStart, iDateEnd - iDateStart + 1) If InStr(strDate, ".") <> 0 Then ' Deal with the dot delimiters in the date temp = Split(strDate, ".") strDate = Format(DateSerial( _ CInt(temp(2)), CInt(temp(1)), CInt(temp(0))), "ddmmyy") Else ' No dot delimiters... assume date is already formatted as ddmmyy ' Do nothing End If ' Piece it together FormatThis = Left(str, iDateStart - 1) _ & strDate & Right(str, Len(str) - iDateEnd)End Function
This uses the following helper function:
Function GetFirstNumPosition(str As String, startFromRight As Boolean) As Long Dim i As Long Dim startIndex As Long Dim endIndex As Long Dim indexStep As Integer If startFromRight Then startIndex = Len(str) endIndex = 1 indexStep = -1 Else startIndex = 1 endIndex = Len(str) indexStep = 1 End If For i = startIndex To endIndex Step indexStep If Mid(str, i, 1) Like "[0-9]" Then GetFirstNumPosition = i Exit For End If Next iEnd Function
To test:
Sub tester() MsgBox FormatThis("Smith, J. 01.03.12.pdf") MsgBox FormatThis("Smith, J. 010312.pdf") MsgBox FormatThis("Smith, J. 1.03.12.pdf") MsgBox FormatThis("Smith, J. 1.3.12.pdf")End Sub
They all return "Smith, J. 010312.pdf"
.