Removing repeated duplicated characters
This works for strings that are exclusively commas or have up to 398 contiguous commas.
SELECT CASE WHEN TargetString NOT LIKE '%[^,]%' THEN '' /*The string is exclusively commas*/ ELSE REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(TargetString, REPLICATE(',',16),','), /*399/16 = 24 remainder 15*/ REPLICATE(',',8),','), /* 39/ 8 = 4 remainder 7*/ REPLICATE(',',4),','), /* 11/ 4 = 2 remainder 3*/ REPLICATE(',',2),','), /* 5/ 2 = 2 remainder 1*/ REPLICATE(',',2),',') /* 3/ 2 = 1 remainder 1*/ END FROM T
Add extra powers of 2 at the top if you need more or remove from the top if you need less. The comments by each stage indicate the smallest number that this stage will not deal with successfully.
All the comment lines are in this format
/* L/D = Q remainder R */D: Corresponds to the length of the string generated by `REPLICATE`R: Is always D-1Q+R: Form L for the next step
So to extend the series upwards with another
D = 32 R = 31Q = 368 (399-31)L = (368 * 32) + 31 = 11807
So that would deal with sections of commas up to 11,806 characters.
I would suggest a UDF to do this. Since the UDF I am about to suggest doesn't touch any tables, the performance should be pretty good.
CREATE Function [dbo].[CleanDuplicates](@Data VarChar(8000), @DuplicateChar VarChar(1))Returns VarChar(8000)WITH SCHEMABINDINGASBegin Set @Data = @DuplicateChar + @Data While PATINDEX('%' + @DuplicateChar + @DuplicateChar + '%',@Data) > 0 Set @Data = REPLACE(@Data, @DuplicateChar + @DuplicateChar,@DuplicateChar) Return Right(@Data, Len(@Data)-1)End
You can test the function like this:
Select dbo.CleanDuplicates(',,,', ',')Select dbo.CleanDuplicates(',,,sam,,bob,', ',')