split alpha and numeric using sql
If the numeric part is always at the beginning, then you can use this:
PATINDEX('%[0-9][^0-9]%', ConcUnit)
to get the index of the last digit.
Thus, this:
DECLARE @str VARCHAR(MAX) = '4000 ug/ML' SELECT LEFT(@str, PATINDEX('%[0-9][^0-9]%', @str )) AS Number, LTRIM(RIGHT(@str, LEN(@str) - PATINDEX('%[0-9][^0-9]%', @str ))) As Unit
gives you:
Number Unit-------------4000 ug/ML
EDIT:
If numeric data include double values as well, then you can use this:
SELECT LEN(@str) - PATINDEX ('%[^0-9][0-9]%', REVERSE(@str))
to get the index of the last digit.
Thus, this:
SELECT LEFT(@str, LEN(@str) - PATINDEX ('%[^0-9][0-9]%', REVERSE(@str)))
gives you the numeric part.
And this:
SELECT LEFT(@str, LEN(@str) - PATINDEX ('%[^0-9][0-9]%', REVERSE(@str))) AS Numeric, CASE WHEN CHARINDEX ('%', @str) <> 0 THEN LTRIM(RIGHT(@str, LEN(@str) - CHARINDEX ('%', @str))) ELSE LTRIM(RIGHT(@str, PATINDEX ('%[^0-9][0-9]%', REVERSE(@str)))) END AS Unit
gives you both numberic and unit part.
Here are some tests that I made with the data you have posted:
Input:
DECLARE @str VARCHAR(MAX) = '50 000ug/ML'
Output:
Numeric Unit------------50 000 ug/ML
Input:
DECLARE @str VARCHAR(MAX) = '99.5%'
Output:
Numeric Unit------------99.5
Input:
DECLARE @str VARCHAR(MAX) = '4000 . 35 % ug/ML'
Output:
Numeric Unit------------------4000 . 35 ug/ML
Here is my answer. Check output in SQLFiddle for the same.
create TABLE temp( string NVARCHAR(50))INSERT INTO temp (string)VALUES ('4000 ug\ml'), ('2000 ug\ml'), ('%'), ('ug\ml')SELECT subsrtunit,LEFT(subsrtnumeric, PATINDEX('%[^0-9]%', subsrtnumeric+'t') - 1)FROM ( SELECT subsrtunit = SUBSTRING(string, posofchar, LEN(string)), subsrtnumeric = SUBSTRING(string, posofnumber, LEN(string)) FROM ( SELECT string, posofchar = PATINDEX('%[^0-9]%', string), posofnumber = PATINDEX('%[0-9]%', string) FROM temp ) d) t
Updated Version to handle 99.5 ug\ml
create TABLE temp( string NVARCHAR(50))INSERT INTO temp (string)VALUES ('4000 ug\ml'), ('2000 ug\ml'), ('%'), ('ug\ml'), ('99.5 ug\ml')SELECT subsrtunit,LEFT(subsrtnumeric, PATINDEX('%[^0-9.]%', subsrtnumeric+'t') - 1)FROM ( SELECT subsrtunit = SUBSTRING(string, posofchar, LEN(string)), subsrtnumeric = SUBSTRING(string, posofnumber, LEN(string)) FROM ( SELECT string, posofchar = PATINDEX('%[^0-9.]%', string), posofnumber = PATINDEX('%[0-9.]%', string) FROM temp ) d) t
Updated Version: To handle 1 000 ug\ml,20 000ug\ml
create TABLE temp( string NVARCHAR(50))INSERT INTO temp (string)VALUES ('4000 ug\ml'), ('2000 ug\ml'), ('%'), ('ug\ml'), ('99.5 ug\ml'), ('1 000 ug\ml'), ('20 000ug\ml')SELECT substring(replace(subsrtunit,' ',''),PATINDEX('%[0-9.]%', replace(subsrtunit,' ',''))+1,len(subsrtunit)),LEFT(replace(subsrtnumeric,' ',''), PATINDEX('%[^0-9.]%', replace(subsrtnumeric,' ','')+'t') - 1)FROM ( SELECT subsrtunit = SUBSTRING(string, posofchar, LEN(string)), subsrtnumeric = SUBSTRING(string, posofnumber, LEN(string)) FROM ( SELECT string, posofchar = PATINDEX('%[^0-9.]%', replace(string,' ','')), posofnumber = PATINDEX('%[0-9.]%', replace(string,' ','')) FROM temp ) d) t
Check out SQLFiddle for the same.
Would something like this work? Based on the shown data it looks like it would.
Apply it to your data set as a select and if you like the results then you can make an update from it.
WITH cte as (SELECT 'ug/mL' ConcUnit, 500 as [Numeric], '' as Unit UNION ALL SELECT '2000 ug/mL', NULL, '')SELECT [ConcUnit] as [ConcUnit], [Numeric] as [Original Numeric], [Unit] as [Original Unit], CASE WHEN ConcUnit LIKE '% %' THEN SUBSTRING(ConcUnit, 1, CHARINDEX(' ', ConcUnit) - 1) ELSE [Numeric] END as [New Numeric], CASE WHEN ConcUnit LIKE '% %' THEN SUBSTRING(ConcUnit, CHARINDEX(' ', ConcUnit) + 1, LEN(ConcUnit)) ELSE ConcUnit END as [New Unit]FROM cte