Group by is case sensitive in T-SQL even though db and server collations are CI Group by is case sensitive in T-SQL even though db and server collations are CI sql sql

Group by is case sensitive in T-SQL even though db and server collations are CI


This is more of extended comment that real answer.

I believe this issue is coming from how SQL Server is attempting to evaluate case statement expression.

To prove that server is case insensetive you can run the following two statements

SELECT CASE WHEN 'Bertrand' = 'bertrand' THEN 'true' ELSE 'false' end

-

DECLARE @base TABLE(NAME VARCHAR(1),value INT)INSERT INTO @base Values('a',0),('b',0),('B',0)SELECT * FROM @baseSELECT name, COUNT(value) AS CntFROM @baseGROUP BY NAME

results:

enter image description here

as you can see here even though letter in second row is lower case and in third row is upper case, group by clause ignores the case. Looking at execution plan there are two expression for

Expr 1007 COUNT([value])    Expr 1004 CONVERT_IMPLICIT(int,[Expr1007],0)    

now when we change it to case

SELECT CASE WHEN name = 'a' THEN 'adam' ELSE 'bertrand' END AS name, COUNT(value) AS CntFROM @baseGROUP BY CASE WHEN name = 'a' THEN 'adam' ELSE 'bertrand' END

execution plan shows 3 expressions. 2 from above and new one

Expr 1004 CASE WHEN [NAME]='a' THEN 'adam' ELSE 'bertrand' END  

so at this point aggregate function is no longer evaluating value of the column name but now it evaluating value of the expression.

What i think is happening is, could be incorrect. When SQL server converts both CASE statement in SELECT and GROUP BY clause to a expression it comes up with different expression value. In this case you might as well do 'bertrand' in select and 'charlie' in group by clause because if CASE expression is not 100% match between select and group by clause SQL Server will consider them as different Expr aka (columns) that no longer match.


Update:

To take this one step further, the following statement will also fail.

SELECT CASE WHEN name = 'a' THEN 'adam' ELSE UPPER('bertrand') END AS name    ,COUNT(value) AS CntFROM @baseGROUP BY CASE WHEN name = 'a' THEN 'adam' ELSE UPPER('Bertrand') END

Even wrapping the different case strings in UPPER() function, SQL Server is still unable to process it.


The problem is the case when statement in select and group by should be same, but you can write your query as below

select name, count(value) from(    SELECT CASE name WHEN 'a' THEN 'adam' ELSE 'bertrand' END name, value FROM        (        SELECT 'a' AS name,1 AS value        UNION        SELECT 'b',1        UNION        SELECT 'b',2        )a    )tGROUP BY name

This is not about case sensivity, take a look at this sample

This worksSELECT CASE name WHEN 'a' THEN 'adam' when 'b' then 'bertrand' end name, COUNT(value) FROM(SELECT 'a' AS name,1 AS valueUNIONSELECT 'b',1UNIONSELECT 'b',2)aGROUP BY CASE name WHEN 'a' THEN 'adam' when 'b' then 'bertrand' ENDThis does not workSELECT CASE name when 'b' then 'bertrand' WHEN 'a' THEN 'adam' end name, COUNT(value) FROM(SELECT 'a' AS name,1 AS valueUNIONSELECT 'b',1UNIONSELECT 'b',2)aGROUP BY CASE name WHEN 'a' THEN 'adam' when 'b' then 'bertrand' END

While the case of 'adam' and 'bertrand' are the same.


You have found something that is genuinely weird, but I think the problem is that you are using the case statement at all in the group by statement. It should be:

SELECT CASE name WHEN 'a' THEN 'adam' ELSE 'bertrand' END AS name   FROM(SELECT 'a' AS name,1 AS valueUNIONSELECT 'b',1UNIONSELECT 'b',2) aGROUP BY name

The group by should apply to the entire table and not individual rows. I could be missing some reason to do this, but I don't think it makes sense to conditionally group by a value.

I am more surprised that the first one works at all than that the second one doesn't work. Comparing 'a' = 'A' is subtly different from comparing a column to another column. SQL server doesn't seem to use the collation settings in this check to see if the column is in the group by. The error message you receive from your second query is saying 'this column in the select is not the same as the column in the group by' and not 'these values are not equal'.