Optimal way to concatenate/aggregate strings
SOLUTION
The definition of optimal can vary, but here's how to concatenate strings from different rows using regular Transact SQL, which should work fine in Azure.
;WITH Partitioned AS( SELECT ID, Name, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Name) AS NameNumber, COUNT(*) OVER (PARTITION BY ID) AS NameCount FROM dbo.SourceTable),Concatenated AS( SELECT ID, CAST(Name AS nvarchar) AS FullName, Name, NameNumber, NameCount FROM Partitioned WHERE NameNumber = 1 UNION ALL SELECT P.ID, CAST(C.FullName + ', ' + P.Name AS nvarchar), P.Name, P.NameNumber, P.NameCount FROM Partitioned AS P INNER JOIN Concatenated AS C ON P.ID = C.ID AND P.NameNumber = C.NameNumber + 1)SELECT ID, FullNameFROM ConcatenatedWHERE NameNumber = NameCount
EXPLANATION
The approach boils down to three steps:
Number the rows using
OVER
andPARTITION
grouping and ordering them as needed for the concatenation. The result isPartitioned
CTE. We keep counts of rows in each partition to filter the results later.Using recursive CTE (
Concatenated
) iterate through the row numbers (NameNumber
column) addingName
values toFullName
column.Filter out all results but the ones with the highest
NameNumber
.
Please keep in mind that in order to make this query predictable one has to define both grouping (for example, in your scenario rows with the same ID
are concatenated) and sorting (I assumed that you simply sort the string alphabetically before concatenation).
I've quickly tested the solution on SQL Server 2012 with the following data:
INSERT dbo.SourceTable (ID, Name)VALUES (1, 'Matt'),(1, 'Rocks'),(2, 'Stylus'),(3, 'Foo'),(3, 'Bar'),(3, 'Baz')
The query result:
ID FullName----------- ------------------------------2 Stylus3 Bar, Baz, Foo1 Matt, Rocks
Are methods using FOR XML PATH like below really that slow? Itzik Ben-Gan writes that this method has good performance in his T-SQL Querying book (Mr. Ben-Gan is a trustworthy source, in my view).
create table #t (id int, name varchar(20))insert into #tvalues (1, 'Matt'), (1, 'Rocks'), (2, 'Stylus')select id ,Names = stuff((select ', ' + name as [text()] from #t xt where xt.id = t.id for xml path('')), 1, 2, '')from #t tgroup by id
STRING_AGG()
in SQL Server 2017, Azure SQL, and PostgreSQL:https://www.postgresql.org/docs/current/static/functions-aggregate.html
https://docs.microsoft.com/en-us/sql/t-sql/functions/string-agg-transact-sql
GROUP_CONCAT()
in MySQL
http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat
(Thanks to @Brianjorden and @milanio for Azure update)
Example Code:
select Id, STRING_AGG(Name, ', ') Names from Demogroup by Id
SQL Fiddle: http://sqlfiddle.com/#!18/89251/1