Click here to Skip to main content
15,884,388 members
Please Sign up or sign in to vote.
2.55/5 (4 votes)
See more:
Hey guys,

Please help me in building a logic that gives a comma seperated result set based on 3 columns and over a 65000 record set.
I've tried using STUFF, COALESCE, SUBSTRING, by the time the query completes execution, I can make a cup of tea.. :( :P.

Note:
SQL Server : MSSQL 2005.
HTML
type	Country	        Year	filename	FileID
A	Singapore	2010	A.xls	        20
A	Singapore	2010	B.xls	        21
A	Singapore	2010	C.Xls	        22
A	Malaysia	2008	A.xls	        55
A	Malaysia	2008	B.xls	        53
B	Malaysia	2008	Cat.xls	        88
B	Malaysia	2008	doll.xls	78
				
				
Expected output				
type	Country	        Year	filename	        FileID
A	Singapore	2010	a.xls,b.xls,c.xls	20,21,22
A	Malaysia	2008	a.xls,b.xls	        55,53
B	Malaysia	2008	cat.xls,doll.xls	88,78
Posted
Updated 10-May-14 4:57am
v4
Comments
DamithSL 10-May-14 10:14am    
update the question with your code
Venkat Devireddy 10-May-14 10:30am    
Hi Damith, due to privacy, I cant share my code here., please send a test mail () I will share my code. Well, i've used the similar logic the one here(http://goo.gl/EJ1jgP)..
CHill60 11-May-14 9:06am    
You might want to remove that email address before a spambot finds it
Venkat Devireddy 11-May-14 9:27am    
thank you CHill
[no name] 10-May-14 10:49am    
You would need to tell us what sort of "help" you need. We cannot see your code, your screen or your server to know what the problem might possibly be.

Any effort?

Anyway, this should work:

SELECT 
	[type], 
	Country, 
	[Year], 
	fileName = 
		STUFF((SELECT ', ' + fileName
           FROM Records FileNameRecords
           WHERE 
				FileNameRecords.[type] = Records.[type]
				AND FileNameRecords.Country = Records.Country
				AND FileNameRecords.[Year] = Records.[Year]
          FOR XML PATH('')), 1, 2, ''),
    FileId = 
		STUFF((SELECT ', ' + FileId
           FROM Records FileNameRecords
           WHERE 
				FileNameRecords.[type] = Records.[type]
				AND FileNameRecords.Country = Records.Country
				AND FileNameRecords.[Year] = Records.[Year]
          FOR XML PATH('')), 1, 2, '')
FROM	
	Records
GROUP BY
	[type], Country, [Year]
ORDER BY
	type, Year DESC
 
Share this answer
 
Comments
Maciej Los 10-May-14 12:42pm    
+5!
Venkat Devireddy 10-May-14 12:58pm    
Hi manas,

I've written exactly the same query (with STUFF, substring), but its taking hell lot of time(5+ mins) with that query, as i've a data of 65000+ records.
Actually I heard we can build a logic using CTE, but i'm not sure how.
If you want to use CTE, please see:
SQL
DECLARE @tmp TABLE (atype VARCHAR(5), Country VARCHAR(30), aYear INT, afilename VARCHAR(30), FileID INT)

INSERT INTO @tmp (atype, Country, aYear, afilename, FileID)
VALUES('A', 'Singapore', 2010, 'A.xls', 20),
('A', 'Singapore', 2010, 'B.xls', 21),
('A', 'Singapore', 2010, 'C.Xls', 22),
('A', 'Malaysia', 2008, 'A.xls', 55),
('A', 'Malaysia', 2008, 'B.xls', 53),
('B', 'Malaysia', 2008, 'Cat.xls', 88),
('B', 'Malaysia', 2008, 'doll.xls', 78)

;WITH CTE AS
(
    SELECT atype, Country, aYear, afilename, CONVERT(VARCHAR(MAX),afilename) AS fnames, FileID, CONVERT(VARCHAR(MAX),FileID) AS fIds, 1 AS LoopNo
    FROM @tmp
    UNION ALL
    SELECT C.atype, C.Country, C.aYear, C.afilename, C.fnames + ',' + T.afilename AS fnames, C.FileID, C.fids + ',' + T.FileId AS fIds, LoopNo + 1 AS LoopNo
    FROM CTE AS C INNER JOIN (
        SELECT atype, Country, aYear, CONVERT(VARCHAR(MAX),afilename) AS afilename, CONVERT(VARCHAR(MAX),FileID) AS FileID
        FROM @tmp
        ) AS T ON C.aType = T.aType AND C.Country = T.Country AND C.aYear = T.aYear AND C.FileID < T.FileID
    WHERE CHARINDEX(CONVERT(NVARCHAR(10),T.FileID), C.fIds)=0
)
SELECT C.atype, C.Country, C.aYear, fnames, fIds
FROM CTE AS C
ORDER BY C.atype, C.Country, C.aYear


Above query produces:
A	Malaysia	2008	A.xls			55
A	Malaysia	2008	B.xls			53
A	Malaysia	2008	B.xls,A.xls		53,55
A	Singapore	2010	B.xls,C.Xls		21,22
A	Singapore	2010	A.xls,B.xls		20,21
A	Singapore	2010	A.xls,C.Xls		20,22
A	Singapore	2010	A.xls,C.Xls,B.xls	20,22,21
A	Singapore	2010	A.xls,B.xls,C.Xls	20,21,22
A	Singapore	2010	A.xls			20
A	Singapore	2010	B.xls			21
A	Singapore	2010	C.Xls			22
B	Malaysia	2008	Cat.xls			88
B	Malaysia	2008	doll.xls		78
B	Malaysia	2008	doll.xls,Cat.xls	78,88


There is no guarantee that CTE will be faster than in-build SQL functions. Change the code to your needs.

Note: Above sample code shows how CTE work.

For further information about CTE, please see:
WITH common_table_expression (Transact-SQL)[^]
CTEs (Common Table Expressions)[^]
CTE In SQL Server[^]
 
Share this answer
 
Comments
Manas Bhardwaj 11-May-14 16:09pm    
Yes, +5!
Maciej Los 11-May-14 16:12pm    
Thank you, Manas ;)
Maciejs solution reminded me that I solved that problem a couple of years ago. Link[^]
Adjusted for your needs it would look similar to this:
SQL
CREATE TABLE Records
	([type] varchar(1), [Country] varchar(9), [Year] int, [filename] varchar(8), [FileID] int)
;

CREATE INDEX records_ix
     ON Records
       ([type], [Country], [Year], [filename])
     INCLUDE ([FileID]);
	
INSERT INTO Records
	([type], [Country], [Year], [filename], [FileID])
VALUES
	('A', 'Singapore', 2010, 'A.xls', 20),
	('A', 'Singapore', 2010, 'B.xls', 21),
	('A', 'Singapore', 2010, 'C.Xls', 22),
	('A', 'Malaysia', 2008, 'A.xls', 55),
	('A', 'Malaysia', 2008, 'B.xls', 53),
	('B', 'Malaysia', 2008, 'Cat.xls', 88),
	('B', 'Malaysia', 2008, 'doll.xls', 78)
;

WITH ranked AS (
    SELECT  [type], [Country], [Year], [filename], [FileID], Rank() over (PARTITION BY [type], [Country], [Year] ORDER BY [filename]) Rnk 
    FROM    Records
    )
,cte ([type], [Country], [Year], FileNames, FileIDs, rnk) as
    (
    SELECT  [type]
           ,[Country]
           ,[Year]
           ,cast([FileName] as varchar(max)) as Filenames
           ,cast([FileID] as varchar(max)) as FileIDs
           ,rnk
    FROM    Ranked r
    WHERE   rnk = 1
    UNION ALL
    SELECT  cte.[type]
           ,cte.[Country]
           ,cte.[Year]
           ,FileNames + ', ' + [FileName]
           ,FileIDs + ', ' + cast([FileID] as varchar(max))
           ,r.rnk
    FROM    cte
    INNER JOIN ranked r
        ON  cte.[type] = r.[type]
        AND cte.[Country] = r.[Country]
        AND cte.[Year] = r.[Year]
        AND cte.Rnk = r.rnk - 1
    )
SELECT  [type], [Country], [Year], Max(FileNames) FileNames, Max(FileIDs) FileIDs, Max(rnk)
FROM    cte
GROUP BY [type], [Country], [Year]
ORDER BY [type], [Year] desc

Note the covering composite Index on the record table.
Still no guarantee that it's faster, but a proper index is important.
 
Share this answer
 
Comments
Manas Bhardwaj 11-May-14 16:09pm    
Yes 5!
Maciej Los 11-May-14 16:14pm    
+5!
Nice to know that my answer was an inspiration...
Jörgen Andersson 12-May-14 3:10am    
Sometimes my memory needs to be nudged in the right direction. :-)
I believe though that the Stuff For XML Path solution is faster when used with a properly indexed table.
Pity that SQL Server does not have any functionality similar to Oracles ListAgg. That's really fast.
Just realized there is yet another way to do this.
You can create your own aggregate.
In VS create a new SQL Server Database Project, add a new item "SQL CLR C# Aggregate".
Add code similar to this:
C#
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Collections.Generic;

[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.Native)]
public struct ListAgg
{
    private List<sqlstring> values;

    public void Init()
    {
        this.values = new List<sqlstring>();
    }

    public void Accumulate(SqlString Value)
    {
        this.values.Add(Value.Value);
    }

    public void Merge (ListAgg Group)
    {
        this.values.AddRange(Group.values.ToArray());
    }

    public SqlString Terminate ()
    {
        return new SqlString(string.Join(", ", this.values.ToArray()));
    }
}
Build and deploy[^] this assembly on the server.

Now you should be able to run a query like this:
SQL
SELECT  [type]
       ,[Country]
       ,[Year]
       ,dbo.ListAgg([filename])
       ,dbo.ListAgg([FileID])
FROM    Records
GROUP BY [type]
        ,[Country]
        ,[Year]
This solution should be running very fast compared to the others, BUT, note that this is untested and should be thought of as an idea of how you can solve your problem.
 
Share this answer
 
v3
Comments
Maciej Los 12-May-14 5:00am    
Very interesting alternative!
+5!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900