Click here to Skip to main content
15,891,597 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Can someone please clarify some SQL for me.
Please see the following script and comment:
SQL
CREATE TABLE [dbo].[TimeValues]
(
    [Time] DATETIME NOT NULL,
    [Collection_Id] INT NOT NULL,
    [Value] DECIMAL(18,6) NOT NULL,
    CONSTRAINT [PK_TimeValues] PRIMARY KEY ([Time],[Collection_Id]) ON [MyPartitionScheme]([Time])
) ON [MyPartitionScheme]([Time])

--The following script takes 2 min to run ???
SELECT
    DATEADD(DAY,DATEDIFF(DAY,'2000-01-01 00:00:00',[Time]),'2000-01-01 00:00:00') AS [Time]
    ,1 AS [Collection_Id]
    ,SUM([Value]) AS [Value]
FROM [dbo].[TimeValues]
WHERE [Time] BETWEEN '2000-01-01' AND '2020-01-01'
    AND Collection_Id = 1
GROUP BY DATEDIFF(DAY,'2000-01-01 00:00:00',[Time])

--If I add the following Index the script takes 0 sec. to run WHY ????
CREATE NONCLUSTERED INDEX [IX_TimeValues_Collection_Id_Time]
ON [dbo].[TimeValues]([Collection_Id],[Time])
INCLUDE ([Value]) WITH (FILLFACTOR = 80)
Posted

1 solution

Your primary key is a composite index using Time and Collection_Id, in that order. So when your Where Clause states [Time] BETWEEN '2000-01-01' AND '2020-01-01' the optimizer will in best case use a range scan, but if this range covers a large part of available dates but with only one Collection_Id per date so it will probably resort to a full table scan.
You can probably confirm this by checking the execution plan.

When you add the index that starts with a unique Collection_Id and then one small ordered range to aggregate the values from.
This is very fast.
Again, check the execution plan.

If all your queries use unique Collection_Ids I would strongly suggest that you recreate your Primary key in the same order as your index.
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900