Click here to Skip to main content
15,120,739 members
Articles / Database Development / SQL Server / SQL 2017
Tip/Trick
Posted 8 Jan 2020

Stats

12.8K views
6 bookmarked

How to Create and Use a Self-referencing Hierarchical Table

Rate me:
Please Sign up or sign in to vote.
5.00/5 (3 votes)
8 Jan 2020CPOL3 min read
This article explains how to create and use a self referencing key in a SQL Server Table.

Introduction

Representing organizational structures in an SQL Table can be tricky. One way to do this is use a self referencing key in a table that identifies the parent-child relationship.

hierarchy

Can be referenced as:

Self Referencing Data Structure

Id RefId Name
1 null Root
2 1 Customer1
3 1 Customer2
4 1 Customer3
5 2 Dept 1.1
6 2 Dept 1.2
7 2 Dept 1.3
8 3 Dept 2.1
9 4 Dept 3.1
10 4 Dept 3.2

As you can see, the RefId points to the id of each parent. Through this data structure, we can identify all of the descendants for the parent. For example, the children of id 4 would be ids 9 and 10.

This article will also introduce a table valued function that will provide a list of descendant ids for a given id.

Background

To set this up, a single table must be created and populated.

SQL
 CREATE TABLE [dbo].[Organizations](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [nvarchar](500) NOT NULL,
    [ParentId] [int] NULL,
 CONSTRAINT [PK_Organizations] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH (PAD_INDEX = OFF, 
       STATISTICS_NORECOMPUTE = OFF, 
       IGNORE_DUP_KEY = OFF, 
       ALLOW_ROW_LOCKS = ON, 
       ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Organizations_ParentId] ON [dbo].[Organizations]
(
    [ParentId] ASC
)WITH (PAD_INDEX = OFF, 
       STATISTICS_NORECOMPUTE = OFF, 
       SORT_IN_TEMPDB = OFF, 
       DROP_EXISTING = OFF, 
       ONLINE = OFF, 
       ALLOW_ROW_LOCKS = ON, 
       ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
set identity_insert [dbo].[Organizations] on;

insert into dbo.Organizations ([id], [ParentId], [Name]) values
(1 ,null, 'Root    '),
(2 ,1,'Customer1 '),
(3 ,1,'Customer2 '),
(4 ,1,'Customer3 '),
(5 ,2,'Dept 1.1  '),
(6 ,2,'Dept 1.2  '),
(7 ,2,'Dept 1.3  '),
(8 ,3,'Dept 2.1  '),
(9 ,4,'Dept 3.1  '),
(10 ,4,'Dept 3.2  ')

set identity_insert [dbo].[Organizations] off; 

As the records are added, the ParentId with take the value of the Parent Records Id. All three customer's ParentIds reference the root id (1). This creates the hierarchy we can use to recursively select the child records.

Selecting the Children

The query that is used to select the children is pretty complicated. It is using recursion (calling itself) to identify each of the children of the parent. This example uses a CTE (Common Table Expression) to simplify the query.

SQL
;with recur as (
      select a.id,  a.parentid from organizations a
      where a.id=2
      union all 
      select b.id, b.parentid from Organizations b 
      inner join recur c on c.id = b.parentid
    )
    select id from recur

This code would select each of the record ids that are descendant from the record with id 2.

id
-----------
2
5
6
7

This is the simplest form of the query and could easily be extended to present more values than just the id. However, it is trimmed down to suit the next step - creating a table valued function.

Table Valued Functions

Table valued functions are SQL functions that return a table. For example, a table value function called dbo.GetDecendants with an input parameter of id could be expressed as:

SQL
Select * from dbo.GetDecendants(3)

A table value function is set up to return its results as a table. Therefore, it can be used to return a set of integers (ids) that can be used in a more complex select clause.

SQL
select id, parentid, substring(name,1,12) from dbo.Organizations 
       where id in (select * from dbo.GetDecendants(2))

Where the results of this query would be:

id          parentid    
----------- ----------- ------------
2           1           Customer1 
5           2           Dept 1.1  
6           2           Dept 1.2  
7           2           Dept 1.3  

(4 rows affected)

Here is the code to create the GetDecendants function:

SQL
create FUNCTION GetDecendants 
(    
    @id int 
)
RETURNS TABLE 
AS
RETURN 
(
    with recur as (
      select a.id,  a.parentid from organizations a
      where a.id=@id
      union all 
      select b.id, b.parentid from Organizations b 
      inner join recur c on c.id = b.parentid
    )
    select id from recur
)
GO 

Points of Interest

This data structure is useful for cascading configuration values where the base configuration is set at the parent and each level of descendant fine tunes the configuration for the associated data object. In other words, the configuration values for the descendant would override the values set for the parent.

This could also be used as an index table that identifies the parents and children of an object and detail tables are joined to this id.

At the company, I was introduced to this data structure, it was used to identify hot spots in a hospital's performance. For example, if the hierarchy represents a hospital, floor, wing, room and the metrics were applied to the room, this pattern could be used to aggregate metrics for each section of the hospital.

History

  • 7th January, 2020: Initial draft

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

E. Scott McFadden
Architect 1960
United States United States
No Biography provided

Comments and Discussions

 
QuestionMS T-SQL HierarchyID data type Pin
Eek Ten Bears8-Jan-20 5:13
MemberEek Ten Bears8-Jan-20 5:13 
AnswerRe: MS T-SQL HierarchyID data type Pin
E. Scott McFadden8-Jan-20 16:35
professionalE. Scott McFadden8-Jan-20 16:35 
GeneralRe: MS T-SQL HierarchyID data type Pin
Eek Ten Bears8-Jan-20 21:35
MemberEek Ten Bears8-Jan-20 21:35 
GeneralRe: MS T-SQL HierarchyID data type Pin
E. Scott McFadden10-Jan-20 5:14
professionalE. Scott McFadden10-Jan-20 5:14 
I really appreciate this discussion.

I started playing with it a bit and it appears you really need to understand the string representation syntax of the hierarchy really well. The syntax can be a bit confusing.

However, its concepts are pretty neat. For example, you can manage the order of nodes on the current leaf - something my example does not really support. However, managing this order can be pretty complicated because the insert/function needs to know about the adjacent nodes - making it a bit more than a straight forward insert. (See example on this page ( GetDescendant (Database Engine) )

Another thing I noticed is that there are a few concerns that need to be address during database version migrations (to 2019) and in replication (More information here (hierarchyid data type method reference))

But probably the greatest benefit of using this method over the one introduced in my article is that is is published and supported by Microsoft. My code example is "roll your own" which tend to make the Database Admins a bit nervous. Where as this "supported" data type is something they are usually more comfortable with.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.