Tuesday, July 15, 2008

SQL Server 2008 (KATMAI) TSQL Enhancements - Part 7 - HIERARCHYID (Part 1)

HIERARCHYID is a new data type introduced in SQL Server 2008. I had worked on a few assignments where I had to deal with hierarchical data. I guess most of you have at least once worked with a hierarchical data model. Traversing a hierarchy is not very easy in SQL Server 2005. The new data type HIERARCHYID is all set to make the life easier.

Let us first see a simple example and then examine the data type in detail. For the purpose of the example, let us create a table to store some hierarchical data that looks like the following.

-- ----------------------------------------------------------
-- CEO
-- PM1
-- TechLead1
-- Jacob
-- Steve
-- TechLead2
-- Bob
-- Mike
-- PM2
-- TechLead3
-- Laura
-- Jeff
-- TechLead4
-- Bill
-- Andrew
-- ----------------------------------------------------------
As you must have figured out so far, this is the organizational hierarchy of a technology company. Now let us create a table and populate it with the above data.
IF OBJECT_ID('employees') IS NOT NULL DROP TABLE employees
CREATE TABLE employees (
EmployeeID INT,
EmployeeName VARCHAR(20),
ManagerID INT )

INSERT INTO employees (EmployeeID, EmployeeName, ManagerID)
SELECT 1, 'CEO', 0 UNION ALL
SELECT 2, 'PM1', 1 UNION ALL
SELECT 4, 'TechLead1', 2 UNION ALL
SELECT 6, 'Jacob', 4 UNION ALL
SELECT 7, 'Steve', 4 UNION ALL
SELECT 5, 'TechLead2', 2 UNION ALL
SELECT 8, 'Bob', 5 UNION ALL
SELECT 9, 'Mike', 5 UNION ALL
SELECT 3, 'PM2', 1 UNION ALL
SELECT 10, 'TechLead3', 3 UNION ALL
SELECT 11, 'Laura', 10 UNION ALL
SELECT 12, 'Jeff', 10 UNION ALL
SELECT 13, 'TechLead4', 3 UNION ALL
SELECT 14, 'Bill', 13 UNION ALL
SELECT 15, 'Andrew', 13
Looks pretty simple. Now we could easily answer the following questions.

To whom does Mike Report?

SELECT
e.EmployeeName,
p.EmployeeName AS Manager
FROM employees e
INNER JOIN employees p ON p.EmployeeID = e.ManagerID
WHERE e.EmployeeName = 'Mike'
/*
EmployeeName Manager
-------------------- --------------------
Mike TechLead2
*/
Who are the developers reporting to TechLead3?
SELECT
e.EmployeeName,
p.EmployeeName AS Manager
FROM employees e
INNER JOIN employees p ON p.EmployeeID = e.ManagerID
WHERE p.EmployeeName = 'TechLead3'
/*
EmployeeName Manager
-------------------- --------------------
Laura TechLead3
Jeff TechLead3
*/
Now look at the following questions:
  1. Who are the bosses of TechLead4?
  2. Who are at the second level below PM2?
  3. Who are the employees directly or indirectly reporting to PM2?

We could answer the questions easily when it is about people one level above or below a given position. However, it will be harder to find information from more than one level up or down. HIERARCHYID makes this much easier. To find all the bosses of TechLead4, we need to go upward in the hierarchy starting from the immediate boss of TechLead4. To find the people at second level below PM2, we need to find all the people who directly reports to PM2 and all the people below them. To answer the 3rd question, we need to traverse the entire hierarchy downwards, starting with PM2.

Now, let us see how HIERARCHYID helps to answer these questions. Let us first of all, recreate the Employees table having a HIERARCHYID column and then populate it. Here is the version of the insert statement that uses a HIERARCHYID.
INSERT INTO employees (EmployeeNode, EmployeeID, EmployeeName, ManagerID)
VALUES
(CAST('/' AS HIERARCHYID), 1, 'CEO', 0),
(CAST('/1/' AS HIERARCHYID), 2, 'PM1', 1),
(CAST('/1/1/' AS HIERARCHYID), 4, 'TechLead1', 2),
(CAST('/1/1/1/' AS HIERARCHYID), 6, 'Jacob', 4),
(CAST('/1/1/2/' AS HIERARCHYID), 7, 'Steve', 4),
(CAST('/1/2/' AS HIERARCHYID), 5, 'TechLead2', 2),
(CAST('/1/2/1/' AS HIERARCHYID), 8, 'Bob', 5),
(CAST('/1/2/2/' AS HIERARCHYID), 9, 'Mike', 5),
(CAST('/2/' AS HIERARCHYID), 3, 'PM2', 1),
(CAST('/2/1/' AS HIERARCHYID), 10, 'TechLead3', 3),
(CAST('/2/1/1/' AS HIERARCHYID), 11, 'Laura', 10),
(CAST('/2/1/2/' AS HIERARCHYID), 12, 'Jeff', 10),
(CAST('/2/2/' AS HIERARCHYID), 13, 'TechLead4', 3),
(CAST('/2/2/1/' AS HIERARCHYID), 14, 'Bill', 13),
(CAST('/2/2/2/' AS HIERARCHYID), 15, 'Andrew', 13)

There are a few different ways to populate hierarchy information in a column with HIERARCHYID data type. We will discuss this in the next post. Let us now try to write the queries to answer the 3 questions we saw previously.

Question 1: Who are the bosses of TechLead4

DECLARE @h HIERARCHYID
SELECT @h = EmployeeNode FROM employees WHERE EmployeeName = 'TechLead4'
SELECT EmployeeName FROM employees WHERE @h.IsDescendantOf(EmployeeNode) = 1
AND EmployeeNode <> @h
/*
EmployeeName
--------------------
CEO
PM2
*/

HIERARCHYID is implemented internally as a CLR data type and it has a number of methods. We will examine the methods exposed by HIERARCHYID in the next post. The method "IsDescendantOf()" returns 1 if it is the child of another HIERARCHYID value. The query first stores the HierarchyID of TechLead4 to a variable and calls the "IsDescendantOf()" method against all the rows in the table.

Question 2: Who are at the second level below PM2

DECLARE @h HIERARCHYID
SELECT @h = EmployeeNode FROM employees WHERE EmployeeName = 'PM2'
SELECT EmployeeName FROM employees WHERE EmployeeNode.GetAncestor(2) = @h
/*
EmployeeName
--------------------
Laura
Jeff
Bill
Andrew
*/

Another useful method is "GetAncestor()". This method returns true if a given value is the ancestor of another value by a given level. The above query stores the HierarchyID of 'PM2' to a variable. It then matches all records in the table and returns the rows with a HierarchyID value whose second level ancestor is 'PM2'.

Question 3: Who are the employees directly or indirectly reporting to PM2?

DECLARE @h HIERARCHYID
SELECT @h = EmployeeNode FROM employees WHERE EmployeeName = 'PM2'
SELECT EmployeeName FROM employees WHERE EmployeeNode.IsDescendantOf(@h) = 1
AND EmployeeNode <> @h
/*
EmployeeName
--------------------
TechLead3
Laura
Jeff
TechLead4
Bill
Andrew
*/

In the first example, we have seen the usage of "IsDescendantOf()". But the logic we used in the first example was different. In the first example, we matched all the rows against the variable and checked if a given row is the descendant of the variable. This example shows the reverse case. Here are we are testing if the variable is the descendent of any of the rows in the table.

HIERARCHYID has 10 methods and we just saw two of them. We will examine the other methods in the next posts.

SQL Server 2008 (KATMAI) TSQL Enhancements

  1. SQL Server 2008 (KATMAI) TSQL Enhancements - Part 1 - Variable initialization and Compound Assignment Operators
  2. SQL Server 2008 (KATMAI) TSQL Enhancements - Part 2 - Row Constructors
  3. SQL Server 2008 (KATMAI) TSQL Enhancements - Part 3 - New Data Type: DATE
  4. SQL Server 2008 (KATMAI) TSQL Enhancements - Part 4 - New Data Type: TIME
  5. SQL Server 2008 (KATMAI) TSQL Enhancements - Part 5 - New Data Type: DATETIME2
  6. SQL Server 2008 (KATMAI) TSQL Enhancements - Part 6 - DATETIMEOFFSET

2 comments:

Ayyanar said...

Excellent!!!. I like your clean explanation.

RR said...

Thank you for excellent posts.