Playing With Hierarchies

In our world of data, often times we come across data that represents a hierarchical structure.  I recently was asked to work on a design that would utilize the hierarchyid datatype that was introduced in SQL Server 2008.

Our task was to bulk load multiple nodes of a hierarchy into a single table structure, with the possibility of having multiple parents.  Since multiple parents are not supported in the hierarchyid datatype, I had some creative work ahead of me.

The first thing I needed to do was understand hierarchyid.  This datatype is meant to provide all of the information required to denote a “Single Parent” hierarchy.  However, the value displayed is a hexadecimal, and has no real world application.  Thankfully we have the following methods to help us.

  • GetAncestor(n) -  This will get you parent hierarchyid at the level (int) that you supply
  • GetDescendant(child1, child2) – Gets the descendant of a hierarchyid based on the child hierarchyids you pass in. NOTE: this does not guarantee an unused hierarchyid
  • GetLevel() – This will return the level of the Hierarchyid.
  • hierarchyid::GetRoot() -  returns the root of a hierarchyid.  In my experience, this is always equivalent to “/”
  • IsDescendantOf() – Will return a bit value of 0 or 1 based on if the Hid is a descendant of the hierarchyid parameter.
  • hierarchyid::Parse(string) – Will return the hierarchyID of a string.  Ex: ‘/45/142/’ returns 0xCBBE06F4.  This is the opposite of ToString.  Cast(‘/45/142/’ AS hierarchyid) performs the same action.
  • GetReparentedValue(oldparent, newparent) – This will exchange the old parent value for the new parent value.
  • ToString() – Will return the string representation of the hierarchyID.  Ex: 0xCBBE06F4 returns ‘/45/142/’.  This is the opposite of Parse.

Since we have the ToString function and the hexadecimal value is hard to look at, I will refer to examples of hierarchyids with their string value.  After studying hierarchyid, I discovered a few related things.

First, the string representation of the hierarchyid is very intuitive with its structure.  /76/ is the parent of /76/39/, which in turn is the parent of /76/39/2/.

Second, I noticed that there is no function of this datatype that enforces any sort of integrity. A new record does not need to have its ancestors in the table.  There can be a single record in your table with a hierarchyid of /4/3/5/6/, and the 4, 3, and 5 hierarchies can be non existent.  In fact, if there is no unique constraint on the column, duplicate records with /4/3/5/6/ can exist in the same table.  It is in this way that hierarchyid is not really aware of the other nodes in the table.  Even great care must be taken with the GetDescendant function, because unlike an identity column, it will not just give you the next unused node.

Finally, given this information, I realized that the value represented in each node was completely arbitrary.  This means that it is another structure to store information in!

With this great piece of information in hand, I built a table similar to the following:

Existing tables: Country, SubCountry

CountryId CountryName
206 United States
314 Canada
SubCountryId SubCountryName CountryId
74 Washington 1
19 Oregon 1
261 British Columbia 2

(hierarchyids are in string format)

GeographyId GeographyHierarchyId GeographyName
1 /206/ United States
2 /314/ Canada
3 /206/74/ Washington
4 /206/19/ Oregon
5 /314/261/ British Columbia

In this fashion, the original IDs were preserved inside of the hierarchyid of the new structure, giving an easy way to update additional attributes that might be in the original table when the case arises.

So, how has your experience with hierarchyid been? Do you have any tricks or tips that have helped you out?

Until next time,
-Mike

Leave a Comment


NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>