Friday, August 03, 2007

Unique identifiers

Holy wars and worse have been fought about unique identifiers for data items and the last word on it has not been written either. The most purist solution is the meaningless numeric sequence and this is usually the recommendation. So why is it that other solutions are adopted more regularly than this simple recommendation?

Well - people get confused in the discussion on unique identification of things, because if you uniquely identify something, wouldn't it be nice to also recognise it. Why have a meaningless number if you can also call me Evert? And this is where people make the mistake. There is a difference between human recognition of a thing or person and what systems need to do with its related records. Every system has different ways of dealing with the names of things and therefore every recognisable identification will have to go through multiple conversions if the item is used in many systems. A neutral number does not have this problem.

The other issue is that things that you can recognise can change. Take country codes as an example. You would think that a country is a pretty stable object, but Upper Volta became Burkina Faso and Birma became Myanmar and then I even did not start mentioning Serbia ... Every time a country changes you have to change the identification of the object in all systems. This can lead to lots of (technical) issues.

My view is that we should allow for both - a unique (meaningless) number and a unique meaningful alias (or set of aliases). The number is the actual primary key, but the alias is what you use in practice on your screen. You can use the alias as much as you like, but when you need to convert, you don't run into technical problems! The more meaningless the identifier, the less discussion when situations change.

The unique identifier needs to be assigned when the object is created and should never change. The best way is to do this via a 'service', an independent component in your system architecture, but this only makes sense in complex large enterprise wide architectures.

Labels: , ,

2 Comments:

At 10:07 AM, Blogger Unknown said...

Hi, I am enjoying going through your blog. Very helpful information. I have a question re: unique identifiers. For, say, customer or product. If you have a purchased application that handles cx or product data mgmt, what are the best practices re: whether you use the purchased application's identifer or whether you use an in-house (especially if one perhaps uses the purchased app to manage the data but another app as an ODS to serve up to other application in the enterprise). My thinking is the corporate identifier should prevail and be the "handle" for other apps. But there's obviously a cost. The otherwise option is that the identifiers generated by the purchased package are exposed and accessible, that may work as well. Thoughts?

thanks

 
At 4:09 PM, Blogger Evert Ruijs said...

Anoop, I would advocate using your own keys as unique identifiers, and use the external identifiers as an alternate key (and extra attribute)

 

Post a Comment

<< Home