We are designing a large transactional database that will end up being deployed in a clustered SQL Server 2000 environment. We have initially designed all of our primary key fields as GUID type to eliminate any potential future replication issues. Will
this design impact the performance of the db negatively versus using an auto-increment PK design using big int type fields and following a ranged identities design approach?
Primary keys selection and performance is the same regardless of whether you
are in a clustered environment or not.
GUIDs as PKs are not a good idea. First, they are effectively random, so
there will be lots of page splits and table/index fragementation if you
allow the default of clustered order on the PK. Second, they are totally
artificial, which I think is a very bad idea from a data integrity
standpoint. Finally, they are wide. Remember, every non-clustered index
uses the clustered index to look up date rows. GUIDs are very wide and will
slow down index intersection as a possible query resolution path.
Here is the (very condensed) version of how I prefer to layout index
structures.
First, I create a RowID field Int (or BigInt) Identity(1,1) on each table.
I then create a unique, clustered index on this column. Note it is NOT the
Primary Key. Primary keys are data-centric, not artificial. If you cannot
identify a data-centric key, then you need to rethink your design since it
is by definition not third normal form. With data-centric keys, I don't
worry about new RowIDs if I need to replicate the data.
My PK index is non-clustered. Any additional indexes are also
non-clustered.
Since the data is now in insert order, the cache manager now has a bit of
help. Most databases query newly inserted data more often than older data,
especially data in large tables. Inserts are intentionally hot-spotted,
also helping out the lazy writer and checkpoint processes. Index
intersection is highly optimized.
I have been using this technique for a few years now and it makes a HUGE
difference in performance.
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
Careerbuilder.com
I support the Professional Association for SQL Server
www.sqlpass.org
"toml" <anonymous@.discussions.microsoft.com> wrote in message
news:A2293202-5C5F-4240-9413-628E5913FF58@.microsoft.com...
> We are designing a large transactional database that will end up being
deployed in a clustered SQL Server 2000 environment. We have initially
designed all of our primary key fields as GUID type to eliminate any
potential future replication issues. Will this design impact the
performance of the db negatively versus using an auto-increment PK design
using big int type fields and following a ranged identities design approach?
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment