Database Design Guidelines For Developers

One of the developers that I work with asked me to write a “brief” (really brief) guide on database design. What I have written below is a summary in bullet point form of best practices when creating databases, tables, indexes, and then querying the data within. It is designed to be short and sweet so the reader can check each section quickly when creating a new database. References are provided so that the reader can use if they wish to explore a certain topic in more detail.

Let me know what you think! Is there anything glaringly obvious that I’ve missed? My email is dbafromthecold@gmail.com

Database Design Guidelines

Database Design

  • PRIMARY filegroup should be reserved for system objects
  • Secondary DATA filegroup should be set to default.
  • Minimum of two physical files mapped to each secondary filegroups
  • Data and Log files presized for a minimum of one year’s data inserts
  • Data and Log files should reside on separate drives
  • Denormalisation should only be considered in special circumstances. A properly normalised OLTP database will outperform a denormalised one
  • Recommended collation is Latin1_General_CI_AS

Table Design

  • Each table created within the database should have a Primary Key and a Clustered Index. These need not be the same column
  • GUID fields should not be used for clustered indexes even if used as table’s Primary Key
  • Composite keys should be avoided. Surrogate INT IDENTITY keys can be used as a replacement
  • Columns set to the smallest size possible – avoiding NVARCHAR(MAX), TEXT etc datatypes
  • Avoid Nullable columns. Nullable columns require extra space within the database to designate the fact that they are nullable
  • Use referential integrity – Foreign keys and unique constraints should be applied

MSDN CREATE TABLE Documentation – http://technet.microsoft.com/en-us/library/ms174979.aspx

Querying Data

  • Stored procedures should be used as much as possible, to allow query execution plans to be re-used
  • Do not use SELECT * – SELECT only the columns needed to keep the query’s memory usage as low as possible
  • As a standard, cursors should be avoided. Removal of RBAR (row-by-agonising-row) processing
  • SET NOCOUNT ON should be at the start of each SQL batch to reduce network traffic
  • Dynamic SQL should be executed using sp_executesql.
  • Do not repeatedly call functions within stored procedures, functions, batches and triggers.
  • Sargability (Search ARGument Able) – Confirm queries executed are able to seek on indexes in database
    • Avoid wildcard characters at the beginning of a word while searching using the LIKE keyword
    • Avoid searching using not equals operators (<> and NOT)
    • Avoid functions in WHERE clause on table columns
    • Avoid implicit conversions in WHERE clause (use CAST/CONVERT if necessary)
BAD:   SELECT ... WHERE Year(StartDate) = 2014
FIXED: SELECT ... WHERE StartDate &gt;= '01-01-2014' AND StartDate &lt; '01-01-2015'

BAD:   SELECT ... WHERE isNull(Surname,'Williams') = 'Williams'
FIXED: SELECT ... WHERE ((Surname = 'Williams') OR (Surname IS NULL))

BAD:   SELECT ... WHERE LEFT(FirstName,1) = 'P'
FIXED: SELECT ... WHERE FirstName LIKE 'P%'

BAD:   SELECT ... WHERE DateDiff(mm,OrderDate,GetDate()) &gt;= 30
FIXED: SELECT ... WHERE OrderDate &lt; DateAdd(mm,-30,GetDate())

Sargability References

Stack Overflow – What makes a SQL Statement sargable? http://stackoverflow.com/questions/799584/what-makes-a-sql-statement-sargable
Rob Farley – Sargable Functions http://blogs.msmvps.com/robfarley/2010/01/21/sargable-functions-in-sql-server/
Brent Ozar – Why %string% is slow http://www.brentozar.com/archive/2010/06/sargable-why-string-is-slow/

Indexes

  • ¬†Naming convention – Prefix_TableName_ColumnName
  • ¬†Prefix – PK for Primary Keys, IX for non-clustered indexes, UC for Unique Indexes
  • Make sure the option ONLINE = ON is in place
  • Correct nonclustered indexes created for incoming queries – avoid RID and KEY lookups on tables
  • No Indexes on columns like “Gender” as this would only have a maximum of three possible entries (Male, Female & NULL) making an index ineffective. Nonclustered indexes must be as selective as possible