DbSchema Database Designer

DbSchema | SQL Server Database Normalization Techniques

Publish on DbSchema Blog >>>

SQL Server: How to Apply Database Normalization Techniques in sqlcmd and DbSchema

SQL Server alt >

Table of Contents

  1. Introduction
  2. Prerequisites
  3. What is Database Normalization?
  4. Purpose of Using Database Normalization
  5. Restrictions and Permissions for Database Normalization
  6. Advantages and Limitations of Database Normalization
  7. Principles of Database Normalization
  8. Understanding the Normal Forms
  9. Applying Normal Forms in sqlcmd and DbSchema
  10. Conclusion
  11. References

Introduction

With the exponential growth of data, organizing it in a meaningful and efficient manner has become extremely important. This is where Database Normalization comes into play, an important concept for any database developer. This article provides a comprehensive guide to understanding and applying Database Normalization techniques in SQL Server using tools like sqlcmd and DbSchema. We will take a deep dive into different Normal Forms and also discuss how to use them in real-world scenarios.

Prerequisites

  • Basic understanding of SQL Server
  • Familiarity with SQL commands and syntax
  • Installed instances of SQL Server and DbSchema
  • Basic understanding of Database Management Systems

For installation and establishing connection you can read our article SQL Server-How to create a database?

What is Database Normalization?

Database Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anomalies. It is a multi-step process that puts data into a tabular form, removing duplicated data from the relation tables. The end result is a set of tables that are well-structured and reliable for the long run.

Purpose of Using Database Normalization

The primary purpose of Database Normalization is to:

  • Eliminate redundant data
  • Ensure data dependencies make sense
  • Improve data consistency
  • Protect the integrity of databases

Restrictions and Permissions for Database Normalization

To perform database normalization in SQL Server, you need certain permissions. These include:

  • The CREATE TABLE permission in the database
  • The ALTER permission on the schema

However, there are restrictions such as:

  • Tables must not contain sparse columns
  • Tables should be on the same filegroup

Advantages and Limitations of Database Normalization

Advantages

  • Reduces data redundancy and improves data integrity
  • Makes database schema flexible
  • Facilitates efficient data retrieval with quicker queries

Limitations

  • Over-normalization can lead to excess complexity and impact performance
  • It might require more joins, which can be costly in terms of resources

Principles of Database Normalization

Normalization involves arranging attributes in relations based on dependencies between attributes. The key principles of normalization include:

  • Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key. This refers to the concept of functional dependency.
  • The values in an attribute should be atomic.
  • There should be no redundant data.

Understanding the Normal Forms

First Normal Form (1NF)

This form imposes the following rules:

  • Each table should have atomic (indivisible) values.
  • Each record needs to be unique.

Second Normal Form (2NF)

  • All requirements for 1st NF must be met.
  • Redundant data across multiple rows of a table should be moved to a separate table.

Third Normal Form (3NF)

  • All requirements for 2nd NF must be met.
  • There should be no transitive dependencies for non-prime attributes.

Boyce-Codd Normal Form (BCNF or 4NF)

  • All requirements for 3rd NF must be met.
  • For any dependency A → B, A should be a super key.

Fifth Normal Form (5NF)

  • All requirements for 4th NF must be met.
  • There should be no possibility of generating more information than contained in the concatenated keys and the related columns.

Sixth Normal Form (6NF)

It is used for temporal databases like time-series databases.

Applying Normal Forms in sqlcmd and DbSchema

For this tutorial, let’s consider a sample database Bookstore. For the sake of simplicity, we will use a single Books table, and we will normalize it step-by-step.

1NF

Firstly, we must ensure that each cell of our table contains atomic (indivisible) values. For example, in our Books table, we should separate the Authors column if a book can have multiple authors.

In sqlcmd:

To do this in sqlcmd, you would run:

CREATE TABLE Authors(
  AuthorId INT PRIMARY KEY,
  Name NVARCHAR(100)
);

CREATE TABLE Books(
  BookId INT PRIMARY KEY,
  Title NVARCHAR(100),
  AuthorId INT FOREIGN KEY REFERENCES Authors(AuthorId)
);

To know more about creating a table you can read our article SQL Server-How to create a table?

In DbSchema:

In DbSchema, you would follow a similar process, but with a visual interface. Right-click in the layout area, select Create new table, then define the columns as above.

2NF

In this stage, we want to remove data redundancy by moving the data to separate tables and establish a relationship between them using foreign keys.

In sqlcmd:

CREATE TABLE Publishers(
  PublisherId INT PRIMARY KEY,
  Name NVARCHAR(100)
);

CREATE TABLE Books(
  BookId INT PRIMARY KEY,
  Title NVARCHAR(100),
  AuthorId INT FOREIGN KEY REFERENCES Authors(AuthorId),
  PublisherId INT FOREIGN KEY REFERENCES Publishers(PublisherId)
);

In DbSchema:

You would follow similar steps in DbSchema to create the Publishers table and define the relationship with Books.

3NF

In the 3rd NF, we remove transitive dependencies. Assume there’s a column PublisherAddress in the Books table, which depends on PublisherId, not BookId. We remove it from Books and move to Publishers.

Boyce-Codd Normal Form (BCNF)

BCNF addresses a scenario where a table has more than one candidate key and there are dependencies between those candidate keys. This can lead to anomalies in the database.

For instance, let’s assume there is a table ClassRoom which contains columns ClassRoomID, Subject, Teacher. Both ClassRoomID and Subject could be considered candidate keys for this table, because a classroom will have only one subject and a subject is taught by one teacher only.

This relation violates BCNF rules because Teacher is dependent on the candidate key Subject (non-prime attribute). To normalize this relation to BCNF, we need to break this table up into two tables: ClassRoom and Subject.

In sqlcmd:

CREATE TABLE ClassRoom(
  ClassRoomID INT PRIMARY KEY,
  Subject NVARCHAR(100),
);

CREATE TABLE Subject(
  Subject NVARCHAR(100) PRIMARY KEY,
  Teacher NVARCHAR(100)
);

In DbSchema:

In DbSchema, create the tables using the GUI as previously explained.

Fourth Normal Form (4NF)

4NF aims to handle multivalued dependencies. Assume a Students table in our Bookstore database, which records the books each student has borrowed and their respective return dates. Each student can borrow multiple books and each book can have multiple return dates.

This table could potentially have redundant data and lead to update anomalies. To normalize it to the 4NF, we need to split it into two tables.

In sqlcmd:

CREATE TABLE StudentBooks(
  StudentId INT PRIMARY KEY,
  BookId INT FOREIGN KEY REFERENCES Books(BookId),
);

CREATE TABLE BookReturnDates(
  BookId INT PRIMARY KEY REFERENCES Books(BookId),
  ReturnDate DATE
);

In DbSchema:

Again, in DbSchema, these tables are created through the GUI.

Fifth Normal Form (5NF)

5NF or Project-Join Normal Form (PJNF) deals with cases where information can be reconstructed from smaller pieces of information that can be maintained with less redundancy.

For example, assume we have a BookAuthorPublisher table in our Bookstore database with BookId, AuthorId, and PublisherId as composite key. However, a book can be written by multiple authors and can be published by multiple publishers.

To resolve this, we will decompose this table into three tables: Books, Authors, and Publishers.

In sqlcmd:

CREATE TABLE BookAuthors(
  BookId INT PRIMARY KEY REFERENCES Books(BookId),
  AuthorId INT FOREIGN KEY REFERENCES Authors(AuthorId)
);

CREATE TABLE BookPublishers(
  BookId INT PRIMARY KEY REFERENCES Books(BookId),
  PublisherId INT FOREIGN KEY REFERENCES Publishers(PublisherId)
);

In DbSchema:
As before, use DbSchema’s interface to create these tables.

Note:

It’s important to note that normalizing a database to 4NF or 5NF is quite rare as most of the real-world databases are normalized up to 3NF or BCNF. The higher normal forms are more theoretical and are less likely to be found in practical application, but they’re useful for understanding how data can be efficiently organized.

Results from the Query:

First Normal Form (1NF)

After executing 1NF, your Authors table might look like this:

AuthorId _Name_
1 John Smith
2 Mary Johnson
3 James Williams

And your Books table might look like this:

BookId _Title_ AuthorId
1 Programming 101 1
2 Design Patterns 2
3 Learn SQL 1

Second Normal Form (2NF)

After executing 2NF, you might have a new Publishers table:

PublisherId _Name_
1 TechBooks Publishing
2 Educational Reads

And your Books table might now look like this:

BookId Title AuthorId PublisherId
1 Programming 101 1 1
2 Design Patterns 2 2
3 Learn SQL 1 1

Third Normal Form (3NF)

After executing 3NF, your Publishers table might now look like this:

PublisherId Name PublisherAddress
1 TechBooks Publishing 123 Tech St, NY
2 Educational Reads 456 Education Lane, LA

Fourth Normal Form (4NF)

After executing 4NF, your StudentBooks table might look like this:

StudentId BookId
1 1
2 1
2 2

And you might have a new BookReturnDates table:

BookId ReturnDate
1 2023-07-20
1 2023-07-21
2 2023-07-22

Fifth Normal Form (5NF)

After executing 5NF, your BookAuthors table might look like this:

BookId AuthorId
1 1
2 2
3 1
1 2

And your BookPublishers table might look like this:

BookId PublisherId
1 1
2 2
3 1
1 2

Conclusion

Database normalization is a crucial process to create reliable and efficient databases. While the process can seem complex, tools like sqlcmd and DbSchema, along with a solid understanding of the Normal Forms, can make it manageable. Despite its limitations, normalization is a best practice for database design, and it is an essential skill for any database professional.

References

  1. Microsoft Docs: sqlcmd
  2. DbSchema: Interactive Diagrams
  3. Database Normalization: Database Normalization

Visually Manage Databases using DbSchema

DbSchema is a databases client and visual designer. DbSchema has a free Community Edition, which can be downloaded here.
DbSchema main features include:

DbSchema Designer alt >

Interactive Diagrams

Design tables, column and foreign keys directly in diagrams, by double-clicking them. Changes will be saved to the design model and, if DbSchema is connected to the database also into the database. More.


Connection Dialog alt >

Simple Connection Dialog

Choose the database location, the user and password, and simply get connected. Choose 'Edit Manually' into the JDBC URL combo to enter a custom URL. More.


Relational Data Explorer alt >

Relational Data Explorer

Explore data from multiple tables simultaneously, using foreign keys or virtual foreign keys. Double-click cells to edit the data. More.


Query Builder alt >

Query Builder

Create SQL Queries featuring JOINS, GROUP BY, ORDER BY just using the mouse. More.


SQL Query Editor alt >

SQL Query Editor

Edit and execute SQL Queries. The editor is autocompletion-enabled. More.


Schema Synchronization alt >

Design Schema in Team & Schema Deployment

DbSchema is using the design model, a copy of the schema structure, independent of the database.
The design model can be saved to file and shared in a team.
Connecting to another database you may compare the design model with the database, commit the differences or merge them in the design model. More.


Dark Theme alt >

Dark Theme

Configurable styles & dark theme. More.


Many features are available in the free Community edition.
The Pro edition adds capabilities to save the design to the model file, design schema in team and deploy the schema on multiple databases.


DbSchema can be downloaded for free. No registration is required.