📘 Normalization of Database Tables
Welcome to another part of the Database Management System (DBMS) study series! In this post, we’ll learn about Normalization — one of the most essential concepts in database design. You’ll understand why normalization is needed, how functional dependencies work, and how to convert tables into different normal forms such as 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF — all with simple explanations and examples.
🔍 What is Normalization?
Normalization is a process used in database design to organize data efficiently by reducing data redundancy and ensuring data integrity. It divides large tables into smaller related tables and defines relationships between them.
🎯 Goals of Normalization
- Eliminate redundant data (repetition of same data).
- Ensure data dependencies make sense (each data item depends on the key).
- Simplify data maintenance and updates.
- Improve consistency and storage efficiency.
⚙️ Need for Normalization
When a database is not normalized, several problems occur:
- Data redundancy: Same data stored multiple times.
- Update anomalies: Updating one record requires updating many others.
- Insertion anomalies: You can’t insert data due to missing fields.
- Deletion anomalies: Deleting one record may unintentionally remove useful data.
Example of Unnormalized Data:
STUDENT Table
| Student_ID | Student_Name | Course_1 | Course_2 | Instructor |
|-------------|---------------|-----------|-----------|-------------|
| 1 | Ananya | DBMS | OS | Dr. Mehta |
| 2 | Rahul | DBMS | NULL | Dr. Mehta |
Problems here:
- Redundancy — “Dr. Mehta” is repeated.
- Difficulty adding a new course or instructor without a student.
- Partial and inconsistent data.
🧩 Understanding Functional Dependencies (FDs)
A Functional Dependency is a relationship between two attributes in a table. It shows how one attribute uniquely determines another.
Notation: A → B means attribute A functionally determines B.
Example:
Student_ID → Student_Name
This means for every Student_ID, there is only one Student_Name associated.
🧱 Normalization Conversion Process
1️⃣ First Normal Form (1NF)
Rule: Each cell should contain a single value; no repeating groups or arrays.
Conversion to 1NF:
STUDENT_COURSE Table (After 1NF)
| Student_ID | Student_Name | Course | Instructor |
|-------------|---------------|------------|-------------|
| 1 | Ananya | DBMS | Dr. Mehta |
| 1 | Ananya | OS | Dr. Mehta |
| 2 | Rahul | DBMS | Dr. Mehta |
✅ Each field now contains only atomic (single) values.
2️⃣ Second Normal Form (2NF)
Rule: The table should be in 1NF, and all non-key attributes should depend on the entire primary key, not just part of it.
In our case, the composite key is (Student_ID, Course).
Problem: Instructor depends only on Course, not on Student_ID.
Conversion to 2NF:
STUDENT Table
| Student_ID | Student_Name |
|-------------|---------------|
| 1 | Ananya |
| 2 | Rahul |
COURSE Table
| Course | Instructor |
|-------------|-------------|
| DBMS | Dr. Mehta |
| OS | Dr. Mehta |
ENROLLMENT Table
| Student_ID | Course |
|-------------|--------|
| 1 | DBMS |
| 1 | OS |
| 2 | DBMS |
✅ Now, all non-key attributes depend on the full key.
3️⃣ Third Normal Form (3NF)
Rule: The table should be in 2NF, and all non-key attributes should depend directly on the primary key — not on other non-key attributes (no transitive dependency).
Example: If we had:
Course → Instructor → Department
Then Department is transitively dependent on Course.
Conversion to 3NF:
- Separate the
Departmentinto a new table.
INSTRUCTOR Table
| Instructor | Department |
|-------------|-------------|
| Dr. Mehta | Computer Science |
✅ Now all non-key attributes depend only on the key.
🏛️ Higher Normal Forms
🔸 Boyce-Codd Normal Form (BCNF)
Rule: For every functional dependency A → B, A must be a super key.
Example:
Teacher_ID → Subject
Subject → Teacher_ID
This violates BCNF because Subject is not a super key.
Solution: Split into separate tables:
TEACHER Table: Teacher_ID → Subject
SUBJECT Table: Subject → Teacher_ID
✅ Eliminates anomalies caused by overlapping dependencies.
🔸 Fourth Normal Form (4NF)
Rule: A table is in 4NF if it is in BCNF and contains no multi-valued dependencies.
Example: A student can have multiple skills and multiple hobbies:
STUDENT Table
| Student_ID | Skill | Hobby |
|-------------|--------|--------|
| 1 | C++ | Chess |
| 1 | Python | Chess |
| 1 | C++ | Music |
Problem: Skills and Hobbies are independent — causes unnecessary combinations.
Solution: Separate into two tables:
STUDENT_SKILL (Student_ID, Skill)
STUDENT_HOBBY (Student_ID, Hobby)
✅ Now there are no multi-valued dependencies.
🔸 Fifth Normal Form (5NF)
Rule: A table is in 5NF when it cannot be further decomposed without losing data. It removes join dependencies.
Example: A project may involve multiple students, instructors, and tools. Breaking such relationships correctly into smaller tables ensures accurate joins without redundancy.
✅ 5NF ensures complete data integrity in complex many-to-many relationships.
🧠 Summary Table of Normal Forms
| Normal Form | Main Rule | Removes |
|---|---|---|
| 1NF | Atomic values, no repeating groups | Repeating data |
| 2NF | Full dependency on entire key | Partial dependency |
| 3NF | No transitive dependency | Transitive dependency |
| BCNF | Every determinant is a key | Functional anomalies |
| 4NF | No multi-valued dependencies | Redundant multi-values |
| 5NF | No join dependency | Join anomalies |
💡 Key Takeaways
- Normalization is essential to maintain data accuracy and reduce redundancy.
- Start with 1NF and move step by step up to 3NF — most databases use 3NF or BCNF.
- Higher forms (4NF, 5NF) are needed for complex relationships.
- Always balance normalization with performance — over-normalization can slow queries.
🧪 Practice Exercise
Try normalizing the following unnormalized table:
BOOK Table
| Book_ID | Title | Author | Publisher | Publisher_Address | Author_Phone |
Convert it step-by-step to 1NF, 2NF, and 3NF.
Written by Saksham Shekher — MCA student & tech blogger passionate about simplifying computer science concepts for learners.
