Normalization is a database design technique used to organize data in a relational database. The goal of normalization is to reduce data redundancy and dependency by organizing fields and table of a database. This process involves breaking down large tables into smaller, more manageable tables and defining relationships between them.
The most common normal forms are the first normal form (1NF), second normal form (2NF), and third normal form (3NF). Here's a brief overview of each:
First Normal Form (1NF):
Eliminate duplicate columns from the same table.
Create separate tables for each group of related data.
Identify a primary key to uniquely identify each row.
Second Normal Form (2NF):
Meet the requirements of 1NF.
Remove partial dependencies by separating columns that are dependent on only part of the primary key into their own tables.
A table is in 2NF if it is in 1NF and every non-prime attribute is fully functionally dependent on the primary key.
Third Normal Form (3NF):
Meet the requirements of 2NF.
Remove transitive dependencies by removing columns that are dependent on other non-primary key columns.
A table is in 3NF if it is in 2NF and every non-prime attribute is non-transitively dependent on the primary key.
There are additional normal forms such as Boyce-Codd Normal Form (BCNF) and Fourth Normal Form (4NF), but they are refinements of the above and might not be needed in all cases.
It's important to note that normalization is a trade-off, and over-normalization can lead to complex queries and performance issues. In some cases, denormalization (introducing redundancy for performance gains) may be considered, but it should be done carefully.
Normalization is a key concept in database design, and understanding the principles can help create efficient, scalable, and maintainable database structures.
First Normal Form (1NF): explain with example
First Normal Form (1NF) is a property of a relational database that ensures that each column in a table contains atomic (indivisible) values, and there are no repeating groups or arrays of values. In other words, every cell of a table must hold a single, indivisible piece of data. Let's look at an example to illustrate the concept of 1NF.
Consider the following table that represents information about students and their courses:
Student_ID Student_Name Courses
1 John Math, Physics
2 Jane Chemistry, Biology, Physics
3 Bob Math, Chemistry
In the table above, the "Courses" column violates the 1NF because it contains multiple values separated by commas. To bring it into 1NF, we need to split the courses into separate rows for each student. The modified table would look like this:
Student_ID Student_Name Course
1 John Math
1 John Physics
2 Jane Chemistry
2 Jane Biology
2 Jane Physics
3 Bob Math
3 Bob Chemistry
Now, the table adheres to 1NF because each cell contains a single, indivisible value. The courses are no longer stored as a comma-separated list within a single cell.
In summary, First Normal Form ensures that there are no repeating groups or arrays within a table, and each column contains atomic values. It is a fundamental step in the normalization process to organize data in a relational database efficiently.
Second Normal Form (2NF):
Second Normal Form (2NF) is a database normalization form that builds on the concepts of First Normal Form (1NF). To be in 2NF, a table must first satisfy the requirements of 1NF, and additionally, it should eliminate partial dependencies by removing columns that are dependent on only part of the primary key.
A table is in 2NF if, and only if:
It is in 1NF.
All non-prime attributes (attributes not part of the primary key) are fully functionally dependent on the entire primary key.
Let's illustrate this with an example:
Consider the following table, which represents information about a company's employees and projects:
Employee_ID Project_ID Employee_Name Project_Name Hours_Worked
1 101 Alice ProjectA 20
1 102 Alice ProjectB 30
2 101 Bob ProjectA 25
2 103 Bob ProjectC 15
In this table, the composite key is {Employee_ID, Project_ID}, as no combination of these two columns repeats. However, we can see that the "Employee_Name" column is functionally dependent only on the "Employee_ID" part of the key, and the "Project_Name" column is functionally dependent only on the "Project_ID" part of the key. This indicates a partial dependency.
To bring this table into 2NF, we split it into two tables:
Employee Table:
Employee_ID Employee_Name
1 Alice
2 Bob
Project Table:
Project_ID Project_Name
101 ProjectA
102 ProjectB
103 ProjectC
Hours_Worked Table:
Employee_ID Project_ID Hours_Worked
1 101 20
1 102 30
2 101 25
2 103 15
Now, each table has a single-purpose, and there are no partial dependencies. The "Employee_Name" and "Project_Name" are stored in separate tables, ensuring that each non-prime attribute is fully functionally dependent on the entire primary key. The resulting tables are in Second Normal Form (2NF).
Third Normal Form (3NF):
Sure, let's discuss Second Normal Form (2NF) and Third Normal Form (3NF).
Second Normal Form (2NF):
The primary goal of 2NF is to address partial dependencies within a table. A table is in 2NF if it is in First Normal Form (1NF) and every non-prime attribute (attributes not part of the primary key) is fully functionally dependent on the entire primary key. In simpler terms, there should be no partial dependencies on a composite primary key.
Consider the following table:
Order_ID Product_ID Customer_Name Quantity
1 101 John 5
1 102 Jane 3
2 101 Bob 2
2 103 Alice 1
In this example, (Order_ID, Product_ID) is the composite primary key. The issue here is that the "Customer_Name" attribute is dependent only on the "Order_ID" part of the key, and not on the entire key.
To bring this table into 2NF, we split it into two tables:
Table 1: Orders
Order_ID Customer_Name
1 John
2 Bob
Table 2: Order_Details
Order_ID Product_ID Quantity
1 101 5
1 102 3
2 101 2
2 103 1
Now, "Customer_Name" is fully dependent on the primary key in the "Orders" table.
Third Normal Form (3NF):
The primary goal of 3NF is to address transitive dependencies within a table. A table is in 3NF if it is in 2NF and every non-prime attribute is non-transitively dependent on the primary key. In simpler terms, there should be no indirect dependencies between non-prime attributes.
Consider the following table:
Table: Employee
Employee_ID Department Manager_Name Manager_Department
101 Sales John Sales
102 HR Jane HR
In this example, "Manager_Department" is dependent on "Manager_Name," which is not directly part of the primary key.
To bring this table into 3NF, we split it into two tables:
Table 1: Employee
Employee_ID Department Manager_ID
101 Sales 1
102 HR 2
Table 2: Managers
Manager_ID Manager_Name Manager_Department
1 John Sales
2 Jane HR
Now, "Manager_Department" is directly dependent on the primary key in the "Managers" table, meeting the requirements of 3NF.
Comments
Post a Comment