What is First Normal Form (1NF) in DBMS? Explained with Examples
What is normalization in DBMS?
Database normalization is a process we use to organize columns and tables in relational databases to reduce data redundancy and improve data integrity. Normalization helps us logically structure the relationships between different columns in a table and between two tables in a database for efficient data storage.
During normalization, we divide a large table with composite attributes into one or more tables with simpler attributes. For this, we use several rules defined by different normal forms like first normal form (1NF), second normal form (2NF), third normal form (3NF), and more. All these normal forms define rules to remove redundancy and dependency issues from the tables.
By progressively applying these normal forms, we can make the database tables more structured and resilient to anomalies. This process begins with the first normal form (1NF). Let’s discuss what the first normal form is.
What is first normal form (1NF)?
The first normal form (1NF) is the first level of normalization. A table is said to be in 1NF if it consists of only atomic values. Atomic values are indivisible, single values that cannot be further broken down while still retaining meaning. If a table contains multiple values or composite values in a column, it violates 1NF. To decide whether or not a table is in 1NF, we use four rules, as discussed below:
Four rules for 1NF
We consider a table to be in 1NF if it follows the following rules:
- Atomicity of Data: All the values in a table must be atomic or indivisible. It must not contain composite values or lists within a single column. For example, if a table has a column
subjectscontaining multiple comma-separated subject names, the table will not be in 1NF as we can split the list into multiple values. Similarly, a table withaddressas an attribute cannot be in 1NF as we can split theaddresscolumn into attributes likestreet,city,state,country, andZIP code. - Uniqueness of Attributes: Each column must have a unique name and store a single type of data. For instance, we cannot have three columns named
subject_1,subject_2, andsubject_3in the same table if we want the table to be in 1NF. This is because all these columns contain the same type of data. - Same domain values in a column: Each column in a table must store the same type of data for the table to be in 1NF. We cannot put the marks of a student in the
subjectcolumn. Thesubjectcolumn must only store the subject name, and marks should only be stored in a separatemarkscolumn. - Distinct Unordered Rows: In 1NF, the order of rows doesn’t affect the meaning of the data, and each row in the table must be unique.
Now that we have discussed the rules a table must satisfy to be in the first normal form, let’s discuss the steps to convert a table to 1NF.
Steps to convert a table to 1st normal form
We can convert any database table into the first normal form in a four-step process. In this process, we modify the table in each step so that the modified table satisfies one of the rules for 1NF. After applying all the four steps, we get a table that satisfies all the four rules for 1NF.
To understand this four-step process, consider we have the following table:
| Student_ID | Name | Subject_1 | Subject_2 | Marks | Address |
|---|---|---|---|---|---|
| 101 | Alice | Math | Science | Math: 85, Science: 85 | 123 Main St, New York, NY, 10001 |
| 102 | Bob | English | History | English: 78, History: 85 | 456 Oak St, Los Angeles, CA, 90002 |
| 101 | Alice | Math | Science | Math: 85, Science: 85 | 123 Main St, New York, NY, 10001 |
| 103 | Charlie | Physics | Chemistry | Physics: 82,Chemistry: 89 | 789 Pine St, Houston, TX, 77003 |
| 104 | David | Biology | NULL | Biology: 91 | 321 Elm St, Miami, FL, 33101 |
| 105 | Emma | History | Math | History: 87, Math: 92 | 654 Maple St, Chicago, IL, 60607 |
This table has composite values, contains multiple values in a column cell, and has similar column names. Hence, the table isn’t in the first normal form. Let’s convert this table into 1NF in a stepwise manner.
Step 1: Ensure atomicity of data
The Marks column in the table contains multiple values in a single cell (e.g., “Math: 85, Science: 85”). Also, the Address column has composite values, i.e. (Street, City, State, and ZIP are all combined). To ensure atomicity of the data, we will use the following operations:
- Separate the
Markscolumn into two columns, namelySubject_1_MarksandSubject_2_Marks. - Split the
Addresscolumn into separate columns i.e.Street,City,State, andZIP.
After applying these steps to the original table, we will get the following table:
| Student_ID | Name | Subject_1 | Subject_2 | Subject_1_Marks | Subject_2_Marks | Street | City | State | ZIP |
|---|---|---|---|---|---|---|---|---|---|
| 101 | Alice | Math | Science | 85 | 85 | 123 Main St | New York | NY | 10001 |
| 102 | Bob | English | History | 78 | 85 | 456 Oak St | Los Angeles | CA | 90002 |
| 101 | Alice | Math | Science | 85 | 85 | 123 Main St | New York | NY | 10001 |
| 103 | Charlie | Physics | Chemistry | 82 | 89 | 789 Pine St | Houston | TX | 77003 |
| 104 | David | Biology | NULL | 91 | NULL | 321 Elm St | Miami | FL | 33101 |
| 105 | Emma | History | Math | 87 | 92 | 654 Maple St | Chicago | IL | 60607 |
You can see that all the columns in this table contain atomic values. Now, we will apply the rule for the uniqueness of attributes to further modify the table.
Step 2: Ensure uniqueness of attributes
After step 1, the columns Subject_1 and Subject_2 store the same type of data, i.e., subject names in different columns. Similarly, Subject_1_Marks and Subject_2_Marks are also of the same type. To ensure the uniqueness of the attributes, we will create separate rows for each subject per student, as shown below:
| Student_ID | Name | Subject | Marks | Street | City | State | ZIP |
|---|---|---|---|---|---|---|---|
| 101 | Alice | Math | 85 | 123 Main St | New York | NY | 10001 |
| 101 | Alice | Science | 85 | 123 Main St | New York | NY | 10001 |
| 102 | Bob | English | 78 | 456 Oak St | Los Angeles | CA | 90002 |
| 102 | Bob | History | 85 | 456 Oak St | Los Angeles | CA | 90002 |
| 101 | Alice | Math | 85 | 123 Main St | New York | NY | 10001 |
| 101 | Alice | Science | 85 | 123 Main St | New York | NY | 10001 |
| 103 | Charlie | Physics | 82 | 789 Pine St | Houston | TX | 77003 |
| 103 | Charlie | Chemistry | 89 | 789 Pine St | Houston | TX | 77003 |
| 104 | David | Biology | 91 | 321 Elm St | Miami | FL | 33101 |
| 105 | Emma | History | 87 | 654 Maple St | Chicago | IL | 60607 |
| 105 | Emma | Math | 92 | 654 Maple St | Chicago | IL | 60607 |
In this table, you can see that no two columns have the same type of data. Hence, we have ensured the atomicity of data and the uniqueness of attributes. Now, let’s move to the next step.
Step 3: Ensure the same domain values in a column
In step 3, we need to ensure that all the values in a column belong to the same domain. As all the columns in the table have the same type of values in all the rows, this rule is already satisfied. So, we will move to the next step.
Step 4: Ensure distinct unordered rows
In this step, we need to ensure that the table contains unique rows. As we have duplicate rows for student ID 101, we will remove the duplicate rows and ensure distinct unordered rows, as shown below:
| Student_ID | Name | Subject | Marks | Street | City | State | ZIP |
|---|---|---|---|---|---|---|---|
| 101 | Alice | Math | 85 | 123 Main St | New York | NY | 10001 |
| 101 | Alice | Science | 85 | 123 Main St | New York | NY | 10001 |
| 102 | Bob | English | 78 | 456 Oak St | Los Angeles | CA | 90002 |
| 102 | Bob | History | 85 | 456 Oak St | Los Angeles | CA | 90002 |
| 103 | Charlie | Physics | 82 | 789 Pine St | Houston | TX | 77003 |
| 103 | Charlie | Chemistry | 89 | 789 Pine St | Houston | TX | 77003 |
| 104 | David | Biology | 91 | 321 Elm St | Miami | FL | 33101 |
| 105 | Emma | History | 87 | 654 Maple St | Chicago | IL | 60607 |
| 105 | Emma | Math | 92 | 654 Maple St | Chicago | IL | 60607 |
This table satisfies all four rules for the first normal form. Hence, we have successfully converted the input table to 1NF. In real-world scenarios, you may need to apply one or more steps to ensure the table is in the first normal form. However, it’s recommended to follow the steps in the order discussed to simplify the process.
Conclusion
Normalization is crucial in database design as it ensures data integrity and reduces redundancy. The first normal form is the first step in normalizing the database tables. 1NF is the foundational step focusing on the atomicity of values and the uniqueness of rows and columns. By following the steps to convert a table to 1NF, you can improve the efficiency and consistency of your database. To improve them further, you can apply higher normal forms like the second normal form (2NF), third normal form (3NF), Boyce-Codd normal form (BCNF), fourth normal form (4NF), etc, in a progression.
To learn more about databases, you can take up this introduction to SQL course on Codecademy. You might also like this course on designing databases with PostgreSQL.
Frequently asked questions
1. What is the difference between 1NF and 2NF?
First Normal Form (1NF) focuses on eliminating repeating groups and ensuring atomic values in each cell. Second Normal Form (2NF) builds upon 1NF by additionally eliminating partial dependencies - it requires that all non-key attributes be fully functionally dependent on the entire primary key, not just part of it.
2. How do you know if a table is in first normal form?
A table is in first normal form if it meets all four criteria: contains only atomic (indivisible) values, has unique column names with consistent data types, stores the same type of data in each column, and contains only distinct rows with no duplicates.
3. What are common 1NF violations?
Common 1NF violations include: storing multiple values in a single cell (like “Math, Science” in one column), using composite attributes (like combining street, city, and state in one address field), having duplicate column names, and storing different data types in the same column.
4. Can a table be in 1NF but not 2NF?
Yes, a table can satisfy 1NF requirements but still violate 2NF rules. This happens when the table has atomic values and unique rows (1NF) but contains partial dependencies where some non-key attributes depend on only part of a composite primary key (violating 2NF).
5. Why is 1NF important in database design?
1NF is crucial because it establishes the foundation for all database normalization. It eliminates data redundancy, prevents update anomalies, ensures data consistency, and makes the database more efficient for querying and maintenance. Without 1NF, higher normal forms (2NF, 3NF) cannot be achieved.
6. What is an example of 1NF violation?
A classic 1NF violation is a student table with a “Subjects” column containing “Math, Science, History” in a single cell. This violates atomicity because the cell contains multiple values that can be further divided. The correct 1NF approach would be to separate rows for each subject per student.
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
- Article
What is Normalization in DBMS? Explained with Examples
Learn what normalization in DBMS is. Complete guide to 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF with examples and practical applications. - Article
Common SQL Interview Questions
Practice with some common SQL interview questions. - Article
What is a Relational Database (RDBMS)?
Learn what relational databases are, how SQL is used in RDBMS, and explore the key components like tables, keys, and relationships with examples.
Learn more on Codecademy
- Use SQL to create, access, and update tables of data in a relational database.
- Beginner Friendly.2 hours
- In this SQL course, you'll learn how to manage large datasets and analyze real data using the standard data management language.
- Beginner Friendly.5 hours
- Learn how to query SQL databases and design relational databases to efficiently store large quantities of data.
- Includes 5 Courses
- With Certificate
- Beginner Friendly.13 hours