Articles

What is First Normal Form (1NF) in DBMS? Explained with Examples

  • Use SQL to create, access, and update tables of data in a relational database.
    • Beginner Friendly.
      2 hours
  • In this SQL course, you'll learn how to manage large datasets and analyze real data using the standard data management language.
    • Beginner Friendly.
      5 hours

What is normalization in DBMS?

Database normalization is a process we use to organize columns and tables in relational databases to reduce data redundancy and improve data integrity. Normalization helps us logically structure the relationships between different columns in a table and between two tables in a database for efficient data storage.

During normalization, we divide a large table with composite attributes into one or more tables with simpler attributes. For this, we use several rules defined by different normal forms like first normal form (1NF), second normal form (2NF), third normal form (3NF), and more. All these normal forms define rules to remove redundancy and dependency issues from the tables.

By progressively applying these normal forms, we can make the database tables more structured and resilient to anomalies. This process begins with the first normal form (1NF). Let’s discuss what the first normal form is.

What is first normal form (1NF)?

The first normal form (1NF) is the first level of normalization. A table is said to be in 1NF if it consists of only atomic values. Atomic values are indivisible, single values that cannot be further broken down while still retaining meaning. If a table contains multiple values or composite values in a column, it violates 1NF. To decide whether or not a table is in 1NF, we use four rules, as discussed below:

Four rules for 1NF

We consider a table to be in 1NF if it follows the following rules:

  1. Atomicity of Data: All the values in a table must be atomic or indivisible. It must not contain composite values or lists within a single column. For example, if a table has a column subjects containing multiple comma-separated subject names, the table will not be in 1NF as we can split the list into multiple values. Similarly, a table with address as an attribute cannot be in 1NF as we can split the address column into attributes like street, city, state, country, and ZIP code.
  2. Uniqueness of Attributes: Each column must have a unique name and store a single type of data. For instance, we cannot have three columns named subject_1, subject_2, and subject_3 in the same table if we want the table to be in 1NF. This is because all these columns contain the same type of data.
  3. Same domain values in a column: Each column in a table must store the same type of data for the table to be in 1NF. We cannot put the marks of a student in the subject column. The subject column must only store the subject name, and marks should only be stored in a separate marks column.
  4. Distinct Unordered Rows: In 1NF, the order of rows doesn’t affect the meaning of the data, and each row in the table must be unique.

Now that we have discussed the rules a table must satisfy to be in the first normal form, let’s discuss the steps to convert a table to 1NF.

Steps to convert a table to 1st normal form

We can convert any database table into the first normal form in a four-step process. In this process, we modify the table in each step so that the modified table satisfies one of the rules for 1NF. After applying all the four steps, we get a table that satisfies all the four rules for 1NF.

To understand this four-step process, consider we have the following table:

Student_ID Name Subject_1 Subject_2 Marks Address
101 Alice Math Science Math: 85, Science: 85 123 Main St, New York, NY, 10001
102 Bob English History English: 78, History: 85 456 Oak St, Los Angeles, CA, 90002
101 Alice Math Science Math: 85, Science: 85 123 Main St, New York, NY, 10001
103 Charlie Physics Chemistry Physics: 82,Chemistry: 89 789 Pine St, Houston, TX, 77003
104 David Biology NULL Biology: 91 321 Elm St, Miami, FL, 33101
105 Emma History Math History: 87, Math: 92 654 Maple St, Chicago, IL, 60607

This table has composite values, contains multiple values in a column cell, and has similar column names. Hence, the table isn’t in the first normal form. Let’s convert this table into 1NF in a stepwise manner.

Step 1: Ensure atomicity of data

The Marks column in the table contains multiple values in a single cell (e.g., “Math: 85, Science: 85”). Also, the Address column has composite values, i.e. (Street, City, State, and ZIP are all combined). To ensure atomicity of the data, we will use the following operations:

  • Separate the Marks column into two columns, namely Subject_1_Marks and Subject_2_Marks.
  • Split the Address column into separate columns i.e. Street, City, State, and ZIP.

After applying these steps to the original table, we will get the following table:

Student_ID Name Subject_1 Subject_2 Subject_1_Marks Subject_2_Marks Street City State ZIP
101 Alice Math Science 85 85 123 Main St New York NY 10001
102 Bob English History 78 85 456 Oak St Los Angeles CA 90002
101 Alice Math Science 85 85 123 Main St New York NY 10001
103 Charlie Physics Chemistry 82 89 789 Pine St Houston TX 77003
104 David Biology NULL 91 NULL 321 Elm St Miami FL 33101
105 Emma History Math 87 92 654 Maple St Chicago IL 60607

You can see that all the columns in this table contain atomic values. Now, we will apply the rule for the uniqueness of attributes to further modify the table.

Step 2: Ensure uniqueness of attributes

After step 1, the columns Subject_1 and Subject_2 store the same type of data, i.e., subject names in different columns. Similarly, Subject_1_Marks and Subject_2_Marks are also of the same type. To ensure the uniqueness of the attributes, we will create separate rows for each subject per student, as shown below:

Student_ID Name Subject Marks Street City State ZIP
101 Alice Math 85 123 Main St New York NY 10001
101 Alice Science 85 123 Main St New York NY 10001
102 Bob English 78 456 Oak St Los Angeles CA 90002
102 Bob History 85 456 Oak St Los Angeles CA 90002
101 Alice Math 85 123 Main St New York NY 10001
101 Alice Science 85 123 Main St New York NY 10001
103 Charlie Physics 82 789 Pine St Houston TX 77003
103 Charlie Chemistry 89 789 Pine St Houston TX 77003
104 David Biology 91 321 Elm St Miami FL 33101
105 Emma History 87 654 Maple St Chicago IL 60607
105 Emma Math 92 654 Maple St Chicago IL 60607

In this table, you can see that no two columns have the same type of data. Hence, we have ensured the atomicity of data and the uniqueness of attributes. Now, let’s move to the next step.

Step 3: Ensure the same domain values in a column

In step 3, we need to ensure that all the values in a column belong to the same domain. As all the columns in the table have the same type of values in all the rows, this rule is already satisfied. So, we will move to the next step.

Step 4: Ensure distinct unordered rows

In this step, we need to ensure that the table contains unique rows. As we have duplicate rows for student ID 101, we will remove the duplicate rows and ensure distinct unordered rows, as shown below:

Student_ID Name Subject Marks Street City State ZIP
101 Alice Math 85 123 Main St New York NY 10001
101 Alice Science 85 123 Main St New York NY 10001
102 Bob English 78 456 Oak St Los Angeles CA 90002
102 Bob History 85 456 Oak St Los Angeles CA 90002
103 Charlie Physics 82 789 Pine St Houston TX 77003
103 Charlie Chemistry 89 789 Pine St Houston TX 77003
104 David Biology 91 321 Elm St Miami FL 33101
105 Emma History 87 654 Maple St Chicago IL 60607
105 Emma Math 92 654 Maple St Chicago IL 60607

This table satisfies all four rules for the first normal form. Hence, we have successfully converted the input table to 1NF. In real-world scenarios, you may need to apply one or more steps to ensure the table is in the first normal form. However, it’s recommended to follow the steps in the order discussed to simplify the process.

Conclusion

Normalization is crucial in database design as it ensures data integrity and reduces redundancy. The first normal form is the first step in normalizing the database tables. 1NF is the foundational step focusing on the atomicity of values and the uniqueness of rows and columns. By following the steps to convert a table to 1NF, you can improve the efficiency and consistency of your database. To improve them further, you can apply higher normal forms like the second normal form (2NF), third normal form (3NF), Boyce-Codd normal form (BCNF), fourth normal form (4NF), etc, in a progression.

To learn more about databases, you can take up this introduction to SQL course on Codecademy. You might also like this course on designing databases with PostgreSQL.

Frequently asked questions

1. What is the difference between 1NF and 2NF?

First Normal Form (1NF) focuses on eliminating repeating groups and ensuring atomic values in each cell. Second Normal Form (2NF) builds upon 1NF by additionally eliminating partial dependencies - it requires that all non-key attributes be fully functionally dependent on the entire primary key, not just part of it.

2. How do you know if a table is in first normal form?

A table is in first normal form if it meets all four criteria: contains only atomic (indivisible) values, has unique column names with consistent data types, stores the same type of data in each column, and contains only distinct rows with no duplicates.

3. What are common 1NF violations?

Common 1NF violations include: storing multiple values in a single cell (like “Math, Science” in one column), using composite attributes (like combining street, city, and state in one address field), having duplicate column names, and storing different data types in the same column.

4. Can a table be in 1NF but not 2NF?

Yes, a table can satisfy 1NF requirements but still violate 2NF rules. This happens when the table has atomic values and unique rows (1NF) but contains partial dependencies where some non-key attributes depend on only part of a composite primary key (violating 2NF).

5. Why is 1NF important in database design?

1NF is crucial because it establishes the foundation for all database normalization. It eliminates data redundancy, prevents update anomalies, ensures data consistency, and makes the database more efficient for querying and maintenance. Without 1NF, higher normal forms (2NF, 3NF) cannot be achieved.

6. What is an example of 1NF violation?

A classic 1NF violation is a student table with a “Subjects” column containing “Math, Science, History” in a single cell. This violates atomicity because the cell contains multiple values that can be further divided. The correct 1NF approach would be to separate rows for each subject per student.

Codecademy Team

'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'

Meet the full team

Learn more on Codecademy

  • Use SQL to create, access, and update tables of data in a relational database.
    • Beginner Friendly.
      2 hours
  • In this SQL course, you'll learn how to manage large datasets and analyze real data using the standard data management language.
    • Beginner Friendly.
      5 hours
  • Learn how to query SQL databases and design relational databases to efficiently store large quantities of data.
    • Includes 5 Courses
    • With Certificate
    • Beginner Friendly.
      13 hours