4.2. Normalization 1st Normal Form

  • Atomic Values

  • Unique Identifiers (rows uniquely identified by PRIMARY KEY)

  • Entries in a column are same type

4.2.1. 1st Normal Form

  • A cell must never contain more than one value

  • Each row must be unique

  • Each column name must be unique

  • There must be no repeating groups

4.2.2. Atomic Values

  • Cell with single values

  • A cell must never contain more than one value

  • Each value needs to be in its own column

Table 4.14. Assignments (problem)

name

role

Melissa Lewis

Commander

Mark Watney

Botanist

Rick Martinez

Pilot

Note that in this example astronaut's firstname and lastname is in the same column (name).

Table 4.15. Assignments (solution)

firstname

lastname

role

Melissa

Lewis

Commander

Mark

Watney

Botanist

Rick

Martinez

Pilot

../../_images/normalform-1st-atomicvalues.png

Figure 4.10. A cell must never contain more than one value (each value needs to be in its own column) Image credits: [1]

4.2.3. Unique Rows

  • Each row must be unique

  • One column, or combination of columns must uniquely identify a row

  • Also known as PRIMARY KEY

Table 4.16. assignments (problem)

firstname

lastname

Melissa

Lewis

Mark

Watney

Mark

Watney

Note, that this in problematic example we can have duplicate names, for example there could be more than one Mark Watney in the world.

Table 4.17. assignments (solution)

id (PK)

firstname

lastname

1

Melissa

Lewis

2

Mark

Watney

3

Mark

Watney

Now, even if we have two Mark Watneys, each of them will have a unique identifier (id).

../../_images/normalform-1st-primarykey.png

Figure 4.11. Each row must be unique (one column, or combination of columns must uniquely identify a row also known as PRIMARY KEY). Image credits: [1]

4.2.4. Unique Column Names

  • Each Column Name Must be unique

Table 4.18. assignments (problem)

name

year

name

Melissa Lewis

2035

Ares3

Mark Watney

2035

Ares3

Rick Martinez

2035

Ares3

Some column names are very (such as name) to the point, that there could be several independent columns which we can call by name. Each column should have unique name.

Table 4.19. assignments (solution)

astronaut

year

mission

Melissa Lewis

2035

Ares3

Mark Watney

2035

Ares3

Rick Martinez

2035

Ares3

../../_images/normalform-1st-uniquecolumnnames.png

Figure 4.12. Each column name must be unique. Image credits: [1]

4.2.5. No Repeating Groups

  • There must be no repeating groups

Table 4.20. Assignments (problem)

astronaut

year1

mission1

year2

mission2

Melissa Lewis

2031

Ares1

2035

Ares3

Mark Watney

2031

Ares1

2035

Ares3

Rick Martinez

2031

Ares1

2035

Ares3

../../_images/normalform-1st-norepeatinggroups.png

Figure 4.13. There must be no repeating groups. Image credits: [1]

4.2.6. Recap

../../_images/normalform-1st-summary.png

Figure 4.14. If both Atomic Values and Unique Identifiers rules are applied the database achieved 1st Normal Form. Image credits: [1]

4.2.7. References