Functional Dependencies

Functional dependencies (FDs) are a fundamental concept in relational database design. They describe the relationship between attributes (columns) in a relation (table). Understanding FDs is crucial for achieving database normalization, which helps reduce data redundancy and improve data integrity.

Definition

A functional dependency, denoted as X → Y, states that for any two tuples (rows) r1 and r2 in a relation R, if r1[X] = r2[X], then it must be that r1[Y] = r2[Y]. In simpler terms, the value(s) of attribute(s) in set X uniquely determine the value(s) of attribute(s) in set Y.

X is called the determinant.
Y is called the dependent.

Types of Functional Dependencies

Functional dependencies can be categorized based on the attributes involved:

Trivial Functional Dependency: If Y is a subset of X (Y ⊆ X), then X → Y is a trivial FD. These are always true and don't provide new information about the data's constraints. For example, if we have attributes {A, B}, then {A, B} → A is trivial.
Non-Trivial Functional Dependency: If Y is not a subset of X, then X → Y is a non-trivial FD. These are the dependencies that carry meaningful information about the data.
Full Functional Dependency: An FD X → Y is a full functional dependency if no proper subset X' ⊂ X functionally determines Y. That is, for all A ∈ X, (X - {A}) ¬→ Y.
Partial Functional Dependency: An FD is a partial dependency if the dependent (Y) is functionally determined by only a part of the determinant (X). This typically occurs when the determinant is a composite key.
Transitive Functional Dependency: A transitive dependency exists if A → B and B → C, where A is the primary key, B is not part of the primary key, and C is not part of the primary key. (A → B and B → C, and B ¬→ A). More formally, if X → Y and Y → Z, and Y is not a superkey, then X → Z is a transitive dependency.

Inference Rules (Armstrong's Axioms)

Armstrong's axioms are a set of rules that allow us to infer all valid functional dependencies from a given set of FDs. These rules are complete, meaning any FD that logically follows from a set of FDs can be derived using these axioms.

Reflexivity: If Y ⊆ X, then X → Y. (Trivial FDs)
Augmentation: If X → Y and W is any set of attributes, then XW → YW.
Transitivity: If X → Y and Y → Z, then X → Z.

From these three, we can derive others:

Union: If X → Y and X → Z, then X → YZ.
Decomposition: If X → YZ, then X → Y and X → Z.
Pseudotransitivity: If X → Y and WZ → V, and Y and W are disjoint, and Y ⊆ W, then XZ → V.

Example

Consider a relation Students with attributes: {StudentID, StudentName, CourseID, CourseName, InstructorID, InstructorName}.

Suppose we have the following functional dependencies:

StudentID → StudentName (A student ID uniquely determines a student's name)
CourseID → CourseName (A course ID uniquely determines a course's name)
{StudentID, CourseID} → InstructorID (A specific student in a specific course is assigned a specific instructor)
InstructorID → InstructorName (An instructor ID uniquely determines an instructor's name)

Analysis:

StudentID → StudentName is a non-trivial FD.
CourseID → CourseName is a non-trivial FD.
{StudentID, CourseID} → InstructorID is a non-trivial FD.
InstructorID → InstructorName is a non-trivial FD.

Let's look for potential issues related to normalization:

Consider the FD {StudentID, CourseID} → InstructorID. If we can determine InstructorID from StudentID alone, or from CourseID alone, then this is a partial dependency.

Suppose we also know that StudentID → InstructorID. This would be a partial dependency because InstructorID is determined by only part of the composite key {StudentID, CourseID}.

Now, consider the FD StudentID → InstructorName. We know StudentID → InstructorID and InstructorID → InstructorName. By transitivity, we can infer StudentID → InstructorName. Since StudentID is not a superkey for the whole relation, this represents a transitive dependency.

Importance in Normalization

Functional dependencies are the bedrock of database normalization:

First Normal Form (1NF): Addresses repeating groups and atomic values.
Second Normal Form (2NF): Requires 1NF and eliminates partial dependencies. Any non-key attribute must be fully functionally dependent on the primary key.
Third Normal Form (3NF): Requires 2NF and eliminates transitive dependencies. Non-key attributes cannot be transitively dependent on the primary key.
Boyce-Codd Normal Form (BCNF): A stricter version of 3NF. For every non-trivial FD X → Y, X must be a superkey.

By analyzing and enforcing functional dependencies, we can systematically normalize a database schema to achieve optimal structure.

← Previous: Second Normal Form (2NF) Next: Third Normal Form (3NF) →