Functional dependencies (FDs) are a fundamental concept in relational database design. They describe the relationship between attributes (columns) in a relation (table). Understanding FDs is crucial for achieving database normalization, which helps reduce data redundancy and improve data integrity.
A functional dependency, denoted as X → Y
, states that for any two tuples (rows) r1
and r2
in a relation R
, if r1[X] = r2[X]
, then it must be that r1[Y] = r2[Y]
. In simpler terms, the value(s) of attribute(s) in set X
uniquely determine the value(s) of attribute(s) in set Y
.
X
is called the determinant.Y
is called the dependent.Functional dependencies can be categorized based on the attributes involved:
Y
is a subset of X
(Y ⊆ X
), then X → Y
is a trivial FD. These are always true and don't provide new information about the data's constraints. For example, if we have attributes {A, B}
, then {A, B} → A
is trivial.Y
is not a subset of X
, then X → Y
is a non-trivial FD. These are the dependencies that carry meaningful information about the data.X → Y
is a full functional dependency if no proper subset X' ⊂ X
functionally determines Y
. That is, for all A ∈ X
, (X - {A}) ¬→ Y
.Y
) is functionally determined by only a part of the determinant (X
). This typically occurs when the determinant is a composite key.A → B
and B → C
, where A
is the primary key, B
is not part of the primary key, and C
is not part of the primary key. (A → B
and B → C
, and B ¬→ A
). More formally, if X → Y
and Y → Z
, and Y
is not a superkey, then X → Z
is a transitive dependency.Armstrong's axioms are a set of rules that allow us to infer all valid functional dependencies from a given set of FDs. These rules are complete, meaning any FD that logically follows from a set of FDs can be derived using these axioms.
Y ⊆ X
, then X → Y
. (Trivial FDs)X → Y
and W
is any set of attributes, then XW → YW
.X → Y
and Y → Z
, then X → Z
.From these three, we can derive others:
X → Y
and X → Z
, then X → YZ
.X → YZ
, then X → Y
and X → Z
.X → Y
and WZ → V
, and Y
and W
are disjoint, and Y ⊆ W
, then XZ → V
.Consider a relation Students
with attributes: {StudentID, StudentName, CourseID, CourseName, InstructorID, InstructorName}
.
Suppose we have the following functional dependencies:
StudentID → StudentName
(A student ID uniquely determines a student's name)CourseID → CourseName
(A course ID uniquely determines a course's name){StudentID, CourseID} → InstructorID
(A specific student in a specific course is assigned a specific instructor)InstructorID → InstructorName
(An instructor ID uniquely determines an instructor's name)StudentID → StudentName
is a non-trivial FD.CourseID → CourseName
is a non-trivial FD.{StudentID, CourseID} → InstructorID
is a non-trivial FD.InstructorID → InstructorName
is a non-trivial FD.Let's look for potential issues related to normalization:
Consider the FD {StudentID, CourseID} → InstructorID
. If we can determine InstructorID
from StudentID
alone, or from CourseID
alone, then this is a partial dependency.
Suppose we also know that StudentID → InstructorID
. This would be a partial dependency because InstructorID
is determined by only part of the composite key {StudentID, CourseID}
.
Now, consider the FD StudentID → InstructorName
. We know StudentID → InstructorID
and InstructorID → InstructorName
. By transitivity, we can infer StudentID → InstructorName
. Since StudentID
is not a superkey for the whole relation, this represents a transitive dependency.
Functional dependencies are the bedrock of database normalization:
X → Y
, X
must be a superkey.By analyzing and enforcing functional dependencies, we can systematically normalize a database schema to achieve optimal structure.