Functional dependencies (FDs) are a fundamental concept in relational database design. They describe the relationship between attributes (columns) in a relation (table). Understanding FDs is crucial for achieving database normalization, which helps reduce data redundancy and improve data integrity.
A functional dependency, denoted as X → Y, states that for any two tuples (rows) r1 and r2 in a relation R, if r1[X] = r2[X], then it must be that r1[Y] = r2[Y]. In simpler terms, the value(s) of attribute(s) in set X uniquely determine the value(s) of attribute(s) in set Y.
X is called the determinant.Y is called the dependent.Functional dependencies can be categorized based on the attributes involved:
Y is a subset of X (Y ⊆ X), then X → Y is a trivial FD. These are always true and don't provide new information about the data's constraints. For example, if we have attributes {A, B}, then {A, B} → A is trivial.Y is not a subset of X, then X → Y is a non-trivial FD. These are the dependencies that carry meaningful information about the data.X → Y is a full functional dependency if no proper subset X' ⊂ X functionally determines Y. That is, for all A ∈ X, (X - {A}) ¬→ Y.Y) is functionally determined by only a part of the determinant (X). This typically occurs when the determinant is a composite key.A → B and B → C, where A is the primary key, B is not part of the primary key, and C is not part of the primary key. (A → B and B → C, and B ¬→ A). More formally, if X → Y and Y → Z, and Y is not a superkey, then X → Z is a transitive dependency.Armstrong's axioms are a set of rules that allow us to infer all valid functional dependencies from a given set of FDs. These rules are complete, meaning any FD that logically follows from a set of FDs can be derived using these axioms.
Y ⊆ X, then X → Y. (Trivial FDs)X → Y and W is any set of attributes, then XW → YW.X → Y and Y → Z, then X → Z.From these three, we can derive others:
X → Y and X → Z, then X → YZ.X → YZ, then X → Y and X → Z.X → Y and WZ → V, and Y and W are disjoint, and Y ⊆ W, then XZ → V.Consider a relation Students with attributes: {StudentID, StudentName, CourseID, CourseName, InstructorID, InstructorName}.
Suppose we have the following functional dependencies:
StudentID → StudentName (A student ID uniquely determines a student's name)CourseID → CourseName (A course ID uniquely determines a course's name){StudentID, CourseID} → InstructorID (A specific student in a specific course is assigned a specific instructor)InstructorID → InstructorName (An instructor ID uniquely determines an instructor's name)StudentID → StudentName is a non-trivial FD.CourseID → CourseName is a non-trivial FD.{StudentID, CourseID} → InstructorID is a non-trivial FD.InstructorID → InstructorName is a non-trivial FD.Let's look for potential issues related to normalization:
Consider the FD {StudentID, CourseID} → InstructorID. If we can determine InstructorID from StudentID alone, or from CourseID alone, then this is a partial dependency.
Suppose we also know that StudentID → InstructorID. This would be a partial dependency because InstructorID is determined by only part of the composite key {StudentID, CourseID}.
Now, consider the FD StudentID → InstructorName. We know StudentID → InstructorID and InstructorID → InstructorName. By transitivity, we can infer StudentID → InstructorName. Since StudentID is not a superkey for the whole relation, this represents a transitive dependency.
Functional dependencies are the bedrock of database normalization:
X → Y, X must be a superkey.By analyzing and enforcing functional dependencies, we can systematically normalize a database schema to achieve optimal structure.