Last Updated: October 26, 2023
Version: SQL Server 2022
Understanding Primary and Foreign Keys in SQL
Primary and Foreign Keys are fundamental concepts in relational database design. They are used to define relationships between tables and ensure data integrity. Understanding these concepts is crucial for building robust and efficient databases.
What are Database Keys?
In a relational database, tables are structured to store specific types of data. Keys are special columns or sets of columns that serve unique purposes:
- Uniquely Identifying Rows: Keys help to identify individual records within a table.
- Establishing Relationships: They link records in one table to records in another table.
- Enforcing Data Integrity: Keys ensure that the data entered into the database adheres to specific rules, preventing inconsistencies.
Primary Keys
A Primary Key is a column or a set of columns that uniquely identifies each record in a table. It has the following characteristics:
- Uniqueness: Every value in the primary key column(s) must be unique. No two rows can have the same primary key value.
- Non-Nullability: A primary key column cannot contain NULL values. Every record must have a primary key.
- One per Table: A table can have only one primary key, although it can consist of multiple columns (a composite primary key).
Primary keys are essential for retrieving specific records and for establishing relationships with other tables.
Example: Defining a Primary Key
Consider a table for `Customers`:
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Email VARCHAR(100) UNIQUE
);
In this example, CustomerID is the primary key. Each customer will have a unique CustomerID, and this value cannot be null.
Foreign Keys
A Foreign Key is a column or a set of columns in one table that refers to the Primary Key in another table. It establishes a link between the two tables, enforcing referential integrity. This means that a value in the foreign key column must exist in the referenced primary key column.
Key characteristics of a foreign key:
- Referential Integrity: Ensures that relationships between tables remain consistent. You cannot add a record with a foreign key value that doesn't exist in the parent table's primary key.
- Can be Null: Unlike primary keys, foreign keys can sometimes be NULL, indicating that the record is not linked to any record in the parent table (depending on the business rules).
- Multiple Foreign Keys: A table can have multiple foreign keys, linking to different tables.
Example: Defining a Foreign Key
Consider a table for `Orders`, which references the `Customers` table:
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
OrderDate DATE,
CustomerID INT,
TotalAmount DECIMAL(10, 2),
FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
);
Here, CustomerID in the Orders table is a foreign key referencing the CustomerID (primary key) in the Customers table. This ensures that every order is associated with a valid customer.
Constraints and Relationships
Primary and Foreign Keys are implemented as constraints in SQL. These constraints help define the structure and rules of your database.
- Referential Actions: When defining a foreign key, you can specify what happens to the child records if the parent record is modified or deleted. Common actions include:
ON DELETE CASCADE: If a parent row is deleted, corresponding child rows are also deleted.
ON UPDATE CASCADE: If a parent row's primary key is updated, corresponding child foreign key values are also updated.
ON DELETE SET NULL: If a parent row is deleted, the foreign key in the child row is set to NULL.
ON DELETE NO ACTION / RESTRICT: Prevents the deletion of a parent row if there are any referencing child rows.
- Relationship Types:
- One-to-One: Each record in Table A corresponds to at most one record in Table B, and vice-versa. Often implemented by having a foreign key in one table that also has a UNIQUE constraint.
- One-to-Many: One record in Table A can correspond to many records in Table B, but each record in Table B corresponds to only one record in Table A. This is the most common type, implemented with a foreign key in the "many" side table referencing the primary key of the "one" side table.
- Many-to-Many: A record in Table A can correspond to many records in Table B, and vice-versa. This is typically implemented using an intermediate "junction" or "linking" table that has foreign keys referencing the primary keys of both Table A and Table B.
Practical Examples and Use Cases
Let's consider a simple database for a library:
Table: Books
CREATE TABLE Books (
BookID INT PRIMARY KEY,
Title VARCHAR(255) NOT NULL,
AuthorID INT,
ISBN VARCHAR(20) UNIQUE,
PublishedYear INT
);
Table: Authors
CREATE TABLE Authors (
AuthorID INT PRIMARY KEY,
AuthorName VARCHAR(100) NOT NULL,
BirthDate DATE
);
Linking Books and Authors (One-to-Many)
The AuthorID in the Books table is a foreign key referencing the AuthorID in the Authors table.
ALTER TABLE Books
ADD CONSTRAINT FK_BookAuthor
FOREIGN KEY (AuthorID) REFERENCES Authors(AuthorID)
ON DELETE SET NULL;
This ensures that every book is linked to an author, and if an author is removed from the system, the AuthorID for their books will be set to NULL rather than deleting the book itself.
Table: Borrowings (Many-to-Many relationship between Books and Patrons)
To model who borrowed which book, we need a linking table.
CREATE TABLE Patrons (
PatronID INT PRIMARY KEY,
PatronName VARCHAR(100) NOT NULL,
MembershipExpiry DATE
);
CREATE TABLE Borrowings (
BorrowingID INT PRIMARY KEY,
BookID INT,
PatronID INT,
BorrowDate DATE NOT NULL,
ReturnDate DATE,
FOREIGN KEY (BookID) REFERENCES Books(BookID),
FOREIGN KEY (PatronID) REFERENCES Patrons(PatronID)
);
Here, BookID and PatronID in the Borrowings table are foreign keys. This structure allows a patron to borrow many books, and a book to be borrowed by many patrons over time.
Conclusion
Primary and Foreign Keys are the backbone of relational databases. They are vital for data integrity, consistency, and enabling efficient querying through well-defined relationships between tables. Implementing them correctly from the start will save significant effort and prevent data anomalies in the long run.