Introduction to SQL
Structured Query Language (SQL) is a standard language for accessing and manipulating databases. It is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS).
SQL is used to:
- Execute queries against a database
- Retrieve data from a database
- Insert records into a database
- Update records in a database
- Delete records from a database
- Create and modify database structures
Understanding SQL Data Types
Data types define the type of data that can be stored in a column. Choosing appropriate data types is crucial for data integrity, storage efficiency, and performance. Common SQL data types include:
Data Type | Description | Example |
---|---|---|
INT |
Integer numbers (whole numbers). | 100 , -50 |
VARCHAR(n) |
Variable-length character strings. | 'Microsoft' , 'SQL Server' |
DATE |
Date values (year, month, day). | '2023-10-27' |
DECIMAL(p,s) |
Exact numeric value with fixed precision and scale. | 123.45 |
BOOLEAN |
True or False values. | TRUE , FALSE |
Tables, Columns, and Rows
Databases are organized into tables. Each table contains data about a specific entity. A table is composed of columns (which define the attributes of the entity) and rows (which represent individual records or instances of the entity).
Columns: Each column has a name and a data type.
Rows: Each row contains a set of values, one for each column.
Primary Key: A column or set of columns that uniquely identifies each row in a table.
Example Table: Employees
Consider a simple Employees
table:
EmployeeID (INT, PRIMARY KEY) | FirstName (VARCHAR(50)) | LastName (VARCHAR(50)) | HireDate (DATE) |
---|---|---|---|
101 | Nancy | Davolio | '2020-05-01' |
102 | Andrew | Fuller | '2019-08-15' |
Basic SQL Queries
The SELECT
statement is used to query the database and retrieve data that matches specified criteria.
Retrieving All Columns
To retrieve all columns and all rows from the Employees
table:
SELECT *
FROM Employees;
Retrieving Specific Columns
To retrieve only the first name and last name of all employees:
SELECT FirstName, LastName
FROM Employees;
Filtering Data with WHERE
The WHERE
clause is used to filter records. It specifies a condition to be met for the data to be returned.
To find employees hired after '2020-01-01':
SELECT EmployeeID, FirstName, LastName
FROM Employees
WHERE HireDate > '2020-01-01';
CRUD Operations
CRUD stands for Create, Read, Update, and Delete. These are the four basic operations performed on data in a database.
Create (INSERT
)
Adds new records to a table.
INSERT INTO Employees (EmployeeID, FirstName, LastName, HireDate)
VALUES (103, 'Janet', 'Leverling', '2021-03-10');
Read (SELECT
)
Retrieves existing records from a table (covered in Basic Queries).
Update (UPDATE
)
Modifies existing records in a table.
To update the hire date for EmployeeID 101:
UPDATE Employees
SET HireDate = '2020-05-05'
WHERE EmployeeID = 101;
Delete (DELETE
)
Removes records from a table.
To delete the employee with EmployeeID 103:
DELETE FROM Employees
WHERE EmployeeID = 103;
Introduction to Joins
Joins are used to combine rows from two or more tables based on a related column between them. This is essential when your data is spread across multiple tables.
INNER JOIN
: Returns records that have matching values in both tables.
Example: Joining Employees and Departments
Assume a Departments
table:
DepartmentID (INT, PRIMARY KEY) | DepartmentName (VARCHAR(50)) |
---|---|
1 | Sales |
2 | Engineering |
And the Employees
table has a DepartmentID
column.
To retrieve employee names and their department names:
SELECT E.FirstName, E.LastName, D.DepartmentName
FROM Employees AS E
INNER JOIN Departments AS D
ON E.DepartmentID = D.DepartmentID;
Database Normalization
Database normalization is the process of organizing the columns and tables of a relational database to minimize data redundancy and improve data integrity. It involves a series of guidelines called normal forms.
Common normal forms include:
- First Normal Form (1NF): Eliminates repeating groups in individual tables.
- Second Normal Form (2NF): Is in 1NF and all non-key attributes are fully functionally dependent on the primary key.
- Third Normal Form (3NF): Is in 2NF and all non-key attributes are not transitively dependent on the primary key.
Normalization helps in designing efficient and robust database structures.