CSV Storage Engine
The CSV storage engine is a special-purpose engine that allows you to store data in plain comma-separated value (CSV) files. Each table using the CSV engine is represented by a single CSV file in the data directory. This engine is primarily useful for exchanging data with applications that produce or consume CSV files.
Key Characteristics
- Plain Text Format: Data is stored as text in standard CSV format, making it human-readable and easily importable/exportable by various tools.
- No Indexes: The CSV engine does not support indexing, which means queries will always perform full table scans.
- No Transactions: It does not support ACID transactions, rollbacks, or locking mechanisms. Operations are atomic at the file system level.
- Limited Data Types: Primarily treats all data as strings. No data type validation or conversion is performed by the engine itself.
- No Primary Key/Unique Constraints: These are not enforced.
- No Auto-increment: Not supported.
Use Cases
- Data Interchange: Easily export data from your database to a CSV file for use in spreadsheets, reporting tools, or other applications.
- Data Import: Import data from existing CSV files into a table.
- Simple Logging: Can be used for very simple logging scenarios where a human-readable log is desired.
Limitations
Due to its simplicity, the CSV engine is not suitable for performance-critical applications or scenarios requiring complex querying, data integrity, or concurrency control.
Creating a CSV Table
To create a table that uses the CSV storage engine, specify ENGINE=CSV
in your CREATE TABLE
statement:
CREATE TABLE sales_data (
product_id INT,
sale_date DATE,
quantity INT,
price DECIMAL(10, 2)
) ENGINE=CSV;
Data Format and Delimiters
By default, the CSV engine uses a comma (,
) as the field delimiter and a newline character as the row terminator. These can be configured using table options:
CREATE TABLE customer_list (
customer_id INT,
name VARCHAR(100),
email VARCHAR(100)
) ENGINE=CSV
DEFAULT CHARACTER SET utf8mb4
FIELD DELIMITED BY ';'
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\n';
Table Options:
FIELD DELIMITED BY
: Specifies the character(s) used to separate fields.OPTIONALLY ENCLOSED BY
: Specifies the character used to enclose field values, especially those containing the delimiter.ESCAPED BY
: Specifies the character used to escape special characters within enclosed fields.LINES TERMINATED BY
: Specifies the character(s) used to mark the end of a row.
Note on Data Types
The CSV engine does not enforce data types. All values are stored and retrieved as strings. Any interpretation or conversion of data types must be handled by the application interacting with the CSV file.
Operations and Performance
Operations on CSV tables are generally slow due to the lack of indexing and the need to parse/write text files. Inserts and updates involve appending or rewriting the entire file, which can be inefficient for large datasets.
When to Use the CSV Engine
- When you need to easily share data with other systems that understand CSV.
- For simple data archiving where human readability is a priority.
- As a staging area for data that will be processed and loaded into a more robust storage engine.
When NOT to Use the CSV Engine
- For tables that require fast read or write performance.
- For tables that need data integrity constraints (primary keys, foreign keys, etc.).
- For applications requiring transactional integrity.
- For tables that will be frequently updated or modified.