Dataset and DataTable in ADO.NET

The DataSet and DataTable objects are core components of ADO.NET, providing a rich, in-memory representation of data. They enable you to work with data independently of the data source, allowing for features like disconnected data access, data manipulation, and data caching.

Understanding the DataSet

A DataSet is an in-memory cache of DataTable objects. It can hold multiple tables, their relationships, and constraints. A DataSet is particularly useful when you need to work with data from multiple tables, or when you need to maintain data integrity through constraints and foreign keys in a disconnected environment.

Key Features of DataSet:

Understanding the DataTable

A DataTable represents a single table of data in memory. It is composed of rows and columns, and can be used to store, retrieve, and manipulate data. Each DataTable has a collection of DataColumn objects that define the schema (column names, data types, etc.) and a collection of DataRow objects that represent the data itself.

Key Features of DataTable:

Creating and Populating a DataSet

You can create a DataSet and populate it programmatically or by using a DataAdapter.

Programmatic Creation:


using System;
using System.Data;

public class DataSetExample
{
    public static void Main(string[] args)
    {
        // Create a new DataSet
        DataSet myDataSet = new DataSet("MySampleData");

        // Create a DataTable for Customers
        DataTable customersTable = new DataTable("Customers");

        // Define columns for Customers table
        DataColumn customerIdColumn = new DataColumn("CustomerID", typeof(int));
        customerIdColumn.AutoIncrement = true;
        customerIdColumn.AutoIncrementSeed = 1;
        customerIdColumn.AutoIncrementStep = 1;
        customerIdColumn.AllowDBNull = false;
        customersTable.Columns.Add(customerIdColumn);
        customersTable.PrimaryKey = new DataColumn[] { customerIdColumn };

        DataColumn customerNameColumn = new DataColumn("CustomerName", typeof(string));
        customerNameColumn.AllowDBNull = false;
        customersTable.Columns.Add(customerNameColumn);

        // Add a row to the Customers table
        DataRow newCustomerRow = customersTable.NewRow();
        newCustomerRow["CustomerName"] = "Alice Smith";
        customersTable.Rows.Add(newCustomerRow);

        newCustomerRow = customersTable.NewRow();
        newCustomerRow["CustomerName"] = "Bob Johnson";
        customersTable.Rows.Add(newCustomerRow);

        // Create a DataTable for Orders
        DataTable ordersTable = new DataTable("Orders");

        // Define columns for Orders table
        DataColumn orderIdColumn = new DataColumn("OrderID", typeof(int));
        orderIdColumn.AutoIncrement = true;
        orderIdColumn.AutoIncrementSeed = 1001;
        orderIdColumn.AutoIncrementStep = 1;
        orderIdColumn.AllowDBNull = false;
        ordersTable.Columns.Add(orderIdColumn);
        ordersTable.PrimaryKey = new DataColumn[] { orderIdColumn };

        DataColumn orderDateColumn = new DataColumn("OrderDate", typeof(DateTime));
        ordersTable.Columns.Add(orderDateColumn);

        DataColumn customerIdFkColumn = new DataColumn("CustomerID", typeof(int));
        ordersTable.Columns.Add(customerIdFkColumn);

        // Add a row to the Orders table
        DataRow newOrderRow = ordersTable.NewRow();
        newOrderRow["OrderDate"] = DateTime.Now;
        newOrderRow["CustomerID"] = 1; // Link to Alice Smith
        ordersTable.Rows.Add(newOrderRow);

        // Add the DataTables to the DataSet
        myDataSet.Tables.Add(customersTable);
        myDataSet.Tables.Add(ordersTable);

        // Create a relationship
        DataRelation relation = new DataRelation("CustomerOrders",
                                                 customersTable.Columns["CustomerID"],
                                                 ordersTable.Columns["CustomerID"]);
        myDataSet.Relations.Add(relation);

        Console.WriteLine("DataSet created and populated successfully!");
    }
}
            

Using DataAdapter:

The most common way to populate a DataSet is by using a DataAdapter (e.g., SqlDataAdapter, OleDbDataAdapter) in conjunction with a DataSet object.


using System;
using System.Data;
using System.Data.SqlClient; // Or other data provider

public class DataAdapterExample
{
    public static void Main(string[] args)
    {
        string connectionString = "YourConnectionStringHere";
        string query = "SELECT CustomerID, CustomerName FROM Customers; SELECT OrderID, OrderDate, CustomerID FROM Orders";

        using (SqlConnection connection = new SqlConnection(connectionString))
        {
            SqlDataAdapter adapter = new SqlDataAdapter(query, connection);
            DataSet myDataSet = new DataSet();

            // Fill the DataSet with data from the query
            // The adapter automatically names tables based on the order of SELECT statements or specified names
            adapter.Fill(myDataSet);

            Console.WriteLine("DataSet populated using DataAdapter.");

            // Accessing tables
            DataTable customers = myDataSet.Tables["Customers"];
            DataTable orders = myDataSet.Tables["Orders"];

            // ... process data ...
        }
    }
}
            

Working with Data

Once data is in a DataSet or DataTable, you can easily iterate through rows, access column values, filter, sort, and update data.

Iterating through Rows:


// Assuming 'customers' is a DataTable
foreach (DataRow row in customers.Rows)
{
    Console.WriteLine($"CustomerID: {row["CustomerID"]}, Name: {row["CustomerName"]}");
}
            

Filtering Data:

Filtering can be done using DataView or LINQ to DataSet.

Using DataView:


DataView dv = new DataView(myDataSet.Tables["Customers"]);
dv.RowFilter = "CustomerID > 5"; // Example filter

foreach (DataRowView rowView in dv)
{
    Console.WriteLine($"Filtered Customer: {rowView["CustomerName"]}");
}
            

Using LINQ to DataSet:


var filteredCustomers = from custRow in myDataSet.Tables["Customers"].AsEnumerable()
                        where custRow.Field<int>("CustomerID") > 5
                        select custRow.Field<string>("CustomerName");

foreach (string customerName in filteredCustomers)
{
    Console.WriteLine($"Filtered Customer (LINQ): {customerName}");
}
            

Handling Changes

When working in a disconnected scenario, you need to track changes made to the data and then send those changes back to the data source. The DataTable's AcceptChanges() and RejectChanges() methods, along with RowState and row versions, are key to this process.

Note on Row State

Each DataRow has a RowState property that indicates its status: Unchanged, Added, Modified, or Deleted. This state is crucial for determining which rows need to be updated in the database.

Conclusion

DataSet and DataTable are powerful tools in ADO.NET for managing data in memory. They provide flexibility and efficiency, especially for applications that require disconnected data access, complex data manipulation, or data caching. Understanding their features and how to use them effectively is fundamental to building robust data-driven applications with .NET.

Tip

For new development, consider using Entity Framework or other ORMs for more abstract and higher-level data access. However, DataSet and DataTable remain relevant for specific scenarios and legacy applications.