ADO.NET DataSet Object

The DataSet object is a central component of ADO.NET. It represents a complete, in-memory cache of data that can be manipulated independently of any data source. A DataSet can contain multiple, related tables, as well as constraints and relationships between those tables.

Key Features and Concepts

Creating and Populating a DataSet

You can create a DataSet programmatically or by using a DataAdapter to fill it from a data source.

Programmatic Creation:


using System.Data;

// Create a new DataSet
DataSet myDataSet = new DataSet("MyData");

// Create a DataTable
DataTable productsTable = new DataTable("Products");

// Add columns to the DataTable
productsTable.Columns.Add("ProductID", typeof(int));
productsTable.Columns.Add("ProductName", typeof(string));
productsTable.Columns.Add("UnitPrice", typeof(decimal));

// Add rows to the DataTable
productsTable.Rows.Add(1, "Chai", 18.00m);
productsTable.Rows.Add(2, "Chang", 19.00m);

// Add the DataTable to the DataSet
myDataSet.Tables.Add(productsTable);

// Accessing data
Console.WriteLine($"Product Name: {myDataSet.Tables["Products"].Rows[0]["ProductName"]}");
            

Populating with a DataAdapter:


using System.Data;
using System.Data.SqlClient; // Or your specific data provider

// Assume connectionString and selectCommand are properly initialized
string connectionString = "Your_Connection_String";
string selectCommand = "SELECT ProductID, ProductName, UnitPrice FROM Products";

using (SqlConnection connection = new SqlConnection(connectionString))
{
    SqlDataAdapter adapter = new SqlDataAdapter(selectCommand, connection);
    DataSet dataSet = new DataSet("ProductData");

    // Fill the DataSet with data
    adapter.Fill(dataSet, "Products");

    // Now dataSet.Tables["Products"] contains the data
    Console.WriteLine($"Number of products: {dataSet.Tables["Products"].Rows.Count}");
}
            

Working with Data in a DataSet

Accessing Tables and Rows:


// Assuming 'dataSet' is a populated DataSet
DataTable ordersTable = dataSet.Tables["Orders"]; // Accessing by table name
DataTable productsTable = dataSet.Tables[0];      // Accessing by index

foreach (DataRow row in ordersTable.Rows)
{
    Console.WriteLine($"Order ID: {row["OrderID"]}, Customer ID: {row["CustomerID"]}");
}

// Accessing a specific cell
int productId = (int)ordersTable.Rows[0]["ProductID"];
            

Filtering and Sorting with DataView:


// Assuming 'productsTable' is a DataTable
DataView productsView = new DataView(productsTable);

// Set a filter
productsView.RowFilter = "UnitPrice > 20";

// Set sorting
productsView.Sort = "ProductName ASC";

// Iterate through the filtered and sorted data
foreach (DataRowView rowView in productsView)
{
    Console.WriteLine($"Product: {rowView["ProductName"]}, Price: {rowView["UnitPrice"]}");
}
            

Merging DataSets

You can merge the contents of one DataSet into another. This is useful when dealing with multiple sources of data or when performing operations that result in separate DataSet objects.

Important: When merging, you can specify how to handle discrepancies, such as whether to preserve or discard changes, and how to handle missing or extra columns/rows.

DataSet vs. DataTable

While a DataSet can contain multiple DataTable objects, a DataTable itself represents a single table of data. A DataSet provides the container and context for these tables, including relationships and constraints between them.

Feature DataSet DataTable
Scope Collection of tables, relationships, and constraints Single table of data
In-Memory Cache Yes, for multiple tables Yes, for a single table
Relationships Can contain relationships between its DataTables Does not contain relationships (can be part of a DataSet's relationships)
Primary Use Representing complex, hierarchical data structures; offline data manipulation Representing a single tabular data source; used within a DataSet

Conclusion

The DataSet object in ADO.NET is a powerful tool for managing data in memory. Its ability to hold multiple tables, define relationships, and support offline operations makes it indispensable for building robust data-driven applications.