On this page
Introduction
The Windows I/O subsystem provides a uniform interface for user‑mode applications to access a wide range of devices. It abstracts device‑specific details and manages request queuing, cancellation, and completion. This article describes the high‑level architecture of the kernel‑mode I/O stack, focusing on the components that collaborate to process I/O requests.
Core Components
- IRP (I/O Request Packet) – The fundamental data structure that travels through the stack.
- Device Object – Represents a logical or physical device and maintains a pointer to driver routines.
- Driver Object – Holds entry points for driver callbacks (AddDevice, Dispatch, Unload).
- Filter Drivers – Optional layers that can intercept and modify IRPs.
- I/O Manager – Core kernel component that creates IRPs, queues them, and dispatches completion notifications.
- File System Runtime (FSRTL) – Provides common services for file system drivers such as locking and caching.
- Cache Manager – Handles systemwide caching for file data.
Request Processing Flow
The following diagram illustrates the typical path an IRP follows from creation to completion:
User Mode → I/O Manager → Driver Stack (Filter Drivers → Function Driver → Device Driver) → Device
↑ ↓
Completion Routine ←——————————————— Completion Queue ←———
Key stages:
- IRP Creation: The I/O manager allocates an IRP and initializes its fields based on the user request.
- Dispatch: The IRP is sent to the topmost driver in the stack; each driver may handle, forward, or complete the request.
- IO Completion: Once the lowest driver finishes processing, the IRP is propagated back up, invoking any registered completion routines.
- Cleanup: The I/O manager deallocates the IRP and notifies the originating thread.
Synchronization Mechanisms
To protect shared resources, the I/O subsystem uses a combination of:
- Fast Mutex (KMUTEX) – For short critical sections.
- Spin Locks – For high‑frequency, low‑latency protection.
- Interlocked Operations – Atomic increments/decrements for reference counting.
- Event Objects – Signaling completion of asynchronous I/O.
Performance Considerations
Optimizing I/O performance typically involves:
- Minimizing context switches by using
IOCTLbatching. - Leveraging the Cache Manager for sequential reads.
- Implementing efficient cancellation paths.
- Using the
FAST_IOpath when possible to avoid full IRP processing.
Sample Code
The snippet below demonstrates how a driver creates and sends a read IRP to a lower driver:
NTSTATUS
MyReadDispatch(
_In_ PDEVICE_OBJECT DeviceObject,
_Inout_ PIRP Irp
)
{
PIO_STACK_LOCATION irpSp = IoGetCurrentIrpStackLocation(Irp);
PDEVICE_OBJECT lowerDev = ((PDEVICE_EXTENSION)DeviceObject->DeviceExtension)->LowerDeviceObject;
IoCopyCurrentIrpStackLocationToNext(Irp);
IoSetCompletionRoutine(Irp, MyReadComplete, nullptr, TRUE, TRUE, TRUE);
return IoCallDriver(lowerDev, Irp);
}
NTSTATUS
MyReadComplete(
_In_ PDEVICE_OBJECT DeviceObject,
_In_ PIRP Irp,
_In_ PVOID Context
)
{
UNREFERENCED_PARAMETER(DeviceObject);
UNREFERENCED_PARAMETER(Context);
// Process data or forward status to user mode
return STATUS_SUCCESS;
}
For a full driver example, see Sample Driver Code.