CPU Optimization
Effective CPU tuning can improve application responsiveness and reduce power consumption. Consider the following practices:
- Set processor affinity for high‑priority threads.
- Use
SetThreadPriorityto adjust thread scheduling. - Enable
Processor Power Managementcapabilities in the BIOS.
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_HIGHEST);
Memory Management
Optimizing memory usage reduces paging and improves latency.
- Allocate large buffers with
VirtualAllocaligned to page size. - Use the
MEM_LARGE_PAGESflag where possible. - Implement a caching strategy that avoids excessive allocations.
void* buf = VirtualAlloc(NULL, 64*1024*1024, MEM_RESERVE|MEM_COMMIT, PAGE_READWRITE);
Disk I/O
Minimize seek time and maximize throughput.
- Use asynchronous I/O with
OVERLAPPEDstructures. - Align file writes to the sector size (typically 4 KB).
- Enable write caching on SSDs when data integrity policies allow.
HANDLE h = CreateFile(L"data.bin",GENERIC_WRITE,0,NULL,CREATE_ALWAYS,FILE_FLAG_OVERLAPPED,NULL);
GPU Performance
For graphics‑intensive workloads, consider these tips:
- Utilize Direct3D 12's command bundles to reduce driver overhead.
- Keep resource states explicit and transition only when required.
- Profile shader performance with
Shader Model 6.0metrics.
// Example: Record a command list
ID3D12GraphicsCommandList* cmdList;
cmdList->Close();
Power Management
Balancing performance and power is critical for mobile devices.
- Use
PowerSetRequestto request high performance when needed. - Monitor
BatteryStatusand adapt workload dynamically. - Configure
PowerThrottlingfor background processes.
PowerSetRequest(GetCurrentProcess(), POWER_REQUEST_TYPE_PERFORMANCE);
Tuning Tools
Microsoft provides several utilities to assist with performance analysis:
- Windows Performance Analyzer (WPA) – deep latency analysis.
- Windows Performance Recorder (WPR) – collects trace events.
- Process Explorer – real‑time view of CPU, memory, I/O.
- Install via the Windows SDK.
- Open a .etl file and view the CPU Usage (Precise) graph.
- Use the "Hardware Events" view for cache miss analysis.
- Run
wpr -start GeneralProfile -recordduration 30to capture a 30‑second trace. - Stop with
wpr -stop trace.etl. - Open the resulting .etl in WPA.
References
Further reading and official documentation: