XPath Expression Reference
Note: This document provides a reference for XPath expressions used within Windows API functions for XML processing. Ensure you have a fundamental understanding of XML and XPath concepts before proceeding.
Introduction to XPath
XPath (XML Path Language) is a query language for selecting nodes from an XML document. It is used extensively in Windows development when interacting with XML data, such as configuration files, data serialization, and web services. XPath expressions navigate through the elements and attributes of an XML document.
Core Components of XPath Expressions
XPath expressions are built from several key components:
1. Nodes and Node Types
An XML document is a tree structure. XPath can select different types of nodes:
- Element nodes: Represent XML elements (e.g.,
<element>). - Attribute nodes: Represent attributes of elements (e.g.,
attribute="value"). - Text nodes: Represent the text content within elements.
- Namespace nodes: Represent XML namespaces.
- Root node: The top-level node of the XML document.
2. Path Expressions
These expressions select nodes based on their position in the XML tree. They can be:
- Absolute paths: Start from the root node.
- Relative paths: Start from the current node.
Common Path Steps:
| Syntax | Description | Example |
|---|---|---|
/ |
Selects from the root node. | /root/element |
// |
Selects nodes anywhere in the document, regardless of their position. | //element[@attribute='value'] |
. |
Selects the current node. | . |
.. |
Selects the parent of the current node. | ../parent_element |
@ |
Selects attributes. | @attribute_name |
* |
Wildcard for any element. | /root/* |
@* |
Wildcard for any attribute. | element/@* |
3. Predicates
Predicates are used to filter nodes based on conditions, enclosed in square brackets [].
- Indexing: Select the Nth occurrence of a node.
[1]- Selects the first element.[last()]- Selects the last element.[position()=3]- Selects the third element. - Attribute value matching:
[attribute_name='value']- Selects elements with a specific attribute value. - Text content matching:
[text()='exact text']- Selects elements with exact text content.[contains(text(),'partial text')]- Selects elements containing specific text. - Logical operators:
and,or,not().[price > 100 and @available='true']
4. Functions
XPath provides a rich set of built-in functions:
- String functions:
string(),concat(),substring(),string-length(),normalize-space(). - Numeric functions:
number(),sum(),floor(),ceiling(). - Node-set functions:
count(),last(),position(),id(). - Boolean functions:
true(),false(),not().
5. Axes
Axes specify the relationship between the context node and the nodes to be selected. Some common axes:
| Axis | Description |
|---|---|
child:: |
Selects children of the context node. |
attribute:: |
Selects attributes of the context node. |
parent:: |
Selects the parent of the context node. |
ancestor:: |
Selects ancestors of the context node. |
descendant:: |
Selects descendants of the context node. |
following-sibling:: |
Selects siblings that follow the context node. |
preceding-sibling:: |
Selects siblings that precede the context node. |
self:: |
Selects the context node itself. |
Shorthand notations:
element_nameis shorthand forchild::element_name@attribute_nameis shorthand forattribute::attribute_name
Example XPath Expressions
Example XML Document:
<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description>
</book>
</catalog>
Expressions and their results:
| XPath Expression | Description | Result (based on example XML) |
|---|---|---|
/catalog/book |
Selects all book elements that are direct children of the catalog root element. |
Both book nodes. |
//book[@id='bk101'] |
Selects all book elements anywhere in the document that have an id attribute with the value 'bk101'. |
The first book node. |
/catalog/book/title |
Selects all title elements that are direct children of book elements, which are direct children of the catalog root. |
"XML Developer's Guide", "Midnight Rain". |
//author[text()='Ralls, Kim'] |
Selects all author elements anywhere in the document whose text content is exactly 'Ralls, Kim'. |
<author>Ralls, Kim</author> |
//book[price > 10] |
Selects all book elements that have a child price element with a value greater than 10. |
The first book node (price 44.95). |
//book[contains(description, 'architect')] |
Selects book elements whose description contains the word 'architect'. |
The second book node. |
/catalog/book[last()] |
Selects the last book element under the catalog root. |
The second book node. |
XPath in Windows API
Many Windows API functions and .NET classes utilize XPath for XML manipulation. Examples include:
IXMLDOMDocument2::selectNodes: Selects a node-set satisfying the given XPath expression.IXMLDOMNodeList::nextNode: Retrieves the next node in the enumerated list.- .NET Framework
System.Xml.XmlDocumentclass: Provides methods likeSelectNodes()andSelectSingleNode()that accept XPath expressions.
Best Practices
- Be specific with your paths to improve performance and avoid unexpected results.
- Use absolute paths when possible for clarity, especially for top-level queries.
- Employ predicates effectively to filter down the node-set efficiently.
- Test your XPath expressions thoroughly with representative XML data.