XPath Documentation - Windows System

XPath: XML Path Language

XPath is a powerful query language for selecting nodes from an XML document. It's a W3C Recommendation and is widely used in conjunction with XSLT (Extensible Stylesheet Language Transformations) and XQuery.

What is XPath?

XPath provides a syntax for navigating through elements and attributes in an XML document. It treats an XML document as a tree structure, where each node (element, attribute, text, etc.) can be selected using path expressions. These expressions can filter and select specific nodes based on their names, attributes, content, and relationships to other nodes.

Key Concepts

1. The XML Tree

XPath operates on the XML document's hierarchical structure, which can be visualized as a tree. The main types of nodes are:

Root Node (the document itself)
Element Nodes
Attribute Nodes
Text Nodes
Namespace Nodes

2. Path Expressions

These are the core of XPath. They consist of one or more location steps, separated by forward slashes (/).

Absolute path: Starts from the root node (e.g., /bookstore/book).
Relative path: Starts from the current node (e.g., book/title).

3. Location Steps

Each location step has three components:

Axis: Specifies the relationship between the current node and the nodes to be selected (e.g., child, attribute, parent, descendant).
Node Test: Specifies the type or name of nodes to select (e.g., element, attribute, * for any node, or a specific name like book).
Predicate: An optional expression in square brackets ([]) used to filter the selected nodes based on conditions (e.g., [price > 50], [@lang='en']).

If the axis and node test are omitted, they default to child::* (all children elements).

Common Axes:

Axis	Description
`child`	Selects children of the current node.
`attribute`	Selects attributes of the current node.
`parent`	Selects the parent of the current node.
`ancestor`	Selects all ancestors (parent, grandparent, etc.) of the current node.
`descendant`	Selects all descendants (children, grandchildren, etc.) of the current node.
`following-sibling`	Selects all siblings that appear after the current node.
`preceding-sibling`	Selects all siblings that appear before the current node.

Common Node Tests:

*: Selects any element node.
element_name: Selects element nodes with the specified name.
@attribute_name: Selects attribute nodes with the specified name.
text(): Selects text nodes.
node(): Selects any node type.

Predicates:

Predicates are used to filter the nodes selected by a location step. They can use comparisons, functions, or other XPath expressions.

Examples

Consider the following XML document:

<?xml version="1.0"?>
<bookstore>
    <book category="cooking">
        <title lang="en">Everyday Italian</title>
        <author>Giada De Laurentiis</author>
        <year>2005</year>
        <price>30.00</price>
    </book>
    <book category="children">
        <title lang="en">Harry Potter</title>
        <author>J K. Rowling</author>
        <year>2005</year>
        <price>29.99</price>
    </book>
    <book category="web">
        <title lang="en">Learning XML</title>
        <author>Erik T. Ray</author>
        <year>2003</year>
        <price>39.95</price>
    </book>
</bookstore>

Common XPath Expressions and their results:

XPath Expression	Description	Result (based on the XML above)
`/bookstore/book`	Selects all `book` elements that are direct children of the `bookstore` root element.	All three `book` elements.
`/bookstore/book/title`	Selects all `title` elements that are direct children of `book` elements, which are direct children of the root.	The three `title` elements: "Everyday Italian", "Harry Potter", "Learning XML".
`//book`	Selects all `book` elements anywhere in the document (descendant-or-self axis).	All three `book` elements.
`/bookstore/book[1]`	Selects the first `book` element that is a child of `bookstore`.	The "Everyday Italian" book.
`/bookstore/book[@category='cooking']`	Selects `book` elements that have an attribute named `category` with the value 'cooking'.	The "Everyday Italian" book.
`/bookstore/book/title[@lang='en']`	Selects `title` elements with a `lang` attribute equal to 'en'.	All three `title` elements.
`/bookstore/book[price > 35]`	Selects `book` elements where the `price` child element's value is greater than 35.	The "Learning XML" book.
`/bookstore/book/author/text()`	Selects the text content of all `author` elements that are children of `book` elements.	"Giada De Laurentiis", "J K. Rowling", "Erik T. Ray".
`/bookstore/@name`	Selects the `name` attribute of the `bookstore` element. (In this XML, `bookstore` has no attributes).	No result.
`//title/text()[contains(., 'Potter')]`	Selects text nodes that are children of `title` elements and whose content contains the substring "Potter".	The text node for "Harry Potter".

XPath Functions

XPath provides a rich set of built-in functions for string manipulation, numeric operations, node selection, and more. Some common ones include:

string(value): Converts a value to a string.
concat(string1, string2, ...): Concatenates strings.
contains(haystack, needle): Checks if a string contains another string.
starts-with(string, prefix): Checks if a string starts with a specified prefix.
string-length(string): Returns the length of a string.
sum(node-set): Returns the sum of the string values of the nodes in a node-set.
count(node-set): Returns the number of nodes in a node-set.
position(): Returns the position of the current node in the node-set.
last(): Returns the total number of nodes in the node-set.

XPath in Windows System

XPath is extensively used within the Windows operating system and its development tools for:

Querying configuration files (e.g., XML-based settings).
Processing XML data returned by Windows APIs.
Transforming XML data using XSLT for display or further processing.
Defining queries for data sources that expose XML interfaces.

Developers working with XML data in Windows environments will find XPath an indispensable tool for efficient data manipulation and retrieval.

Previous: XML Next: XSLT