MSDN Documentation

XPath: XML Path Language

XPath is a powerful query language for selecting nodes from an XML document. It's a W3C Recommendation and is widely used in conjunction with XSLT (Extensible Stylesheet Language Transformations) and XQuery.

What is XPath?

XPath provides a syntax for navigating through elements and attributes in an XML document. It treats an XML document as a tree structure, where each node (element, attribute, text, etc.) can be selected using path expressions. These expressions can filter and select specific nodes based on their names, attributes, content, and relationships to other nodes.

Key Concepts

1. The XML Tree

XPath operates on the XML document's hierarchical structure, which can be visualized as a tree. The main types of nodes are:

2. Path Expressions

These are the core of XPath. They consist of one or more location steps, separated by forward slashes (/).

3. Location Steps

Each location step has three components:

  1. Axis: Specifies the relationship between the current node and the nodes to be selected (e.g., child, attribute, parent, descendant).
  2. Node Test: Specifies the type or name of nodes to select (e.g., element, attribute, * for any node, or a specific name like book).
  3. Predicate: An optional expression in square brackets ([]) used to filter the selected nodes based on conditions (e.g., [price > 50], [@lang='en']).

If the axis and node test are omitted, they default to child::* (all children elements).

Common Axes:

Axis Description
child Selects children of the current node.
attribute Selects attributes of the current node.
parent Selects the parent of the current node.
ancestor Selects all ancestors (parent, grandparent, etc.) of the current node.
descendant Selects all descendants (children, grandchildren, etc.) of the current node.
following-sibling Selects all siblings that appear after the current node.
preceding-sibling Selects all siblings that appear before the current node.

Common Node Tests:

Predicates:

Predicates are used to filter the nodes selected by a location step. They can use comparisons, functions, or other XPath expressions.

Examples

Consider the following XML document:

<?xml version="1.0"?>
<bookstore>
    <book category="cooking">
        <title lang="en">Everyday Italian</title>
        <author>Giada De Laurentiis</author>
        <year>2005</year>
        <price>30.00</price>
    </book>
    <book category="children">
        <title lang="en">Harry Potter</title>
        <author>J K. Rowling</author>
        <year>2005</year>
        <price>29.99</price>
    </book>
    <book category="web">
        <title lang="en">Learning XML</title>
        <author>Erik T. Ray</author>
        <year>2003</year>
        <price>39.95</price>
    </book>
</bookstore>

Common XPath Expressions and their results:

XPath Expression Description Result (based on the XML above)
/bookstore/book Selects all book elements that are direct children of the bookstore root element. All three book elements.
/bookstore/book/title Selects all title elements that are direct children of book elements, which are direct children of the root. The three title elements: "Everyday Italian", "Harry Potter", "Learning XML".
//book Selects all book elements anywhere in the document (descendant-or-self axis). All three book elements.
/bookstore/book[1] Selects the first book element that is a child of bookstore. The "Everyday Italian" book.
/bookstore/book[@category='cooking'] Selects book elements that have an attribute named category with the value 'cooking'. The "Everyday Italian" book.
/bookstore/book/title[@lang='en'] Selects title elements with a lang attribute equal to 'en'. All three title elements.
/bookstore/book[price > 35] Selects book elements where the price child element's value is greater than 35. The "Learning XML" book.
/bookstore/book/author/text() Selects the text content of all author elements that are children of book elements. "Giada De Laurentiis", "J K. Rowling", "Erik T. Ray".
/bookstore/@name Selects the name attribute of the bookstore element. (In this XML, bookstore has no attributes). No result.
//title/text()[contains(., 'Potter')] Selects text nodes that are children of title elements and whose content contains the substring "Potter". The text node for "Harry Potter".

XPath Functions

XPath provides a rich set of built-in functions for string manipulation, numeric operations, node selection, and more. Some common ones include:

XPath in Windows System

XPath is extensively used within the Windows operating system and its development tools for:

Developers working with XML data in Windows environments will find XPath an indispensable tool for efficient data manipulation and retrieval.