XPath is a specification for a query language that locates and extracts data from XML documents, and a comprehensive set of functions for the manipulation of that data. XPath is used to identify, filter, and test nodes and content; and to apply functions or operations on the resulting data sets.
XML provides tree-structured data objects. Each tag in an XML document is a node in the tree. A tag can contain attributes, other tags, and raw data content: these are also nodes in the tree. A valid XML document has exactly one node at the root of its tree. Contained by that node are its children and, contained in those children, descendent nodes branching out until reaching terminal nodes that contain only data content.
XPath provides a language with which to locate nodes, by identifying the nodes address or by finding nodes using tests or filters, and to perform operations on the identified nodes.
In identifying the location of a node, XPath uses the concept of an axis, which describes the relationship between the node that is currently identified (the context) and the node that one wants to locate. These axes include familial relations (parent, child, descendent, sibling, etc.), linear relations (preceding, following, etc.), and two XML markup identifies (attribute and namespace). Combined with tests and filters, XPath provides an immensely powerful but succinct way to exactly identify a specific node or collection of nodes.
An example XPath statement can illustrate this power. The only further knowledge you require is that the slash character is used to separate steps in the traversal; double colons separate axes from tag names, and square brackets contain filtering statements.