W3C home > Mailing lists > Public > public-change@w3.org > March 2013

RE: XPath to identify a point in an XML document (Was: A sort of synthesis)

From: Dennis E. Hamilton <dennis.hamilton@acm.org>
Date: Mon, 11 Mar 2013 20:24:38 -0700
To: <liam@w3.org>, "'Casey Jordan'" <casey.jordan@jorsek.com>
Cc: <public-change@w3.org>
Message-ID: <008e01ce1ed1$22860650$679212f0$@acm.org>
@Liam,

Yes, all of those things can be done with XPath unadorned.  But being able to inspect or even match or substring a text item in a formula is not the same as directing a path into the node that holds the text.  Ditto for pointing into the values of attributes, something that can be relevant for the kind of lengthy attribute values that store spreadsheet cell formulas, for example.  I should perhaps been more emphatic about "find a path."

I did wonder about those wonderful manipulations on the types of values in nodes, but I couldn't reconcile that with setting a path except for search-condition predicates.  I think those just help filter us to an intended node (and avoiding unintended ones).  Getting to that last inch, extending the path into the text (or attribute value) node itself, seems to require augmentation of some kind.  

I'm fine with that.  If there's some way that plain XPath can cross that gap, I'm very interested.  And I'm not disturbed by the need to augment XPath, since there is need for other augmentation anyhow.

 - Dennis

PS: Aside to @Casey.  Although ODF spreadsheet cells don't have their coordinates in the element for the cell, ones for OOXML spreadsheets do, and that makes for great XPath access.  I sympathize with the numeric compactness desire.  For the document formats I work with, that battle was lost long ago [;<).  

I do find it interesting that it is easy to keep numerically-stepped path sequences sorted in XML-document scan order.  That makes it rather easy to match up change-tracking operations with their targets in the XML, at least to the element level.  I don't know if it is a help or a hindrance when editing.  I suspect the tracked-change paths are best recomputed whenever a modified document is persisted.  But the simplicity of match-up on opening a document for viewing or editing is something to look into.  

-----Original Message-----
From: Liam R E Quin [mailto:liam@w3.org] 
Sent: Monday, March 11, 2013 16:55
To: dennis.hamilton@acm.org
Cc: 'Innovimax W3C'; public-change@w3.org
Subject: RE: XPath to identify a point in an XML document (Was: A sort of synthesis)

On Mon, 2013-03-11 at 15:07 -0700, Dennis E. Hamilton wrote:
[...]

> It appears that XPath can find a text node, but not a path to the
> interior of that text node.  (A text node is a string and never has
> adjacent text nodes -- it is the largest string that can be made
> without crossing a tag.)

It depends what you mean by "find" - it can return a substring; XPath 2
(and 3) can return a (node, offset) pair if you like.

>   XPath, even abbreviated XPath can be far "wordier" but it can also
> be simpler because it can short-circuit using full paths because it is
> based on a search model, not a strictly-navigational model.

SoftQuad Panorama, years ago, used the nearest element ancestor with an
ID attribute, if there was one, and navigated from there, since IDs tend
to be relatively stable across document revisions.

> For example, one advantage of an (augmented) XPath arrangement is the
> ability to find attributes via XPath.  This means that one can find
> elements by known xml:id attribute values.
Yes, indeed.
>  There are other aspects of XPath usage that might be used  easier to
> confirm that the target has what is expected there (sort of the way
> patch software often works).
Yes, one can also use contains() to check for a string, for example.

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org freenode/#xml
Received on Tuesday, 12 March 2013 03:25:10 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 12 March 2013 03:25:11 GMT