W3C home > Mailing lists > Public > public-change@w3.org > March 2013

RE: XPath to identify a point in an XML document (Was: A sort of synthesis)

From: Dennis E. Hamilton <dennis.hamilton@acm.org>
Date: Mon, 11 Mar 2013 15:07:48 -0700
To: "'Innovimax W3C'" <innovimax+w3c@gmail.com>
Cc: <public-change@w3.org>
Message-ID: <002401ce1ea4$de8d11e0$9ba735a0$@acm.org>
I certainly don't want to rule XPath out.  However, it appears that XPath only finds nodes.  I think you do mean XPointer, <http://www.w3.org/TR/xptr-framework/>,

XPointer is a framework for more than just XPath.  I can imagine something like 
xpointer(xpath(... to a text node ...) ct:offset(12))

and that XPointer-augmented XPath could be used in a URI fragment part.

 - Dennis

MORE THOUGHTS SPECIFIC TO XPATH

It appears that XPath can find a text node, but not a path to the interior of that text node.  (A text node is a string and never has adjacent text nodes -- it is the largest string that can be made without crossing a tag.)

I can think of ways to augment XPath to do that (and maybe XPointer too).  I think the limitation is real, though, in terms of what the W3C specs provide.  (I didn't look at the XPath 3 draft.  But I suspect this is tied to the XML InfoSet abstraction.  To point into a text node, some canonicalization must be assumed.)

I find the XPath avenue more appealing than the dot-notation Casey just illustrated.  I agree that a beginning and an ending might be needed, and a fair amount of other material as well.  XPath, even abbreviated XPath can be far "wordier" but it can also be simpler because it can short-circuit using full paths because it is based on a search model, not a strictly-navigational model.  

For example, one advantage of an (augmented) XPath arrangement is the ability to find attributes via XPath.  This means that one can find elements by known xml:id attribute values.  Those must be unique per XML document and that can be a big win.  The (augmented) access into text nodes should work on access to values in attribute nodes too.  There are other aspects of XPath usage that might be used  easier to confirm that the target has what is expected there (sort of the way patch software often works).

Another (partial) advantage is the ability to use XPath in URIs (e.g., in XPointers) and that might be material in dealing with cross-references among XML parts of a compound document (e.g., as done for OOXML and ODF using Zip packagings).  Even if that is not significant to CTMarkup, it might mean that XPath machinery can be used for more than one purpose once it is supported at all.  

If the markup instructions were in an XML document in the midst of such a compound structure, that would be a case for URI cross-reference to the marked-up content, making the use of XPath in URIs more interesting.

 - Dennis

-----Original Message-----
From: innovimax@gmail.com [mailto:innovimax@gmail.com] On Behalf Of Innovimax W3C
Sent: Monday, March 11, 2013 13:47
To: dennis.hamilton@acm.org
Cc: public-change@w3.org
Subject: XPath to identify a point in an XML document (Was: A sort of synthesis)

Dennis,

Can you elaborate on the exact use case ?

My understanding is that it is possible with XPath depending on your definition of a *point* (how many points are there in an open tag ?)

PTC uses oid + offset to do that for ages now, and it seems to work.

It seems like XPointer [1] or some variant of it was supposed to do that

Anyway, I think it's probably too hasty to rule XPath out without very good counter examples

Mohamed


On Mon, Mar 11, 2013 at 7:42 PM, Dennis E. Hamilton <dennis.hamilton@acm.org> wrote:


	I suspect that XPath works for pure XML change-tracking, although I have a question.
	
	I am not conversant enough with XPath to see whether it can isolate a point *within* a text node.  My superficial examination of XPath 1.0/2.0 suggests that there is no path expression into the interior of a text node.  Is that considered a problem here, or is there a well-known way of addressing that?
	
	 - Dennis
	





-- 
Innovimax SARL
Consulting, Training & XML Development
9, impasse des Orteaux
75020 Paris
Tel : +33 9 52 475787
Fax : +33 1 4356 1746
http://www.innovimax.fr
RCS Paris 488.018.631
SARL au capital de 10.000 € 
Received on Monday, 11 March 2013 22:08:18 GMT

This archive was generated by hypermail 2.3.1 : Monday, 11 March 2013 22:08:19 GMT