Notes for 2012-10-09 TAG f2f brief session on XML fragment identifiers from Henry S. Thompson on 2012-10-09 (www-tag@w3.org from October 2012)

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Tue, 09 Oct 2012 13:20:58 +0100
To: www-tag@w3.org
Message-ID: <f5bipajkf9x.fsf@calexico.inf.ed.ac.uk>
The fragid compromise -- if the suffix (e.g. +xml) definition of the
fragment semantics doesn't resolve wrt a particular document, then
it's 'available' for a specific (e.g. foo/baz+xml) or generic
(e.g. image/...)  definition to 'take over'

Question: Is the fragid compromise only for barenames, or for all
XPointer fragments?

What would it mean to go for the 'all XPointer' alternative?

It will turn out to be relevant that XPointer allows multipart
pointers -- multiple alternatives, with silent failover

Note here and below 'XPointer. . . match' _includes_ barenames

XPointer failure modes:
 1) Doesn't match productions; DECL
 2) Doesn't map:
    a) unbound scheme prefix [not addressed by XPointer Framework spec.] DECL/PROC
    b) unsupported scheme [not distinguished from doesn't resolve,
                           because of multipart+failover semantics] PROC
 3) Doesn't resolve (scheme-specific -- no anchor, no 3rd elt child, etc.) DECL/PROC

DECL -- declarative/determinable by inspection of the fragment alone
PROC -- procedural/involves the stem-retrieved document/implementation-dependent
DECL/PROC -- could in principle be done based on fragment alone, plus
             facts about the implementation, i.e. what schemes are used (DECL),
             prefixes bound (DECL, but impls may cheat), supported

Liberal approach says "XPointer-matching frag-id identifies what XPointer says it
does, if it does".  "Non-XPointer-matching frag-id, or non-identifying
XPointer, is not defined by this spec. to identify anything [but
higher or lower spec. might]".

Conservative approach says "Barename frag-id identifies what XPointer
says it does, if it does.  Other XPointer-matching frag-id identifies what
XPointer says it does, if it does".  "Non-XPointer-matching frag-id,
or non-identifying barename, is not defined by this spec. to identify
anything [but higher spec. might].  Non-identifying non-barename
XPointer identifies nothing, ever".
------------------------
barenames -- you can use XPointer syntax, but your interp. only holds
              if XPointer-based resolution does not "identify a subresource" 

XPointer schema-based syntax

   1) You can't use;
   2) You can use, rules exactly as above for barenames, i.e. no
         constraint on overlap
   3) You can use syntax, but not any registered scheme except with
      its registered meaning

Consider a spec. defining application/schema+xml, which wants to
identify schema components via the elements which define them in a
XML-syntax schema document.  In the registration for that type
  (1) says I have to make up a non-overlapping syntax
  (2) Appears to say I can say 'element(/1/3/2)' means "the component
      defined by the 2nd child of the 3rd child of the root"
        But it actually doesn't, because intpreted _as_ an XPointer
        that fragid _does_ resolve, so you don't get your chance
  (3) Says you have to define schemaComponentTumbler which is your
      'alias' for the element scheme, plus your interpretation

So (2) doesn't make sense in this case -- would it _ever_ make sense?
Complexity (of comprehension) would be high, probability of value
seems low, I'm inclined against it.

What about (3)?  Issues include
  
  * The element scheme incorporates barenames, so '#element(foo)' is
    defined to mean the same as '#foo' - should we treat the two
    exactly the same?
  * It would require a scheme registration which says "Generic XML
     processors MUST NOT implement this scheme", which seems at the
     very least a bit odd. . .

ht
-- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]
Received on Tuesday, 9 October 2012 12:21:27 UTC