Re: fragment identifier as uri? (Was: 000106 Minutes)

I've been thining about this some more and believe we should just go
with what all normal people think of as a URI.

To a substantial extent, the WG tentative decision at the F2F was
based on some sort of aversion to the strange and wonderful things you
can in principle do with fragments, like selecting a time sub-interval
from an audio data type or doing an XPointer.  I don't really
understand that aversion since it is inherently the property of
something as powerful as a URI with arbitrary scheme and other parts,
sometimes including qery parts thousands of characters long, that no
software has ever existed or will ever exists that can actually handle
all URIs.  The reality is that all software will have to perform a
number of checks on URIs and/or IDREFs.  It does not seem appreciably
harder to me to check for "#token" than it does to check for "token"
as an IDREF.  I'm sure there will be implementations, probably mostly
simple protocol oriented implementations, that just plain only handle
IDREFs.  But I don't see that there job is made much simpler by having
an IDREF attribute.  Why should we split IDREF and fragment this and
do things in an unusual way?  Since all real applications are going to
have to impose lots of contraints on the URIs they can handle, how
does it matter than much how complex/ridiculous/whatever the URIs they
reject can be?

I don't see the non-validating parser / absence of a DTD as that much
of a problem as signature aware applications should, in effect, have
the XMLDSIG DTD built in and so can recognize our IDs in our elemnts,
including the Object elment, for example, which can be used to wrap an
optionally encoded data item anywhere in the document which contains
the Signature element, if the application is designed that way.

On the other hand, I don't give much weight to the argument against an
IDREF attribute because it is so limited compared with a URI when we
provide the URI attribute as an alternative.

From:  "Joseph M. Reagle Jr." <reagle@w3.org>
Resent-Date:  Fri, 28 Jan 2000 18:07:04 -0500 (EST)
Resent-Message-Id:  <200001282307.SAA12126@www19.w3.org>
Message-Id:  <3.0.5.32.20000128180649.04939530@localhost>
Date:  Fri, 28 Jan 2000 18:06:49 -0500
To:  Dan Connolly <connolly@w3.org>
Cc:  "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>,
            "C. M. Sperberg-McQueen" <cmsmcq@acm.org>,
            "Henry S. Thompson" <ht@cogsci.ed.ac.uk>,
            Tim Berners-Lee <timbl@w3.org>
In-Reply-To:  <388FC9CF.38E985AC@w3.org>
References:  <4.1.20000126152750.00ab4970@tigger.cc.uic.edu>

>http://www.w3.org/Signature/2000/01/URI-IDREF.html
>
>[Dan's take on URIs and IDREFs is below and a worthwhile read, I'm
>trying to summarize the WG's position in response.]
>
>The reason I asked the XML schema editors about the URI datatype is
>because I needed to understand the syntactical validation constraints
>placed over that type. If it permits fragments, we will likely have to
>create a user-generated  type by specifying a [1]pattern facet over
>the string type. This is admittedly awkward, and I'm of a mixed mind
>on it as are many of the WG members, but the reasons for this follow:
>
>   [1] http://www.w3.org/TR/xmlschema-2/#dt-pattern
>
>The WG is presently doing two things "oddly" in its treatment of
>references.
> 1. Our present course is to define a URI-clean (sans the fragment),
>    such that:
>    URI-clean = [ absoluteURI | relativeURI ]
>    This is done because the treatment of XPATH/XSLT or other fragment
>    expressions in the context of a URI can be confusing. As  XPath is
>    a feature some WG members will want to use, the semantics of the
>    transform are very important to the signature and it makes sense
>    that they be explicitly represented as part of a transform. As
>    part of a transform that we identify the WG _can_ properly specify
>    any serialization or canonicalization necessary for XPATH/XSLT to
>    work for our application. (Given that serialization and attribute
>    order are purposefully not specified by those specs and are punted
>    to the application, I wonder how other applications will address
>    this issue (consistent serialization) when they are expressed
>    merely as part of a URI...)

Consistent serialiation is rarely a problem except for signatures
where it is absolutely critical...

> 2. However, we still need to support signature references to XML
>    elements within a local document. (Where a signature is enveloped
>    by or [2]enveloping XML content in the same document. ) Given our
>    URI definition it makes sense to rely upon IDREFs for this purpose
>    for the following reasons:
>
> 1. I believe this was the intent of ID/IDREF as specified in XML1.0.
> 2. This method permits those members not keen on XPath to reference
>    local XML elements (within the same document) by using presently
>    implemented XML and not having to support XPath immediately.

You don't have to implement any part of XPath to recognize "#token"
and treat it as an IDREF.  In principle, when you do this, you are
implementing a part of XPointer but it is such a tiny part that I find
it misleading to think of it that way.

>   [2] http://www.w3.org/Signature/Drafts/WD-xmldsig-core-20000114/#def-SignatureEnv
eloping
>
>However, there are a number of reasons/arguments not to do this
> 1. It is my understanding that ID/IDREFs are not thought of that
>    highly by Berners-Lee as they permit "closed-world" references:
>    "The local identifier space is a subset of URI space. When an
>    attribute is defined as a URI, the simple "#" prefix gives access
>    to the local ID space - while still allowing great power of
>    expression by reference to anything else on the Web. When the
>    "IDREF" form is used, this is not possible. The IDREF form is a
>    weak form IMHO and  not wise for new designs which are not to be
>    deliberately constraining."
>    [3]http://www.w3.org/DesignIssues/Syntax.html
> 2. For XML applications to understand IDREFs they need access to the
>    DTD. However, I've heard arguments that this is not the case.
>    (Though I'm not sure how relevant the DTD is in any case as this
>    this document will have element types from two different
>    DTD/schemas: the document and the signature.)
> 3. The end result of this is rather kludgey as already noted.
>
>   [3] http://www.w3.org/DesignIssues/Syntax.html
>
>Consequently the following to arguments were forwarded:
> 1. [4]Boyer has proposed we use XPath (or some profile subset/hack)
>    for doing local references. Everyone must support this particular
>    XPath instance, though not the whole specification.

The kind of pattern matching against XPath Transforms you would have
to do is inherently more complex than handling an IDREF attribute or
recognizing a "#token" URI reference.

> 2. [5]Karlinger has seconded Boyer's argument, or even suggested that
>    any XPath specification of a URI needs to be interpreted in the
>    context of our application serialization and canonicalization
>    profile.
>
>   [4] http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2000JanMar/0011.html
>   [5] http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2000JanMar/0028.html
>
>However, [6]this issue was discussed at the FTF meeting last week with
>the result that:
>
>   [6] http://www.w3.org/Signature/Minutes/SanJose/#IDREF
>
>Schaad: let's stay with what we have until we hear a compelling
>argument that we understand and agree with before we move away from
>what we have.
>Reagle: what about the "clean-URI" type, no such thing. Result: Define
>our own 'clean-URI' XML datatype.
>
>...
>

Donald
===================================================================
 Donald E. Eastlake 3rd                    dee3@torque.pothole.com
 65 Shindegan Hill Rd, RR#1            lde008@noah.dma.isg.mot.com
 Carmel, NY 10512 USA     +1 914-276-2668(h)    +1 508-261-5434(w)

Received on Monday, 31 January 2000 21:41:29 UTC