RE: DSig comments on XML Base

At 16:07 2000 07 26 -0700, John Boyer wrote:
>-----Original Message-----
>From: Paul Grosso [mailto:pgrosso@arbortext.com]
>Sent: Wednesday, July 26, 2000 12:29 PM
>
>What does C14N do about relative URIs in external entities in the
>absence of any xml:base
>
><john>
>c14n is not a big consumer of xml:base per se.  It will use it the same way
>that Xpath will use it, namely for resolving relative URIs in namespace
>declarations, unless the W3C decides  to stick with the Namespaces
>recommendation and patch XPath so that relative URIs are not absolutized (in
>which case, c14n will not directly use xml:base at all).
></john>

My point above is that the issue you have isn't with XML Base,
because all XML Base does is give you another way to set the
base URI, but you already have the problem of different base
URIs in different external entities even if XML Base is not
considered.

>
>and how does xml:base cause a problem that
>isn't already there with external entities?
>
><john>
>The problem isn't with any functionality that c14n needs, but rather with
>the interpretation of a document being different from the canonicalized
>document.  XML Base is (or will be) a recommendation that all XML
>applications should resolve relative URIs *whereever they may be in the
>application content* by using the rules given in xml:base.

No, it is not.  XML Base as a spec merely provides the way for
a resource of MIME type */xml to specify the base URI within
the document content (see RFC 2396 section 5.1.1).  It does NOT
say how to determine what is a relative URI and which relative
URIs should be resolved.  It isn't even normative about how to 
resolve a relative URI given one and the base URI--it merely
restates what RFC 2396 says normatively about this.  It is the
job of any spec using XML Base to describe how to determine what
it means to that application for something to be a relative URI
within the application content.  C14N is one such application.

>An application uses an XML processor to obtain XML content, so the
>application is unlikely to know whether the content was the result of an
>external entity or an internal entity because this information is not
>reported to the application.
>
>So, suppose the application has an element E containing char data that the
>application knows is a relative URI.  Further, suppose an ancestor element
>of E contains an xml:base attribute.  How does the application know whether
>or not to apply the xml:base to the relative URI?

If the application is built on the Infoset, then the infoset carries
correct base URI info on each element.

In addition, suppose there are no xml:base attributes at all.  Then
how does your application know which external entity's URI to use
as the base URI to resolve the relative URI?  If you can solve this
issue, you can probably handle xml:base, and if you can't solve this
issue, then you've already got a problem that has nothing to do with
XML Base.

>See below for more...
></john>
>
>See also some of the following:
>
>http://lists.w3.org/Archives/Public/www-xml-linking-comments/2000JulSep/0056
>http://lists.w3.org/Archives/Public/www-xml-linking-comments/2000JulSep/0047
>http://lists.w3.org/Archives/Public/www-xml-linking-comments/2000JulSep/0062
>
>and the rest of the many messages in this archive on this issue.
>
><john>
>In these emails, you rationalize the fact that xml:base should not apply to
>external entities by using the following citation from XML 1.0 from section
>4.2.2 [1]:

Actually, this is not a rationale for what xml:base should do, it is
quoted in support of the position that relative URIs in external 
entities--having nothing to do with XML Base--should be resolved
relative to the base URI of the external entity in which they appear.

>"Unless otherwise provided by information outside the scope of this
>specification (e.g. a special XML element type defined by a particular DTD,
>or a processing instruction defined by a particular application
>specification), relative URIs are relative to the location of the resource
>within which the entity declaration occurs"
>
>[1] http://www.w3.org/TR/REC-xml#sec-external-ent
>
>I have several comments about this.
>
>Firstly, the citation justifies a *default* base URI, so there is no
>justification for explicitly cutting off xml:base, which is used to override
>the default established by the citation.

You are using different terminology than RFC 2396 which talks of
"Establishing a Base URI" (section 5.1) and then gives a four
level way to determine the base URI.  XML Base is precisely the
first level way (5.1.1. Base URI within Document Content) out of
those four ways to determine the base URI, whereas the URI of the
external entity (see RFC 2396 5.1.3. Base URI from the Retrieval URI)
is the third level way.  (Note that the term "entity" in Section 5.1.2 
Base URI from the Encapsulating Entity of RFC 2396 is used in the
MIME encapsulation sense, not the XML external parsed entity sense.)

>Secondly, the citation explicitly allows for the possibility that overrides
>to this default behavior are possible.  Therefore, the citation
>substantiates the possibility that xml:base could be applied to descendant
>content derived from external entities.  I do not see the citation
>justifying or mandating that xml:base settings should not apply to content
>derived from external entities.
>
>Thirdly, I provided a compelling citation from the XML 1.0 spec that seems
>to override your interpretation of [1].  The following can be read at [2]:
>
>[2] http://www.w3.org/TR/REC-xml#included
>
>"4.4.2 Included
>
>An entity is included when its replacement text is retrieved and processed,
>in place of the reference itself, as though it were part of the document at
>the location the reference was recognized. The replacement text may contain
>both character data and (except for parameter entities) markup, which must
>be recognized in the usual way, except that the replacement text of entities
>used to escape markup delimiters (the entities amp, lt, gt, apos, quot) is
>always treated as data. (The string "AT&amp;T;" expands to "AT&T;" and the
>remaining ampersand is not recognized as an entity-reference delimiter.) A
>character reference is included when the indicated character is processed in
>place of the reference itself."

I'm not trying to be a Philadelphia lawyer here--no question that the
XML 1.0 spec is not clear on this issue.  The problem is that 4.2.2
"Included" (if you look at the preceding table) is really only used
for internal entities; the table entry for external entities links
to 4.2.3 "Included If Validating" which doesn't have any wording that
is really helpful for our debate.

>The key phrase is *as though it were part of the document at the location
>the reference was recognized*.  

And that is true for internal entities.

>The issue I'm raising is that applications
>do not know that content was derived by internal or external means because
>the content appears as though it were part of the document.  Therefore,
>application cannot use of xml:base when its setting is given by an ancestor
>element.

But that is the tail wagging the dog.  Applications should be built
on the Infoset (or equivalent model), and the infoset should provide
the necessary base URI info for whatever needs the application may
have.  And this is necessary even in the absence of XML Base.

paul

Received on Thursday, 27 July 2000 13:15:55 UTC