W3C home > Mailing lists > Public > public-grddl-wg@w3.org > April 2007

Re: #xmlbase [1234]

From: Jeremy Carroll <jjc@hpl.hp.com>
Date: Thu, 26 Apr 2007 21:06:28 +0100
Message-ID: <46310644.9080300@hpl.hp.com>
To: "Clark, John" <CLARKJ2@ccf.org>
CC: GRDDL Working Group <public-grddl-wg@w3.org>

So, it hinges on whether this text (RFC 3986 section 5.1.2] is or is not 
applicable to the RDF/XML produced by a GRDDL transform:

[[
    If no base URI is embedded, the base URI is defined by the
    representation's retrieval context.  For a document that is enclosed
    within another entity, such as a message or archive, the retrieval
    context is that entity.  Thus, the default base URI of a
    representation is the base URI of the entity in which the
    representation is encapsulated.

    A mechanism for embedding a base URI within MIME container types
    (e.g., the message and multipart types) is defined by MHTML
    [RFC2557].  Protocols that do not use the MIME message header syntax,
    but that do allow some form of tagged metadata to be included within
    messages, may define their own syntax for defining a base URI as part
    of a message.
]]

The second para seems irrelevant, and both our options agree on the 
first sentence of the first para.

Thus we are down to two sentences:
[[
For a document that is enclosed
    within another entity, such as a message or archive, the retrieval
    context is that entity.  Thus, the default base URI of a
    representation is the base URI of the entity in which the
    representation is encapsulated.
]]

In RFC 1808, the corresponding text appears to have been longer, section 3.2
[[
Composite media types, such as the "multipart/*" and "message/*" media 
types defined by MIME (RFC 1521, [4]), define a hierarchy of retrieval 
context for their enclosed documents. In other words, the retrieval 
context of a component part is the base URL of the composite entity of 
which it is a part. Thus, a composite entity can redefine the retrieval 
context of its component parts via the inclusion of a base-header, and 
this redefinition applies recursively for a hierarchy of composite 
parts. Note that this might not change the base URL of the components, 
since each component may include an embedded base URL or base-header 
that takes precedence over the retrieval context.
]]
The last two sentences would be applicable in this case. The first does 
not seem to be. I am not sure if an XML doc with a GRDDL transform is a 
composite entity or not.

This text appears to have been deleted in this diff:
http://www.ics.uci.edu/~fielding/url/diff07to08.txt
I'm not sure what to read into that, if anything.

So choices:

1) base URI of RDF/XML is retrieval URI
2) base URI of RDF/XML is base URI of XML/XHTML


practical impact:

a) either way this is a moderately difficult aspect of the GRDDL Spec to 
get right.

b) authors for HTML transforms, which include reaping href's will find 
(2) easier, because the GRDDL aware agent processes html:base, and the 
transform doesn't have to

c) authors of XML transforms which include reaping relative references, 
will have more of a struggle with (2) than with (1), because the 
xml:base on the root elment, if any, modifies the baseURI of the 
document (well at least of the nodeset, which is all we actually care 
about), will be handled by the GRDDL agent - and the xml:base on the 
non-root elements need to be handle by the XSLT. In either case, the 
library function embeddedRDF should do something (approx. what it 
currently does), so perhaps that should be wrapped up as a module that 
other GRDDL can import.

I'll think through what happens during a transform, and send another 
message.

Jeremy



Clark, John wrote:
> I read the relevant specifications somewhat differently, which leads me
> to conclude that the test results as of 2007-04-25 were the correct
> results.  My analysis responds to Jeremy's below.
> 
>> =====
>> Stepping through again (skip to ISSUE for the divergence 
>> between what I was thinking, and what I think you think) ======
> 
>> ISSUE
>> =====
>>
>> rdf:about="" is a same document reference, that hence 
>> resolves to the baseURI of this intermediate representation.
> 
>> Chime points to section 5.1.3 of RFC 2396 see
>> http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1
>> actually the picture is helpful.
>> I think Chime's argument is compelling.
>>
>>    |  .----------------------------------------------------.  |
>>    |  |  .----------------------------------------------.  |  |
>>    |  |  |  .----------------------------------------.  |  |  |
>>    |  |  |  |  .----------------------------------.  |  |  |  |
>>    |  |  |  |  |       <relative-reference>       |  |  |  |  |
>>    |  |  |  |  `----------------------------------'  |  |  |  |
>>    |  |  |  | (5.1.1) Base URI embedded in content   |  |  |  |
>>    |  |  |  `----------------------------------------'  |  |  |
>>    |  |  | (5.1.2) Base URI of the encapsulating entity |  |  |
>>    |  |  |         (message, representation, or none)   |  |  |
>>    |  |  `----------------------------------------------'  |  |
>>    |  | (5.1.3) URI used to retrieve the entity            |  |
>>    |  `----------------------------------------------------'  |
>>    | (5.1.4) Default Base URI (application-dependent)         |
>>    `----------------------------------------------------------'
>>
>> We have a relative reference "" inside an RDF/XML representation.
>> There is no base URI embedded in content, so 5.1.1 does not apply.
>> Thus we use 5.1.2 to get the Base URI of the 'encapsulating 
>> entity', which was the XML document above.
> 
> I don't think that it makes sense to think of the GRDDL source document
> as an 'encapsulating entity' for a GRDDL result associated with that
> document.  If this association held, what would be the context for
> encapsulating a GRDDL result within its GRDDL source document?  Would
> the context be the document, or would it be some element within that
> document?  XML Base provides us with (potentially) different base URIs
> for the document and for elements within that document; this is defined
> in section 4.2[0].  If the context will be an element, do we choose the
> element where the GRDDL transformation for a particular GRDDL result is
> referenced?  This may provide different base URIs for portions of GRDDL
> results produced by GRDDL transformations in an XHTML document, so what
> do we do about XHTML documents?  Finally, since a GRDDL result is a
> faithful rendition of the source document, wouldn't that make it an
> alternative to the source document rather than encapsulated within the
> source document?
> 
>> So we start again: the 'encapsulating entity' has a base URI 
>> embedded in content, so we never get to 5.1.3 URI used to 
>> retrieve the entity, and hence my current code that assumes 
>> that we do, is broken.
> 
> Because of my statement above, I think that we do get to 5.1.3, so we
> should use the retrieval URI of the GRDDL source document as the base
> URI of its GRDDL result(s).  I would not be opposed to the reading that
> Chime gives here, but I think some of the questions I raise might first
> need to be dealt with in the specification.
> 
> Take care,
> 
>     John
> 
> 
> 
> 
> 
> 
> Cleveland Clinic is ranked one of the top 3 hospitals in
> America by U.S.News & World Report. Visit us online at
> http://www.clevelandclinic.org for a complete listing of
> our services, staff and locations.
> 
> 
> Confidentiality Note:  This message is intended for use
> only by the individual or entity to which it is addressed
> and may contain information that is privileged,
> confidential, and exempt from disclosure under applicable
> law.  If the reader of this message is not the intended
> recipient or the employee or agent responsible for
> delivering the message to the intended recipient, you are
> hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited.  If
> you have received this communication in error,  please
> contact the sender immediately and destroy the material in
> its entirety, whether electronic or hard copy.  Thank you.
> 
> 
> ===================================
> 

-- 
Hewlett-Packard Limited
registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Thursday, 26 April 2007 20:07:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:11:49 GMT