Re: Review of SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs review, rev1.56

On 07/10/10 20:46, Chimezie Ogbuji wrote:
> On 10/6/10 6:25 AM, "Andy Seaborne"<andy.seaborne@epimorphics.com>  wrote:
>> The use of xml:base [1] applies to the XML element it is an attribute
>> of.  In XML, and Turtle, the base URI can change during parsing.
>>
>> -------------
>> PUT /rdf-graphs/service/?graph=1  HTTP/1.1
>> Host: example.com
>>
>> <?xml version='1.0' encoding='UTF-8'?>
>>     <rdf:RDF
>>        xml:base='http://example2.com/rdf-graphs/employees/'
>>        xmlns:rdf='...'>
>>        ...
>> </rdf:RDF>
>> -------------
>>
>> Here, it starts as http://example.com/rdf-graphs/service/?graph=1, and
>> that is in-scope for determining ?graph=1. The base for the parsing of
>> the XML document (the external base):
>> http://example.com/rdf-graphs/service/?graph=1
>>
>> It changes inside rdf:RDF element to
>> http://example2.com/rdf-graphs/employees/
>>
>> I thing the graph to PUT to is http://example.com/rdf-graphs/service/1
>
> This is in conflict with my understanding of RFC3986 (in particular with the
> highest precedent: Base URI embedded in content).  According to the chain,
> the base is given by the embedded xml:base attribute first.

We may be talking at cross purposes.  The issue is a relative URI in the 
HTTP request line.

The simple solution is to require the URI in ?graph= to be absolute.  We 
are defining the naming convention here so we can make that condition.

If we allow relative URIs:

The important point for me is that RFC 3986 describes a process in 
section 5 that is applied when the relative URI is encountered.

I explain below why I think RFC 3986 describes a process to apply at the 
point when a relative URI is found so if outside the XML document, 
xml:base has not been encoutered and does not apply.  I provide 
supporting evidence for that view from the defn of xml:base and RFC 3986.

I don't understand how the scope of xml:base can be outside the XML 
document, actually how it can be outside the rdf:RDF element for several 
reasons:

1/ xml:base, by its definition, applies only to XML documents
2/ xml:base is scoped to the element where it occurs
3/ xml:base is itself subject to relative URI resolution so there is 
base of wider scope than the XML document.
4/ An XML document can have several xml:base - if they apply outside 
their element scope, which one applies?


(1) has a practical matter - the request should be able to be dispatched 
within the web server before the content is parsed.

[[ http://www.w3.org/TR/xmlbase/#rfc3986
4.1 Relation to RFC 3986
...
    1. The base URI is embedded in the document's content.
...
This document specifies the details of rule #1 for embedding base URI 
information in the specific case of XML documents.
]]

The last part is significant.  xml:base is specific to XML documents.


[[RFC 3986
Section 5:
5 Reference Resolution

     This section defines the process of resolving a URI reference 
within a context that allows relative references so that the result is a 
string matching the <URI> syntax rule of Section 3.
]]

so we are talking about the process of resolution.  That occurs when the 
relative URI is encountered.

The diagram in RFC 3986 shows the nesting and shows the relative URI at 
the center.  When the relative URI is encountered, the parser (and RFC 
3986 talks about parsers) picks the base URI working out from that diagram.

This is illustrated when xml:base is relative - it itself is resolved 
using the current base at that point.  There is base URI of wider scope 
than the xml:base in the document.

[[RFC 3986 - 5.1 Establishing a Base URI
If the base URI is obtained from a URI reference, then that reference 
must be converted to absolute form and stripped of any fragment 
component prior to its use as a base URI.
]]

In RDF/XML:
Document base:  http://example/A/
<rdf:RDF
     xml:base="base1/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns="http://example/">

=> base is now <http://example/A/base1> .


Let me work though the parsing process for the HTTP request:

PUT /rdf-graphs/service/?graph=1  HTTP/1.1
Host: example.com

so resolving "1" working outwards through that diagram, we get a base of 
http://example.com/rdf-graphs/service/?graph=1.

It is also arguable that there no base URI (or it's the application 
dependent one) at this point.


The base inside the XML content does not apply: it's not in context yet. 
  (This also follows in definition of xml:base as it applies to XML 
documents.)

<?xml version='1.0' encoding='UTF-8'?>
** base URI is http://example.com/rdf-graphs/service/?graph=1
   <rdf:RDF
       **Base becomes 'http://example2.com/rdf-graphs/employees/'
       xml:base='http://example2.com/rdf-graphs/employees/'
        xmlns:rdf='...'>

At this point the base is that given by xml:base and by the xmlbase 
spec, applies to the rdf:RDF only. Resolution of relative URIs in 
xml:base provides evidence for this.

If the xml:base is itself relative:
[[ RFC 3986
If the base URI is obtained from a URI reference, then that reference 
must be converted to absolute form and stripped of any fragment 
component prior to its use as a base URI.
]]


For the ?graph=relURI, this leaves two possibilities:

1/ there is no base URI, and ?graph= can not be a relative URI.
2/ The base is URI used to retrieve the entity.

(1) is made by arguing that the base URI is the sum total of the request 
line and HTTP headers and starts immediately at the end of the headers 
and so is not active at the point ?graph=1 is encountered.

(2) is made by arguing that the base URI comes from the entity URI in 
the request. It's still arguable if it is active by the time ?graph= is 
reached or whether it starts at the end of the method line.

	Andy

Received on Thursday, 7 October 2010 23:41:38 UTC