Re: Review of SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs review, rev1.56

Adding some thoughts based on our experience.

On 2010-12-01, at 21:19, Chimezie Ogbuji wrote:

> Hey Andy.  Per you email today, I'm responding to the original email you
> referred to (for context).  Apparently, I didn't respond to this part of the
> thread - see my response inline below and to your most recent questions at
> the bottom.
> 
> On 10/7/10 7:41 PM, "Andy Seaborne" <andy.seaborne@epimorphics.com> wrote:
>> On 07/10/10 20:46, Chimezie Ogbuji wrote:
>>>> -------------
>>>> PUT /rdf-graphs/service/?graph=1  HTTP/1.1
>>>> Host: example.com
>>>> 
>>>> <?xml version='1.0' encoding='UTF-8'?>
>>>>    <rdf:RDF
>>>>       xml:base='http://example2.com/rdf-graphs/employees/'
>>>>       xmlns:rdf='...'>
>>>>       ...
>>>> </rdf:RDF>
>>>> -------------
>>>> ..snip..
>>>> Here, it starts as http://example.com/rdf-graphs/service/?graph=1, and
>>>> that is in-scope for determining ?graph=1. The base for the parsing of
>>>> the XML document (the external base):
>>>> http://example.com/rdf-graphs/service/?graph=1
>>>> ..snip..
>>>> It changes inside rdf:RDF element to
>>>> http://example2.com/rdf-graphs/employees/
>>>> ..snip..
>>>> I thing the graph to PUT to is http://example.com/rdf-graphs/service/1
> 
>> The simple solution is to require the URI in ?graph= to be absolute.  We
>> are defining the naming convention here so we can make that condition.
> 
> It has been my experience that proper support for relative URI resolution is
> very useful for deploying the same content in different locations without
> having to change hardcoded absolute URIs and for that reason I prefer not to
> have this requirement.

We've been using this feature for a number of years, and only allow absolute URIs here. There are a number of difficulties if you allow relative URIs, such as relative to what? The client can end up with different expectations to the server, even without xml:base muddying the waters.

Not allowing relative URIs has not proved to be limitation.

>> If we allow relative URIs:
>> 
>> The important point for me is that RFC 3986 describes a process in
>> section 5 that is applied when the relative URI is encountered.
> 
> Yes.
> 
>> I explain below why I think RFC 3986 describes a process to apply at the
>> point when a relative URI is found so if outside the XML document,
>> xml:base has not been encoutered and does not apply.
> 
> I believe that xml:base does apply, because the rule of highest precedent in
> determining the Base URI is to use what is embedded in the document and the
> xml:base document "specifies the details of [this rule] for embedding base
> URI information in the specific case of XML documents."
> 
> In particular, "The attribute xml:base may be inserted in XML documents to
> specify a base URI other than the base URI of the document or external
> entity." - Start of 3 xml:base Attribute / XML Base (Second Edition)

By my reading that doesn't apply "outwards" to the document URI, but in any case I don't think we should allow relative URIs here.

- Steve

>> I don't understand how the scope of xml:base can be outside the XML
>> document, actually how it can be outside the rdf:RDF element for several
>> reasons:
> 
> The notion of the 'scope' of an @xml:base attribute (as used in the
> corresponding specification) is only relevant for resolving relative URIs
> within the document.  The use of @xml:base to determine the base against
> which embedded, relative graph URIs are resolved is not as a result of being
> in scope of @xml:base but rather because @xml:base is how XML documents
> (specifically) implement the resolution rule of highest precedent in RFC
> 3986.  I.e., the XML base spec says how you can embed a base URI in content
> and RFC 3986 governs whether and how it is used to resolve relative URIs
> external to the content.
> 
>> 1/ xml:base, by its definition, applies only to XML documents
>> 2/ xml:base is scoped to the element where it occurs
>> 3/ xml:base is itself subject to relative URI resolution so there is
>> base of wider scope than the XML document.
>> 4/ An XML document can have several xml:base - if they apply outside
>> their element scope, which one applies?
> 
>> (1) has a practical matter - the request should be able to be dispatched
>> within the web server before the content is parsed.
> 
> By my reading of RFC 3986 (as it relate to resolution of relative URIs),
> *if* the content is an XML document then the resolution mechanism *has* to
> parse to determine if @xml:base exists otherwise it is not doing URI
> resolution in a compliant manner.
> 
> This certainly can be explicitly stated as something a user of the protocol
> needs to be aware of as a caveat of using ?graph=..relativeUri.. with
> payloads in a format that allows you to specify a base URI.   Note that this
> is true for any other RDF concrete syntax that has a mechanism for
> indicating an explicit Base URI in content such as the use of @base in
> Turtle.
> 
> Generally, such indications serve 2 separate purposes: 1) to embed a URI (in
> content) for the purpose of the use of RFC 3986 resolution rule #1 to
> resolve a relative URI external to the content and 2) to allow relative
> references within the content to be resolved WRT the specified base URI.
> 
> This first purpose is not as ubiquitous as the second but it follows from my
> reading of RFC 3986.
> 
>> [[ http://www.w3.org/TR/xmlbase/#rfc3986
>> 4.1 Relation to RFC 3986
>> ...
>>    1. The base URI is embedded in the document's content.
>> ...
>> This document specifies the details of rule #1 for embedding base URI
>> information in the specific case of XML documents.
>> ]]
>> 
>> The last part is significant.  xml:base is specific to XML documents.
> 
> 
> Yes.  If we change the example to
> 
> PUT /rdf-graphs/service/?graph=1  HTTP/1.1
> Host: example.com
> Content-type: text/SomeRdfSyntax
> 
> ..RDF document serialized in text/SomeRdfSyntax ..
> 
> Then this particular discussion (regarding @xml:base) wouldn't apply.  If
> the specification for text/SomeRdfSyntax doesn't indicate how to embed a (an
> absolute) base URI in content then the service would need to respond with
> either 201 (Created) or 200 (OK) depending on if a named graph in the store
> with a URI of http://www.example.com/rdf-graphs/service/1 already exists per
> (5.1.3).
> 
>> [[RFC 3986
>> Section 5:
>> 5 Reference Resolution
>> 
>>     This section defines the process of resolving a URI reference
>> within a context that allows relative references so that the result is a
>> string matching the <URI> syntax rule of Section 3.
>> ]]
>> 
>> so we are talking about the process of resolution.  That occurs when the
>> relative URI is encountered.
> 
> Yes.
> 
>> The diagram in RFC 3986 shows the nesting and shows the relative URI at
>> the center.  When the relative URI is encountered, the parser (and RFC
>> 3986 talks about parsers)
> 
> It looks like all mention of parsers in that document is specifically
> speaking about URI parsers not parsers of the request body.
> 
>> picks the base URI working out from that diagram.
> 
> Yes
> 
>> This is illustrated when xml:base is relative - it itself is resolved
>> using the current base at that point.  There is base URI of wider scope
>> than the xml:base in the document.
> 
> Yes, but this notion of scope is only relevant within the document.  So, if
> the @xml:base is relative, it is resolved using " .. the base URI of the
> parent element [...], if one exists [...], otherwise the base URI of the
> document entity or external entity containing the element." -- 4.3 Matching
> URIs with base URIs
> 
>> [[RFC 3986 - 5.1 Establishing a Base URI
>> If the base URI is obtained from a URI reference, then that reference
>> must be converted to absolute form and stripped of any fragment
>> component prior to its use as a base URI.
>> ]]
>> 
>> In RDF/XML:
>> Document base:  http://example/A/
>> <rdf:RDF
>>     xml:base="base1/"
>>     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>     xmlns="http://example/">
>> 
>> => base is now <http://example/A/base1> .
> 
> I'm not sure of the context of your use of 'base' here.  If you mean that
> any relative URI associated with a descendent node in the RDF/XML is
> resolved using <http://example/A/base1> as the base URI, then yes, but RFC
> 3986 doesn't apply.
> 
> However if your context is outside the RDF/XML document, then RFC 3986 does
> apply and the only way the document base would be http://example/A/ is if
> that is the request URI of the message for which the document serves as
> content:
> 
> "The base URI of a document entity or an external entity is determined
> by RFC 3986 rules, namely, that the base URI is the URI used to retrieve the
> document entity or external entity." -- 4.2 Granularity of base URI
> information
> 
>> Let me work though the parsing process for the HTTP request:
> 
> Ok.
> 
>> PUT /rdf-graphs/service/?graph=1  HTTP/1.1
>> Host: example.com
>> 
>> so resolving "1" working outwards through that diagram, we get a base of
>> http://example.com/rdf-graphs/service/?graph=1.
> 
> Only if (5.1.1) doesn't apply and the only ways in which it wouldn't is if
> a) there is no specified way for documents of the data format indicated by
> the media type to embed a base URI or b) there is none embedded in content.
> The resolution method can't simply skip 5.1.1 without doing the diligence of
> checking the media-type and parsing (if it needs to for the given media
> type) without failing to be compliant.
> 
> Ofcourse, there could be no content-type, but you are already in a grey area
> at that point.
> 
>> It is also arguable that there no base URI (or it's the application
>> dependent one) at this point.
> 
> I'm not sure how you would ever end up with this since (5.1.3) takes
> precedence over (5.1.4).  Since we are talking about resolution in the
> context of an HTTP request, you would always have the 'URI used to retrieve
> the entity'.
> 
>> The base inside the XML content does not apply: it's not in context yet.
> 
> See above.
> 
>>  (This also follows in definition of xml:base as it applies to XML
>> documents.)
>> 
>> <?xml version='1.0' encoding='UTF-8'?>
>> ** base URI is http://example.com/rdf-graphs/service/?graph=1
> 
> base URI of the document entity (as determined by RFC 3986 and the XML Base
> spec) 
> 
>>   <rdf:RDF
>>       **Base becomes 'http://example2.com/rdf-graphs/employees/'
> 
> Base URI of descendent nodes in the infoset (as determined by XML Base spec)
> 
>>       xml:base='http://example2.com/rdf-graphs/employees/'
>>        xmlns:rdf='...'>
>> 
>> At this point the base is that given by xml:base and by the xmlbase
>> spec, applies to the rdf:RDF only. Resolution of relative URIs in
>> xml:base provides evidence for this.
> 
> See above.
> 
>> ..snip..
>> For the ?graph=relURI, this leaves two possibilities:
>> 
>> 1/ there is no base URI, and ?graph= can not be a relative URI.
>> 2/ The base is URI used to retrieve the entity.
>> 
>> (1) is made by arguing that the base URI is the sum total of the request
>> line and HTTP headers and starts immediately at the end of the headers
>> and so is not active at the point ?graph=1 is encountered.
> 
> See above.  Also, the URI spec doesn't say anything about incremental
> parsing (which is what your use of 'active' seems to suggest)
> 
>> (2) is made by arguing that the base URI comes from the entity URI in
>> the request. It's still arguable if it is active by the time ?graph= is
>> reached or whether it starts at the end of the method line.
> 
> This certainly makes sense as the 'fallback', but my contention is that RFC
> 3986 requires that you rule out (5.1.1) and it is pretty clear about how the
> media-type (and the specification for its data format) determines how 5.1.1
> is implemented (otherwise, the notion of precedence is a misnomer).
> 
> As for your later questions:
> 
>> (which one? 
> 
> If you mean which assertion, then the one I'm using is "This document
> specifies the details of rule #1 [...]" in the XML Base specification.
> 
>> why outside the doc?
> 
> Because the resolution happens outside the document despite the fact that
> the base URI is embedded inside the document (and used to resolve relative
> URIs within it). 
> 
>> syntax issues? 
> 
> You mean if the document is not well-formed for instance?  Well then the
> entire operation would need to fail with a 400 (Bad Request) in any case.
> 
>> xmlbase only applies to XML elements)
> 
> Within the document, yes.  However, outside the document, its URI can be
> used by 5.1.1
> 
> -- Chime 
> 
> 
> ===================================
> 
> P Please consider the environment before printing this e-mail
> 
> Cleveland Clinic is ranked one of the top hospitals
> in America by U.S.News & World Report (2009).  
> Visit us online at http://www.clevelandclinic.org for
> a complete listing of our services, staff and
> locations.
> 
> 
> Confidentiality Note:  This message is intended for use
> only by the individual or entity to which it is addressed
> and may contain information that is privileged,
> confidential, and exempt from disclosure under applicable
> law.  If the reader of this message is not the intended
> recipient or the employee or agent responsible for
> delivering the message to the intended recipient, you are
> hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited.  If
> you have received this communication in error,  please
> contact the sender immediately and destroy the material in
> its entirety, whether electronic or hard copy.  Thank you.
> 
> 

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD

Received on Monday, 6 December 2010 14:24:24 UTC