Re: Review of SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs review, rev1.56

Hey Andy.  Per you email today, I'm responding to the original email you
referred to (for context).  Apparently, I didn't respond to this part of the
thread - see my response inline below and to your most recent questions at
the bottom.

On 10/7/10 7:41 PM, "Andy Seaborne" <andy.seaborne@epimorphics.com> wrote:
> On 07/10/10 20:46, Chimezie Ogbuji wrote:
>>> -------------
>>> PUT /rdf-graphs/service/?graph=1  HTTP/1.1
>>> Host: example.com
>>> 
>>> <?xml version='1.0' encoding='UTF-8'?>
>>>     <rdf:RDF
>>>        xml:base='http://example2.com/rdf-graphs/employees/'
>>>        xmlns:rdf='...'>
>>>        ...
>>> </rdf:RDF>
>>> -------------
>>>..snip..
>>> Here, it starts as http://example.com/rdf-graphs/service/?graph=1, and
>>> that is in-scope for determining ?graph=1. The base for the parsing of
>>> the XML document (the external base):
>>> http://example.com/rdf-graphs/service/?graph=1
>>>..snip..
>>> It changes inside rdf:RDF element to
>>> http://example2.com/rdf-graphs/employees/
>>>..snip..
>>> I thing the graph to PUT to is http://example.com/rdf-graphs/service/1

> The simple solution is to require the URI in ?graph= to be absolute.  We
> are defining the naming convention here so we can make that condition.

It has been my experience that proper support for relative URI resolution is
very useful for deploying the same content in different locations without
having to change hardcoded absolute URIs and for that reason I prefer not to
have this requirement.
 
> If we allow relative URIs:
> 
> The important point for me is that RFC 3986 describes a process in
> section 5 that is applied when the relative URI is encountered.

Yes.
 
> I explain below why I think RFC 3986 describes a process to apply at the
> point when a relative URI is found so if outside the XML document,
> xml:base has not been encoutered and does not apply.

I believe that xml:base does apply, because the rule of highest precedent in
determining the Base URI is to use what is embedded in the document and the
xml:base document "specifies the details of [this rule] for embedding base
URI information in the specific case of XML documents."

In particular, "The attribute xml:base may be inserted in XML documents to
specify a base URI other than the base URI of the document or external
entity." - Start of 3 xml:base Attribute / XML Base (Second Edition)
 
> I don't understand how the scope of xml:base can be outside the XML
> document, actually how it can be outside the rdf:RDF element for several
> reasons:

The notion of the 'scope' of an @xml:base attribute (as used in the
corresponding specification) is only relevant for resolving relative URIs
within the document.  The use of @xml:base to determine the base against
which embedded, relative graph URIs are resolved is not as a result of being
in scope of @xml:base but rather because @xml:base is how XML documents
(specifically) implement the resolution rule of highest precedent in RFC
3986.  I.e., the XML base spec says how you can embed a base URI in content
and RFC 3986 governs whether and how it is used to resolve relative URIs
external to the content.
 
> 1/ xml:base, by its definition, applies only to XML documents
> 2/ xml:base is scoped to the element where it occurs
> 3/ xml:base is itself subject to relative URI resolution so there is
> base of wider scope than the XML document.
> 4/ An XML document can have several xml:base - if they apply outside
> their element scope, which one applies?

> (1) has a practical matter - the request should be able to be dispatched
> within the web server before the content is parsed.

By my reading of RFC 3986 (as it relate to resolution of relative URIs),
*if* the content is an XML document then the resolution mechanism *has* to
parse to determine if @xml:base exists otherwise it is not doing URI
resolution in a compliant manner.

This certainly can be explicitly stated as something a user of the protocol
needs to be aware of as a caveat of using ?graph=..relativeUri.. with
payloads in a format that allows you to specify a base URI.   Note that this
is true for any other RDF concrete syntax that has a mechanism for
indicating an explicit Base URI in content such as the use of @base in
Turtle.

Generally, such indications serve 2 separate purposes: 1) to embed a URI (in
content) for the purpose of the use of RFC 3986 resolution rule #1 to
resolve a relative URI external to the content and 2) to allow relative
references within the content to be resolved WRT the specified base URI.

This first purpose is not as ubiquitous as the second but it follows from my
reading of RFC 3986.

> [[ http://www.w3.org/TR/xmlbase/#rfc3986
> 4.1 Relation to RFC 3986
> ...
>     1. The base URI is embedded in the document's content.
> ...
> This document specifies the details of rule #1 for embedding base URI
> information in the specific case of XML documents.
> ]]
> 
> The last part is significant.  xml:base is specific to XML documents.


Yes.  If we change the example to

PUT /rdf-graphs/service/?graph=1  HTTP/1.1
Host: example.com
Content-type: text/SomeRdfSyntax

..RDF document serialized in text/SomeRdfSyntax ..

Then this particular discussion (regarding @xml:base) wouldn't apply.  If
the specification for text/SomeRdfSyntax doesn't indicate how to embed a (an
absolute) base URI in content then the service would need to respond with
either 201 (Created) or 200 (OK) depending on if a named graph in the store
with a URI of http://www.example.com/rdf-graphs/service/1 already exists per
(5.1.3).

> [[RFC 3986
> Section 5:
> 5 Reference Resolution
> 
>      This section defines the process of resolving a URI reference
> within a context that allows relative references so that the result is a
> string matching the <URI> syntax rule of Section 3.
> ]]
> 
> so we are talking about the process of resolution.  That occurs when the
> relative URI is encountered.

Yes.

> The diagram in RFC 3986 shows the nesting and shows the relative URI at
> the center.  When the relative URI is encountered, the parser (and RFC
> 3986 talks about parsers)

It looks like all mention of parsers in that document is specifically
speaking about URI parsers not parsers of the request body.

> picks the base URI working out from that diagram.

Yes

> This is illustrated when xml:base is relative - it itself is resolved
> using the current base at that point.  There is base URI of wider scope
> than the xml:base in the document.

Yes, but this notion of scope is only relevant within the document.  So, if
the @xml:base is relative, it is resolved using " .. the base URI of the
parent element [...], if one exists [...], otherwise the base URI of the
document entity or external entity containing the element." -- 4.3 Matching
URIs with base URIs

> [[RFC 3986 - 5.1 Establishing a Base URI
> If the base URI is obtained from a URI reference, then that reference
> must be converted to absolute form and stripped of any fragment
> component prior to its use as a base URI.
> ]]
> 
> In RDF/XML:
> Document base:  http://example/A/
> <rdf:RDF
>      xml:base="base1/"
>      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>      xmlns="http://example/">
> 
> => base is now <http://example/A/base1> .

I'm not sure of the context of your use of 'base' here.  If you mean that
any relative URI associated with a descendent node in the RDF/XML is
resolved using <http://example/A/base1> as the base URI, then yes, but RFC
3986 doesn't apply.

However if your context is outside the RDF/XML document, then RFC 3986 does
apply and the only way the document base would be http://example/A/ is if
that is the request URI of the message for which the document serves as
content:

"The base URI of a document entity or an external entity is determined
by RFC 3986 rules, namely, that the base URI is the URI used to retrieve the
document entity or external entity." -- 4.2 Granularity of base URI
information

> Let me work though the parsing process for the HTTP request:

Ok.
 
> PUT /rdf-graphs/service/?graph=1  HTTP/1.1
> Host: example.com
> 
> so resolving "1" working outwards through that diagram, we get a base of
> http://example.com/rdf-graphs/service/?graph=1.

Only if (5.1.1) doesn't apply and the only ways in which it wouldn't is if
a) there is no specified way for documents of the data format indicated by
the media type to embed a base URI or b) there is none embedded in content.
The resolution method can't simply skip 5.1.1 without doing the diligence of
checking the media-type and parsing (if it needs to for the given media
type) without failing to be compliant.

Ofcourse, there could be no content-type, but you are already in a grey area
at that point.

> It is also arguable that there no base URI (or it's the application
> dependent one) at this point.

I'm not sure how you would ever end up with this since (5.1.3) takes
precedence over (5.1.4).  Since we are talking about resolution in the
context of an HTTP request, you would always have the 'URI used to retrieve
the entity'.

> The base inside the XML content does not apply: it's not in context yet.

See above.

>   (This also follows in definition of xml:base as it applies to XML
> documents.)
> 
> <?xml version='1.0' encoding='UTF-8'?>
> ** base URI is http://example.com/rdf-graphs/service/?graph=1

base URI of the document entity (as determined by RFC 3986 and the XML Base
spec) 

>    <rdf:RDF
>        **Base becomes 'http://example2.com/rdf-graphs/employees/'

Base URI of descendent nodes in the infoset (as determined by XML Base spec)

>        xml:base='http://example2.com/rdf-graphs/employees/'
>         xmlns:rdf='...'>
> 
> At this point the base is that given by xml:base and by the xmlbase
> spec, applies to the rdf:RDF only. Resolution of relative URIs in
> xml:base provides evidence for this.

See above.

> ..snip..
> For the ?graph=relURI, this leaves two possibilities:
> 
> 1/ there is no base URI, and ?graph= can not be a relative URI.
> 2/ The base is URI used to retrieve the entity.
> 
> (1) is made by arguing that the base URI is the sum total of the request
> line and HTTP headers and starts immediately at the end of the headers
> and so is not active at the point ?graph=1 is encountered.

See above.  Also, the URI spec doesn't say anything about incremental
parsing (which is what your use of 'active' seems to suggest)
 
> (2) is made by arguing that the base URI comes from the entity URI in
> the request. It's still arguable if it is active by the time ?graph= is
> reached or whether it starts at the end of the method line.

This certainly makes sense as the 'fallback', but my contention is that RFC
3986 requires that you rule out (5.1.1) and it is pretty clear about how the
media-type (and the specification for its data format) determines how 5.1.1
is implemented (otherwise, the notion of precedence is a misnomer).

As for your later questions:

> (which one? 

If you mean which assertion, then the one I'm using is "This document
specifies the details of rule #1 [...]" in the XML Base specification.

> why outside the doc?

Because the resolution happens outside the document despite the fact that
the base URI is embedded inside the document (and used to resolve relative
URIs within it). 

> syntax issues? 

You mean if the document is not well-formed for instance?  Well then the
entire operation would need to fail with a 400 (Bad Request) in any case.

> xmlbase only applies to XML elements)

Within the document, yes.  However, outside the document, its URI can be
used by 5.1.1

-- Chime 


===================================

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law.  If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited.  If
you have received this communication in error,  please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy.  Thank you.

Received on Wednesday, 1 December 2010 21:20:51 UTC