Re: Call for Consensus: IRI resolution tests

On 10/25/2015 03:02 PM, Andy Seaborne wrote:
> On 25/10/15 16:14, Gregg Kellogg wrote:
>> On Oct 25, 2015, at 8:48 AM, Andy Seaborne <andy@apache.org
>> <mailto:andy@apache.org>> wrote:
>>
>>> On 25/10/15 12:01, Ruben Verborgh wrote:
>>>> Dear Andy,
>>>>
>>>>> The tests make an additional assumption that absolute URIs are not
>>>>> normalized.  This is not covered by the Turtle spec one way or
>>>>> another (nor should it be).  Both normalizing and not normalizing
>>>>> are possible.
>>>>
>>>> I disagree here—there Turtle spec should cover this.
>>>
>>> "should" or "does"? Are you arguing for a change to Turtle?
>>>
>>> If it's a change, then -1 to these tests.
>>>
>>> One way is to avoid the area that is a problem for 3986 and change the
>>> tests to use the "/../" from the "/.." form.  As you yourself noted,
>>> normalization is assumed by RFC3986/5.2. Or follow RFC 3987 and don't
>>> have absolute URIs with them in.
>>>
>>>> Otherwise, two identical Turtle documents can result in different
>>>> sets of triples.
>>>
>>> ... in the one case where the base URI ends in "/.." which isn't good
>>> practice; RFC 3987/5.3.2.4 even says it is not intended usage.
>>>
>>>> I think it's clear that absolute URIS should not be touched,
>>>> and that the spec also says this.
>>>
>>> The spec being Turtle?
>>>
>>> Please quote text where it says that about @base.
>>
>> The key for me was this sentence from the IRIs section:
>>
>>  > Relative IRIs like |<#green-goblin>| are resolved relative to the
>> current base IRI.
>>
>> It says that _relative_ IRIs are resolved, but is silent on absolute
>> IRIs. Thus, if the value of @base is an absolute IRI it is not changed
>> at all, and used as is when resolving other relative IRIs. (Note, my
>> implementation did this previously, but I was convinced this was an
>> error; always resolving an IRI against the current base is supported in
>> RFC3982, but not called for from our specs. If it were, it would
>> arguably be more consistent).
>
> Being silent to me means the RFCs apply.  So we have two readings - we
> should have tests that do not choose one reading over another as we are
> not in the role of changing or interpreting the specs.

I don't think it was silent on this.
<http://www.w3.org/TR/turtle/#h3_sec-iri-references> says
[[
Relative IRIs are resolved with base IRIs as per Uniform Resource
Identifier (URI): Generic Syntax [RFC3986] using only the basic
algorithm in section 5.2. Neither Syntax-Based Normalization nor
Scheme-Based Normalization (described in sections 6.2.2 and 6.2.3 of
RFC3986) are performed.
]]

3986 §5.2 includes 4 sections:
[[
        5.2.  Relative Resolution  . . . . . . . . . . . . . . . . . . 30
              5.2.1.  Pre-parse the Base URI . . . . . . . . . . . . . 31
              5.2.2.  Transform References . . . . . . . . . . . . . . 31
              5.2.3.  Merge Paths  . . . . . . . . . . . . . . . . . . 32
              5.2.4.  Remove Dot Segments  . . . . . . . . . . . . . . 33
]]

I believe the answer to this is covered in algorithm 2 D:
[[
D.  if the input buffer consists only of "." or "..", then remove
            that from the input buffer;
]]
<http://tools.ietf.org/html/rfc3986#section-5.2.4>


> The RFCs say that ".." and "." are intended for relative URIs only.  RDF
> Concepts says they are "best avoided".
>
> I think it is a bug in RFC 3986 and called out in 3987 as "situation not
> intended to happen".
>
>      Andy
>
> [*] A fix is merging the base and relative URI needs to treat "/.." as
> "/../" either by minimal normalization or in rule the merge rule.
> Otherwise various inconsistencies appear.
>
>
>>
>> Looking at other specs, I think the same is true for JSON-LD, RDFa and
>> RDF/XML.
>>
>>>    Andy
>>>
>>> (RDF/XML is different on relative URIs)
>>
>> Why do you say this? Can you site something from the spec?
>
> """
> 5.3 Resolving URIs
>
> RDF/XML supports XML Base [XML-BASE] which defines a ·base-uri· accessor
> for each ·root event· and ·element event·. Relative URI references are
> resolved into RDF URI references according to the algorithm specified in
> XML Base [XML-BASE] (and RFC 2396).
> """
> i.e. it says "use the algorithm".
>>
>> Gregg
>>
>>>> Best,
>>>>
>>>> Ruben
>>>>
>>>
>>>
>
>

Received on Tuesday, 27 October 2015 09:19:54 UTC