W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2005

[Fwd: Re: SPARQL: QuotedIRIref too lax]

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Thu, 06 Oct 2005 11:14:14 +0100
Message-ID: <4344F8F6.2020509@hp.com>
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>

Two additional comments arose from the [OK?] message:

1/ Stating that A.1 applies after escape processing.

I added ", after escape processing," in A.1

"""
Text matched by the Q_IRI_REF production and QName (after prefix expansion) 
production, after escape processing, must be conform to the generic syntax of 
IRI references in section 2.2 of RFC 3987 "ABNF for IRI References and IRIs" 
[RFC3987]. For example, the Q_IRI_REF <abc#def> may occur in a SPARQL query 
string, but the Q_IRI_REF <abc##def> must not.
"""

(This is virtual editorial but this message needs to be sent anyway.)


2/ Additional excluded characters "{", "}", "

The full list is, from RFC 3987:
< > " space { } | \ ^ `

Making the rule:
Q_IRI_REF ::= '<' ([^<>"{}|^`]-[#00-#20])* '>'

(We'd have to put in \ to allow for escape processing after parsing)
which all checks out in yacker (grammar rq23final) and the syntax tests.


3/ The comment in the grammar - as we have section A.1, I have dropped the 
comment from the grammar.


We discussed including the whole IRI gramamr and decided not to in recognition 
that IRI processing may be done by a separate library and not in a SPARQL parser.

v1.498

Dan - this now awaits the punctionSyntax vote because of the grammar change.

	Andy

-------- Original Message --------
Subject: Re: SPARQL: QuotedIRIref too lax [OK?]
Date: Wed, 5 Oct 2005 16:22:20 +0100
From: Bjoern Hoehrmann <derhoermi@gmx.net>
To: Seaborne, Andy <andy.seaborne@hp.com>
CC: <public-rdf-dawg-comments@w3.org>
References: <431838ee.225599421@smtp.bjoern.hoehrmann.de> 
<1122916625.18971.70.camel@localhost> <4343E7AA.9000402@hp.com>

* Seaborne, Andy wrote:
>The SPARQL grammar section [1] says that the syntax produces IRIs (A.1)
>
>"""
>Text matched by the  Q_IRI_REF production and QName  (after prefix expansion) 
>production must be conform to the generic syntax of IRI references in section 
>2.2 of RFC 3987 "ABNF for IRI References and IRIs" [RFC3987]. For example, the 
>  Q_IRI_REF <abc#def> may occur in a SPARQL query string, but the  Q_IRI_REF 
><abc##def> must not.
>"""

This prohibes use of \u... escapes as the escape sequence would be part
of the matched text which must conform to the IRI syntax which excludes
the \ character, so this contradicts A.5. It should probably say that
the unescaped text must match.

>and in addition the grammar itself has:
>
>Q_IRI_REF ::= '<' ([^<>]-[#00-#20])* '>' /* An IRI reference : RFC 3987 */
>
>(rather than duplicate the whole of the ABNF in RFC 3986 and 3987).

This should also exclude other disallowed characters like { and } and it
might make sense to note that this "IRI reference" comment refers to the
value space rather than the lexical space.

I would prefer to "duplicate" the IRI Reference grammar (either in the
document or some normative reference) with some flexible means to allow
for the additional escape syntax as that would encourage early error
detection (parser would likely be derived from the grammar and the IRIs
be passed to some generic IRI handling API e.g. in case of FROM <...>
which are likely more tolerant, so keeping this out of the grammar is
likely to cause some implementations to fail to detect errors), but I
understand that this might be too much work for the DAWG at this point.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Thursday, 6 October 2005 10:14:21 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:24 GMT