W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > October 2005

Re: SPARQL: QuotedIRIref too lax [OK?]

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Thu, 13 Oct 2005 13:54:34 +0100
Message-ID: <434E590A.3000208@hp.com>
To: Bjoern Hoehrmann <derhoermi@gmx.net>
CC: public-rdf-dawg-comments@w3.org



Bjoern Hoehrmann wrote:
> * Seaborne, Andy wrote:
> 
>>The SPARQL grammar section [1] says that the syntax produces IRIs (A.1)
>>
>>"""
>>Text matched by the  Q_IRI_REF production and QName  (after prefix expansion) 
>>production must be conform to the generic syntax of IRI references in section 
>>2.2 of RFC 3987 "ABNF for IRI References and IRIs" [RFC3987]. For example, the 
>> Q_IRI_REF <abc#def> may occur in a SPARQL query string, but the  Q_IRI_REF 
>><abc##def> must not.
>>"""
> 
> 
> This prohibes use of \u... escapes as the escape sequence would be part
> of the matched text which must conform to the IRI syntax which excludes
> the \ character, so this contradicts A.5. It should probably say that
> the unescaped text must match.
> 
> 
>>and in addition the grammar itself has:
>>
>>Q_IRI_REF ::= '<' ([^<>]-[#00-#20])* '>' /* An IRI reference : RFC 3987 */
>>
>>(rather than duplicate the whole of the ABNF in RFC 3986 and 3987).
> 
> 
> This should also exclude other disallowed characters like { and } and it
> might make sense to note that this "IRI reference" comment refers to the
> value space rather than the lexical space.
> 
> I would prefer to "duplicate" the IRI Reference grammar (either in the
> document or some normative reference) with some flexible means to allow
> for the additional escape syntax as that would encourage early error
> detection (parser would likely be derived from the grammar and the IRIs
> be passed to some generic IRI handling API e.g. in case of FROM <...>
> which are likely more tolerant, so keeping this out of the grammar is
> likely to cause some implementations to fail to detect errors), but I
> understand that this might be too much work for the DAWG at this point.

Bjoern,

The working group has decided to modify the production to exclude charactser 
as shown:

Q_IRI_REF ::= '<' ([^<>'{}|^`]-[#00-#20])* '>'

This allows \u which is covered by section A.6

In addition, section A.1 requires that this production match the generic 
syntax of IRI references in section 2.2 of RFC 3987 after escape processing.

Please let us know whether this response addresses your comment to your 
satisfaction.

	Andy

[1] http://www.w3.org/2001/sw/DataAccess/rq23/#grammar
Received on Thursday, 13 October 2005 12:59:20 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:14:49 GMT