W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > January 2006

Re: SPARQL and Unicode versions

From: Dave Beckett <dave@dajobe.org>
Date: Sat, 07 Jan 2006 20:01:40 -0800
Message-ID: <43C08EA4.3060406@dajobe.org>
To: Dan Connolly <connolly@w3.org>
CC: public-rdf-dawg-comments@w3.org

Dan Connolly wrote:
> On Sat, 2006-01-07 at 12:38 -0800, Dave Beckett wrote:
>>SPARQL refers to:
>>    The Unicode Standard, Version 4. ISBN 0-321-18578-1, as updated from
>>  time to time by the publication of new versions. The latest version of
>>  Unicode and additional information on versions of the standard and of
>>  the Unicode Character Database is available at
>>  http://www.unicode.org/unicode/standard/versions/.
>>which cites a moving target.  Please define SPARQL in terms of a
>>particular version of Unicode only, and no other.  Otherwise if or when
>>this Unicode consortium makes some incompatible changes, all existing
>>implementations become invalid.
> How so? How is conformance to SPARQL sensitive to changes in Unicode?

The SPARQL query syntax is defined on Unicode characters:

A. SPARQL Grammar

A SPARQL query string is a Unicode character string (c.f. section 6.1
String concepts of [CHARMOD])

although the grammar defines precise ranges of codepoints for particular
things such as names of variables (based on XML 1.1 I think).

If the definition of a Unicode character string changes in some future
Unicode revision, such as for example by allowing additional codepoints,
then there will be additional codepoints allowed in a SPARQL query
string, following the sentence above.

Any part of the grammar that uses an negated range such as with '[^...]'
will allow such codepoints.  Examples include:
and all string literals.

These codepoints may be refused by something implementing Unicode 4.0
and no more.

Received on Sunday, 8 January 2006 04:02:02 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:52:07 UTC