W3C home > Mailing lists > Public > public-rdf-wg@w3.org > August 2011

Re: RDF-ISSUE-75 (#x0): Valid plain literals containing #x0 are no longer valid in RDF 1.1

From: Ivan Herman <ivan@w3.org>
Date: Sat, 20 Aug 2011 06:39:11 +0200
Message-Id: <A504C2BA-7E69-41A8-AC0D-BD132B6D8197@w3.org>
To: RDF Working Group WG <public-rdf-wg@w3.org>
Do we know of any place whatsoever where #x0 was used?

I would propose we flag this explicitly as an issue in the document asking for feedback, with the expectation that we will have this restriction in 1.1


Ivan

On Aug 19, 2011, at 20:44 , RDF Working Group Issue Tracker wrote:

> 
> RDF-ISSUE-75 (#x0): Valid plain literals containing #x0 are no longer valid in RDF 1.1
> 
> http://www.w3.org/2011/rdf-wg/track/issues/75
> 
> Raised by: Richard Cyganiak
> On product: 
> 
> The lexical space of xsd:string doesn't cover all Unicode strings.
> 
> I assume we will end up referring to XSD 1.1 for the definition of xsd:string [1]. That document leaves it up to implementations whether they support the XML 1.0 or XML 1.1; accordingly, the definition of allowed characters in an xsd:string is [2] or [3].
> 
> The more permissive one from XML 1.1:
> 
>    Char ::= [#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
> 
> This excludes #x0, Unicode codepoint U+0000. XML 1.0 also excludes a number of other control codes in the #x0-#x1F range.
> 
> The definition of “lexical form” in RDF 2004 [4] says “Unicode string”, which according to [5] includes *all* codepoints including the control codes.
> 
> So, any string that includes #x0 was a valid untagged plain literal in RDF 2004. In RDF 1.1, it will be typed as an xsd:string, and thus will be an ill-typed literal.
> 
> (On the other hand, such strings could never be serialized in RDF/XML or XHTML+RDFa; they were serializable only in N-Triples and Turtle.)
> 
> Is this a problem? Can we go ahead with the new literal design despite this restriction? Should we acknowledge it in the RDF Concepts spec?
> 
> [1] http://www.w3.org/TR/2005/WD-xmlschema11-2-20050224/datatypes.html#string
> [2] http://www.w3.org/TR/REC-xml/#dt-character
> [3] http://www.w3.org/TR/xml11/#NT-Char
> [4] http://www.w3.org/TR/rdf-concepts/#dfn-lexical-form
> [5] http://www.unicode.org/versions/Unicode6.0.0/UnicodeStandard-6.0.pdf
> 
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Saturday, 20 August 2011 04:36:47 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:44 GMT