W3C home > Mailing lists > Public > public-rdf-wg@w3.org > May 2012

Re: Unicode normalization in Turtle

From: Richard Cyganiak <richard@cyganiak.de>
Date: Tue, 8 May 2012 22:56:28 +0100
Cc: Ivan Herman <ivan@w3.org>, David Wood <david@3roundstones.com>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <760E7F4C-1660-4D53-8200-6CA0B4123708@cyganiak.de>
To: Gavin Carothers <gavin@carothers.name>
On 8 May 2012, at 16:46, Gavin Carothers wrote:
>>>> The Turtle ED doesn't say anything about Unicode normalization. Should it?
> 
> I... don't think so?

Well, that's ok with me.

> Related note while in unicode hell, do we wish to define parsing
> Turtle in terms of UTF-8 with error recovery? See
> http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#utf-8
> Better compatibility with random goo that gets screwed up by web
> servers etc.

Probably not. Trying to fix up broken data by guessing the publisher's intent can cause all sorts of Fun. Also, since Turtle is always UTF-8, pointing fingers and fixing stuff when it's broken is much easier than with say HTML where we have two dozen different places where encoding can be specified, guessed, contradicted or otherwise screwed up.

Best,
Richard
Received on Tuesday, 8 May 2012 21:57:12 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:48 GMT