Re: Unicode Character Database in RDF? from Simon Reinhardt on 2011-01-05 (semantic-web@w3.org from January 2011)

From: Simon Reinhardt <simon.reinhardt@koeln.de>
Date: Wed, 05 Jan 2011 15:57:41 +0100
To: Felix Sasaki <felix.sasaki@fh-potsdam.de>
CC: Gerard de Melo <gdemelo@mpi-inf.mpg.de>, Bernard Vatant <bernard.vatant@mondeca.com>, Sampo Syreeni <decoy@iki.fi>, Ivan Herman <ivan@w3.org>, Shane Norris <norlesh@gmail.com>, W3C Semantic Web IG <semantic-web@w3.org>
Message-ID: <4D2486E5.4030904@koeln.de>

Hi

Felix Sasaki wrote:
> This makes a lot of sense, since otherwise you get many triples without 
> use cases. Also, there are differences between the properties in terms 
> of stability and data sources. See 
> http://www.unicode.org/Public/5.1.0/ucd/UCD.html for what is available 
> in version 5.1. Note also that there is information about characters 
> which is not in the properties' data base, e.g. whether a character can 
> be used in an internationlized domain name or not, see 
> http://unicode.org/cldr/utility/idna.jsp . So again, it really depends 
> on the use case what information you want in the RDF representation.

There's also a lot of information being collected in wiki-style at <http://www.decodeunicode.org/>.
It has information about individual characters (e.g. <http://www.decodeunicode.org/en/u+203d>) and Unicode blocks (e.g. <http://www.decodeunicode.org/en/armenian>).

Regards,
  Simon

Received on Wednesday, 5 January 2011 14:59:40 UTC