Re: Unicode Character Database in RDF?

On 2011-01-03, Ivan Herman wrote:

> I have asked an advise from Felix Sasaki, (cc-d), who knows both 
> Unicode and RDF. Here is his answer:

In theory creating an RDF version should be a basic character 
manipulation exercise. Take the character database file, assign 
surrogate keys to all of the characters (they after all have already 
been painstakingly unified/deduplicated/etc in a manner even most 
master data management initiatives don't do), then assign a predicate to 
each of the fields, and proceed to split the file into triples, omitting 
empty fields. Put up an OWL schema, and you have a more than good base 
version of it in RDF.

Were I running UNIX, the basic processing step would take perhaps half a 
day, utilizing standard command line tools. I'm sure somebody around 
here can do it even faster.
-- 
Sampo Syreeni, aka decoy - decoy@iki.fi, http://decoy.iki.fi/front
+358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

Received on Tuesday, 4 January 2011 21:47:54 UTC