- From: Sampo Syreeni <decoy@iki.fi>
- Date: Tue, 4 Jan 2011 23:42:13 +0200 (EET)
- To: Ivan Herman <ivan@w3.org>
- cc: Shane Norris <norlesh@gmail.com>, W3C Semantic Web IG <semantic-web@w3.org>, Felix Sasaki <felix.sasaki@fh-potsdam.de>
On 2011-01-03, Ivan Herman wrote: > I have asked an advise from Felix Sasaki, (cc-d), who knows both > Unicode and RDF. Here is his answer: In theory creating an RDF version should be a basic character manipulation exercise. Take the character database file, assign surrogate keys to all of the characters (they after all have already been painstakingly unified/deduplicated/etc in a manner even most master data management initiatives don't do), then assign a predicate to each of the fields, and proceed to split the file into triples, omitting empty fields. Put up an OWL schema, and you have a more than good base version of it in RDF. Were I running UNIX, the basic processing step would take perhaps half a day, utilizing standard command line tools. I'm sure somebody around here can do it even faster. -- Sampo Syreeni, aka decoy - decoy@iki.fi, http://decoy.iki.fi/front +358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Tuesday, 4 January 2011 21:47:54 UTC