W3C home > Mailing lists > Public > public-bpmlod@w3.org > September 2013

Fragment issues in ITS/HTML/XML mapping to NIF

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Wed, 11 Sep 2013 10:27:15 +0100
Message-ID: <52303773.6020302@cs.tcd.ie>
To: public-bpmlod@w3.org
Hi all,
I won't be able to make the call today, but as promised last week here a 
pointer to an issue raised by the RDF WG in relation to the use of 
fragment identifiers in NIF, as used in the ITS-NIF mapping in the ITS 
2.0 specification.

The issue is described at;
https://www.w3.org/International/multilingualweb/lt/track/issues/131

basically pointing out that the 'char' media fragments use in the 
mapping to identity specific annotated text, is only specified for plain 
text file types and not for html or xml. Also the xpath option for 
fragment identifies, while ok for xml files does not apply to html 
files. the result is that if we use such fragment URL in the RDF we 
generate from a  xml+its or html+its mapping the URL won't be 
derferencable, therefore violating this core linked data principle.

The ideal solution would be to get these media type and associated 
processing descriptions registered with the RDF. This wasn't an option 
for the MLW-LT working group due to our time constraints, so we went for 
a query style URL instead, which is derferenceable, and added a note 
about the issues around the fragment option. The agreed text is in the 
spec at:
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#conversion-to-nif
note there is a reverse mapping also.

While this is obviously an issue for NIF in the short term also. Its 
acknowledged by the RDF group that registering those fragment types 
would be generally useful in tying together the web document parsing 
world with the linked data world more clearly.

Perhaps that is a task that we in this group could consider taking on? 
This guide gives us a starting point:
http://www.w3.org/TR/fragid-best-practices/

Regards,
Dave
Received on Wednesday, 11 September 2013 09:24:55 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:45:36 UTC