RE: Towards a TAG consideration of CURIEs from Misha Wolf on 2007-04-06 (www-tag@w3.org from April 2007)

From: Misha Wolf <Misha.Wolf@reuters.com>
Date: Sat, 07 Apr 2007 00:32:12 +0100
To: www-tag@w3.org, semantic-web@w3.org, public-xg-mmsem@w3.org
Cc: newsml-g2@yahoogroups.com
Message-id: <A29ADE959C70A1449470AA9A212F5D8004E9E1A9@LONSMSXM06.emea.ime.reuters.com>
Hi David,

The situation is as follows. 

The IPTC's first priority is B2B News interchange.  Support for 
painless discovery of additional information about Taxonomies and 
for the integration of News Taxonomies with the Semantic Web are 
desirable goals but, for the IPTC, they come second.  This 
prioritisation relates both to the importance we attach to each 
aspect, and to the order in which we are tackling them.

Now, the whole business of URI construction from tuples is a bit 
of mess. XML Namespaces don't mandate such a mechanism.  RDF does 
require it, but the situation is difficult if many of the codes 
start with a digit.

Given a code such as "123456", and given that we refuse to change 
the code to, say, "_123456", the main legal choices before us 
appear to be:

1. Simple concatenation using "/" as the delimiter
   "http://www.iptc.org/NewsCodes/" & "123456" ->
   "http://www.iptc.org/NewsCodes/123456"

2. Simple concatenation using "#_" as the delimiter
   "http://www.iptc.org/NewsCodes#_" & "123456" ->
   "http://www.iptc.org/NewsCodes#_123456"

3. Concatenation using "#_" as the delimiter, where the "_" is 
   glue, mandated by the relevant specification
   "http://www.iptc.org/NewsCodes#" & "_" & "123456" ->
   "http://www.iptc.org/NewsCodes#_123456"

4. Concatenation using "#_" as the delimiter, where the "#_" is 
   glue, mandated by the relevant specification
   "http://www.iptc.org/NewsCodes" & "#_" & "123456" ->
   "http://www.iptc.org/NewsCodes#_123456"

As we would very strongly prefer to end up with a Web page per 
Taxonomy, containing a descriptive entry per concept, where the 
constructed URI results in the relevant entry, we are not 
enthusiastic about option 1.

That appears to leave options 2, 3 and 4.  We have felt uneasy 
about choosing between them without considered advice from the 
SemWeb community.

As we are freezing the XML Schema for the News Architecture for our 
G2 Standards next week, and hope to ratify it at our AGM in Tokyo 
next month, this is an excellent time to consider and resolve the 
question of how to build URIs for Taxonomies used in News.

We would very much welcome your input.

Misha Wolf
News Standards Manager, Reuters, http://www.reuters.com/
Vice Chair, News Architecture WP, IPTC, http://www.iptc.org/


-----Original Message-----
From: Booth, David (HP Software - Boston) [mailto:dbooth@hp.com] 
Sent: 06 April 2007 17:12
To: Misha Wolf; Henry S. Thompson; www-tag@w3.org;
newsml-g2@yahoogroups.com
Subject: RE: Towards a TAG consideration of CURIEs

> From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] 
> . . . 
> Henry pointed out at the Edinburgh AC meeting that if we used simple 
> concatenation to give people access to information about terms in 
> our taxonomies, we could end up with illegal fragment IDs.  So:
> 
> If:
>    <subject qcode="iptc:123456"/>
> and if:
>    iptc -> http://www.iptc.org/NewsCodes#
> and if we used simple concatenation, we'd get:
>    iptc -> http://www.iptc.org/NewsCodes#123456
> 
> There is, of course, the other option:
>    iptc -> http://www.iptc.org/NewsCodes/
> then if we used simple concatenation, we'd get:
>    iptc -> http://www.iptc.org/NewsCodes/123456
> 
> We've decided to side-step this by specifying that the concatenation
> rules are taxonomy-specific and are up to the provider of each 
> taxonomy.

So any URI-based program using multiple taxonomies must have special
concatenation rules built in for *each* taxonomy?  That sounds awful.
Was there some reason why the group could at least recommend that the
namespace part end with either "/" or "#" (along with corresponding
constraints on the local part)?

David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software 


This email was sent to you by Reuters, the global news and information company. 
To find out more about Reuters visit www.about.reuters.com

Any views expressed in this message are those of the individual sender, 
except where the sender specifically states them to be the views of Reuters Limited.

Reuters Limited is part of the Reuters Group of companies, of which Reuters Group PLC is the ultimate parent company.
Reuters Group PLC - Registered office address: The Reuters Building, South Colonnade, Canary Wharf, London E14 5EP, United Kingdom
Registered No: 3296375
Registered in England and Wales
Received on Friday, 6 April 2007 23:32:20 UTC