W3C home > Mailing lists > Public > public-xg-mmsem@w3.org > April 2007

RE: Towards a TAG consideration of CURIEs

From: Booth, David (HP Software - Boston) <dbooth@hp.com>
Date: Fri, 13 Apr 2007 00:29:55 -0400
Message-ID: <EBBD956B8A9002479B0C9CE9FE14A6C2027E6B9B@tayexc19.americas.cpqcorp.net>
To: "Misha Wolf" <Misha.Wolf@reuters.com>, <www-tag@w3.org>, <semantic-web@w3.org>, <public-xg-mmsem@w3.org>
Cc: <newsml-g2@yahoogroups.com>

Misha,

Would it be feasible to mandate a particular prefix as part of all
taxonomy IDs, such as "code:"?  For example:

	< . . . xml:id="code:12345" . . . >
	< . . . xml:id="code:foo" . . . >

This would permit the the XML to be valid, and simple concatenation of
the namespace with the ID value would then work, whether the namespace
ends in "#" or "/":

	"http://example.org/whatever#" + "code:12345" =
		"http://example.org/whatever#code:12345"

	"http://example.org/whatever/" + "code:12345" =
		"http://example.org/whatever#code:12345"

I know you (or someone else) mentioned that publishers do not want to
modify their existing codes, but something like this would be easy for
both human and machine to syntactically distinguish from the original
codes ("12345" or "foo").  In that sense the prefix seems conceptually
no different from other XML syntax that surrounds the original codes and
must be parsed away to retrieve the original codes.

David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software 

> -----Original Message-----
> From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] 
> On Behalf Of Misha Wolf
> Sent: Friday, April 06, 2007 7:32 PM
> To: www-tag@w3.org; semantic-web@w3.org; public-xg-mmsem@w3.org
> Cc: newsml-g2@yahoogroups.com
> Subject: RE: Towards a TAG consideration of CURIEs
> 
> 
> Hi David,
> 
> The situation is as follows. 
> 
> The IPTC's first priority is B2B News interchange.  Support for 
> painless discovery of additional information about Taxonomies and 
> for the integration of News Taxonomies with the Semantic Web are 
> desirable goals but, for the IPTC, they come second.  This 
> prioritisation relates both to the importance we attach to each 
> aspect, and to the order in which we are tackling them.
> 
> Now, the whole business of URI construction from tuples is a bit 
> of mess. XML Namespaces don't mandate such a mechanism.  RDF does 
> require it, but the situation is difficult if many of the codes 
> start with a digit.
> 
> Given a code such as "123456", and given that we refuse to change 
> the code to, say, "_123456", the main legal choices before us 
> appear to be:
> 
> 1. Simple concatenation using "/" as the delimiter
>    "http://www.iptc.org/NewsCodes/" & "123456" ->
>    "http://www.iptc.org/NewsCodes/123456"
> 
> 2. Simple concatenation using "#_" as the delimiter
>    "http://www.iptc.org/NewsCodes#_" & "123456" ->
>    "http://www.iptc.org/NewsCodes#_123456"
> 
> 3. Concatenation using "#_" as the delimiter, where the "_" is 
>    glue, mandated by the relevant specification
>    "http://www.iptc.org/NewsCodes#" & "_" & "123456" ->
>    "http://www.iptc.org/NewsCodes#_123456"
> 
> 4. Concatenation using "#_" as the delimiter, where the "#_" is 
>    glue, mandated by the relevant specification
>    "http://www.iptc.org/NewsCodes" & "#_" & "123456" ->
>    "http://www.iptc.org/NewsCodes#_123456"
> 
> As we would very strongly prefer to end up with a Web page per 
> Taxonomy, containing a descriptive entry per concept, where the 
> constructed URI results in the relevant entry, we are not 
> enthusiastic about option 1.
> 
> That appears to leave options 2, 3 and 4.  We have felt uneasy 
> about choosing between them without considered advice from the 
> SemWeb community.
> 
> As we are freezing the XML Schema for the News Architecture for our 
> G2 Standards next week, and hope to ratify it at our AGM in Tokyo 
> next month, this is an excellent time to consider and resolve the 
> question of how to build URIs for Taxonomies used in News.
> 
> We would very much welcome your input.
> 
> Misha Wolf
> News Standards Manager, Reuters, http://www.reuters.com/
> Vice Chair, News Architecture WP, IPTC, http://www.iptc.org/
> 
> 
> -----Original Message-----
> From: Booth, David (HP Software - Boston) [mailto:dbooth@hp.com] 
> Sent: 06 April 2007 17:12
> To: Misha Wolf; Henry S. Thompson; www-tag@w3.org;
> newsml-g2@yahoogroups.com
> Subject: RE: Towards a TAG consideration of CURIEs
> 
> > From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] 
> > . . . 
> > Henry pointed out at the Edinburgh AC meeting that if we 
> used simple 
> > concatenation to give people access to information about terms in 
> > our taxonomies, we could end up with illegal fragment IDs.  So:
> > 
> > If:
> >    <subject qcode="iptc:123456"/>
> > and if:
> >    iptc -> http://www.iptc.org/NewsCodes#
> > and if we used simple concatenation, we'd get:
> >    iptc -> http://www.iptc.org/NewsCodes#123456
> > 
> > There is, of course, the other option:
> >    iptc -> http://www.iptc.org/NewsCodes/
> > then if we used simple concatenation, we'd get:
> >    iptc -> http://www.iptc.org/NewsCodes/123456
> > 
> > We've decided to side-step this by specifying that the concatenation
> > rules are taxonomy-specific and are up to the provider of each 
> > taxonomy.
> 
> So any URI-based program using multiple taxonomies must have special
> concatenation rules built in for *each* taxonomy?  That sounds awful.
> Was there some reason why the group could at least recommend that the
> namespace part end with either "/" or "#" (along with corresponding
> constraints on the local part)?
> 
> David Booth, Ph.D.
> HP Software
> +1 617 629 8881 office  |  dbooth@hp.com
> http://www.hp.com/go/software 
> 
> 
> This email was sent to you by Reuters, the global news and 
> information company. 
> To find out more about Reuters visit www.about.reuters.com
> 
> Any views expressed in this message are those of the 
> individual sender, 
> except where the sender specifically states them to be the 
> views of Reuters Limited.
> 
> Reuters Limited is part of the Reuters Group of companies, of 
> which Reuters Group PLC is the ultimate parent company.
> Reuters Group PLC - Registered office address: The Reuters 
> Building, South Colonnade, Canary Wharf, London E14 5EP, 
> United Kingdom
> Registered No: 3296375
> Registered in England and Wales
> 
> 
> 
> 
Received on Friday, 13 April 2007 04:30:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:21 GMT