RE: CURIEs, xmlns and bandwidth from Misha Wolf on 2005-11-03 (public-rdf-in-xhtml-tf@w3.org from November 2005)

From: Misha Wolf <Misha.Wolf@reuters.com>
Date: Thu, 03 Nov 2005 13:45:30 +0000
To: "Henry S. Thompson" <ht@inf.ed.ac.uk>, public-rdf-in-xhtml-tf@w3.org
Cc: iptc-metadata@yahoogroups.com
Message-id: <A29ADE959C70A1449470AA9A212F5D808B3354@LONSMSXM06.emea.ime.reuters.com>

Henry wrote:

> I took the opportunity of there being an XML Core WG telcon this
> afternoon to summarize the state of this thread and your
> concerns/requirements.

Thanks.

> The XML Core WG asked me to share with you our concern that using
> QName syntax for something which is not a QName, in a context 
> where a QName might well appear, would be a very misleading and 
> confusing thing to do.  We strongly urge you _either_ to use a 
> different syntax, _or_ to use real QNames.

In our view, the fact that the semantics of our {scheme, code} 
pairs are the same as the semantics of QNames justifies our use of 
the same syntax.

> Speaking personally, I wonder if you've really exhausted the 
> design space wrt making QNames meet your requirements.  Perhaps 
> you have, but I offer the following observations:
> 
> 1) It's a least a _little_ bit odd to think of numerals as being
>    symbols appropriate for namespace qualification.  It feels 
>    quite different to say "in my namespace, 'title' means ... and
>    is used for ..." as opposed to "in my namespace '1105' means 
>    ... and is used for ...".

In the XML world, namespaces tend to contain small numbers of 
terms, maybe in the low hundreds.  In the financial industry and in 
the news industry, taxonomies tend to be very large, sometimes 
containing hundreds of thousands of terms.  Such taxonomies often 
use numeric codes.  Taxonomies using numeric codes that we 
encounter in daily life include ISBN and ISSN.  It is regrettable 
that QNames do not support such real-life use cases.

> 2) Namespace declarations _can_ be packaged up and stored in a
>    document for reference and re-use in XML documents, using
>    parameter entities:
>
> bindings.ent:
> 
> <!ATTLIST news:item
>   xmlns:news CDATA #FIXED "???"
>   xmlns:dsig CDATA #FIXED "http://www.w3.org/2000/09/xmldsig#"
>   xmlns:nc CDATA #FIXED "http://www.iptc.org/NewsCodes#"
>   xmlns:rtr CDATA #FIXED "http://www.reuters.com/NewsCodes#"
>   xmlns:lang CDATA #FIXED "http://www.isi.edu/in-notes/bcp/bcp47.txt#"
>   xmlns:curr CDATA #FIXED "http://en.wikipedia.org/wiki/ISO_4217#">
>
> xmplItem.xml:
>
>   <!DOCTYPE news:item [
>      <!ENTITY % ATTLIST SYSTEM ".../bindings.ent">
>      %ATTLIST;
>   ]>
>   <news:item>
>     <dsig:Signature/>
>     <itemMeta>
>       ...
>     </itemMeta>
>     <contentMeta>
>       <created>2005-10-23T12:34:56Z</created>
>       ...
>       <subject code="curr:JPY"/>
>     </contentMeta>
>     . . .
>   </news:item>
> 
> If you want to _really_ pennypinch on bandwidth, you can publish 
> a 'widely known' URI for popular ATTLIST files and processors 
> don't even have to download them, they can have their values 
> 'built in'.

The IPTC News Architecture WP will discuss this approach.

Thanks,

Misha Wolf
News Standards Manager, Reuters, www.reuters.com
Vice-Chair, News Architecture WP, IPTC, www.iptc.org/dev
Chair, News Metadata Framework WG, IPTC, www.iptc.org/dev


To find out more about Reuters visit www.about.reuters.com

Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.

Received on Thursday, 3 November 2005 13:45:19 UTC