W3C home > Mailing lists > Public > public-rdf-in-xhtml-tf@w3.org > October 2005

RE: CURIEs, xmlns and bandwidth

From: Misha Wolf <Misha.Wolf@reuters.com>
Date: Mon, 31 Oct 2005 13:43:49 +0000
Message-ID: <T7453570a990a01f0191d0c@lonsmime01.rit.reuters.com>
To: Dan Connolly <connolly@w3.org>
Cc: public-rdf-in-xhtml-tf@w3.org, public-swbp-wg@w3.org, iptc-metadata@yahoogroups.com

Hi Dan,

> > We fully intend to use GRDDL to convert the metadata in News 
> > Items to triples.  We've decided to use URIs (expressed as 
> > CURIEs) for *every* term drawn from a vocabulary.  As any 
> > individual News Item will employ many vocabularies, this will 
> > require many prefix->URI declarations.  And we can't afford 
> > the impact of using xmlns for this purpose.

> You can just choose "well-known" prefixes and hard-code them
> in your XSLT transformation. i.e. well-known to everybody
> that uses your profile. Then they don't have to be declared
> in each document.

Within our standards, each news provider is free to use their own 
taxonomies, eg of subjects, entities, genres, etc.  So there are 
no "well-known" prefixes to hardwire.

> > Consider a broadcast stream of real-time headlines.
> > Let's say that the text of each headline requires 50 bytes.  
> > Let's also say that the story metadata (which needs to be 
> > carried with the headline to allow filtering by recipients) 
> > requires 20 vocabularies and that each prefix declaration 
> > takes 50 bytes.  So having started with 50 bytes of text, we 
> > now end up broadcasting 21 * 50 bytes.  This is why we want to 
> > use XInclude to allow the prefix->URI declarations to be 
> > outside the headline object.  And XInclude can't be used for 
> > xmlns declarations.

> Are you broadcasting the headlines in little XHTML documents?
> Or in a custom XML vocabulary?

NewsML 2 (under development) uses our own XML Schema for the 
metadata (NewsML 1 is DTD-based).  NewsML is content-agnostic, and 
may carry any payload.  Where the payload is text, some IPTC 
members (including Reuters) use XHTML, while others use a markup 
language called NITF.

> Do you have some examples that you're kicking around? Sorry if 
> I'm asking you to repeat yourself.

Our current draft looks like this.  The URI examples are just that 
(ie examples).

<news:item>
  <dsig:Signature/>
  <catalog>
    <ns prefix="nc" uri="http://www.iptc.org/NewsCodes#"/>
    <ns prefix="rtr" uri="http://www.reuters.com/NewsCodes#"/>
    <ns prefix="lang" uri="http://www.isi.edu/in-notes/bcp/bcp47.txt#"/>
    <ns prefix="curr" uri="http://en.wikipedia.org/wiki/ISO_4217#"/>
    ...
  </catalog>
  <itemMeta>
    ...
  </itemMeta>
  <contentMeta>
    <created>2005-10-23T12:34:56Z</created>
    <creator code="afp:llm"/>
    <contributor code="greekMythology:muse"/>
    <source code="org:iptc"/>
    <significance>100</significance>
    <audience code="aud:implementors"/>
    <service code="service:tech"/>
    <edNote>Eat afer reading</edNote>
    <title>Hello World</title>
    <description>Something or other ...</description>
    <subject code="nc:04008018"/>
    <subject code="rtr:123"/>
    <subject code="rtr:456"/>
    <subject code="rtr:789"/>
    <subject code="curr:JPY"/>
    <genre code="spec:tech"/>
    <language code="lang:zh-Hant"/>
    ...
  </contentMeta>
  <news:content>
    ...
  </news:content>
</news:item>

> If you help me understand your target, I might be able to
> flesh out what I'm suggesting. 

Thanks,

Misha Wolf
News Standards Manager, Reuters, www.reuters.com
Vice-Chair, News Architecture Working Party, IPTC, www.iptc.org/dev


-----------------------------------------------------------------
        Visit our Internet site at http://www.reuters.com

To find out more about Reuters Products and Services visit http://www.reuters.com/productinfo 

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.
Received on Monday, 31 October 2005 13:43:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:15:00 GMT