- From: Dan Connolly <connolly@w3.org>
- Date: Tue, 13 Jun 2006 10:35:23 -0400
- To: Misha Wolf <Misha.Wolf@reuters.com>
- Cc: www-tag@w3.org, newsml-2@yahoogroups.com, public-rdf-in-xhtml-tf@w3.org
On Jun 2, 2006, at 2:14 PM, Misha Wolf wrote: > Hi folks, > > A modest proposal, drawing on ideas from Mark, Henry, Tim, Dan, Norm > and others: I found the notes from our discussion in Edinburgh, Misha, but then I left them at home and I'm travelling. I got a better picture of the requirements, and we discussed several options. As I understand it, IPTC has a whole bunch of codes... collections of codes, in fact. Vocabularies, I gather. The goal is a compact syntax to encode a code within a vocabulary, such that you can get from this compact syntax a URI for the code within the vocabulary and for the vocabulary itself. Some of the codes start with digits. We suspect (though we're not certain) that vocabularies are homogeneous in this respect: within a vocabulary, either all the codes start with a digit or none do. I gather these are for use in NewsML2, and there's a desire to share technology between NewsML2 and XHTML2 and other formats and to use the URIs with RDF tools. We discussed a number of possibilities... for the sake of example, a numeric code set I know about (though I'm not at all sure it's actually used in IPTC...) is SIC codes (http://en.wikipedia.org/wiki/SIC_codes ) and a non-numeric code set that I know about is IATA codes (http://en.wikipedia.org/wiki/IATA_airport_code ). Option A. Have a syntax for binding, say, sic: to http://sic.org/vocab1# and use sic:0070 for a code. To get a URI for that code, concatenate them. http://sic.org/vocab1#0070 . To get a URI for the vocabulary, concatenate them and then strip off the fragment: http://sic.org/vocab1 . Similarly, bind, say, iata: to http://iata.org/airports# and let iata:LGA expand to http://iata.org/airports#LGA and then to get the vocabulary, strip off the fragment http://iata.org/airports. The sic:0070 short-hand does not match XML/XPath QName syntax, so you can't use it in RDF/XML. You can't even make up a QName for the URI http://sic.org/vocab1#0070 so you simply can't use it as a property name in RDF/XML. (The example of a SIC code is not something that you're likely to want to use as an RDF property name Option B: Bind sic: to http://sic.org/vocab1 and use sic:0070; To get a URI for that code, concatenate them with a # between: http://sic.org/vocab1#0070 . To get a URI for the vocabularly, look in the binding, and get http://sic.org/vocab1 . Option C: Like A, but for any codes that don't start with an XML name start character, put a _ in front of it before you use it in any of these web technolgies. So sic:_0070 is the short syntax, http://sic.org/vocab1#_0070 is the URI for the code, and again, to get the URI for the vocab, strip off the fragment: http://sic.org/vocab1 . Now we can use the short syntax as a QName in RDF/XML. In Option C, the IATA stuff is the same as in Option A: bind iata: to http://iata.org/airports# and let iata:LGA expand to http://iata.org/airports#LGA and strip off the fragment to get the vocabulary and get http://iata.org/airports . There might have been some other options that I've forgotten. And I'm not sure to what extent compatibility with existing NewsML practice is a requirement. The proposal you make here seems much more complicated than any of those options, and it involves a lot more coordination (new rules that bindin on "Groups within the W3C and elsewhere"). > 1 We agree on a generic syntax and generic rules for Compact URIs > (CURIEs) in attribute values. > > 2 We agree that restricted syntaxes and rules will be (or have > been) defined for specific purposes. One such purpose is XML > Namespaces and QNames. > > 3 Groups within the W3C and elsewhere will define other restricted > syntaxes and rules for their own purposes. > > 4 The generic syntax for a CURIE in an attribute value will be: > <foo bar="prefix:suffix"/> > > 5 The generic syntax for multiple CURIEs in an attribute value > will (where permitted) be: > <foo bar="prefix1:suffix1 ... prefixN:suffixN"/> > > 6 Both the prefix and the suffix may (in the generic case) be > numeric. > > 7 Each language must specify: > > 7a the syntactic constraints (if any) on the prefix and suffix. > > 7b how CURIEs and URIs are distinguished, eg through dedicated > attributes or through a special syntax. > > 7c the mechanism for specifying the prefix-to-IRI mapping. The > mechanism may use information provided out-of-band. > > 7d whether and, if so, how the prefix and suffix are combined to > form an IRI. > > 7e whether the prefix and suffix form a tuple or whether they are > just a compact representation for an IRI. > > 7f whether the IRI mapped to the prefix is required to be > dereferenceable. > > 7g whether the IRI built from the prefix and suffix (and, possibly, > including also other building blocks) is required to be > dereferenceable. > > 7h whether any fragment identifiers in these IRIs are required to > be legal XML names. > > 8 To avoid confusion with XML Namespaces and QNames: > > 8a The xmlns attribute is reserved for use with XML Namespaces and > QNames. > > 8b If a prefix matches an xmlns declaration then the CURIE MUST be > interpreted as a QName. > > Misha > ------------------- NewsML 2 resources ------------------------------ > http://www.iptc.org | http://www.iptc.org/std-dev/NAR/1.0 > http://www.iptc.org/std-dev | http://groups.yahoo.com/group/newsml-2 -- Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Tuesday, 13 June 2006 14:35:42 UTC