> >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >- --=-=-= > >I've written up some preliminary thoughts about this, in rather more >detail, but still very much a work-in-progress, something I would have >blogged except I don't have a blog. Please bear this in mind when >responding -- there's very little here, particularly in the more >speculative sections towards the end, which I'm firmly convinced of. >So feedback is very much in order. > >ht > > >- --=-=-= >Content-Type: text/html; charset=iso-8859-1 >Content-Disposition: attachment; filename=names.html >Content-Transfer-Encoding: quoted-printable >Content-Description: Names, namespaces and languages > ><?xml version=3D"1.0" encoding=3D"utf-8"?><html xmlns=3D"http://www.w3.org/= >1999/xhtml"><head><meta HTTP-EQUIV=3D"Content-type" CONTENT=3D"text/html; c= >harset=3DUTF-8"/><title>Names, Namespaces and Languages</title><style type= >=3D"text/css"> > PRE.code {font-family: monospace} > PRE {MARGIN-LEFT: 0em} > OL OL {list-style-type: lower-alpha} > </style></head><body STYLE=3D"font-family: times"> > <div xmlns=3D"" style=3D"text-align: center"> > <h1>Names, Namespaces and Languages</h1> > <div>Henry S. Thompson</div> > <div>24 June 2005</div> > </div> >=20 >=20=20 > <h2 xmlns=3D"">1.=20 > =C2=A0 > <a name=3D"intro">Introduction</a></h2> > <p xmlns=3D"">This is very much a work-in-progress, something I would ha= >ve blogged >except I don't have a blog. Please bear this in mind when responding -- >there's very little here, particularly in the more speculative sections tow= >ards >the end, which I'm firmly convinced of. So feedback is very much in order.= ></p> >=20=20 >=20=20 > <h2 xmlns=3D"">2.=20 > =C2=A0 > <a name=3D"background">Background</a></h2> > <p xmlns=3D"">TAG issues <a href=3D"http://www.w3.org/2001/tag/issues.ht= >ml?type=3D1#namespaceDocument-8">namespaceDocument-8</a> and <a href=3D"htt= >p://www.w3.org/2001/tag/issues.html?type=3D1#abstractComponentRefs-37">abst= >ractComponentRefs-37</a> were the topic of=20 ><a href=3D"http://www.w3.org/2001/tag/2005/06/14-16-minutes.html">extended = >discussion</a> > at the last TAG f2f. There is considerable overlap between these two issu= >es, and both are related to ><a href=3D"http://lists.w3.org/Archives/Public/www-xml-schema-comments/2005= >JanMar/0080.html">Dan Connolly's comment</a> on the recently published Last= > Call Working Draft >of <a href=3D"http://www.w3.org/TR/2005/WD-xmlschema-ref-20050329/">XML >Schema: Component Designators</a>. Although a number of prior >misunderstandings were identified and overcome in the discussion, more work >is needed to make the background assumptions about what the problems are we= >'re >trying to solve and what the space of possible solutions is. This note is = >an >attempt to begin that work.</p> >=20=20 >=20=20 > <h2 xmlns=3D"">3.=20 > =C2=A0 > <a name=3D"namespaces">XML Namespaces: An evolving understanding</a></h2> > <p xmlns=3D"">The <a href=3D"http://lists.w3.org/Archives/Public/www-tag= >/2005Feb/0017.html">recent discussion</a> about whether the <a href=3D"http= >://www.w3.org/TR/2005/CR-xml-id-20050208/">xml:id</a> spec. 'changes' the X= >ML namespace by 'adding' a new name to it helped clarify that the minimalis= >t reading of the <a href=3D"http://www.w3.org/TR/xml-names11/">XML Namespac= >es</a> REC has achieved dominance in the intellectual marketplace. By "the= > minimalist reading" I mean I mean the reading on which an XML namespace is= > primarily a syntactic mechanism for distinguishing one class of uses of a = >particular simple name from all other uses thereof. This means a namespace= > is <i>not</i> a finite set of names, nor a more complex structured object = >as suggested by the (in)famous now-deleted non-normative <a href=3D"http://= >www.w3.org/TR/REC-xml-names/#Philosophy">Appendix A: The Internal Structure= > of XML Namespaces</a> of version 1.0.</p> > <p xmlns=3D"">The minimalist reading is the only one consistent with act= >ual usage -- >people mint new namespaces by simply <i>using</i> them in an expanded name >or namespace declaration, without thereby incurring any obligation to define >the boundaries of some set. You could say that a namespace springs into li= >fe >the first time anyone uses a URI as a namespace name, but on balance I pref= >er >an understanding which doesn't reify a namespace as such at all. I don't >object to using phrases such as "[some name] in the [some URI] namespace", = >but >that's just another was of saying "the expanded name <code>< some_URI, >some_name ></code>".</p> > <p xmlns=3D"">On this account it makes sense to ask questions about name= >space names, e.g. "What >namespace name will XSLT 2.0 use?" and about expanded names, e.g. "Does XSLT >2.0 change the definition of the element named <code>< >http://www.w3.org/Style/1998/Transform, output ></code>?", but >questions about namespaces as such are rarely if ever useful (unless of cou= >rse >they're understood as questions about namespace <i>names</i> or about >some otherwise-defined set of expanded names with a namespace name in commo= >n).</p> >=20=20 >=20=20 > <h2 xmlns=3D"">4.=20 > =C2=A0 > <a name=3D"languages">From namespaces to languages</a></h2>=20=20=20 > <p xmlns=3D"">Taking the argument one step further, it is a necessary co= >nsequence of the >position outlined above that it is incoherent to understand e.g. >"Such-and-such a type is defined in the XML Schema namespace" to mean that = >the >XML Schema namespace contains types (or type definitions). Considering thi= >ngs >carefully, we must understand this sentence as meaning that the XML Schema >language assigns the expanded name <code>< http://www.w3.org/2001/XMLSch= >ema, >such-and-such ></code> to some type definition. This perspective actual= >ly >works well with our overall understanding of XML Schema: a schema document >for a particular target namespace corresponds to a schema which assigns ele= >ment declarations, type definitions, etc. to expanded names all >of whose namespace name is that target namespace.</p> > <p xmlns=3D"">So it's <i>languages</i> (or as we used to say, ><i>applications</i>, in the SGML sense) which assign expanded names ><i>to</i> things. That assignment may be unique and unequivocal, but >evidently it is often one-to-many. And of course it's the language which >determines what there is to be named, its own little (or large) ontology.</= >p> > <p xmlns=3D"">Many languages of course <i>do</i> provide only one thing = >to be >named using a particular namespace name (e.g. <a href=3D"http://www.w3.org/= >TR/xpath-functions/">XQuery Functions and Operators</a>), and others, altho= >ugh naming more than one sort of thing, constrain their use of names to be = >unambiguous (e.g. <a href=3D"http://www.w3.org/TR/SVG/">SVG</a>, <a href=3D= >"http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/">RDF</a>). In = >both these cases, just an expanded name is sufficient to identify something= >, and constructing a URI for something is therefore straightforward.</p> > <p xmlns=3D"">On the other hand there are many examples of languages whe= >re the mapping >is one-to-many. The >most immediate example is XML >itself. The low-level syntax of XML distinguishs two >sorts of things which are identified by expanded name: elements and attribu= >tes. >Since there is no prohibition on using the same expanded name for both an >element and an attribute, an expanded name is not sufficient to uniquely id= >entify a named >aspect of an XML document (or document type, in the ordinary language sense= >) -- >you need to know what I've been calling the <i>sort</i> as well, i.e. ><b>element</b> or <b>attribute</b>. For example, all of the >following names:</p> > <ul xmlns=3D""> > <li><code>abbr</code></li> > <li><code>cite</code></li> > <li><code>code</code></li> > <li><code>dir</code></li> > <li><code>label</code></li> > <li><code>link</code></li> > <li><code>object</code></li> > <li><code>span</code></li> > <li><code>style</code></li> > <li><code>title</code></li> > </ul> > <p xmlns=3D"">can be used for either elements or attributes in XHTML 1.0= > (transitional) >documents, and at least three of these (<code>abbr</code>, <code>cite</code> >and <code>title</code>) survive as ambiguous in XHTML Basic 1.0.</p> > <p xmlns=3D"">When we expand our scope to XML validation, we suddenly ge= >t a ><i>much</i> more complex situation, in which there are in principle an >unbounded number of things which share a name, only disambiguateable by >context: we have element declarations (max. one per expanded >name), and attribute declarations (max. as many as there are >element declarations). For example, there are four distinct attributes >definitions called <b>align</b> and five distinct attribute definitions >called <b>type</b> in the <a href=3D"http://www.w3.org/TR/xhtml1/DTD/xhtml1= >- -transitional.dtd">XHTML transitional DTD</a>. W3C XML Schema not only has= > a richer set of what it calls "symbol spaces", so that there are seven thi= >ngs whose definitions can be named (it adds types, attribute and element gr= >oups, notations and identity-constraints along side elements and attributes= >), it also allows elements as well as attributes to be defined in context.<= >/p> > <p xmlns=3D"">Finally we should note that a language may encompass quite= > a >range of variation in terms of the things it assigns a particular expanded = >name >to. There can be variation over time, as new versions of a language are re= >leased, >and even alternative variants released at the same time. The HTML ><code>P</code> element has a long and complex history, and even the XHTML ><code>p</code> element has three distinct variants in version 1.0 (strict, >transitional and basic), none of which is exactly the same as the one in ve= >rsion 1.1.</p> > <p xmlns=3D"">None of this should come as a surprise. Ordinary language= > uses names in >ways which are both ambiguous and context-determined, and whose use changes >over time. But its consequence for the Web are more serious, particularly = >as >we consider the use of names for things on the Web intended for automatic >processing, where appeal to context for disambiguation may not be >straighforward at all. At the very least it is clear that it is no longer >trivial to specify an approach to >constructing URIs for things which will cover all the cases just discussed.= ></p> >=20=20 >=20=20 > <h2 xmlns=3D"">5.=20 > =C2=A0 > <a name=3D"abstractions">What abstractions to choose</a></h2> > <p xmlns=3D"">Broadly speaking there are three ways one could respond to= > the situation >outlined above:</p> > <ol xmlns=3D""> > <li>Only expect to have a systematic approach to naming things with URI= >s when the >language or application involved has a single flat story about naming (e.g.= > <a href=3D"http://www.w3.org/TR/SVG/">SVG</a>, <a href=3D"http://www.w3.or= >g/TR/2004/REC-rdf-syntax-grammar-20040210/">RDF</a>).=20 >Abstract over variations. We might call this the <a name=3D"simple"><b>sim= >ple</b></a> (or ><b>simplistic</b>) view.</li> > <li>Demand a systematic approach in all cases, and over all variations, >but acknowledge that this means that in complex cases (e.g. WSDL, XML Schem= >a) >the resulting URIs will themselves be complex, requiring new media types an= >d/or using new XPointer >schemes. We might call this the <a name=3D"rich"><b>rich</b></a> (or ><b>overkill</b>) view, exemplified by <a href=3D"http://www.w3.org/TR/2005/= >WD-xmlschema-ref-20050329/">XML >Schema: Component Designators</a>.</li> > <li>Look for a middle ground, which adopts the <a href=3D"simple">simpl= >e</a> >view wherever possible, otherwise an approximation to it which abstracts >over all variation and as much application-specific detail as possible, with >the option to fall back to the <a href=3D"rich">rich</a> view as and when >this is necessary. We might call this the <a name=3D"middle"><b>middle</b>= ></a> (or <b>80/20</b>) view.</li> > </ol> > <p xmlns=3D"">It's important to note that there's an unspoken common ass= >umption to all >three of the above views: We're going to construct the URI for some named = >thing by adding >some variety of fragment identifier to the namespace name of its expanded n= >ame. >There is no space here for the possibility that two distinct languages might >use the <i>same</i> expanded name for two evidently distinct things.=20 >This is intimately bound up with another assumption with respect to variati= >on, >namely that it's possibly to tell reliably when a change in something count= >s as >a variation, as opposed to a fundamental change of identity. If I change t= >he >named definition of a type by nudging its min or max a bit, that pretty >clearly just produces a variant of the same type. But if I change the >definition assigned to a name from being an integer to being a date, it's >equally pretty clear that that's no longer the same type at all. Those are= > the >easy cases, there will be many which are much harder to call.</p> > <p xmlns=3D"">I expect that >both of these assumptions will want to be recast as Good Practice notes goi= >ng >forward (e.g. "Don't use the same expanded name for two different things of= > the >same sort in different languages under your control"; "As a language evolve= >s, >use new expanded names for new things, don't recycle old ones").</p> >=20=20 >=20=20 > <h2 xmlns=3D"">6.=20 > =C2=A0 > <a name=3D"details">More details on the <a href=3D"middle">middle</a> gr= >ound</a></h2> > <p xmlns=3D"">Without more detailed examination of real usage scenarios,= > it's hard to >be sure of what general principles to establish here, but on the basis of my >limited experience to date it seems likely that something along the followi= >ng >lines is a reasonable starting point.</p> > <p xmlns=3D"">It's up to the owner of a language, for each of the namesp= >aces involved >in that language, to provide a constructive definition of the way in which >things which have expanded names can also be named with URIs. I've identif= >ied >the following guidelines for such definitions:</p> > <ul xmlns=3D""> > <li>Use the namespace URI as the basis of the constructed name;</li> > <li>Where part of the complexity of a language's name structure comes >From=20giving expanded names to more than one sort of thing, include the so= >rt in >the URI;</li> > <li>Where evolution over time and or simultaneous language variants are= > a >possibility, be clear that simple URIs are <i>not</i> capable of >capturing this;</li> > <li>Try to provide retrievable representations so that the namespace >URI(s) you construct a) have a widely used media type and b) yield a useful >result when the fragment identifier is resolved.</li> > </ul> >=20=20 >=20=20 > <h2 xmlns=3D"">7.=20 > =C2=A0 > <a name=3D"example">The W3C XML Schema example</a></h2> > <p xmlns=3D"">The <a href=3D"http://www.w3.org/2001/tag/2005/06/14-16-mi= >nutes.html#item031">position</a> that emerged at the end of the recent TAG = >f2f is consistent with the above guidelines, but obviously lacking in detai= >l. On balance my prefered approach would look something like this:</p> > <blockquote xmlns=3D""><div>URI names are provided for everything define= >d or declared by name >at the top level which have some conceptual identity independent of the det= >ails >of W3C XML Schema, i.e. elements, attributes and simple and complex types.<= >/div></blockquote> > <blockquote xmlns=3D""><div>The URI name for something of one of the abo= >ve four sorts is >constructed by concatenating the namespace name of its expanded name, a ><code>/</code> if that does not already end with one, its sort >(i.e. <b>attribute</b>, <b>complexType</b>, <b>element</b> >or <b>simpleType</b>) a <code>/#</code> and the local name of its >expanded name.</div></blockquote> > <blockquote xmlns=3D""><div>URI names for languages which don't use name= >spaces are >based on a URI designated for the purpose in the language specification, e.= >g. ><a href=3D"http://www.w3.org/2002/xmlspec/">http://www.w3.org/2002/xmlspec/= ></a> for the W3C's 'specprod' language.</div></blockquote> > <p xmlns=3D"">It would be the responsibility of language owners to provi= >de retrievable >representations of resources at each sort-determined sub-URI of the namespa= >ce >URI to make this work (but see httpRange-14 below under <a href=3D"#issues"= >>Outstanding issues</a>).</p> > <p xmlns=3D"">So for example the URI for the W3C XML Schema's own dateTi= >me type would be</p> ><blockquote xmlns=3D""><div><pre class=3D"code">http://www.w3.org/2001/XMLS= >chema/simpleType/#dateTime</pre></div></blockquote> > <p xmlns=3D"">and perhaps, for the DAML+OIL example cited in <link>Dan C= >onnolly's >feedback</link>, we would get the following ('perhaps' because there's no n= >amespace involved in the example as published):</p> > <blockquote xmlns=3D""><div><pre class=3D"code">http://www.w3.org/TR/200= >1/NOTE-daml+oil-walkthru-20011218/simpleType/#over12</pre></div></blockquot= >e> > <p xmlns=3D"">(My inspiration for this approach is at least in part the = >IANA >structuring of their <a href=3D"http://www.iana.org/assignments/media-types= >/">registry of media types</a>, which give us e.g.</p> > <blockquote xmlns=3D""><div><pre class=3D"code">http://www.iana.org/assi= >gnments/media-types/application/mathematica</pre></div></blockquote> > <p xmlns=3D"">for <code>application/mathematica</code> (although irritat= >ingly give us >nothing for e.g. <code>text/html</code>).</p> >=20=20 >=20=20 > <h2 xmlns=3D"">8.=20 > =C2=A0 > <a name=3D"issues">Outstanding issues</a></h2> > <p xmlns=3D"">This is by no means a fully-baked story. Some things I <i= >>know</i> >are shaky are</p> > <dl xmlns=3D""> > <dt><b>httpRange-14</b></dt><dd>The TAG's recent resolution of this iss= >ue leaves >the question of what sort of resource a namespace URI identifies, and wheth= >er you should be able to >retrieve any representation of it at all, very much up in the air. The >knock-on implications of this wrt fragment identifiers, sub-URIs, etc. are = >even >more unclear.</dd> > <dt><b>Schema Component Designators</b></dt><dd>As presented there is a= > complete >disconnect between this story and SCDs. Maybe that's the best that we can = >do, >but it would certainly be better if we could get a solution which shared mo= >re.</dd> > <dt><b>Languages vs. namespaces</b></dt><dd>This notion of a language a= >s distinct >From=20a namespace is only just (at least for me) in the process of being w= >orked >out. It may yet be the case that we would do better to use some kind of >'language URIs' as the base, rather than namespace URIs. The continued >widespread use of languages such as Docbook which don't use namespaces >shouldn't be ignored.</dd> > </dl> >=20=20 >=20 ></body></html> >- --=-=-= > > >- -- > Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh > Half-time member of W3C Team > 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 > Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk > URL: http://www.ltg.ed.ac.uk/~ht/ >[mail really from me _always_ has this .sig -- mail without it is forged spam] > >- --=-=-=-- >-----BEGIN PGP SIGNATURE----- >Version: GnuPG v1.2.6 (GNU/Linux) > >iD8DBQFCvDSkkjnJixAXWBoRAif2AJ4kn61mYZdu/9uaGZqbSP693gQxlgCfeCN3 >VyR0Ki0Hv81rraWEn5WaPro= >=2jFt >-----END PGP SIGNATURE----- >Received on Friday, 24 June 2005 16:28:30 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 20 September 2007 13:53:01 GMT