- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Wed, 21 Dec 2005 18:36:51 +0000
- To: www-tag <www-tag@w3.org>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Seems like there was a problem in the background of our discussion of namespaceState-48 which is worth foregrounding: we don't have a consensus understanding of what a namespace is, independently of or prior to our understanding of _XML_ namespaces (please work hard to understand the word 'namespace' in the rest of this message in that general, not-restricted-to-XML, sense). Wikipedia [1] says (of *Namespace (computer science)*): "A namespace is a context for identifiers." What that means, I'm pretty confident, is that any discussion of an identifier is incomplete/underspecified unless it specifies a context, that is, a namespace. Turning for a moment to identifiers, we can take either a top-down or a bottom-up view: Bottom-up, the computer I'm typing on has an identifier, 'erasmus'. And I myself have a number of identifiers, such as 'Henry Swift Thompson', my UK National Insurance number, my US SSN, etc. The function which will I'll run when I'm ready to send this message has the identifier 'message-send-and-exit'. In each case, the context for the above identifiers is pretty clear. The first of those contexts itself has a clearcut nsid (i.e. 'uk.ac.ed.inf'), but the others don't. [I'm using the name 'nsid' to denote the identifier of a namespace, in order to avoid confusion with the identifiers for which a namespace is the context.] Top-down, my previous postings about computer languages and XML [2] [3] provide examples -- it is a property of Python as designed that each class is a namespace, that is, provides a context for a set of names, and that each method _within_ a class corresponds to a further namespace. Some tentative observations: 1) Identifiers need not identify anything -- if we consider the namespace of SSNs, it's clear that it includes both numbers which once did, or still do today, identify individuals, and also numbers which don't, either because they were issued in error or because they haven't (yet) been used. 2) It doesn't follow from anything I've said _yet_ that identifiers are unique in their context. There are three people named 'John' within a few tens of metres of me as I type, and 'John Brown' identifies at least three distinct members of staff at the University of Edinburgh. Most systematic namespaces, that is, ones defined top-down, do eventually narrow things down to a point where uniqueness is guaranteed, but they _don't_ always provide nsids all the way down. For example we could start out by observing that the Java language spec. defines three kinds of namespace for which a well-defined nsid is defined, as follows Context Things identified by name therein Package Class Class Class, Method, Variable Method Variable As noted in the earlier email [2] the context established by a Java class is not itself a namespace within which identifiers are necessarily unique. We can _describe_ three as-it-were sub-namespaces of that namespace, (the middle row above), "the namespace for classes within a class", "the namespace for methods within a class" and "the namespace for variables within a class". Within _those_ namespaces identifiers _are_ unique, but interestingly Java doesn't give us a well-defined way to _assign nsids_ to those namespaces. What does this have to do with the architecture of the Web, and the namespaceState-48 issue? First of all, we can observe that _XML_ namespaces as defined fit with the story given above, as does the draft finding [4]. XML namespace names are nsids, and XML namespace local names are identifiers. It's worth noting that wrt point (2) above the XML namespaces REC as it stands does _not_ require identifiers in an XML namespace to be unique. We can furthermore see that within the XML namespace _identified_ by an XML Namespace name there may be some number of unidentified sub-namespaces within which identifiers _are_ unique, parallel to the Java case discussed above. Second of all, the question does naturally arise as to how the above analysis fits with the WebArch imperative to name things with URIs. Our analysis gives us two problems: 1) Some namespaces don't automatically come with nsids; 2) Not all namespaces guarantee uniqueness for their identifiers. Even when we have a namespace with a well-defined nsid and a uniqueness guarantee therein, there are at least three further things in the way of mapping to URIs by the most transparent means, i.e. the mapping which looks like this: URI(identifier in context of a namespace) == URI(nsid) #? identifier 3) The nsid itself may not map directly to a valid URI; 4) The identifier may not be a valid fragment id per RFC3986, or for the media type associated with the information resource identified by URI(nsid); 5) URI(nsid) may not match the following production, which I wish was available in RFC 3986 [5] core-URI = scheme ":" hier-part [ "#" ] That is, there's no straightforward way of gluing the two parts together to get a valid URI. It's worth noting that as specified XML namespaces cannot suffer from problems (1), (3) or (4), but they are vulnerable to (2) and (5). That's enough for one posting -- I'll return to the still-open question as to where RDF's notion of namespace fits in all this in a subsequent message. ht [1] http://en.wikipedia.org/wiki/Namespace_%28programming%29 [2] http://lists.w3.org/Archives/Public/www-tag/2005Dec/0065.html [3] http://lists.w3.org/Archives/Public/www-tag/2005Dec/0070.html [4] http://www.w3.org/2001/tag/doc/namespaceState-2005-12-16.html [5] http://www.gbiv.com/protocols/uri/rfc/rfc3986.html#collected-abnf - -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh Half-time member of W3C Team 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail really from me _always_ has this .sig -- mail without it is forged spam] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFDqaDDkjnJixAXWBoRAm/7AJ90Y23mpMVSzEvFy/luxDDel4M6TACeJHfZ PREICi7Y/Rp2kdT0Nfcf2AM= =AlKS -----END PGP SIGNATURE-----
Received on Wednesday, 21 December 2005 18:36:58 UTC