- From: Al Gilman <asgilman@iamdigex.net>
- Date: Sat, 27 May 2000 10:49:38 -0500
- To: "Tim Berners-Lee" <timbl@w3.org>, <xml-uri@w3.org>
At 08:50 AM 2000-05-27 -0400, Tim Berners-Lee wrote: > >Let's pop the stack - while it is useful to have shared context about >URIs and what they can do, and this list may be a way to get it, the >most essential point is that XML does not need to specify anything >about particular schemes. > >The reason for using a URI is that you >_separate_ the design issues associated with any particular URI >scheme from the design of the language. In other words, >discussions like these are broken into two parts: the design of the >language in terms of URIs, and the design of the URI schemes. As one of the people trying to make XML safe for the semantic web, and a frequent defender of the URI class in all its breadth as "one thing" for some purposes, maybe it should fall to me to explore why the actual suggestion here could be a mistake. The first independence that one needs to secure in language-building is independence between the flows of instance:instance relationships including but not limited to part:whole, and the flows of subtype:subtype relationships including but not limited to generic:specific. Software design to realize the functions connoted by the data in the messages will be far more healthy and effective if "what we create in the language-construction toolkit" is a canvas where this two-dimensional topology [e.g. as elaborated by iteration on part:whole and generic:specific arcs] is constructed in a way that can be shared and preserved across applications. Insisting that locators for peer instances and names for superlanguages all be muddled into one flat (i.e. null or point set topology) space without any shred of distinction or further structuring of the space, for all practical purposes prevents the growth of a healthy software infrastructure for languages. Yes, the architecture desperately needs separation of concerns. No, these are not the first concerns we need to separate here. The implementation of separation that you suggest is overkill and significantly damaging to the implementation prospects of the resulting logical map. Before getting serious about enforcing separation of concerns, we need to rotate to a more canonical coordinate frame or we will just reduce all the filet mignon and other fine cuts in the beast to sausage as part of salvaging the hocks. Based on my limited exposure to language building [senior participant or technical director for IEEE 1076 VHDL, IEEE 1029.1 WAVES, etc.] I would be inclined to believe that the way to get people to move to the effective application of multilevel partial understanding in language building is by creating room for generic:specific relationships in a subspace which is reliably _separable_ from part:whole and similar instance:instance links. This to me is one of the central learnings of the OO revolution: that "the repeated and _independent_ application of part:whole and generic:specific" constructs your domain-analysis canvas for you. You can't give away either of 'repeated' or 'independent' and get to where you need to be. Larry mentioned how hard it was to get RFC 2396 consensed on. The reason it was hard was that it was very hard to keep the URN devotees who were obsessing on the nominative requirements and the URL pragmatists who were obsessing on the "surviving data communication vicissitudes" believing that they were talking about a common category of beast. Many times I have defended the one-class view, as lately as protesting that "the VIN is there so I can locate my stolen car." But the essential knowledge operation is not putting things in one bucket or putting them in two: it is the compare-and-contrast operation where one accounts in what sense the two things are the same and in what ways they are different. One constructs the setwise least upper bound, the most specific category spanning both, and the setwise greatest lower bound, the coarsest vocabulary of traits which, acting as coordinates, allow one to clearly mark the line separating the two. Classical OO gives us the framework for accounting why most people, including language and communication protocol specialists, view URLs and URNs as two classes of things as opposed to viewing their commonality. This is that for one class the only method guaranteed is an identity check and the other additionally has a conspicuous GET method. The identity check function of the VIN is in fact a material aid in an implicit GET method, which requires searching, but closes much faster because cars have VIN metadata physically integrated in to the physical realisation of the instance. But the performance profile, the value that is attached to optimizing 'check equality' vs. 'get' is sufficiently different for the two classes so that for most design analyses the distinction is significant (a.k.a. germane, important not to lose). Multilevel partial understanding lets us notice or ignore distinctions as appropriate to the tradeoff at hand. The problem is, that 'namespace identification' is a misnomer for 'markup vocuabulary module identification' in this case, and markup vocabulary identification and application _needs to distinguish_ a variety of cases: the superclass with only an equality check, the subclass with both an equality check and a apply-rules method, and the further subclass that uses an "get the rules on the fly and interpretively apply them." These are major practical breakpoints in what kind of a problem it is to implement a language. For language engineering, we need to use our multilevel partial understanding skills to capture a mix/match concurrent assessment of costs and benefits of language constructs. We cannot require that only the highest level (most unified) of constructs survive and that only the benefits side be viewed. Nobody will or should follow that lead. We have to be able to follow the implementors into the trenches and help them comprehend the cost:benefit tradeoffs they face. We may indeed be able to show unexpected benefits [the whole Universal Design premise of the WAI hangs on this] but it has to be in an open dialog where costs and benefits are all respected at the table. My vision of the "ideal" language architecture has a lot of things in it that scare implementors, like mutual recursion among the definitions of different language modules. This is the 'recursive' proposition you alluded to by paraphrasing "Godel, Escher, Bach." Recursion is better. But it lacks the "can't live without" quality of "placing a subspace distinction between part:whole and generic:specific." The architecture has to secure the vital essentials before we can even start thinking about optimizations. So the "URIs are one class" supposition has to be examined in light of the actual care-abouts for healthy generic:specfic relationship flow before it can become a constraint on the language-building architecture. Al >Temping though it is for the users of URI schemes to redesign XML >(how many non-xml languages have come out of the IETF recently?) >and the users of XML to redesign URIs, this reduces the power and >resilience of the whole system. This is one of the basic reasons >for my asking the XML designers to make it a URI pure and simple. > >[This is independent of the relative URI debate now, we are talking about >the properties if the absolutized thing] > >This is software engineering principle of modularity. > ><analogy>The design of a towing hitch separates the design of car and >trailer. >While the designers of trailers discuss the number of cylinders a car >should have, and the designers of cars discuss whether trailers should >be made of fiberglass or aluminum, then nothing is ever settled. >Once a car can provide, and trailer accept, a standard hitch then the >customer >can make workable system with a big enough car and a stable enough >trailer for the job at hand.</analogy> > >For an application which defines social expectations, then it may be >reasonable >to mandate particular uses of URIs. The Platform for Privacy preferences, >for >example, states as part of the protocol that a URI given to a privacy policy >must >never ever used for anything else but exactly that policy. But in the >general language >at a level as fundamental as the namespace spec, you can't presume the >social >conditions of an application. Just quote the URI spec. > > >Tim BL > >(comments on the original message included below) > >-----Original Message----- >From: Paul W. Abrahams <abrahams@valinet.com> >To: xml-uri@w3.org <xml-uri@w3.org> >Date: Friday, May 26, 2000 1:48 PM >Subject: Namespace names: a semi-serious proposal > > >>OK, the folks who brought the namespace spec into the world >>are of one voice: namespace names don't mean anything. They >>are just unique identifiers. > > >(For those folks who had planned on society using the things, >this is rather a disappointment. But perhaps it is as well, >as others say XML documents don't mean anything either ;-) > >One wonders, if they mean nothing, why do they have to be unique? >Perhaps they should be replaced by empty strings! >Obviously they do have *some* meaning. They have a meaning because their >identity properties allow information to be communicated about them. All >sorts of information - in specs, >schemas, corridor gossip, etc. Once you identify something, then in fact >you can't stop people talking about it. > >>So let's make the connotation match the denotation. Let W3C >>set up a website that dispenses unique integers to all >>comers, no matter how nefarious or trivial their purpose. >>You ask for one and you get one. Service on the spot, no >>questions asked. In fact, you can get 10**12 of them at a >>shot if you wish. As far as I know there is no imminent >>shortage of integers, though for the sake of ecology we >>might wish to use the Base64 notation or hexadecimal instead >>of decimal. > > >(We don't do this as a central repository, as we know that >being a central repository, while lucrative, prevents the web from scaling >and is socially unacceptable. > >However, fo those doing W3C work, we happily do provide a persistent >URI of the form http://www.w3.org/YYYY/xxxxxx >where the YYYY is just a device to help us ensure there is no reuse.) > >>The value of the xmlns attribute, i.e., the namespace name, >>is then a unique integer, obtained from the source from >>which such blessings flow. The creator of the namespace can >>decide if a new version is sufficiently similar to a >>previous one to warrant a new number. > >This is exactly the sort of author control of persistence which >the owner of an HTTP name has, in fact. > >> It then is >>abundantly clear that a namespace name conveys no >>information whatsoever. > > >Alas, I will use the number to say something about it -- maybe even in >a standard - and then, it will have some meaning to anyone whohas read the >standard. > >>In fact, there's no need even to restrict the dispensation >>of unique integers to a single source. Anyone can get into >>the business as long as they themselves get a unique integer >>as their business card, and prefix the integers they >>dispense with their own ID and some appropriate delimiter. >>Any cad who sends the same integer to two people will >>deserve the same fate as that old Monty Python character who >>distributed fake Hungarian-English lexicons to Hungarian >>tourists in London. >> >>Maybe the integer dispenser already exists. If it doesn't, >>it should. It obviously has many uses. > > >There are as people have pointed out, many similar schemes >which have siilar properties. > > > > >Tim BL > > > >>Paul Abrahams >> >> >
Received on Saturday, 27 May 2000 10:37:14 UTC