Re: Namespace names: a semi-serious proposal from Al Gilman on 2000-05-27 (xml-uri@w3.org from May 2000)

From: Al Gilman <asgilman@iamdigex.net>
Date: Sat, 27 May 2000 10:49:38 -0500
To: "Tim Berners-Lee" <timbl@w3.org>, <xml-uri@w3.org>
Message-Id: <200005271438.KAA2193988@smtp2.mail.iamworld.net>
At 08:50 AM 2000-05-27 -0400, Tim Berners-Lee wrote:
>
>Let's pop the stack - while it is useful to have shared context about
>URIs and what they can do, and this list may be a way to get it, the
>most essential point is that XML does not need to specify anything
>about particular schemes.
>
>The reason for using a URI is that you
>_separate_ the design issues associated with any particular URI
>scheme from the design of the language.   In other words,
>discussions like these are broken into two parts: the design of the
>language in terms of URIs, and the design of the URI schemes.

As one of the people trying to make XML safe for the semantic web, and a
frequent defender of the URI class in all its breadth as "one thing" for
some purposes, maybe it should fall to me to explore why the actual
suggestion here could be a mistake.  

The first independence that one needs to secure in language-building is
independence between the flows of instance:instance relationships including
but not limited to part:whole, and the flows of subtype:subtype
relationships including but not limited to generic:specific.

Software design to realize the functions connoted by the data in the
messages will be far more healthy and effective if "what we create in the
language-construction toolkit" is a canvas where this two-dimensional
topology [e.g. as elaborated by iteration on part:whole and
generic:specific arcs] is constructed in a way that can be shared and
preserved across applications.

Insisting that locators for peer instances and names for superlanguages all
be muddled into one flat (i.e. null or point set topology) space without
any shred of distinction or further structuring of the space, for all
practical purposes prevents the growth of a healthy software infrastructure
for languages.


Yes, the architecture desperately needs separation of concerns.

No, these are not the first concerns we need to separate here.

The implementation of separation that you suggest is overkill and
significantly damaging to the implementation prospects of the resulting
logical map.  

Before getting serious about enforcing separation of concerns, we need to
rotate to a more canonical coordinate frame or we will just reduce all the
filet mignon and other fine cuts in the beast to sausage as part of
salvaging the hocks.

Based on my limited exposure to language building [senior participant or
technical director for IEEE 1076 VHDL, IEEE 1029.1 WAVES, etc.] I would be
inclined to believe that the way to get people to move to the effective
application of multilevel partial understanding in language building is by
creating room for generic:specific relationships in a subspace which is
reliably _separable_ from part:whole and similar instance:instance links.
This to me is one of the central learnings of the OO revolution: that "the
repeated and _independent_ application of part:whole and generic:specific"
constructs your domain-analysis canvas for you.  You can't give away either
of 'repeated' or 'independent' and get to where you need to be.  

Larry mentioned how hard it was to get RFC 2396 consensed on.  The reason
it was hard was that it was very hard to keep the URN devotees who were
obsessing on the nominative requirements and the URL pragmatists who were
obsessing on the "surviving data communication vicissitudes" believing that
they were talking about a common category of beast.  

Many times I have defended the one-class view, as lately as protesting that
"the VIN is there so I can locate my stolen car."  But the essential
knowledge operation is not putting things in one bucket or putting them in
two: it is the compare-and-contrast operation where one accounts in what
sense the two things are the same and in what ways they are different.  One
constructs the setwise least upper bound, the most specific category
spanning both, and the setwise greatest lower bound, the coarsest
vocabulary of traits which, acting as coordinates, allow one to clearly
mark the line separating the two.


Classical OO gives us the framework for accounting why most people,
including language and communication protocol specialists, view URLs and
URNs as two classes of things as opposed to viewing their commonality.
This is that for one class the only method guaranteed is an identity check
and the other additionally has a conspicuous GET method.  The identity
check function of the VIN is in fact a material aid in an implicit GET
method, which requires searching, but closes much faster because cars have
VIN metadata physically integrated in to the physical realisation of the
instance.  But the performance profile, the value that is attached to
optimizing 'check equality' vs. 'get' is sufficiently different for the two
classes so that for most design analyses the distinction is significant
(a.k.a. germane, important not to lose).

Multilevel partial understanding lets us notice or ignore distinctions as
appropriate to the tradeoff at hand.  The problem is, that 'namespace
identification' is a misnomer for 'markup vocuabulary module
identification' in this case, and markup vocabulary identification and
application _needs to distinguish_ a variety of cases: the superclass with
only an equality check, the subclass with both an equality check and a
apply-rules method, and the further subclass that uses an "get the rules on
the fly and interpretively apply them."  These are major practical
breakpoints in what kind of a problem it is to implement a language.  For
language engineering, we need to use our multilevel partial understanding
skills to capture a mix/match concurrent assessment of costs and benefits
of language constructs.  We cannot require that only the highest level
(most unified) of constructs survive and that only the benefits side be
viewed.  Nobody will or should follow that lead.  We have to be able to
follow the implementors into the trenches and help them comprehend the
cost:benefit tradeoffs they face.  We may indeed be able to show unexpected
benefits [the whole Universal Design premise of the WAI hangs on this] but
it has to be in an open dialog where costs and benefits are all respected
at the table.

My vision of the "ideal" language architecture has a lot of things in it
that scare implementors, like mutual recursion among the definitions of
different language modules.  This is the 'recursive' proposition you
alluded to by paraphrasing "Godel, Escher, Bach."  Recursion is better.
But it lacks the "can't live without" quality of "placing a subspace
distinction between part:whole and generic:specific."  The architecture has
to secure the vital essentials before we can even start thinking about
optimizations.  So the "URIs are one class" supposition has to be examined
in light of the actual care-abouts for healthy generic:specfic relationship
flow before it can become a constraint on the language-building architecture.

Al


>Temping though it is for the users of URI schemes to redesign XML
>(how many non-xml languages have come out of the IETF recently?)
>and the users of XML to redesign URIs,  this reduces the power and
>resilience of the whole system.  This is one of the basic reasons
>for my asking the XML designers to make it a URI pure and simple.
>
>[This is independent of the relative URI debate now, we are talking about
>the properties if the absolutized thing]
>
>This is software engineering principle of modularity.
>
><analogy>The design of a towing hitch separates the design of car and
>trailer.
>While the designers of trailers discuss the number of cylinders a car
>should have, and the designers of cars discuss whether trailers should
>be made of fiberglass or aluminum, then nothing is ever settled.
>Once a car can provide, and trailer accept, a standard hitch then the
>customer
>can make workable system with a big enough car and a stable enough
>trailer for the job at hand.</analogy>
>
>For an application which defines social expectations, then it may be
>reasonable
>to mandate particular uses of URIs.  The Platform for Privacy preferences,
>for
>example, states as part of the protocol that a URI given to a privacy policy
>must
>never ever used for anything else but exactly that policy.  But in the
>general language
>at a level as fundamental as the namespace spec, you can't presume the
>social
>conditions of an application. Just quote the URI spec.
>
>
>Tim BL
>
>(comments on the original message included below)
>
>-----Original Message-----
>From: Paul W. Abrahams <abrahams@valinet.com>
>To: xml-uri@w3.org <xml-uri@w3.org>
>Date: Friday, May 26, 2000 1:48 PM
>Subject: Namespace names: a semi-serious proposal
>
>
>>OK, the folks who brought the namespace spec into the world
>>are of one voice: namespace names don't mean anything.  They
>>are just unique identifiers.
>
>
>(For those folks who had planned on society using the things,
>this is rather a disappointment.  But perhaps it is as well,
>as others say XML documents don't mean anything either ;-)
>
>One wonders, if they mean nothing, why do they have to be unique?
>Perhaps they should be replaced by empty strings!
>Obviously they do have *some* meaning. They have a meaning because  their
>identity properties allow information to be communicated about them. All
>sorts of information - in specs,
>schemas, corridor gossip, etc.  Once you identify something, then in fact
>you can't stop people talking about it.
>

>>So let's make the connotation match the denotation.  Let W3C
>>set up a website that dispenses unique integers to all
>>comers, no matter how nefarious or trivial their purpose.
>>You ask for one and you get one.  Service on the spot, no
>>questions asked.   In fact, you can get 10**12 of them at a
>>shot if you wish.   As far as I know there is no imminent
>>shortage of integers, though for the sake of ecology we
>>might wish to use the Base64 notation or hexadecimal instead
>>of decimal.
>
>
>(We don't do this as a central repository, as we know that
>being a central repository, while lucrative, prevents the web from scaling
>and is socially unacceptable.
>
>However, fo those doing W3C work, we happily do provide a persistent
>URI of the form http://www.w3.org/YYYY/xxxxxx
>where the YYYY is just a device to help us ensure there is no reuse.)
>
>>The value of the xmlns attribute, i.e., the namespace name,
>>is then a unique integer, obtained from the source from
>>which such blessings flow.  The creator of the namespace can
>>decide if a new version is sufficiently similar to a
>>previous one to warrant a new number.
>
>This is exactly the sort of author control of persistence which
>the owner of an HTTP name has, in fact.
>
>> It then is
>>abundantly clear that a namespace name conveys no
>>information whatsoever.
>
>
>Alas, I will use the number to say something about it -- maybe even in
>a standard - and then, it will have some meaning to anyone whohas read the
>standard.
>
>>In fact, there's no need even to restrict the dispensation
>>of unique integers to a single source.  Anyone can get into
>>the business as long as they themselves get a unique integer
>>as their business card, and prefix the integers they
>>dispense with their own ID and some appropriate delimiter.
>>Any cad who sends the same integer to two people will
>>deserve the same fate as that old Monty Python character who
>>distributed fake Hungarian-English lexicons to Hungarian
>>tourists in London.
>>
>>Maybe the integer dispenser already exists.  If it doesn't,
>>it should.  It obviously has many uses.
>
>
>There are as people have pointed out, many similar schemes
>which have siilar properties.
>
>
>
>
>Tim BL
>
>
>
>>Paul Abrahams
>>
>>
>
Received on Saturday, 27 May 2000 10:37:14 UTC