Re: SD5 - Namespaces from Peter Murray-Rust on 1997-05-20 (w3c-sgml-wg@w3.org from May 1997)

From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
Date: Tue, 20 May 1997 21:44:52 GMT
To: w3c-sgml-wg@w3.org
Message-Id: <6954@ursus.demon.co.uk>
In message <2.2.32.19970520190727.00b311a0@pop> "Steven J. DeRose" writes:
> At 03:58 PM 05/20/97 GMT, Peter Murray-Rust wrote:
> 
> 
> >This allows an author to create their own fully qualified names for GIs,
> >thus avoiding tag collision.  How they do this is outside the scope of XML.
> >Examples could be CML:MOL or org.w3.MathML:PRODUCT.  If the convention is 
> >adopted that the namespace prefix is either the DTD name or a fully qualified
> >version the chances of collision are reduced to the chances of DTD names 
> >colliding.
> 
> I'm not sure aboout this proposal, but I am sure we should not try to
> re-define the already common term "fully qualified". It already means the

I am sorry - I didn't realise this was an SGML term.  I used it by analogy
with fully-qualified-domain-name such as I was using to resolve ambiguity.

> element
> type in the context of its *containing* element types; such as
> "BOOK,BDY,CHP,SEC,P".
> That is radically different from the sub-typing mechanism under discussion.

It's not a subtyping mechanism.  It's more like adding someone's postal 
address to their name to resolve difficulties.  If I write

'Peter just said...', that causes confusion.  If I said

'peter@ursus.demon.co.uk just said ...'  that uniquely identifies me out of
all the zillions of people on the planet.  If people are always defined by
their e-mail, it becomes tedious to read.  So *if* the only 'Peter' in this 
posting is me, I can define the identity once and there is no confusion.  But
if I include mail from (say) Peter Flynn then I might write:

peter@ursus.demon.co.uk mailed peter@curia.ucc.ie on ...

and it's unambiguous.


> 
> In short, "is-a" is not the same relation as "has-part".
> 
> >
> >It is possible to create one DTD that can be used for both forms (I hope
> >that I don't get shot down in flames...).  If we have:
> 
> Whether this works depends on what the constraints on subclassing are. Is it
> the case that every
> subtype of a type must permit all child-sequences that the type permits? Or

It isn't a subclass as much as an alias mechanism.  If my dtd is written:

<!ELEMENT org.chem.cml:CML (org.chem.cml:MOL | org.chem.cml:VAR)*>

and MathML is written:

<!ELEMENT org.w3.mathml:VAR (#PCDATA)>

then we have no namespace collisions.  If it is agreed that the field before
the ':' is a uniquifying prefix (and I used domain names because they are 
guaranteed to be unique), we can guarantee that we shall not have namespace
collisions.  That's all.

If we could get all DTD authors to adopt this convention, we could forget
about the namespace problem.  

The problems are that if humans author this, it puts them off.  Especially
if they've spent years creating a DTD with GIs like 'E', 'Q'.  So mine was
a way to uniquify every tag ever used in XML without making it painful for
the authors.  We do it with URLs, why not tags?  It also has the advantage
that if every tag is uniquely identified with its domain name, then resolving
the semantics becomes potentially more tractable.  

Realising that humans don't like long names, I suggested a slightly kludgy
(but only slightly kludgy) mechanism to allow a shortened form.

I assume no-one actually minds if I create a DTD which is unlikely to collide
with anyone else's? :-) 


> is it vice versa? Or must the sets permitted be identical? Or unrelated?
> Architectural forms attempt to address this, but only provide the first case
> (approximately); and cannot be used to unify processing of
> slightly-differing structures, such as deflists with item-pair containers,
> versus without. Do we really want to dive into this?

There isn't any subclassing, any AFs or deflists as far as I know.  Just
unique names.

> 
> If this subclassing mechanism is purely for people to be able to
> interoperate in terms
> of browsing, why is it better than HTML "CLASS=", where you leave the GI
> alone? That's
> certainly simpler, right?

I don't understand this - sorry :-).  Does this imply that the DTD is recast
as 
<OBJECT CLASS="MOL">
<OBJECT CLASS="ATOM">
...
to avoid GI clashes?  If so the traditional way that people write DTDs
will need to change and it won't appeal to people.

The underlying point is that XML opens up a wonderful opportunity of
exchanging information objects  and combining them in the same document,
except for the namespace clashes between tags.  (Everything else can be dealt
with).  I had expected that experienced SGML'ers would have had a simple and
elegant solution to this, but I haven't seen it.  

It isn't essential to have the ':', but it's much less elegant without it.

	P.

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
Received on Tuesday, 20 May 1997 17:17:08 UTC