Re: Irony heaped on irony

[I missed this when it was first sent...]

David Megginson wrote in his message of Thu, 18 May 2000 16:44:04 -0400
(EDT):
> Dan Connolly writes:
> 
>  > >  The schema for schemas (and others) should reference an XML
>  > > schema for the xml: Namespace using the xsi:schemaLocation
>  > > attribute, as in
>  > >
>  > >   xsi:schemaLocation="http://www.w3.org/XML/1998/namespace
>  > >                       http://www.w3.org/XML/Schemas/xmlschema-20000518.xsd"
>  >
>  > Why? Why use schemaLocation when there's no need to?
> 
> I'm too tired to rehash this whole debate for the third (or fourth?)
> time -- we've fought it too many times already, for nearly two years,

This name/address/identifier discussion goes back longer than that...
I joined it in 1991. But it's central to the architecture that
some of us in W3C have been trying to set up since at least 1995.

> and in general, the W3C WGs have done the right thing in the end (as
> the Schema WG did with xsi:schemaLocation and as the XHTML WG did with
> the single Namespace URI)

Or the wrong thing, depending on your perspective.

We have compromised on this issue in order to get a few specs
out the door, thinking that we can do it better next time. But
it's clear that it's getting worse, not better. So we're trying
to see if we can actually address the issue to the satisfaction
of all concerned at this point, rather than working around it any more.

> -- so I'll summarize my position, then drop
> out of the debate and let those who are less jaded continue it.

I read your summary and I don't find it compelling and I'd
like to discuss it further. I can certainly understand your reluctance
to commit to participating in all the discussion, but
if we come to any conclusion, I hope that you'll be involved.

> Many people (not Dan, or Tim B-L, but many of the rest of us) consider
> XML namespaces to be conceptually equivalent to C++ namespaces or Java
> or Perl package names -- that is, they are unique identifiers that
> serve exactly two roles:
> 
>   1. group elements and attributes with different local names into a
>      single collection about which general statements may usefully be
>      made (i.e. "if a processor finds an unrecognized attribute in the
>      X Namespace, it should ignore it"); and
> 
>   2. disambiguate elements and attributes with the same local names.
> 
> That's it.  Period.

How ironic! Java and Perl package names are both used to look
up source code.


> (Now, it happens that both of these functions provide a useful service
> to schemas: (1) allows schemas to make general rules about a related
> collection of element and attribute names, and (2) allows schemas to
> be applied to documents with components defined by multiple
> authorities.  No one, I think, disagrees with either of these
> benefits.)
> 
> In Perl or Java, package names do not change with each new release:
> Java2 still uses the java.util package name just like Java 1.0 and
> Java 1.1 did, even though the package's contents have changed
> considerably.

There are advantages and disadvantages to this... Java software
is often labelled with out-of-band instructions on resolving
Java package names ala "you need at least JDK 1.2 to compile
this package" and such. They make it more convenient to
install version 1.1 in place of version 1.0 on some machine
and continue to use the code that was compiled on that
machine against 1.0, but they make it less convenient to
coordinate releases on a global scale.

COM takes the other approach... they provide compile-time
friendly aliases for UUIDs, but released code references
the globally scoped UUIDs. And the COM design recognizes
that code built against 1.1 libraries (using, e.g.
function OpenEx from interface Syslib) might in fact
be installed on a machine with 1.0 libraries (where inteface
Syslib only has Open, not OpenEx). COM provides different
global names for Syslib version 1.0 and syslib version 1.1
so that this code won't crash by trying to call a
non-existent OpenEx function in the 1.0 library.

Lest you should argue that this doesn't apply to markup
languages like XHTML, consider the case of a document
that depends on the XHTML version 6.3 "dwim" element...
the author checks the spec in 2003, and sure enough, dwim
is in the XHTML namespace. Then he ships his document
to a system WizDoc that claimed, way back in 2001, to
support the XHTML namespace. But WizDoc doesn't support
dwim. So the author has to label his document ala
"requires support for version 6.3 of the XHTML namespace"
using an ad-hoc labelling mechanism that works only
with a human in the loop.

Perhaps we expect this to be rare in the future usage
of XHTML, and we accept the cost in order to get
the benefit of being able to use the same XSLT
script across the next few revisions of XHTML.

But that doesn't mean Java package name style versioning
is right and COM interface style versioning is wrong.
It just means that each is more cost-effective than
the other in some cases.

RDF Schemas use the COM-style naming/versioning policy.
XHTML and XSLT use Java-style naming/versioning policy.


>  Likewise, Namespace URIs should not change with each
> new revision: if or when XML 3.0 comes out, the xml: prefix should
> still be mapped to the http://www.w3.org/XML/1998/namespace URI, so
> that it's easy for software to recognize it.

Maybe... or maybe it'll be lest costly to use a different
namespace URI for new versions, so that software that
uses the old/present (1998) namespace URI doesn't get
any surprises.

In the particular case of the xml: namespace, I think it's
more likely that you're right... it'll be easier to
keep it bound to the same namespace URI,
and folks will just have to stay tuned to that URI to see
how the namespace changes over time.

> While Namespace URIs should be stable, schemas obviously need to
> evolve.  It would be very dangerous silently to update the schema at
> http://www.w3.org/XML/1998/namespace if XML 2.0 adds another xml:
> attribute, since it would change the meaning of every schema that
> referenced that one; on the other hand, it would be disastrous for XML
> 2.0 processors to use a *different* Namespace URI for the xml: prefix,
> since the millions of lines of code that had a hard-coded dependency
> on the old one would break.

Hang on... you can't have your cake and eat it too... either (a) the
http://www.w3.org/XML/1998/namespace namespace may change over time,
in which case having the meaning of every schema that references
http://www.w3.org/XML/1998/namespace change when
http://www.w3.org/XML/1998/namespace is updated is exactly what
is intended, or (b) the http://www.w3.org/XML/1998/namespace does
*not* change over time, and we'll need a new URI when
we define a new xml:blort attribute in XML 2.0, and XML 2.0
processors will have to recognize the new URI as well as the
old one.


> The solution to this problem is, or should be obvious.  We have two
> fundamentally different kinds of things -- the Namespace (or package
> name, if you prefer), which should change rarely, if at all, and the
> schema, which may change frequently, so each should be referenced with
> a different mechanism.

I agree that there's an obvious tension between wanting to exploit
references to an old namespace name on the one hand and wanting to be
able
to update the description of the namespace at will on the other hand,
but
that's the nature of distributed information systems. Sure, we provide
schemaLocation for times when it's not feasible to use the same URI
both for looking up a schema and matching other references to the
namespace, but I don't see why this should preclude using the
same URI for both when it is feasible.

I disagree that using two URIs is the obvious solution in all,
or even the majority of cases.

> Using (or even allowing) the same mechanism for referring to both will
> simply encourage confusion and the worse practice (I know that RDF
> already did so, but since virtually no one is using RDF schemas in
> real-world apps, that may not be much of a problem).

Virtually no one is using schemas to mix vocabularies at all. Of
those that are mixing vocabularies, RDF Schemas is one of the
most popular technologies, from what I can see
(cf http://www.w3.org/RDF/#projects).


> XML schemas should allow *only* the xsi:schemaLocation attribute and
> no other mechanism,

I accept this as your opinion, but it's not a conclusion that
I'm compelled to accept on the basis of the argument you've
presented.

> and the WG should set the example by using that
> attribute itself. If somebody insists on sticking a schema at the end
> of a Namespace URI, it's hard to stop them, but at least we can point
> at the practice as very sloppy engineering.

I disagree. I think that in the vast majority of cases, it
will be simpler for all concerned, not to mention immensely
more powerful, to use the same URI for
a namespace and a schema that describes/defines it.




> Done.
> 
> All the best,
> 
> David

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/

Received on Sunday, 21 May 2000 00:49:21 UTC