Re: Namespaces, the universe, and everything

Summary:

Tim makes some good points, and addresses my concerns (without fully
allaying them).

I ask a lot of questions to try to find out what's really up.

I at least see some signs that there may be a justification for jumping the
gun on standardizing new stuff.

At 11:54 AM -0500 6/19/97, Tim Bray wrote:

I'd also like to note that if the XML project had been taking a more
radical approach to redefining SGML I would be more positive about the
current debate, but the conservative design of the current language makes
me less willing to take chances, especially at the last minute.

>1. Why namespaces anyhow
>
>Here's the canonical scenario that I think we're trying to make possible.
>People want to stick together a chunk of XML that gives a PICS rating,
>another one that applies some Dewey Decimal classifications, another one
>that explains copyright status, pack them up with a chunk of content
>(HTML for now, but soon XML), and jam the whole shooting match down the
>wire in one blob.  They want to do this ad-hoc.

OK, as I have said, this ammounts to either a loophole in the notion of
validity, or a case where it needs to be extended. One of the reasons that
I have been a bit nasty is that no-one has just admitted this up front. We
are talking about a way to add stuff to instances without worrying about a
DTD.

>For one or more of these chunks, validity may be an issue.  For most of
>them, there will be a requirement to map them to a formal schema, which
>may or not be a DTD.  This compound blob will be well-formed but probably
>nobody cares if it's valid, as a whole, with respect to any particular
>DTD.  The goal is that you can cook up a chunk of meta-content, stick it
>at the front/end of the document, and somehow not worry about GI
>collisions confusing a simple-minded app that is reading the
>transmission.  Probably there will be multiple independent downstream
>apps, most of which care only about (a) the content or (b) one of the
>blocks of meta-content.

I am a bit dubious about this assumption... Especially since we _need_ a
notion of validity or at least ignorability that will let simple minded
applications work. This sounds like a good application of MIME/multipart,
but that's got a lot of unnatractive overhead too, so I'm still listening.

>[corollary: this scenario *presupposes* that most web metacontent is going
> to have XML syntax, one way or another.  Smiles.]

This is good as long as I can count on XML syntax to actually be a
meaningful notion... Maybe meta-data in XML is worth a jump into the
unknown. I'm not on the ERB, so I don't make that decision, but if that's
the argument, let's not deny that we're doing it. And let's hear why having
all meta-data in XML is a wonderful idea.

>2. Web community perceptions
>
>I mentioned in an earlier note that the people who are our customers
>for this facility are up to this point in time very hostile toward the
>idea that we do it all with AFs (er, fixed attributes).  Said mention
>drew down a storm of abuse, and as a result the first draft of this note
>was very nasty.  My best friend walked by the computer, glanced at the
>screen, and said "don't send that out tonight, OK?".  So I've confined
>the nastiness to the following well-marked excursion:
>
><nasty>
> At 03:23 PM 18/06/97 -0500, David G. Durand wrote:
> >I don't think that it is a good idea to throw away what we know to be
> >useful (DTDs) based on so-called "massive resistance" from people who don't
> >know what they are talking about.
>
>Gimme a $#%!# break.  Uh, how many pieces of DTD-driven software were
>downloaded voluntarily and used on a daily basis by 50 million people
>because they like it and they get useful work done without training?
>In fact, I <gasp> happen to think that some of these Web guys actually
>do, on a regular basis, know what they are talking about, when they are
>on the subject of the kind of software that people like to use and
>developers like to interface to.   Fortunately for the AF partisans in
>this debate, people like Jean and Jon and myself are insulating
>the web engineers from the above kind of militant ignorance; which means
>that 8879 conformance, for the moment, still has a fighting chance.
>Sheesh.
></nasty>

Well, as I said, the content of one of my objections seems to be true -- we
want an escape hatch from DTDs, and we _are_ throwing out, or at least
radically extending existing SGML practice. Having one's points
acknowledged (if not acceded to) goes a way to reducing overall spleen
level.

Given the performances that I've seen by some web engineers on this list,
and on the net generally, you must admit that I am not without a point. I
obviously did not mean, and don't think, that all the people associated
with the web are idiots, but there's clearly a gap in cultures, and a lot
less esperience with markup systems and the good and bad points of DTDs on
the web side.

HTML has not been an unalloyed success, though success it certainly is.

>OK, let's try again.  I and Jean (and I'm sure Jon at Sun and Dave at
>HP etc etc etc) are  having a lot of trouble with really smart and really
>senior people who at the moment look at us like we're nuts when we try to
>convince them that Architectural Forms are a viable way to go, because
>they exist and are standardized, etc... especially because we *really*
>don't want them to go and look at the relevant standard, if they do we're
>toast.  These people see what looks like a self-evident self-explanatory
>no-brainer syntax

After living with the HyTime document I agree that it should be kept under
wraps. I have only advocated the use of #FIXED attributes now because I
worry that we will standardize something that will not work in the long run.

I'm not opposed to the idea of namespaces, but the last minute rush is very
worrisome, and the reasons behind that rush are _not_ being clearly
presented to this group -- nor are the requirements and technical tradeoffs
being presented.

Given the choice between something that is workable but clumsy and invokes
no new mechanisms and a pig in a poke, I then to go for clumsy...

>Things they don't want to do include:
> 1) loading up every element with form attributes, or
> 2) figuring out, as they append successive chunks of XML, what things
>    have to be surgically implanted in the internal subset to avoid
>    problem #1, and

This doesn't sound like it's that hard, actually, but I can see the point
of view clearly enough.

> 3) ensuring each little lightweight application includes a DTD syntax
>    parser [yes, I know it's not that hard, I know first-hand, Lark
>    processes <!attlists from the internal subset precisely to support
>    constructions like the above, but they DON'T WANT TO and are going
>    to have to be given good solid reasons why they should].

In fact, the parsers make the case for element syntax much better.

>What is the correct answer?  Saying "we know what we're talking about
>and you don't" isn't good enough.  At the moment, I have not heard a
>good enough argument to convince me that AFs are better for the
>purpose described above than the colon syntax.

That was not the argument that I'm making. I'm saying that if AFs are a
solution _at all_ then we should be careful about standardizing a new, only
partially understoo mechanism in a  big hurry. If we don't get the
semantics of the colon syntax right, then we've tied a long-term millstone
around XML's neck. I'm saying that we're talking about standardizing
something that may be _much_ less good that it should be because we don't
understand it well at the moment.

The stakes may be worth it, but I haven't seen a refutation of the adequacy
point. That's why I'm taking a conservative line here. I can believe that
for political reasons it is imperative that we choose a new mechanism
because it's impossible to sell the one that we have -- but if that's the
case, I'd rather see it up front.

Maybe all meta-data in XML is so important to the future of XML that a
chance taken is worth it. I'm partly convinced that it might be, but only
partly at the moment.

>3. Validation, DTD's and schemata
>
>The discussion has suffered a bit from the assumption that when we
>say namespace or schema or whatever, this necessarily means we are
>tied to some DTD and we are worried about content models and the like.
>A lot of the schemata that people want to write are concerned with
>things that DTD's don't care about: data typing, domains and ranges
>of properties, inheritance of one kind or another.  Yes DTDs are
>schemata.  But the traditional SGML mind-set, that metadata means
>markup declarations, and that markup declarations means a DTD, and
>that the DTD must necessarily constrain the whole document from end
>to end, just doesn't apply any more.

I'm also worried about the inverse -- do namespaces force my DTD-loving
application to deal with arbitrary foreign-namespace gunk that I can't
anticipate? If DSIGs don't want to impose DTD stuff, that's fine by me, but
if I want to fix a DTD, will namespaces force me into writing code to deal
with "unexpected markup". This is a big step back from the reasons I
usually want to write a DTD.

>The concept of "validity" in XML isn't going to go away, but it's going
>to be subsumed into the much larger concept of "piecewise conformant to
>the rules for the schemata that apply to various document components".
>Lots of applications that are very stringent about what their data can
>contain just don't care about, for example, content models.  Nobody is
>trying to *discard* DTD's; we need more validation, not less.

Well, if we change the notion of validity, we are sure making a big change
in the infrastructure at the last minute. I'm more concerned that if DTDs
become optional, replaced by informal tag-salad specifications for diverse
namespaces that we will be worse off than with HTML, because now it won't
even be easy to format the sloppy tagging that will come in.

I just want to know how I get a DTD that describes my document when the
server jams a DSIG in there... Or be told straight to my face why not being
able to get that is such a good idea.

>It seems pretty clear that it's probably not realistic at this
>point to extend the 8879 syntax in 8 different directions at once
>to meet the needs of all the meta-content group requirements

I certainly agree -- though we may prefer that once we understand the
requirements, and their implications for SGML.

> - so
>for now we're preaching that they should at least use XML, not
>S-expressions or tab-delimited files or whatever other proprietary
>weirdness they may have been considering.

I can see this as a fine strategy. Why can't the use the "." character and
a suitable naming convention to uniqify tags for use with XML 1.0 and leave
extending the language for later. This is essentially the : proposal, but
avoids making the claim that we're adding a feature, and thus perhaps tying
our hands once we know what we really want.

>Perhaps in a future revision process DTDs will transmogrify, and
>son-of-8879 will have a much more ambitious world-view as to what
>constitutes a markup declaration.  In the meantime, these guys are
>convinced that they need namespaces and they need them well-defined
>by Q4 '97, and we shouldn't tell them that they can't have them.

I can believe that conviction, but by itself it's just an assertion. I feel
that I'm being asked to comment on something where the requirements are not
in the open. And that frustration may be making me a bit nasty...

Are the requirements (singly or in combination):

  1. attachment of abstract semantics
  2. unquification of names
  3. attachment of unique names to short names in an instance
  4. parsability by a one-pass parser
  5. generatability on the fly with no discontinuous dependencies that
would require buffering attlist declarations and the instance in progress
while you do generation.
  6. Conformance to notational prejudices of the engineers involved.
  7. ??? ...

And are the payoffs/risks:
  1. XML if not used for meta-data will not be included in commercial
web-browser software (eg MS, NS).
  2. XML if not used for meta-data will be in browser software, but not
anyplace else.
  3. XML, if lacking namespaces, will be rejected by the W3C as failing to
meet their needs.
  4. ??? ...

I think there is a potential case that XML for meta-data will enhance its
acceptance so that it's worth the risk of adding semantic or syntactic
conventions. But I would like to see the case.

  -- David

_________________________________________
David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________

Received on Thursday, 19 June 1997 15:04:17 UTC