Re: Element content the real issue?...

At 11:20 PM 9/30/96, Charles F. Goldfarb wrote:
>On Mon, 30 Sep 1996 12:10:42 -0400, "David G. Durand" <dgd@cs.bu.edu> (David G.
>Durand) wrote:
>>   We basically cannot afford to process element and non-element content
>>differently with regard to whitespace or anything else.
>>   ==> So we can't allow any ignored whitespace anywhere without resorting
>>to quoting, because of the non-DTD parsing requirement.
>Exactly. So having recognized the unavoidable fact, why do you keep trying to
>avoid it? I can understand trying to ameliorate its impact, but let us first
>face up to the truth: this is the only way we can go.

Becuase the unavoidable fact is not that we need quoting, but rather the
  Ignored whitespace _and_ no DTD _and_ compatibility with SGML implies
quoted data.

   That means that we can avoid quoted data by eliminating ignorable
whitespace. I'd rather give up ignored whitespace than start quoting. And
we _do_ have that choice, since it makes XML more restrictive than SGML.
I'm still waiting for a reason why we _must_ ignore newlines (but not
trailing spaces) after a tag start. It is sometimes convenient, but is not

>>   My application of his theories says that we should change as little as
>>possible from HTML (the market leader), while adding the minimum we can
>>manage to get the most useful new functionality.
>This conclusion contradicts the inescapable fact.

    Only if you make keeping the ability add arbitrary whitespace around
tags a hard requirement. I say we should give up a cosmetic convenience
that already causes trouble in HTML (e.g. in <table>, whitespace in <td>
can screw you up). Otherwise, we can shun quoting by outlawing whitespace
ignoration everywhere. A \ before newline convention is the _most_
intrusive solution that I can see working, and I am skeptical about
acceptance of that.

>His earlier quote says we
>should keep things familiar, so I offer this premise:
>1. Most people will create XML with a program that hides the real data format,
>in which case the "niceness" and "familiarity" of the format are non-issues.

This was the plan for HTML. Tim Berner-Lee originally thought that HTML
would be the machine-language of the web, and that "nice" editors would be
the only thing to mess with HTML. He was _wrong_, because it was impossible
to distribute uniform, friendly platform-neutral editing tools to the
entire market (and some people don't want them anyway).

    We should learn from that failure that the _most likely_ authoring tool
for XML will be a text-editor (maybe with some extra assistance grafted
on), as it is for much SGML, and most HTML. Tools are nice, but the
coverage of user needs and multiple platforms is always spotty.

    Even the graphic designers where I work use plain-text HTML editors,
despite a completely fine-arts background. They need the visibility and

>2. People who will create XML with "dumb" editors or mode-assisted (dull?)
>editors also have written programs. To them, quoting data is quite familiar and
>macros can be written to assist with it.

I believe the claims above to be false. Most will _not_ be programmers.
HTML is should be our user-model here. Most will _not_ be able to write
macros in a programming editor. Macros and specialized tools will become
available only when and _if_ there is enough acceptance of XML markup that
there is an existing market for those tools.

Even for the putative "technical" audience, I am doubtful of the
convenience of quoting syntax. Quoting is a source of a significant
fraction of all programming language syntax errors. I think it a
particularly poor solution for a non-programming user-base. And that \" is
a constant irritant, as well.

>>I must say that I don't
>>see the point of targeting only the SGML community, because they already
>>have SGML.
>The SGML community needs a lean, mean conforming delivery format, which XML can
>be. It will be confusing enough to the market to have SGML and XML. Having a
>third "version" of SGML is at least one too many.

If we target only the SGML market, there will be no platforms to deliver
to. Standards are a dime a dozen on the Internet, and to break out of our
community we need to bear the rest of the world in mind.

   I am not convinced (as you may have gathered) that SGML-conformance in
every detail is that important. I believe that I have proposed nothing
(yet) that will endanger that.

The third "version" comment seems a non-sequitur to me. No-one has made
such a suggestion.

-- David

RE delenda est.

(granted, I'm no Cicero)

David Durand                  dgd@cs.bu.edu | david@dynamicDiagrams.com
Boston University Computer Science          | Dynamic Diagrams
http://www.cs.bu.edu/students/grads/dgd/    | http://dynamicDiagrams.com/