Re: ERB Decisions of March 26th

At 11:52 PM -0500 3/26/97, lee@sq.com wrote:
>Deborah A. Lapeyre <dlapeyre@mulberrytech.com> wrote:
>> All my documents and those of my clients will also be non-compliant.
>Just how many XML documents do your clients have!??!?
>If you are talking about SGML documents, better make sure that
>* they don/t use SDATA entities
>* they don't have any EMPTY elements lacking the <xx/> syntax
>* they don't use minimisation or SHORTTAG
>* the content models don't use &
>* there are no inclusions or exclusions
>* you don't have the minimisation parameters on Element declarations
>* the instances don't rely on whitespace being ignored in element content

For all of these features except SDATA, XML has equivalent representational
mechanisms, so that no _information_ in an SGML document must be removed to
become XML compliant. (DTDs may change, and instances may change, but the
expressive power of the language does not).

For PUBLIC and SDATA, useful (easy to implement, and document) features
have been removed without any corresponding mechanisms in place. There
isn't a workaround for the lack of PUBLIC or SDATA. (I'll leave SDATA out,
from here on as it's off topic).

So this is something where the logical power of XML is at stake, and not
special pleading for a favorite syntax (such as we acceded to in DTDs).
Note that Debbie has not made the same level of argument over any of these
other feastures, though their absence will surely inconvenience her and her

>> Writing small programs to take out the minimization indicators
>> out of DTDs and make empty elements look different in instances
>> is trivial. Coming up with an entirely new mechanism to solve my
>> identification/location problems is NOT.
>Which is why I am pleased PUBLIC isn't in this draft.  We're not ready.
>Since the XML spec is not yet published, it's a little too soon for
>backwards-incompatibility hysteria :-)

Well, this is essential semantics hysteria, not syntactic sugar hysteria
and it is completely justified. I will simply advise people to use PUBLIC
IDs anyway, and only use non-conforming XML tools that support it.

If you need PUBLIC, you _need_ it.

>  If the use of existing SGML
>documents had been a _primary_ goal, we would have stopped and said,
>hey everyone, let's all market SoftQuad Panorama!  But the idea is
>to make a new language, as unhampered as possible by the ugliness of
>the old.

This misses the point.

>If you read Tim's note carefully, PUBLIC may well reapper in a
>later draft.

It has to.

>> In my opinion, Resolution is
>>       NOT
>> part of your charter.  You are not going to tell me
>> what to do with a URL, that's my problem as an application.

>Luckily that's not what is meant by resolution of PUBLIC identifiers.

No, it means that you are going to tell me what to do with a PUBLIC
identifier... This is a _BAD THING_.

>> There have been legions of PRO arguments.
>Actually I am still not convinced.  People have said lofty abstract
>things about keeping name and address separate, others have said that
>a query and an address must _not_ be separarate (in a separate thread),
>some have given examples of how their existing non-web non-internet
>SGML systems need PUBLIC (but no-one is forcing them to be replaced)
>and I have not yet seen any need for PUBLIC identifiers in _XML_.

The problem that PUBLIC solves is location and resolution independent
naming. Just because you don't believe in it, doesn't mean that it's not
important. No fixed Resolution strategy solves the problem. Broken names
are an inherent possibility with naming, but that's life...

>> The only counter argument that stuck (or did I miss a few?) was "there
>> is no agreed upon mechanism." This is true, but not necessarily useful!
>I.e. "lots of people want this feature, and the fact that it doesn't
>woprk and we can't all agree on what it is or does or means is jolly
>well not going to stop us!"

No, the fact that it _should not_ work _the same way_ _everywhere_ is going
to stop us from letting it work for _anyone_.

This is not progress.
>> -- a very disappointed Debbie Lapeyre
>Have a stiff brandy and think back to Michael's closing speeches at
>the last two Boston Web conferences... we're getting there... slowly.

I'm not ready to cry into my beer yet, except from the frustration of
repeating the blindingly obvious over and over. The experience of the URN
group does not make me very snaguine about this whole discussion -- they
essentially had to tell the "resolution insisting" folks to "shut up and
let us hang ourselves", as years of argument could not lead people who want
to require a single resolution mechanism to accept the requirements of a
naming scheme.

>I can show you over a gigabyte of SGML documents at SoftQuad that are
>not valid XML documents, and I expect I could find over a Terabyte
>at our customers if I wanted to, but I don't.  All of these documents
>can be converted to use XML.

Not the ones that use PUBLIC, because there is no equivalent mechanism in
XML right now.

>Incidentally, CATALOG can support looking up a system identifier and
>returning a URL reasonably efficiently (well, efficiently enough for
>the one to fifty system identifiers used for the DTD and its included
>fragments in most SGML documents/DTDs that I've seen).
>Personally I would favour getting rid of the special DTD line on DOCTYPE,
>and requiring inclusion.  Then I would like to allow entitiy expansion
>in SYSTEM Ids, so tht I could do
>    <!Entity server % "http://www.sq.com">
>    <!Entity theDTD STSTEM "%server;/sgml-docs/dtds/legs.dtd">
>    %theDTD;
>That's one fewer construct in the language, fewer productions, and
>you get more powerful indirection than today, since you could load
>an external file that would define %server.
>This is more like an SGML approach to CATALOG in some ways.
>Unfortunately, SYSTEM identifiers are minimum literals.
>However, since they were always system specific, you could argue
>that this is not a disaster for backwards compatibility if
>SGML were changed to accommodate this.
>As far as I can see, apart from not supporting the rather odd syntax
>of FPIs, this has every feature that the PUBLICers are asking for,
>except complete backward compatibility with existing HTML/SGML files,
>and except for not using the keyword PUBLIC:
and except for requiring access to _particular URL_ in order to resolve the
identifier -- and thus eliminatoing the whole point of PUBLIC identifiers.

Can we have some arguments that address the point?

  -- David

David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________