Re: ERB Decisions of March 26th

Note:
    There's an alternative suggestion to PUBLIC and CATALOG at the end of
    this message that simplifies the XML grammar slightly, whilst
    retaining the ability to do centralised indirection.


Deborah A. Lapeyre <dlapeyre@mulberrytech.com> wrote:

> All my documents and those of my clients will also be non-compliant.

Just how many XML documents do your clients have!??!?

If you are talking about SGML documents, better make sure that
* they don/t use SDATA entities
* they don't have any EMPTY elements lacking the <xx/> syntax
* they don't use minimisation or SHORTTAG
* the content models don't use &
* there are no inclusions or exclusions
* you don't have the minimisation parameters on Element declarations
* the instances don't rely on whitespace being ignored in element content

> Writing small programs to take out the minimization indicators
> out of DTDs and make empty elements look different in instances
> is trivial. Coming up with an entirely new mechanism to solve my
> identification/location problems is NOT.

Which is why I am pleased PUBLIC isn't in this draft.  We're not ready.
Since the XML spec is not yet published, it's a little too soon for
backwards-incompatibility hysteria :-)  If the use of existing SGML
documents had been a _primary_ goal, we would have stopped and said,
hey everyone, let's all market SoftQuad Panorama!  But the idea is
to make a new language, as unhampered as possible by the ugliness of
the old.

If you read Tim's note carefully, PUBLIC may well reapper in a
later draft.

> In my opinion, Resolution is 
>       NOT 
> part of your charter.  You are not going to tell me 
> what to do with a URL, that's my problem as an application.

Luckily that's not what is meant by resolution of PUBLIC identifiers.

> There have been legions of PRO arguments.

Actually I am still not convinced.  People have said lofty abstract
things about keeping name and address separate, others have said that
a query and an address must _not_ be separarate (in a separate thread),
some have given examples of how their existing non-web non-internet
SGML systems need PUBLIC (but no-one is forcing them to be replaced)
and I have not yet seen any need for PUBLIC identifiers in _XML_.

> The only counter argument that stuck (or did I miss a few?) was "there
> is no agreed upon mechanism." This is true, but not necessarily useful!

I.e. "lots of people want this feature, and the fact that it doesn't
woprk and we can't all agree on what it is or does or means is jolly
well not going to stop us!"

> -- a very disappointed Debbie Lapeyre
Have a stiff brandy and think back to Michael's closing speeches at
the last two Boston Web conferences... we're getting there... slowly.
I can show you over a gigabyte of SGML documents at SoftQuad that are
not valid XML documents, and I expect I could find over a Terabyte
at our customers if I wanted to, but I don't.  All of these documents
can be converted to use XML.

Incidentally, CATALOG can support looking up a system identifier and
returning a URL reasonably efficiently (well, efficiently enough for
the one to fifty system identifiers used for the DTD and its included
fragments in most SGML documents/DTDs that I've seen).

Personally I would favour getting rid of the special DTD line on DOCTYPE,
and requiring inclusion.  Then I would like to allow entitiy expansion
in SYSTEM Ids, so tht I could do

<!DOCTYPE ANKLE [
    <!Entity server % "http://www.sq.com">
    <!Entity theDTD STSTEM "%server;/sgml-docs/dtds/legs.dtd">
    %theDTD;
]>

That's one fewer construct in the language, fewer productions, and
you get more powerful indirection than today, since you could load
an external file that would define %server.
This is more like an SGML approach to CATALOG in some ways.

Unfortunately, SYSTEM identifiers are minimum literals.
However, since they were always system specific, you could argue
that this is not a disaster for backwards compatibility if
SGML were changed to accommodate this.

As far as I can see, apart from not supporting the rather odd syntax
of FPIs, this has every feature that the PUBLICers are asking for,
except complete backward compatibility with existing HTML/SGML files,
and except for not using the keyword PUBLIC:

<!DOCTYPE ANKLE [
    <!entity CATALOG % SYSTEM "catalog.xml">
	<!--* solves the question of how to find CATALOG *-->
    %CATALOG;

    <!Entity theDTD STSTEM %lower-leg-dtd;>
	<!--* lower-leg-dtd is a "PUBLIC" identifier defined in CATALOG *-->
    %theDTD;
]>



Lee

Received on Thursday, 27 March 1997 00:26:45 UTC