Re: HTML WG last call remark (no way to declare entities?)

I have only recently read the brief discussion on defining general entities
in XML schema which seems to have concluded with a response by Dan Connolly
on May 5 2000.

I was unable to understand the reasons for not supporting definition of
entities in XML schemas, and since this seems to me a regrettable omission I
propose to mention briefly why, and also to comment on Dan Connolly's
response on the matter.

XML Schema Requirements
-----------------------------
It seems to me that the omission of this feature is a substantial violation
of the sprit if not the letter of the requirements for XML schema.
The requirements state that XML schemas should be "more expressive than XML
DTDs" which clearly on this specific matter they are not.
What the requirements perhaps should have stated more explicitly is that it
should be possible for all existing XML documents which are well-formed/DTD
valid with respect to some DTD to replace the DTD with an XML schema and
still be well-formed/schema-valid respectively.
Surely it must be intended that people should ONLY have to rewrite their
DTDs and not their XML, and that all existing proper use of XML with DTDs
can transit to XML with schemas?

With the current proposal, it is not possible to transit my own XML
applications to using schemas without changing the documents (in ways other
than referring to a schema instead of a DTD).
This surely applies also to two of the W3C's own activities.
XHTML documents will not be able to refer to an XML Schema instead of the
existing DTD, and the MathML standard has already been changed because of
this shortfall in XML Schema.

Why Entities are Needed (IMO)
---------------------------------
It has been suggested, and it is proposed in MathML, that entities used as
memorable alternatives
to character entities should be replaces by the use of XML elements.
For those who want to be able to edit XML documents using a text editor and
also want to make use of non-ascii characters elements are no substitute for
entities.
This applies to editing documents in XHTML which contain non-ascii
characters or to many of my own uses of XML for formal literate scripts.
The use of a general external parsed entity has the desired characteristic
of creating a document which is canonically identical to a document in which
either the relevant character entity or the unicode character has been used.
The use of an element does not secure this effect, and forces users wanting
to include non-ascii characters in their XML to migrate from their preferred
editing environment (to one which supports the particular characters they
want, and there may not be one) or to work with documents full of numeric
character references.

On the Reasons for Excluding Entities
--------------------------------------
I have been unable to understand the reasons quoted by Dan Connolly in his
response dated
May 5 2000.

Here's why(, insofar as I understand why I don't understand).

There is an oddity in XML 1.0 that well-formedness of documents using
externally defined parsed entities cannot be checked without reading the
external subset.
This maybe was a bad choice in the definition of "well-formed" and maybe the
definition should have been that suggested in the response for the term
"nearly-well-formed".
However, this is a matter for the XML standard, and what concerns us here is
XML Schema.
What is completely unclear to me in the response is why this is more of a
problem for Schemas than it is for DTDs, or why it is any part of the
business of XML schema to "fix" the problem.

After the passage he quoted Dan says:

    "If you can think of a less awkward way to do it, let us know."

but for the life of me I can't comprehend what the quote did that might be
done better
(i.e. I can't see what there is that needs to be done)
It just seems to note that well-formedness of a standalone=no document may
require reference to the Schema (as it does to the DTD).
It offers an alternative definition of well-formedness which would remove
this need, but this is a matter for XML surely, not for XML Schema.
It neither explains nor solves any problem which is present in Schema but
not also present in the use of XML with DTDs, so far as I can comprehend.

Conclusion
-----------
I have explained why I think definition of parsed entities should be
supported by XMLSchema, and why I have been unable to understand the reasons
which I have seen cited for why they are not.
If I have the wrong end of the stick and there really are good reasons for
not doing it, I would be greatly obliged if someone could come up with a
better account of the reasons.
On the face of it I have difficulty in understanding why there is any aspect
of the content of DTDs which cannot and should not be covered by XML Schema.

Roger Jones

Received on Tuesday, 4 July 2000 07:02:49 UTC