W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > June 1997

Re: DTDs and XML conformance

From: <lee@sq.com>
Date: Tue, 3 Jun 97 23:40:15 EDT
Message-Id: <9706040340.AA00225@sqrex.sq.com>
To: w3c-sgml-wg@w3.org
Eve's list is very helpful.

I've made some detailed comments, becuase I think it's useful enough to
be added to the XML FAQ.

Lee




>   1. The instance has to be well-formed: special empty-element and PI
>      syntax, normalization, etc.
Yes.  Perhaps this is actully where OMITTAG & SHORTTAG fit.

>   2. Either element type declarations can't use CDATA or RCDATA declared
>      content, or the elements' content in the instance must be transformed
>      to escape the appropriate characters that look like markup
Yes.

>   3. The DTD should avoid attribute value defaulting if you want to
>      minimize the need to put attribute list declarations in the internal
>      subset (use #IMPLIED plus a style sheet instead); if default values
>      are supplied, they must be quoted
Yes and yes, although the quoting is already covered by [1].

>   4. Attribute declared values can't be NAME[S], NUMBER[S], or NUTOKEN[S]
>      (probably use NMTOKEN[S] instead, but also possibly CDATA)
Yes.

>   5. Attribute default values can't use #CURRENT (no good substitute)
Well, arguable there are few good reasons to use #CURRENT in the first
place :-)  This would be a good place to mention RANK, I think, if anyone
actually uses it.

>   6. Attribute default values can't use #CONREF (use #IMPLIED plus a style
>      sheet instead)
No.  Interestingly, it would be very cheap to implement CONREF in XML,
because the tag would look different:
	<gi x="y"/>
vs.
	<gi>stuff</gi>
but you would have to have a way of knowing which was the conref attribute.
I don't think it's worth adding the feature.

>   7. Either SDATA entities can't be referenced, or SDATA entity references
>      must be replaced with decimal or hexadecimal character references (or
>      whatever substitute is appropriate) in the instance
No.  This is the first place where we disagree.  You can retain the
entity references in the document.  Only the entitiy definitions need to
be changed.  For example, change
    <!Entity eacute SDATA "[eacute]">
to
    <!Entity eacute "&#225;">
in the DTD.
If you need font changes, you can use
    <!Entity BoldRedRegisteredSymbol "<B><RED>&#174;</RED></B>">
instead.  You'd need to define B and RED to pass validation...

>   8. Either CDATA entities can't be referenced, or the entity type must be
>      changed and the contents transformed to escape characters that look
>      like markup
Yes.

>   9. Bracketed entities can't be referenced (in general, these make
>      ill-formed entities because they contain only half of a markup
>      construct)
I am not sure what a bracketed entity is.  It's not in the glossary to
ISO 8879, and I don't have the handbook here at home.

>  10. SUBDOC entities can't be referenced (it might take quite a bit of work
>      to extricate and transform any uses of SUBDOC entities)
This is not a problem since there is no way in XML to _declare_ a
SUBDOC entity in the first place :-)

>  11. Entity declarations must not have data attributes specified
Yes.  Actually, SoftQuad Explorer -- and probably Panorama -- uses
optional entity data attributes for the height and width of images,
which is a more SGML-like way of doing <IMG.... height=... width=....>.
But it's not worth a language feature in XML.

>  12. External entity declarations must conform to PUBLIC/SYSTEM syntax
>      requirements
Yes.  Is it necessary to state this??

>  13. DTD marked sections must be either transformed to remove any spaces
>      around status keywords, or resolved; the TEMP keyword can't be used
I think not allowing spaces there is a bug in the spec, right?

>  14. Parameter entities either conform to whatever ends up being allowed,
>      or are transformed or resolved
Yes.

>  15. DTD comments within markup declarations are either removed or are
>      transformed to be moved outside and turned into full comment
>      declarations
Yes.  The comment syntax seems to have reverted to <!--......-->, although
I'd still prefer to use <!--*....*--> and have the extra robustness later
this year.

> ---------------------------------------------------------------------------
[...]
> The following list assumes that it's desirable to use the same DTD for SGML
> and XML applications, without transformation.

I'll stop here.  I think that this list could, with some minor edits,
usefully be added to the XML FAQ.

Lee
Received on Tuesday, 3 June 1997 23:40:18 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 10:04:39 EDT