W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > November 1996

Comments: clause 4+

From: Christopher R. Maden <crm@ebt.com>
Date: Wed, 13 Nov 1996 04:15:32 GMT
Message-Id: <199611130415.EAA05542@dot.EBT.COM>
To: w3c-sgml-wg@w3.org
Comments updated for 0.02 of 10 November, PostScript from SunSite.

Clause 4.1, productions [64]-[66]

There's a mysterious 'PEReference' floating in the middle of the
production group.  Is Reference supposed to include '| PEReference'?
It doesn't appear to be referenced anywhere else.

Clause 4.1, No Recursion WFC:
   A text entity may not contain a reference to itself.

How many levels must an XML processor check for circular references to
be compliant with this specification?  One?  It should be explicit.

Clause 4.2, production [67]:
   [67] EntityDecl := '<!ENTITY' S Name EntityDef S? '>'

This doesn't allow space between the name and the definition.  Try:
[67] EntityDecl := '<!ENTITY' S Name S EntityDef S? '>'

Clause 4.2.2, production [69]:
   [69] ExternalDef := ExternalID NDataDecl?

Same problem; try
[69] ExternalDef := ExternalID (S NDataDecl)?

Clause 4.2.2, production [71]:
   [71] NDataDecl := 'NDATA' S Name S

The second S is redundant and not required.  Strike it.

Clause 4.2.3, first paragraph:
   It is recognized that for for some advanced work, particularly with
   Asian languages,...

a) There is a duplicate "for".

b) I don't think Asian-language speakers consider it advanced.  At a
minimum, replace "advanced" with "non-European".  (And call me a
P.C. Brown grad.)

Clause 4.2.3, production [74]:
   [74] Encoding := ...

Revised from 0.01.  Are the members of the enumerated list the only
encodings permitted in XML 1.0, at all?  What about other
IANA-registered encodings, if the system can handle them?  What about
X-prefixed ones that are spoken by the system?  EBCDIC isn't even
mentioned here.  It's not at all dead (is the Brown STG still using
BrownVM?).  Big5 (Chinese) and KOR-8 (Russian) are in use on the Web
for HTML now.

Clause 4.2.3, last paragraph:
   If an XML processor encounters an entity with an encoding that it
   is unable to process, it must inform the application of this fact
   and allow the application to request either that the entity should
   be treated as an binary entity, or that processing should cease.

a) s/an binary entity/a binary entity/
b) I believe the application should also be permitted to cancel
   processing just of that entity, and carry on.  If that is what is
   intended by the second option above, it needs to be better worded.

Clause 4.3, list item 1:
   ... for a character reference, the "&..." string.)

Which string?  That should be more explicit: 'the "&...;" string.'

Clause 4.4, production [75] and discussion:

I feel that *requiring* ExternalID for NotationDecl is unnecessary.
What is the system identifier for processing a text file, in a cross-
platform way?  I think this hurts XML's network friendliness.  I would
prefer to see MIME types recommended as system identifiers, as the
HyTime Technical Corrigendum allows in FSIs.

Appendix A, feature list:
    3. The "&" connector in content models

Revised from 0.01.  Make that 'The "and" connector (&)'.

Appendix B: References.

8879 isn't ISO/IEC, just ISO.  I don't know about 10646 or 10744.

Appendix D:

There appears to have been a one-to-one mapping made from Symbol font
characters to entity names.  There is actually a one-to-many for some.
The ones I caught are the Greek letters.  I'm at home, so please
pardon if I get the entity sets backwards, but:

"Gamma", "Delta", etc., are from iso-grk1, and refer to the
mathematical symbols capital gamma, capital delta, etc.  "Ggr", "Dgr",
etc., are from iso-grk3, and refer to the Greek capital letter G, the
Greek capital letter D, etc.  Both should be accepted by XML; whether
they should reference the same Unicode character, I'm not sure.  (Does
anyone know if Unicode distinguishs the two type of characters?)  In
any case, they would end up as the same Symbol glyph, most likely.  At
a minimum, though, XML should pick one or the other, if not allowing
both; the current chart mizes and matches (Agr, Bgr, Gamma, Delta,
Egr,...)  I think I see a pattern - iso-grk1 doesn't cover the whole
alphabet, and iso-grk3 was used to fill in the holes.  Never mind that
- both entity sets should be fully supported.

Extra comment #1:

Carried from 0.01.  XML desparately needs versioning information.
Otherwise, XML 2.0 will struggle painfully with how to make itself not
cripple XML 1.0 systems.  At a minimum, XML 1.0 can have an absence of
versioning information if it knows how to recognize a newer version.
I would prefer that an explicit mechanism be provided now, but with a
default of 1.0 in its absence.  A required element has been rejected;
perhaps a PI, then: <?XML Version 1.0?>.

Extra comment #2:

Has anyone registered text/xml and application/xml with IANA?  It
would be cool if it were done by next Wednesday.

<!ENTITY crism PUBLIC "-//EBT//NONSGML Christopher R. Maden//EN" SYSTEM
"<URL>http://www.ebt.com <TEL>+1.401.421.9550 <FAX>+1.401.521.2030
<USMAIL>One Richmond Square, Providence, RI 02906 USA" NDATA SGML.Geek>
Received on Tuesday, 12 November 1996 23:16:36 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:20 UTC