W3C home > Mailing lists > Public > www-xml-infoset-comments@w3.org > July to September 2000

Comments on the 26 July Working Draft

From: Wayne Steele <xmlmaster@hotmail.com>
Date: Wed, 02 Aug 2000 18:54:04 PDT
To: www-xml-infoset-comments@w3.org
Message-ID: <F6clX9QZwx7y39fm1zF00000dd7@hotmail.com>
I have several comments on this working draft.
These views are mine, and NOT those of my employer.

I am disappointed that Element Declarations and Content Models are not 
preserved in the information set. Already Notations, Entity Declarations, 
Comments and PIs in a DTD are represented, as well as Attribute information 
about attributes that are present.

I don't think Element Declarations and Content Models are too much to ask. 
You could still go ahead and leave them out of the definition of "CORE" 

I do agree that Parameter Entities, and INCLUDE/IGNORE sections should be 
omitted, with just the "End DTD Result" presented.

2. [entity declaration information item]
This item is attempting to address too many different things, just because 
the XML 1.0 rec calls many different things "entities".

The [type] property has five different values. This should be re-specified 
as several (five?) different types of information items.

This would eliminate much of the complicated "whys and wherefores" for each 
of the properties.

3. [document information item] : [version]
What is the value of this property if there is no XML Declaration? Empty 

4. [element information item] : [namespace name]
What is a "namespace name"? Is this the URI? Where is this defined?
What (if any) information item includes the prefix used to associate this 
element with a particular URI?

5. [element information item] : [declared namespaces]
"...or provided for in the DTD for this element type" is not clear;
we all know that DTDs are naive about namespaces.
I assume this is referring to an xmlns attribute declared in the DTD with a 
default value. This should be clarified.

6. [element information item] : [in-scope namespaces]
   "namespace in effect for this element"
Is this concept defined anywhere?

7. [attribute information item] : [children]
What happens to Character References inside an attribute value?
What happens to them if they are of whitespace characters?
Are they normalized away? I could probably figure it out, but it would be 
nice if it were addressed here.

8. [attribute information item] : [attribute type]
ENUMERATED should probably be changed to ENUMERATION to match the XML 1.0 
NOTATION is technically an "enumerated" type also.

9. [attribute information item] : [attribute type]
I find it very unfortunate that with an enumerated type, the various 
enumerated values are not specified. This information is essential to 
understanding what an enumerated type is.

10. [character information item]
The paragraph that begins with "Note, however..." is clear, until character 
references are mentioned.

If character references turn one-to-one onto character information items, 
but (inside attributes) are whitespace-normalized differently, you have an 

It creates a situation where an XML Document may be parsed into an 
infoset(1), serialized into XML, parsed again into an infoset(2), and the 
infosets (1) and (2) will not be identical.

This inconsistency may be resolved by
	(a) whitespace-normalizing characters created by character references 
identically with others,
	(b) preserving in the infoset the distinction between a linefeed character 
and a character reference to a linefeed character, or
	(c) providing as part of the infoset spec a clear algorithm on how to 
reserialize an attribute value back into XML - what characters should just 
be dropped in, and which ones need to be represented by a character 

11. [character information item] : [element content whitespace]
Instead of setting a special flag, I would really like to just discard 
characters of this kind. I think this is more common behavior among 
validating XML processors. Perhaps the definition of CORE conformance can be 
changed to allow omission of character information items where this flag is 
set to TRUE (instead of not including the flag, and requiring all characters 
to be presented). Obviously, only validating parsers would be able to take 
you up on this option.

12. [comment information item]
The definition is lacking in rigor, and should be rewritten.
What about comments in the DTD:
	Internal subset?
	External subset?
	External Parameter Entity?
What about comments that were not included in the original XML "Document 
Entity", but were pulled in from an external parsed entity?

13. [document type declaration] : [children]
It is unclear if comments and processing instructions located in External 
Parameter Entities appear in this list.
If the same Parameter Entity is referenced twice, will comments in the 
replacement value each create two information items?

14. Section 4.1; Core Conformance
The reference to the [parent] property of namespace declaration items should 
be changed to the [owner element] property.

15. Appendix C; What is not in the Infoset
This should be expanded, to more completely document various other data that 
are not included. For example:
	The target of the DOCTYPE declaration.
	Single quotes vs. Double quotes for each attribute
	Whitespace inside a PI, immediately following the target.
	Literal Characters vs. Character References.
	Whether declarations in a DTD come from the Internal subset, External 
Subset, or from an External Parameter Entity.
	Almost any information at all about Parameter Entities.
	Presence/Absence of an XML Declaration.
	Presence/Absence or a Text Declaration for each External Entity.

I am happy to respond in more detail with regard to any of these points.

-Wayne Steele

Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
Received on Wednesday, 2 August 2000 21:54:39 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:08:00 UTC