Re: Whitespace from Peter Murray-Rust on 1997-05-11 (w3c-sgml-wg@w3.org from May 1997)

From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
Date: Sun, 11 May 1997 18:57:42 GMT
To: w3c-sgml-wg@w3.org
Message-Id: <6448@ursus.demon.co.uk>

In message <199705111729.TAA10393@mygale.inria.fr> Bert Bos writes:
[...]
>  > 
>  >  1. White space in element content
> 
> That is easy to fix by selecting a single whitespace handling method
> in the XML profile for SGML. `Keep-all-whitespace' is ugly, but
         ^^^^^^^^^^^
Please excuse my ignorance :-), but what is this and where does it get
implemented? 

> workable; a better rule is be to simply ignore any newline directly
> after a '>' or directly before a '<'. The important thing is that this
> rule becomes part of the XML profile, and does not depend on the XML
> document itself.

Is it intended that the profile is uniques and unchanging for all XML 
documents?  If not, where does it get altered>

This then means that the content depends on the combination of the
document and the profile. 

> 
>  >  2. Default attributes
> 
> The previous XML-lang draft had a handy macro <?xml default...?> that

I liked this as well, and after its disappearance have vowed not to us
deafults in my own DTDs :-).
> 
>  >  3. Attribute values that are space/case normalized only if you
>  >     read the DTD and know they are NMTOKEN or ID or something.
> 
> This is another thing that will have to be added to the XML profile
> for SGML: all attributes are always treated as CDATA and never
> normalized. NMTOKEN, NUMBER, etc. can still be used for validation,
> but do not influence the parsing. I.e., in the XML datamodel the
> attributes foo="7" and foo="07" are different, even though some
> application may treat them the same.

I would be grateful (perhaps on xml-dev) for some explanation of NMTOKEN and
why it is useful.

An point here is that most *generic* applications do not need to know
what attribute type is used.  Obviously ID matters, because it's used
in TEIXptrs, and that isn't a parser matter.  Are there any other 
attribute types that applications need to know about?  Or can they assume
that any CDATA produced from the parser is typeless?  I can see that some
applications *might* be concerned as to whether something was a string or a 
number, but it's not easy to see how a generic application would react to
this.

	P.

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/

Received on Sunday, 11 May 1997 15:24:10 UTC