W3C home > Mailing lists > Public > www-html@w3.org > August 2005

Re: code and blockcode

From: Christophe Strobbe <christophe.strobbe@esat.kuleuven.be>
Date: Mon, 01 Aug 2005 15:24:07 +0200
Message-Id: <6.0.0.22.2.20050714234331.030c1308@mailserv.esat.kuleuven.be>
To: www-html@w3.org

Hi Laurens,

Sorry for this late response. I was doing other things.

At 22:02 14/07/2005, Laurens Holst wrote:
>Bjoern Hoehrmann wrote:
>>
>>
>>
>>* Laurens Holst wrote:
>>><xml:space=>xml<xml:space=>:space="preserve" by itself will not achieve 
>>>the desired effect, it
>>>only controls how whitespace ends up in the DOM. It is additionally
>>>already set to that value on all elements in XHTML 2.0. See also:
>>><http://www.w3.org/MarkUp/2004/xhtml-faq#xmlspace>http://www.w3.org/MarkUp/2004/xhtml-faq#xmlspace
>>>
>>
>>
>>
>>Well, at least 
>><http://www.imc.org/atom-syntax/mail-archive/msg12799.html>http://www.imc.org/atom-syntax/mail-archive/msg12799.html
>>Tim Bray and I disagree with this, and the latest XHTML 2.0 draft does
>>not seem to define this either. Of course, the HTML WG is not easily
>>persuaded by technical argument [...] about simple and obvious facts.
>>
>Anyways, I did some research, and these were the conclusions I made...

Thanks for the research, but you make some statements that I disagree with.




>
>What <xml:space>xml:space is used for
>
>
>
>
>
>The spec says to signal an intention that in that element, white space 
>should be preserved by applications. First of all, note that this is about 
>what the XML parser communicates to the application on top of it, which is 
>the DOM.

That's not how I read the XML spec.
The parser can be a DOM parser, SAX parser, JDOM, XOM, ... The application 
sits on top of the parser. A browser that supports XML consists of several 
components, one being an XML parser and another being an application that 
uses that parser. So when the XML spec says that the application should 
preserve whitespace, it is the browser that should preserve whitespace. The 
issue here is that it says "should" instead of "must", so a browser can 
happily ignore the "preserved" whitespace that the DOM parser passes on to 
it and use its default whitespace processing.

>Now the confusing thing here is that there are really two things that are 
>dealing with whitespace: white space preservation by the application on 
>top of the XML processor (the DOM), and white space collapsing by CSS. My 
>guess would be that xml:space hints at how the application should process 
>the spacing, not how the styling language processes it.

The browser component that interprets the styling language "should" 
preserve the whitespace, but it is not required to do so. If the browser 
collapses whitespace in spite of 'xml:space="preserve"' before looking at 
the CSS, than the CSS rule 'white-space:pre' can have no effect.


>(...)
>Thus, if you just look at parsing and displaying documents with a generic 
>XML processor and CSS, you can distinguish the following cases. Example 1, 
>XML as a database, where element content whitespace and other adjacent 
>whitespace is not important:

Without a schema, a parser can't determine if whitespace is ignorable or 
not. In the example you provide below, you need a schema that says that 
<movie> can only have element content. If <movie> can only contain element 
content, the xml:space attribute is quite irrelevant because the 
application (e.g. code that turns the XML into database records) has no use 
for the whitespace between the elements. If <movie> can also contain 
character data, then xml:space can be a useful signal.




><?xml version="1.0" encoding="UTF-8"?>
><movies <xml:space=>xml:space="default">
>¬ ¬  <movie>
>¬  ¬  ¬  <title>The Fifth  Element</title>
>¬  ¬  ¬  <director>Luc Besson</director>
>¬ ¬  </movie>
></movies>
>
>Example 2, a text document, where element whitespace is important 
>inbetween ‚€˜reallya good example‚€™:
>
>
>
><?xml version="1.0" encoding="UTF-8"?>
><doc <xml:space=>xml:space="preserve">
>     <p>It is <em>really</em> <a href="">a good example.</p>
></doc>
>
>
>
>Note by the way that nowhere in the DOM specification it says that this is 
>how <xml:space>xml:space should be processed.

xml:space is defined by the XML spec. It would be wrong for the DOM spec to 
redefine it, otherwise DOM would be an API for an "XML" that deviates from 
the W3C XML spec.

>Also note that setting <xml:space>xml:space has no effect on the way 
>IE‚€™s DOM treats them. But nowhere in XML does it say that the 
>application on top of the XML parser should be a DOM either.

Again, DOM is an API for XML parsers, it does not sit on top of the XML parser.


>Anyways, if you look at these examples, you will notice that the usage of 
><xml:space>xml:space here has nothing to do with whether the content 
>should be displayed preformatted.

Since you did not consider the role of a schema, the examples do not 
exhaust all possibilities.

>[Discussion of DOM snipped.]
>So, for the sake of ensuring the preservation of the document in the DOM 
>or whatever other backend the XHTML UA has, xml:space="preserve" is 
>automatically set as the default value for the entire XHTML document. As 
>shown above, it is appropriate for rich text documents.
>
>Automatically placing xml:space="preserve" on elements is not at all 
>unforseen use by the way, because in the last paragraph of 
><http://www.w3.org/TR/REC-xml/#sec-white-space>section 2.10 of the XML 
>spec it explicitly says that the attribute can be declared with a default 
>value on the root element. I cannot think of much other use cases for that 
>other than for the reason XHTML 2.0 does it.

XML vocabularies are allowed to do this, but it is not necessarily a good 
idea to do this for all elements. In most XHTML elements you don't want to 
preserve the whitespace that you add for the readability of your code. So 
every stylesheet for XHTML 2 would have to define "html {white-space: 
normal;}" and then override this again for the elements where whitespace 
should be preserved.

The current draft of the XHTML 2 spec says "All XHTML 2 elements preserve 
whitespace." (http://www.w3.org/TR/xhtml2/conformance.html#conf_whitespace).




>
>Using <xml:space>xml:space to express preformatted content
>
>
>
>
>
>
>
>Given that default setting to preserve, it‚€™s not possible to 
>additionally specify <xml:space=>xml:space="preserve" for elements in the 
>document.

It's merely superfluous, not impossible. This distinction is important for 
content that can be validated.

>  After all, that would change nothing. When using the following CSS to 
> achieve the preformatted styling:
>
>
>
>*[<xml:space=preserve>xml:space=preserve] { whitespace: pre }
>
>
>It would stop collapsing the whitespace and linebreaks for the entire 
>document.
>
>But even supposing that we could use <xml:space>xml:space for this 
>purpose, I am still left with the question: what would I present ASCII art in?

Use HTML 4.01 or XHTML 1.0. Or use a namespace for the XHTML 1.1 Text 
Module 
(http://www.w3.org/TR/xhtml-modularization/abstract_modules.html#s_textmodule) 
in your XHTML 2 document. I'd be glad to see ASCII art disappear, but I 
know that others will disagree. I am in favour of separating structure from 
presentation; <pre> and ASCII art are examples of how not to do this. Is 
there anything in ASCII art that cannot be expressed by means of an image 
format?
Note that the introduction to the XHTML 2 spec says: "XHTML 2 takes HTML 
back to these roots [HTML as document structuring language], by removing 
all presentation elements, and subordinating all presentation to style 
sheets. This gives greater flexibility, greater accessibility, ..." ASCII 
art is a nightmare for screenreader users; skiplinks to bypass them are a 
hack. If we had a structure (whether element, attribute or role) to 
unambiguously identify a piece of ASCII "art",  screenreaders would be able 
to ignore it (if the user wished that).

(The paragraph element is just as appropriate or inappropratiate for ASCII 
art as it is for poems. Poems are not just paragraphs: they consist of 
lines that are often arranged in stanzas. XHTML 2 does not have poem and 
stanza elements, so we'll have to make do with p and l. But I would not 
oppose the introduction of poem and stanza elements; at least they are 
semantic, unlike pre.)

Regards,

Christophe




-- 
Christophe Strobbe
K.U.Leuven - Departement of Electrical Engineering - Research Group on 
Document Architectures
Kasteelpark Arenberg 10 - 3001 Leuven-Heverlee - BELGIUM
tel: +32 16 32 85 51
http://www.docarch.be/  
Received on Monday, 1 August 2005 13:25:09 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:16:04 GMT