W3C home > Mailing lists > Public > www-html@w3.org > May 2005

Re: About XHTML 2.0

From: Sebastian Redl <sebastian.redl@getdesigned.at>
Date: Tue, 24 May 2005 10:12:17 +0200
Message-ID: <4292E1E1.7040801@getdesigned.at>
To: www-html@w3.org

Orion Adrian wrote:

>However I believe a goal of HTML should be to maximize the number of
>documents it can represent semantically without requiring CSS.
This sentence in itself is something of an oxymoron. "Semantically
without requiring CSS" is a given - CSS only conveys presentation, never
semantics. Yes, HTML must present the semantics of a document, and for
me, that means its structure.

>One could say we only need <list>, but not <ol>, <ul> and <nl>, but
>that would reduce the number of documents we could represent with
>simply HTML.
No, one could not say that, because it loses semantic meaning. (That's
ignoring that the <list> element might have a 'type' attribute.) On the
other hand, <sep> has no semantic meaning that isn't also presented by
putting the two separated parts in their own container. If you believe
that we need more fine-grained control over the type of block, propose
another container.

>CSS, ID and class should not be required to represent a document. They
>are used to enhance a document with additional information; they
>should not be a requirement for most documents. I'm not going to say
>all documents because that would be hard to assert, but for me it's a
But I feel very comfortable saying that CSS, ID and class may be
required to *present* a document. CSS might be turned off in a browser,
true, but then this is a shortcoming of the browser. Using this as an
argument for <sep> makes it a presentational element, taken because of
its default presentation. Not good. Do you use an <ol> over an <ul>
because it is presented with numbers instead of bullet points? Or do you
use it because the content you have is an ordered list of items? If you
do it for the former reason, you've misunderstood the spirit of HTML.
You're conveying presentation instead of semantics.

><group> doesn't accomplish this unless it implies there is a seperator
>between each group. If this is the case it has to represent something
>very specific.
<group> implies that there is a separation between the groups. Does it
imply a visual separation? No, because that's a presentation issue.

Mark Birbeck wrote:

>So, yes, we know that if you have:
>  X
>    A
>    B
>    separator
>    C
>you can make that into:
>  X
>    Y
>      A
>      B
>    Z
>      C
>That's simple.
>But did I say that A and B have a parent of Y? I didn't. Did I say that A
>and B were in the same 'group' as each other? I certainly didn't do that.
>Did I say that A and B are in a different group to C? No. In fact the only
>thing I said was that I want a separator -- I didn't even say what I want to
>separate. Everything else is a layer of semantics that you are trying to
>impose on my document that I did not imply.
Actually, yes, you implied all these things. By separating C from A and
B, you implied that A and B are one group, and C is another. Why else
would they be separated? By using XML, you've implied that every group
has a containing element. Kicking and screaming? Perhaps, but that's how
XML works. If you don't like it, perhaps you shouldn't use a tree-based
markup language.

>It's a cliché indeed, but it does seem to me to be a case of "when you have
>a new hammer belt, you tend to carry your hammer around with you all day".
I thought it was, "when you have a hammer, everything looks like a nail."

Al Gilman wrote:

> On the other hand, separators are not 'impractical,' as their use in
> MIME, I think, conclusively demonstrates.

I disagree. The separators in MIME are syntactical constructs. The
semantics of a multipart message are about blocks of data, and the
separators entirely disappear from the semantic representation. And
blocks are best represented, in XML, by containers. The specification
even says so:

> The body must then contain
> one or more body parts, each preceded by a boundary delimiter line,
> and the last one followed by a closing boundary delimiter line.

Incidently, this very much resembles the implicit closing tags of SGML.
Taking this valid HTML document:

<p>More text
<p>Even more text</p>

This has the very same grammatical representation as a MIME multipart
message. And yet the parse tree is about blocks, not separators.

Laurens Holst wrote:

> I think it is good to have a separator element. There are many books
> where a chapter contains different perspectives in the story. Those
> parts inside a chapter are commonly separated by e.g. a drawing, or
> three stars.

But again, you're separating two blocks of content. Why not put them
into blocks instead? Because of the presentation?

Orion Adrian wrote:

>Not all separations will be perspecive changes. Many won't be and it
>gets pretty ugly when you start diving all over.
Not all separations will be perspective changes? So they separate some
other kind of blocks. It's still blocks above and below the separator,
and they could and should be wrapped in their own block elements, thus
making the separator useless.

>What may make sense to us here doesn't make sense in another
Can you explain what kind of language you mean? Markup language, natural
language, ...

>Authors over an amazing number of years have used things like
>* * *
>to add to comprehension of their document. The text above and below
>doesn't necessarily have a commonality, but the author has decided
>that there should be a separation between the two bodies of text. It's
>really not the place of the W3C to say, "you must only seperate things
>into logical groups of the same type that we can understand."
Authors over an amazing number of years have used these things to
separate sections of their text, because the readers of sequential text
do not appreciate reading opening and closing text. But that's a
presentation issue. As far as the structure of the document goes, you
have separate blocks.
The W3C defines a language for "Hypertext Documents", and yes, it is the
W3C's place to say what a hypertext document is, and what belongs in it.
See, if you really want to accurately represent an online book, I'm sure
there's an ebook XML language out there. Or perhaps you can define your
own. Modern browsers all support client-side XSLT, so presentation
should not be an issue. (There's no need for the output of XSLT to be
semantic. That's why it's part of XSL.)

>Taking the following example (lifted from another post):
> A
> B
> Separator
> C
>Moving A and B into a subelement Y and C into a subelement Z
>fundamentally changes the semantics of A, B and C in ways I may not
>want to. For example A, B and C are no longer siblings. Expanding on
>this setup to:
> Y
>   A
>   B
> Z
>   C
> D
>My intention was that A, B, C and D (say E2) all be the same depth
>that they all be direct children of the class of element that is X and
>W (say E1).
Perhaps your intention was wrong? As I said in another post, structure
must be discovered, not defined. I'm sure that once you go from the
abstract to the concrete, you can find a satisfying solution for the
A, B and C are no longer siblings, but isn't that how it's supposed to
be? After all, you intentionally put a stronger sepration between B and
C than between A and B. It would therefore be inappropriate to say that
A and B have the same relation on the document tree as B and C. B and C
are now first cousins.
D? Well, the most likely problem is that whatever Y is a type of should
also wrap D. And just like that they'r eon the same level again.

>There is no way to address E2 with the child sibling alone. If you
>remove Y and Z and replace it with a seperator it simply becomes E1 >
>E2 which is what I wanted in the first place.
E1 > * > E2
Not much more work, and given the document structure, it is appropriate.

>document structure at
>the section level is taught to be designed before the actual writing
And then you must keep to this design. By introducing the <sep> element,
you're trying to break from the boundaries of the design you've given
yourself - clearly a design error.
Designing a document structure, if you want to call it that, means
thinking about what your document contains and how to represent it. It
does not mean making arbitrary decisions on how you want it to be.
Discovering, not creating.

><section> cannot be used except in overly document structure. If it is
>used for anything else it completely falls apart. Let's say that I
>number the paragraphs in my document so that each paragraph gets a
>number based on its position in the overall document.
>  <p>I will be 1.1</p>
>  <p>I will be 1.2</p>
>  <section>
>    <p>I will be 1.2.1</p>
>    <p>I will be 1.2.2</p>
>  </section>
>I start throwing in sections for other reasons and that falls apart.
>And if you don't like <p> imagine it with anything else. You can't
>just throw an extra <ol> into a list because you feel like there
>should be a pause.
What are you talking about? Are you claiming I want to insert <section>s
for presentation? No. What we have here is that you give, for your
particular document, additional semantics to <section> that do not exist
in the first place, namely the numbering of paragraphs by a scheme you
developed on your own. Let's add a separator to that.

  <p>I will be 1.1</p>
  <p>I will be 1.2</p>
    <p>I will be 1.2.1</p>
    <p>I will be 1.2.2</p>
    <p>I will be 1.2.3</p>

Can you explain to me why 1.2.3 is separate form 1.2.1? I can not
imagine any real case where it wouldn't be more appropriate to introduce
another section with another subheading and another numbering level.

Or taking the <ol> example:
Show me one situation where you can justify the separator over grouping:

>Here's the issue with depth. All elements of a particular type in a
>document at the same depth should have the same semantic value. And
>while <section class="X"> seems like a nice approach that assumes that
>every group can be broken up into class X. It also assumes that I, as
>an author, wish to apply the same <separation /> semantics to every
>change in X.
You, the document author, decided to give the elements a common class X
in the first place. If you later want to have different separation
semantics for individual parts of the group, you made a mistake earlier,
because clearly, the elements are NOT of the same class.

><separation /> is useful, lightweight, allows for less typing and is
>in many ways a nice convenience while not detracting at all from the
>semantic web. So what's the problem? It solves lots of problems while
>not causing any and there is a clear use for it. If only all the
>elements in HTML had met that goal.
Separation is superfluous. In my opinion, that's enough reason to remove it.
It's also setting a dangerous precedent. Because if we introduce the
empty element <sep> to separate lightweight blocks, who can argue
against introducing <bigsep> to separate heavyweight blocks? On what basis?

Henri Sivonen wrote:

> <div> and class are not a proper substitute. <div> and class don't
> mean anything to the UA beyond being opaque identifiers you can bind
> styles to, and CSS is optional.

<div>, as the name says, places a division. It means that to the UA.
<separator> places a separation. The semantic value of the two is
exactly identical (something is a group, separate from the rest), but
<div> is easier to work with in processing and more consistent with the
tree model of XML.

Laurens Holst wrote:

> As far as CSS is concerned, you can select the elements following a
> <separator/> element by using the ‘~’ selector (or the element
> following it immediately using ‘+’). So within its common limitations,
> CSS is capable of styling sections around and inbetween separators as
> well.

And now apply a border to this selector. Let's just say that the result
will not be exactly what we'd (probably) like it to be.

Sebastian Redl
Received on Tuesday, 24 May 2005 08:11:57 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:06:10 UTC