RE: Semantic Argument (Warning: Long Post)

On Sun, 14 Nov 2004, Doug Schepers wrote:
> 
> I see that we have fundamentally different visions for the future of the 
> Web. I have little expectation that I will change your mind, but in case 
> you're interested, here's my perspective. Thanks for replying to my 
> rather off-topic email.
> 
> |  * An element in a custom namespace bound to SVG using sXBL, 
> |  or a custom   element that is styled using CSS, is 
> |  semantically poor because:
> | 
> |     1. The element will not be natively implemented by any 
> | 	   Web browsers, 
> 
> That's yet to be known. Perhaps not this year, or even in 5 years 
> (though it wouldn't surprise me to see more browsers supporting richer 
> semantics in that timeframe), but eventually browsers will support 
> something completely different than HTML and SVG, and probably to the 
> exclusion of those "old-fashioned" languages. Long after the current 
> presentational and quasi-structural layers have dropped off the face of 
> the Web, content with sufficient semantics will be able to be reframed 
> in the new idiom.

And when the browsers support those elements, then they're fine. But until 
they do, the elements are semantically poor for that reason.


> |        it will only be usable in the context of  
> |        the sXBL binding or the CSS stylesheet,
> 
> Or whatever languages XBL bridges to, or however future browsers 
> interpret it. In many cases it would be silly to represent some content 
> in SVG to a sightless person, and so that UA would use the appropriate 
> presentation (which could be provided by the author, the user, or a 
> third party).

Sure. The point is that without sXBL or CSS, that element is mostly 
useless, since the UA doesn't know what to do with it.


> |     2. It has no defined conceptual meaning that is not tied to 
> |        the binding or stylesheet,
> 
> Exactly the opposite. The conceptual meaning is tied into the ontology 
> itself, and is therefore more rich and robust.

By definition, if it's a custom element, it doesn't have a well-known 
meaning, since it is not from a well-known ontology.


> |     3. The elements can only be rendered by user agents that 
> |        implement SVG (for the binding case) or CSS (for the 
> |        stylesheet case).
> 
> Even if that were true, which I don't believe it is

How can it not be true? The UA knows nothing about it. The author is 
explicitly relying on sXBL or CSS to render it. What would a non-XBL and
non-CSS UA do with it?


> | That's not at all what I said. The semantics afforded by HTML 
> | are suitable for many millions, if not billion, of pages, 
> | such as FAQs, documentation, home pages about people's cats, 
> | contact pages, and all the other things people say to each 
> | other regularly.
> 
> Yes, but that's only a small part of what could be represented on the 
> Web. Those things are the activities people do on the We now because 
> that's all that they can do. I see the need for far more complex data 
> and representation in the future, as well as quotidian text. It doesn't 
> harm that kind of data to have another kind on the Web as well.

There are specific industries where it makes sense to use more appropriate 
markup languages. Those industries have UAs that support those markup 
languages. XBL or CSS (or, more commonly, XSLT and CSS) might well be used 
to render such documents -- but the point is the markup is known to the 
user, or his agent. He doesn't _need_ to use CSS or XBL to view the page.

The harm is when those languages are used for the "quotidian" text. This 
is quite common, just look at the number of PDF or Microsoft Word 
documents on the Web. This is what we have to avoid encouraging. It's bad 
because in _those_ cases the user _doesn't_ know the markup language, and 
has no way to use the pages _except_ CSS or XBL.


> | > Why is semantic markup styled with CSS better than semantic markup 
> | > styled with SVG? I submit that it isn't.
> | 
> | It wouldn't be, if that's what you actually had. But it 
> | isn't. The sXBL model, for instance, requires an <svg> root 
> | element. You can't take an HTML document and "style it with 
> | SVG" the way things stand now. (That was part of my technical 
> | comments, in fact.)
> 
> Maybe not, but you could model the HTML in SVG. Why does the root matter?

Because you need to be able to _also_ render it using other mechanisms 
(such as aural CSS) without changing the document. If I access the 
document from my HTML phone, I don't want it to tell me "I don't 
understand SVG", I want it to just render the HTML using its built-in 
rules, and ignore the author's suggested rendering.

With languages that the UA doesn't support, you can't do this.


> | The biggest problem, though, is that you can write content 
> | purely in SVG, without the semantics. 
> 
> You can, but if there's an sXBL component that already does the 
> conversion for you (made by some third party, for sale or for free), it 
> will be easier to use that one and simply include the actual semantic 
> content.

This is quite clearly not the case, given the fraction of HTML pages that 
mis-use WYSIWYG-like extensions instead of using HTML's semantic elements.


> | That's the problem
> | XSL:FO has too, and the problem HTML 3.2 had with <font>, 
> | <br>, and <table border>. People in general don't understand 
> | semantics, 
> 
> I'll bet many people do. They use specific jargon and specialized
> terminology within their fields. Just because they don't care about the
> limited semantics HTML offers, because it gains them too little to bother
> with, doesn't mean that they aren't very invested in their own semantics.
> I'll bet that they will use semantic content when they think it's
> appropriate to their needs.

It's not when it's appropriate to _their_ needs that matters. It's when 
it's appropriate to their blind user, who is unable to use the document 
because "well semantics didn't matter in this case".


> | so if you let them write documents using a 
> | presentational language, and if it will work in Web browsers, 
> | they will. 
> 
> Not if it's easier to do otherwise.

WYSIWYG is easier than abstract markup with a presentation layer.


> | Same as CSS, sure. And in limited environments with only a 
> | few thousand people, or in intranets, that might be fine. But 
> | on the Web you have to deal with over half a billion people.
> 
> Not all at the same time. How many people are really going to be 
> interested in complex molecular frameworks? Does that mean we shouldn't 
> enable those who want to use the Web to further their research and 
> collaboration and data-sharing?

I posit that the majority of content created by humans (Web or not) is not 
of that nature.


It would be great if the Web worked as you suggest, with people using 
languages appropriate to their domain, users having the tools to deal with 
all the languages they are exposed to, and so forth. In fact that _is_ how 
the Web works _now_, except most of those languages aren't XML-based. My 
concern is simply that for the rest of the Web -- the common bit, the bit 
that is filled with punch-the-monkey ads, etc -- the authors simply aren't 
competent enough to understand the difference between presentation and 
content. W3C languages that are expected to be used by that audience 
simply have to take this into account, and have to be _very_ careful about 
not making it easy to do the wrong thing.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Wednesday, 24 November 2004 16:06:23 UTC