RE: Liaison with CSS WG to provide a mechanism for expressing the style of document semantics

Robert -

I think we are nearly 100% on the same page now!

My only remaining concern that I don't feel comfortable with, is around the
concept of "semantics". To me, "semantics" must be machine-usable.
Definitely not machine-verifiable (impossible goal in too many cases), but
what you say about authors creating their own semantics all of the time via
CSS and such... it rubs me wrong. We have a different concept of
"semantics". To me, the "Semantic Web" means that I can write a user-less
system that spiders and indexes Web pages, and can, for example, know that:
<div semantic-role="Search Box">...</div>
can be safely ignored for a huge portion of things, but should be flagged
for, say, addition to a meta-search engine, and that <div
semantic-role="important">...</div> should be treated with extra
consideration, the same as <strong> or <em>.

The really cool thing to me about semantic roles is that they are separated
from styling. @aria-role="important" (or aria:role="important") doesn't make
something bold like <strong> does by default in every major browser. That
encourages the HTML authors to use the tags appropriately, unlike the
current "semantic tags"; there is no reason to apply a semantic role to
something unless it is warranted and correct. If someone wants to grant a
semantic role to a CSS definition (such as in my separate proposal),
*great*. That will encourage consistent, widespread usage, and it is fairly
easy (much easier than writing such a UA without semantic role at all, but
trying to figure out semantics!) for a non-browser UA to fold the CSS in and
discover the semantic role.

Your conception of semantics is much wider and permissive than mine. Not
saying that either one of us is more right or more wrong (let alone an
absolutely wrong/right value!), but my conception is more oriented towards
people writing non-browser UAs, and yours is more oriented towards HTML
authors.

J.Ja

-----Original Message-----
From: Robert J Burns [mailto:rob@robburns.com] 
Sent: Thursday, June 05, 2008 6:31 AM
To: Justin James
Cc: 'HTML Issue Tracking WG'
Subject: Re: Liaison with CSS WG to provide a mechanism for expressing the
style of document semantics

Hi Justin,

Let me reply to your bullet points here:

On Jun 5, 2008, at 10:22 AM, Justin James wrote:
>
> Whew! It's nice to finally "get" this! Now that I understand it the  
> way you
> intended it, I like it a lot better. Some comments though (I know,  
> I'm a
> tough crowd):
>
> * I am proposing a touch more than just a rename (although the  
> rename would
> make it more clear, IMHO)

Perhaps, but to me a legend in the sense of: "wording provided by  
authors explaining the meaning of the presentational properties" is a  
more apt description.

> * I believe that allowing user created "semantics" is a mistake; we  
> keep
> trending away from pre-enumerated values in favor of this ultimate
> flexibility.

Well I'm not proposing user created "semantic", but instead  
facilitating the extension of authoring semantics that's already  
taking place on the web. Whenever an author coins a new class name or  
even attaching the id value of footer or header to a DIV they are  
authoring semantics whether they themselves think of it that way or  
not. Typically when a CSS author adds a class selector or other  
intricate selectors they are often doing so to provide a unique  
presentation mechanism to expose the semantics of the HTML document.

> I believe that an insanely small fraction of developers ever
> extend these systems, and when they do, consumers are never able to
> understand the extensions.

Quite the contrary authors regularly extend the system whenever  
coining class names or attaching those class names to a particular  
element. And by providing a CSS legend mechanism we give consumers of  
content the ability to understand those semantics.

> When we mandate a predefined enumeration, we
> allow developers to leverage that and write applications that can  
> truly
> understand the semantics of a document.

Certainly.

> * The human user figures out semantics easily, regardless of markup.  
> When I
> see "USS Nimitz", I know that it is a ship's name regardless of the
> formatting, thanks to the "USS" prefix. But when a human HTML author  
> marks
> that string with <em> to provide the grammatically correct italics  
> to it,
> any machine semantic parser with think that the string is important,  
> which
> is not its semantic meaning at all.

Not always imagine the semantics of big issues in the current HTML5  
draft. If the authored legend of "big issues are marked like this"  
were not included in the document, the user would not necessarily  
comprehend the semantics conveyed by the red text and red border  
styling. Obviously the user can draw on context and other clues, but  
it is better to provide an explicit mechanism for authors. For a user  
of unconventional media, the author may have failed entirely to  
provide legend information or even distinctive styling declarations.  
For that user, the semantics remain completely inaccessible.

> * I think that an awful lot of our problems come from trying to  
> import XML's
> concept of extensibility into HTML. I posit that HTML, when it is
> extensible, does not need to be consistent. After all, HTML *is* an  
> end-use
> standard. XML is a metastandard used to define other standards.  
> That's a big
> difference. As it applies here, I don't see why authors need the  
> ability to
> create their own semantic roles for things.

Authors already do create and use their own semantics by using class  
names, id values and even combinations of various attribute values  
(apparent in compound CSS selectors). Keep in mind that most times a  
CSS author makes a CSS declaration it is typically associated with a  
semantic in the document and often times associated with extended  
semantics not provided by HTML<5.

> Indeed, if we just take the
> existing list of semantic tags, and strip off "<" and ">", that  
> right there
> is a great start for an enumeration of all of the semantic concepts  
> that we
> want to expose to "legend/purpose".

Well the element type names contain semantic information yes. However,  
authors do add other semantic information through the other mechanisms  
available in HTML.

> I hope I am making sense here; to be frank, I haven't had real sleep  
> in a
> while, and I know that my grammar and logic have been suffering a  
> bit as a
> result.

I understand. I do think we're starting to understand each other, but  
there are clearly still some gaps.

> Thanks again for your patience! I think that we have a great start  
> to an
> idea that will really improve things for Web browser implementers, Web
> developers, and the authors of applications that consume HTML.

Thanks for your feedback, confidence and enthusiasm. I do think this  
can be a valuable addition to the semantic and self-describing web.

Take care,
Rob=

Received on Monday, 9 June 2008 04:07:14 UTC