Re: relevance of diverse HTML authoring practices [was: Versioning re-visited ...] from scott lewis on 2007-06-26 (public-html@w3.org from June 2007)

From: scott lewis <sfl@scotfl.ca>
Date: Tue, 26 Jun 2007 00:23:16 -0600
To: Philip & Le Khanh <Philip-and-LeKhanh@Royal-Tunbridge-Wells.Org>
Cc: HTML Working Group <public-html@w3.org>
Message-Id: <48A10127-FF2D-42AB-B740-A37180D3CA62@scotfl.ca>
On 25 Jun 2007, at 1444, Philip & Le Khanh wrote:

> scott lewis wrote:
>
>> What use is HTML except as a set of directions to a UA?
>
> The use of HTML is to mark up the semantics of a hypertext document.
> One deployment of such a document is to render it through a user
> agent; but just because that is the primary deployment does not
> mean that there are not others (indexing (a la Google), data
> mining, typesetting are just three that come immediately to mind).
>
> [...]

I apologize for using a visual browser as a shorthand for all HTML  
UAs. :)

The point I was trying to make is that the semantic meaning of HTML  
and the presentation of it (in whatever form: visual, aural, braille,  
database entry, etc.) is dependent on the consumer of the markup. For  
example, if a UA were to treat <ul> as a hypertext link and <li> as a  
resource that the UL points to, it would be wholly incompatible with  
the author's intent. The simplest way to keep UA behaviour and  
authors' intent in step is to specify the language with one document  
that applies to both.


>> The semantic meaning of "an unordered list with three items" is a  
>> processing requirement. I don't understand how HTML could be  
>> defined properly in the absence of describing UA behaviour.
>
> All that is necessary is to describe the semantics, and suggest
> appropriate renderings for a variety of common media.

Suggesting appropriate renderings does not guarantee compatible  
implementations amongst different UAs. The only way to do that is to  
provide a specific, detailed description of how a UA is to handle the  
markup. The way UAs handle the markup is the only thing that matters  
to authors because it does not matter how theoretically perfect your  
document is if no one can access it. Whenever an author witnessed UA  
behaviour they didn't understand (visual example; the page renders  
differently than expected in browsers) he or she would have to resort  
to the UA Spec in order to understand what the UA is doing and why it  
doesn't match the author's expectations. And if authors are going to  
have to read the UA Spec eventually, why do we need an Author Spec?


> [...]
>
>> The behaviour of the UAs would be defined by the "UA Behaviour  
>> Specification", not by the "HTML Author Specification". Thus  
>> anything mentioned in the Author Spec that is not precisely  
>> duplicated in the UA Spec would not be implemented by the UAs.  
>> Even if the Author Spec was just a copy of the UA Spec with  
>> commentary, that commentary could alter the interpretation of the  
>> spec in a way that would not be reflected in the UA Spec, and thus  
>> not implemented in UAs. The greater the difference between the two  
>> specs, the greater the likelihood of irreconcilable differences.
>
> The two specifications are totally different : one defines what is,
> and what is not, a valid production in the language, and ascribes
> semantics to those productions that are valid.  The other defines
> (actually I prefer the word "suggests") how valid productions are
> to be rendered, and provides /guidance/ as to how invalid productions
> might best be handled.
>
>> Which is precisely the situation we are in today, with the UAs  
>> implementing a poorly defined de facto specification that differs  
>> from the HTML4 and XHTML1 specifications provided to authors.
>
> And with whom do you lay the blame for this ?

I don't think anyone can be truly blamed for it. The specification  
attempted to use broad language and not get bogged down in  
implementation details. The implementers produced UAs that followed  
the specification as they understood it. The problem is that the  
broad language of the spec could be interpreted in multiple ways and  
thus, the implementations are incompatible. Everyone had good  
intentions, the situation was simply unworkable.

We have the advantage of hindsight and past experience. We know what  
will likely happen if we do not rigourously specify processing  
requirements in the specification, and we have the ability to avoid  
that.


>> We are not defining a new language, we are refining an existing  
>> one. To borrow your programming language metaphor: if the majority  
>> of programs written in that language define common subroutines,  
>> that indicates there is a shortcoming in the language that would  
>> be filled by adding those subroutines to the standard library. Or,  
>> to put it another way: if everyone cuts over a corner of the lawn,  
>> you would be better off paving that corner rather than posting  
>> more signs saying 'keep off the grass'.
>
> Everyone speeds (modulo epsilon, for very small epsilon); but
> speed limits tend to be reduced over time, rather than raised.
>
>> The current state of the web was created in spite of the existence  
>> of HTML 4.01 Strict.  This tells us that, faced with a spec that  
>> does not
>> provide what they want, authors will choose to ignore the spec in  
>> favour of non-conforming documents that work in popular browsers  
>> and do do what the author wants.  Thus, if we want to see the  
>> number of conforming HTML
>> documents increase, we would be well advised to adjust the spec in  
>> favour of actual practice, not theoretical perfection.
>
> So you would recommend to your legislature that the legal code
> be based not on justice, ethics, equality and morality, but on
> the typical behaviour of "the man in the street".  Well, if it's
> all the same to you, I'd prefer to live in a country where
> Parliament understands that "the common man" doesn't always know
> best.

These are false analogies. The fact that wildly non-conforming  
documents are written and successfully parsed everyday is not in any  
way similar to people dying fiery deaths on the highways or societies  
that have crumbled into moral-less, unethical chaos. In addition, you  
are talking about *laws*. We are not writing laws. People who write  
non-conforming documents are not going to be threatened with  
imprisonment or violence.

The best we can do is to make writing conforming HTML 5 documents a  
preferable experience to writing non-conforming documents. To do  
that, we must meet the needs of authors: we must work to ensure  
compatible implementations of UAs, we must recognize the features  
authors use and include them (or include a better feature that  
provides similar functionality), and we must ensure that the author's  
understanding of the document matches closely with the way it will be  
processed by UAs.

If we do not provide a consistent, predictable and featureful  
language supported by the majority of UAs, authors will reject it.  
That is not to say that we shouldn't form the language in such a way  
that it guides authors towards semantic, accessible documents -- we  
just have to put a little sugar in with the medicine.

scott.
Received on Tuesday, 26 June 2007 06:23:26 UTC