Re: HTML interpreter vs. HTML user agent from Maciej Stachowiak on 2009-05-28 (public-html@w3.org from May 2009)

From: Maciej Stachowiak <mjs@apple.com>
Date: Thu, 28 May 2009 16:46:55 -0700
To: "Roy T. Fielding" <fielding@gbiv.com>
Cc: HTML WG <public-html@w3.org>
Message-id: <F45D8263-73E1-4074-8F7C-AC91C0A38B1F@apple.com>
On May 28, 2009, at 12:22 PM, Roy T. Fielding wrote:

> On May 28, 2009, at 3:17 AM, Maciej Stachowiak wrote:
>> On May 28, 2009, at 12:52 AM, Roy T. Fielding wrote:
>>
>>>
>>> HTML application = {HTML interpreter, HTML generator}
>>
>> Please see my post on this thread for why the term "HTML  
>> interpreter" is inaccurate and grating.
>
> I saw it.  I suggest you do a little more research on the term,
> because Larry is about thirty years ahead of you on that.
> See Interlisp.

First of all, this is an argument from authority.

Second, if you are trying to make some connection, HTML  
implementations don't work at all the way Interlisp did, and are even  
more certainly not required to per spec.

Third, if Larry feels there is a resemblance between

>
>>> I think it is useful to distinguish them.  I don't think
>>> this has anything to do with the user agent terminology.
>>>
>>> OTOH, I do agree with your underlying points.  The fact is
>>> that different applications of HTML will have very different
>>> operational behavior, particularly in the presence of errors,
>>> so a specification that defines HTML in terms of operational
>>> behavior of just one type of implementation is broken for all
>>> applications that don't happen to be of that type.
>>
>> What kinds of HTML processors/implementations/whatever have  
>> different requirements and would have different behavior in the  
>> presence of errors?
>
> All of them.  CMSs, maintainers, validators, mash-ups, help systems,
> refrigerators, switches, and televisions, just to name a few.

CMSs and maintainers are content producers, not content consumers.

Conformance requirements for validators are given (they are allowed  
but not required to hard stop at parse errors, but if they perform  
erro recovery they must do so the same as a browser).

It's not clear to me how a mash-up is an HTML processor. As I  
understand it, most mash-ups work without any HTML parsing being  
involved. However, if they were to process HTML, they would be well- 
advised to do so the same way as browsers, unless they have  
coordinated an out-of-band agreement with their data source.

Help systems should certainly display HTML the same way as browsers;  
much HTML help content is tested in the browser, and it is commonplace  
for HTML-based help systems to use a full browser rendering engine.

It is not obvious to me how a refrigerator, switch or television would  
use HTML, or why it would have no need to behave in accordance with  
the existing Web ecosystem. The only example I can think of is set-top  
boxes - either they can display general Web content, in which case  
they must behave the same way, or they display only a restricted set  
of "walled garden" content in which case they can rely on their own  
private standards for parsing and display.

So at least on initial review, your list does not appear to justify  
your assertion.

Perhaps you can give at least one specific example of a tool that  
would need to process HTML differently in the presence of errors and  
explain why.


>
>>> That's why the title of the document matters.  If the title
>>> applies to all HTML applications, then the specification must
>>> define HTML in a way that is suitable for all applications.
>>
>> The spec is meant to apply to any software that generates or  
>> processes HTML. Some requirements are scoped to particular  
>> conformance classes. If you feel it does not do so, then please  
>> describe how, rather than debating the title.
>
> I already did.

Can you provide a reference please? In the emails I have seen, you  
have only asserted this, not given an explanation or justification.

>
>>> If the title applies only to browsers, then the applications
>>> that are not browsers are not bound by its requirements
>>> beyond their need to interoperate with browsers.  Consensus
>>> here would then be reduced to a more tractable problem.
>>
>> The whole point of the spec is to enable interoperability,  
>> including among different classes of implementations. I can't think  
>> of any piece of useful HTML-processing software that has no need to  
>> interoperate with either browsers or existing deployed HTML content  
>> on the Web,  at the very least through some chain of indirection. I  
>> suppose you can imagine completely closed ecosystems of HTML where  
>> the content is never authored using a standard tool, or consumed by  
>> a standard browser, search engine, mail client, assistive  
>> technology, etc. But I don't think it is worth it to make a  
>> separate spec to cater to such walled garden use cases. Inside a  
>> walled garden you can do whatever you want, regardless of specs.
>
> Straw man.

I'm trying to imagine what kind of client you might have in mind that  
doesn't need to interoperate with browsers, even indirectly. If this  
is not the kind of thing you had in mind, then please give a concrete  
example.

>
>>> Hence, there should be one specification of the language
>>> in terms of syntax and declarative semantics (that applies
>>> to all applications) and separate specifications of behavior
>>> during application-specific processing of HTML.
>>
>> I continue to disagree, and I once again point out that things are  
>> not generally done this way for other languages, even other W3C  
>> languages.
>
> My bookshelf disagrees with your opinion.  Even the compiled  
> languages,
> which have very limited application-specific behavior, are always  
> split
> into separate sections for tutorial, reference (complete syntax/ 
> semantics),
> and application-specific behavior (e.g., differences between  
> architectures
> that have 16, 32, or 64bit words).  Note that these are books, not
> standards specs.  The standards are just the reference section.

We are writing a standard spec, not a general-purpose technical book.  
I checked the actual standards for some popular languages (C, Java, C+ 
+, ECMAScript, SVG, XForms) and they certainly cannot be said to  
describe "syntax and declarative semantics" in a separate application.  
They all define semantics in terms of operational processing  
requirements. Even languages defined with a very high degree of  
abstract formalism and mathematical rigor often use formal operational  
semantics rather than denotational semantics or other more declarative  
formalisms.

Regards,
Maciej
Received on Thursday, 28 May 2009 23:47:35 UTC