Re: HTML interpreter vs. HTML user agent from Roy T. Fielding on 2009-05-29 (public-html@w3.org from May 2009)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Thu, 28 May 2009 18:55:27 -0700
To: Maciej Stachowiak <mjs@apple.com>
Cc: HTML WG <public-html@w3.org>
Message-Id: <FC75560A-7305-4516-89A1-25864DC4B194@gbiv.com>
On May 28, 2009, at 4:46 PM, Maciej Stachowiak wrote:
> On May 28, 2009, at 12:22 PM, Roy T. Fielding wrote:
>
>> On May 28, 2009, at 3:17 AM, Maciej Stachowiak wrote:
>>> On May 28, 2009, at 12:52 AM, Roy T. Fielding wrote:
>>>
>>>>
>>>> HTML application = {HTML interpreter, HTML generator}
>>>
>>> Please see my post on this thread for why the term "HTML  
>>> interpreter" is inaccurate and grating.
>>
>> I saw it.  I suggest you do a little more research on the term,
>> because Larry is about thirty years ahead of you on that.
>> See Interlisp.
>
> First of all, this is an argument from authority.

No, it is a fact.  A piece of software that interprets the meaning
of a sequence of statements while reading those statements is an
interpreter.  The term is applicable to all languages (including
human languages) and is useful for distinguishing the very different
requirements/skills of authoring, translating, and interpreting.

*sheesh*

I am really tired of people using "argument from authority" as
an excuse for being clueless.  You are trying to change a standard
that is in use by a billion or so people, with tens of thousands
of deployed applications.  We don't have time to teach you everything
you have to know in order to not break the world.  Ultimately, you
are going to have to respect the opinions of the people who implement
the other 99% of HTML applications that are not browsers.

> Second, if you are trying to make some connection, HTML  
> implementations don't work at all the way Interlisp did, and are  
> even more certainly not required to per spec.
>
> Third, if Larry feels there is a resemblance between
>
>>
>>>> I think it is useful to distinguish them.  I don't think
>>>> this has anything to do with the user agent terminology.
>>>>
>>>> OTOH, I do agree with your underlying points.  The fact is
>>>> that different applications of HTML will have very different
>>>> operational behavior, particularly in the presence of errors,
>>>> so a specification that defines HTML in terms of operational
>>>> behavior of just one type of implementation is broken for all
>>>> applications that don't happen to be of that type.
>>>
>>> What kinds of HTML processors/implementations/whatever have  
>>> different requirements and would have different behavior in the  
>>> presence of errors?
>>
>> All of them.  CMSs, maintainers, validators, mash-ups, help systems,
>> refrigerators, switches, and televisions, just to name a few.
>
> CMSs and maintainers are content producers, not content consumers.

Wrong.  Content Management Software does both (import, gateway, etc.)
and maintenance spiders consume HTML in order to check it.

> Conformance requirements for validators are given (they are allowed  
> but not required to hard stop at parse errors, but if they perform  
> erro recovery they must do so the same as a browser).

Sorry, I was talking about validators that actually validate input.

> It's not clear to me how a mash-up is an HTML processor. As I  
> understand it, most mash-ups work without any HTML parsing being  
> involved. However, if they were to process HTML, they would be well- 
> advised to do so the same way as browsers, unless they have  
> coordinated an out-of-band agreement with their data source.

See HTML stream transducers, Yahoo! Pipes, or my dissertation.

> Help systems should certainly display HTML the same way as  
> browsers; much HTML help content is tested in the browser, and it  
> is commonplace for HTML-based help systems to use a full browser  
> rendering engine.

Some do.  Some use a limited subset of HTML due to same-source
requirements (translation of documentation to multiple formats
and human languages).  Others just use HTML snippets.

> It is not obvious to me how a refrigerator, switch or television  
> would use HTML,

You've never seen a refrigerator with a built-in LCD screen
that uses hypertext to maintain shopping lists?  I have.
Most network switches are configured via a built-in browser.
This coming year will introduce a number of televisions with
built-in media engines (special purpose browsers) for Netflix,
iTunes, and similar VOD sites.  They all use some form of HTML.

> or why it would have no need to behave in accordance with the  
> existing Web ecosystem.

The existing Web ecosystem changes every month.  Most devices
don't want to deal with that much dynamism. They want a standard.
Of the above, only network switches are exposed to the needs
of general-purpose browsers.

> The only example I can think of is set-top boxes - either they can  
> display general Web content, in which case they must behave the  
> same way, or they display only a restricted set of "walled garden"  
> content in which case they can rely on their own private standards  
> for parsing and display.
>
> So at least on initial review, your list does not appear to justify  
> your assertion.

That's because you don't know the field.

> Perhaps you can give at least one specific example of a tool that  
> would need to process HTML differently in the presence of errors  
> and explain why.

Look at any of the systems I have personally been involved in
developing: wwwstat, MOMspider, libwww-perl, Apache httpd, Apache
Jackrabbit, Apache Sling, Day CQ5, ... none of which are capable of
conforming to HTML5-as-defined because the definition of conformance
is behaving like a browser.

>>>> That's why the title of the document matters.  If the title
>>>> applies to all HTML applications, then the specification must
>>>> define HTML in a way that is suitable for all applications.
>>>
>>> The spec is meant to apply to any software that generates or  
>>> processes HTML. Some requirements are scoped to particular  
>>> conformance classes. If you feel it does not do so, then please  
>>> describe how, rather than debating the title.
>>
>> I already did.
>
> Can you provide a reference please? In the emails I have seen, you  
> have only asserted this, not given an explanation or justification.

I've explained it more than enough times.  I won't do another detailed
review of the current document until the editor removes the parts
that are out of scope and otherwise violating existing standards.

....Roy
Received on Friday, 29 May 2009 01:55:54 UTC