Re: HTML interpreter vs. HTML user agent from Maciej Stachowiak on 2009-05-29 (public-html@w3.org from May 2009)

From: Maciej Stachowiak <mjs@apple.com>
Date: Thu, 28 May 2009 19:33:49 -0700
To: "Roy T. Fielding" <fielding@gbiv.com>
Cc: HTML WG <public-html@w3.org>
Message-id: <97DF89ED-101A-4C74-8D58-5915A3C80D52@apple.com>
On May 28, 2009, at 6:55 PM, Roy T. Fielding wrote:

> On May 28, 2009, at 4:46 PM, Maciej Stachowiak wrote:
>> On May 28, 2009, at 12:22 PM, Roy T. Fielding wrote:
>>
>>> On May 28, 2009, at 3:17 AM, Maciej Stachowiak wrote:
>>>> On May 28, 2009, at 12:52 AM, Roy T. Fielding wrote:
>>>>
>>>>>
>>>>> HTML application = {HTML interpreter, HTML generator}
>>>>
>>>> Please see my post on this thread for why the term "HTML  
>>>> interpreter" is inaccurate and grating.
>>>
>>> I saw it.  I suggest you do a little more research on the term,
>>> because Larry is about thirty years ahead of you on that.
>>> See Interlisp.
>>
>> First of all, this is an argument from authority.
>
> No, it is a fact.  A piece of software that interprets the meaning
> of a sequence of statements while reading those statements is an
> interpreter.  The term is applicable to all languages (including
> human languages) and is useful for distinguishing the very different
> requirements/skills of authoring, translating, and interpreting.

Now you're making a different argument. But it's wrong. That is not  
what "interpreter" means in the field of computing. This Wikipedia  
article is a good  explanation: <http://en.wikipedia.org/wiki/Interpreter_(computing) 
 >.

There are other meanings of "interpreter" that apply outside the field  
of computing: <http://dictionary.reference.com/browse/interpreter>.  
But they don't really apply here. Just as "router" and "bridge" have  
non-computing meanings, but applying those meanings by analogy is just  
going to be confusing.

> *sheesh*
>
> I am really tired of people using "argument from authority" as
> an excuse for being clueless.

It's not an excuse for being clueless. It's point out that you can't  
just say "that's the way it is because I have 50 billion years of  
experience building intergalatic flux capacitor widgets". You have to  
actually explain why. As a recognized authority you may find this  
tiresome, but in a standards body we have to use logically valid  
reasoning. No kings or presidents, etc etc.

> You are trying to change a standard
> that is in use by a billion or so people, with tens of thousands
> of deployed applications.  We don't have time to teach you everything
> you have to know in order to not break the world.  Ultimately, you
> are going to have to respect the opinions of the people who implement
> the other 99% of HTML applications that are not browsers.

I'm still waiting for concrete examples from you of how the spec  
"breaks the world". All I see from you is a bunch of bluster and  
claims that you are a big shot expert so you don't need to justify  
yourself, and anyone who disagrees with you must be ignorant.

>>>>
>>>> What kinds of HTML processors/implementations/whatever have  
>>>> different requirements and would have different behavior in the  
>>>> presence of errors?
>>>
>>> All of them.  CMSs, maintainers, validators, mash-ups, help systems,
>>> refrigerators, switches, and televisions, just to name a few.
>>
>> CMSs and maintainers are content producers, not content consumers.
>
> Wrong.  Content Management Software does both (import, gateway, etc.)
> and maintenance spiders consume HTML in order to check it.

And why would their import functions require different error handling  
rules?

>
>> Conformance requirements for validators are given (they are allowed  
>> but not required to hard stop at parse errors, but if they perform  
>> erro recovery they must do so the same as a browser).
>
> Sorry, I was talking about validators that actually validate input.

I guess you're saying that HTML5 validators don't validate input? That  
seems like an insult, not an argument.

>
>> It's not clear to me how a mash-up is an HTML processor. As I  
>> understand it, most mash-ups work without any HTML parsing being  
>> involved. However, if they were to process HTML, they would be well- 
>> advised to do so the same way as browsers, unless they have  
>> coordinated an out-of-band agreement with their data source.
>
> See HTML stream transducers, Yahoo! Pipes, or my dissertation.

I see. Seems like these would clearly need browser-compatible error  
handling, if they want to work with arbitrary HTML documents on the Web.

>
>> Help systems should certainly display HTML the same way as  
>> browsers; much HTML help content is tested in the browser, and it  
>> is commonplace for HTML-based help systems to use a full browser  
>> rendering engine.
>
> Some do.  Some use a limited subset of HTML due to same-source
> requirements (translation of documentation to multiple formats
> and human languages).  Others just use HTML snippets.

As far as I can tell, nothing in the HTML 5 spec prevents you from  
defining your own subset, or a snippet format. But if you can only  
process subsets or snippets then naturally you are not a conforming  
implementation.

>
>> It is not obvious to me how a refrigerator, switch or television  
>> would use HTML,
>
> You've never seen a refrigerator with a built-in LCD screen
> that uses hypertext to maintain shopping lists?  I have.

If this is all inside a closed world, it's a walled garden. If it can  
consume content from other sources, or produces content for other  
sources, then it needs to interoperate.

> Most network switches are configured via a built-in browser.

Really? The ones I have seen that use the Web for configuration have a  
built-in Web server that is configured using the administrator's  
browser of choice. So (a) they are content producers not consumers and  
(b) they need to interoperate with browsers.

> This coming year will introduce a number of televisions with
> built-in media engines (special purpose browsers) for Netflix,
> iTunes, and similar VOD sites.  They all use some form of HTML.

If they can only handle a defined subset inside their walled garden,  
then they are not conforming implementations. If they are

>> or why it would have no need to behave in accordance with the  
>> existing Web ecosystem.
>
> The existing Web ecosystem changes every month.  Most devices
> don't want to deal with that much dynamism. They want a standard.
> Of the above, only network switches are exposed to the needs
> of general-purpose browsers.

Most systems I am familiar with that use specialized HTML content also  
want to enable creating that content using standard tools, and testing  
it in an ordinary Web browser. Thus, they need to interoperate.  
However, any system that wanted to use a restricted subset would have  
to create its own specification for the subset, because there is no  
way a general spec can define your subset for you.

>> The only example I can think of is set-top boxes - either they can  
>> display general Web content, in which case they must behave the  
>> same way, or they display only a restricted set of "walled garden"  
>> content in which case they can rely on their own private standards  
>> for parsing and display.
>>
>> So at least on initial review, your list does not appear to justify  
>> your assertion.
>
> That's because you don't know the field.

Insults are not a substitute for reasoning.

>
>> Perhaps you can give at least one specific example of a tool that  
>> would need to process HTML differently in the presence of errors  
>> and explain why.
>
> Look at any of the systems I have personally been involved in
> developing: wwwstat, MOMspider, libwww-perl, Apache httpd, Apache
> Jackrabbit, Apache Sling, Day CQ5, ... none of which are capable of
> conforming to HTML5-as-defined because the definition of conformance
> is behaving like a browser.

Can you give a concrete example of how any of these tools, in the  
course of processing or producting HTML, cannot conform to HTML5? Just  
one conformance criterion, and a concrete example of why a tool is  
unable to follow it.

For some of these pieces of software, I am surprised to hear they  
would process HTML at all. I can't see why libwww-perl would do so for  
instance and I couldn't find code that does so. Examining the source  
code, it seems to do exclusively HTTP things.

>
>>>>> That's why the title of the document matters.  If the title
>>>>> applies to all HTML applications, then the specification must
>>>>> define HTML in a way that is suitable for all applications.
>>>>
>>>> The spec is meant to apply to any software that generates or  
>>>> processes HTML. Some requirements are scoped to particular  
>>>> conformance classes. If you feel it does not do so, then please  
>>>> describe how, rather than debating the title.
>>>
>>> I already did.
>>
>> Can you provide a reference please? In the emails I have seen, you  
>> have only asserted this, not given an explanation or justification.
>
> I've explained it more than enough times.  I won't do another detailed
> review of the current document until the editor removes the parts
> that are out of scope and otherwise violating existing standards.

I haven't seen any detailed reviews from you, just summary dismissal.  
If you don't want to contribute constructively to improving the spec,  
then that is your choice.

Regards,
Maciej
Received on Friday, 29 May 2009 03:10:20 UTC