Re: HTML interpreter vs. HTML user agent

On May 26, 2009, at 12:26 PM, Larry Masinter wrote:

>> By "HTML interpreter" do you mean "HTML user agent"?
>
> The phrase "user agent"  was introduced and used in the
> literature as representing a software used by, well,
> users: humans or agents representing humans.

Specifically, it is any software acting as the agent of a
user in initiating actions.  It came from MIME.

> The "User-Agent" header was introduced into the HTTP
> protocol at a time when HTTP was thought only to
> involve clients (that were user agents) and servers
> (that were not user agents.)
>
> When spiders, search engines, proxies, and other
> web intermediaries got added to the web architecture,
> the header "User-Agent" remained, even when some of
> those agents are not operating directly in service
> of a human end user.

No, spiders and search engines are every bit as much a
user agent as a browser.  Proxies are not user agents
because they do not initiate actions (they perform them
on behalf of some user agent).

> For HTML, it seems useful to distinguish between a
> HTML "User Agent" -- software that interprets HTML
> in service of an individual agent -- from other
> applications that need to read and interpret HTML,
> such as search engine analyzers, translation tools,
> etc.
>
> In particular, there are frequently different
> security requirements for "User Agents"  vs. other
> autonomous tools. For HTTP the distinction can be
> made by making reference to the "client".
>
> "HTML interpreter" seems like a more general
> term that would include HTML user agents but also
> other kinds of agents. "HTML processor" might work,
> but an "XML processor" doesn't do any of the
> semantic interpretation.

HTML application = {HTML interpreter, HTML generator}

I think it is useful to distinguish them.  I don't think
this has anything to do with the user agent terminology.

OTOH, I do agree with your underlying points.  The fact is
that different applications of HTML will have very different
operational behavior, particularly in the presence of errors,
so a specification that defines HTML in terms of operational
behavior of just one type of implementation is broken for all
applications that don't happen to be of that type.

That's why the title of the document matters.  If the title
applies to all HTML applications, then the specification must
define HTML in a way that is suitable for all applications.
If the title applies only to browsers, then the applications
that are not browsers are not bound by its requirements
beyond their need to interoperate with browsers.  Consensus
here would then be reduced to a more tractable problem.

Hence, there should be one specification of the language
in terms of syntax and declarative semantics (that applies
to all applications) and separate specifications of behavior
during application-specific processing of HTML.

....Roy

Received on Thursday, 28 May 2009 07:52:40 UTC