Re: The 'javascript' scheme from Graham Klyne on 2006-11-08 (uri@w3.org from November 2006)

From: Graham Klyne <GK@ninebynine.org>
Date: Wed, 08 Nov 2006 18:39:48 +0000
To: Bjoern Hoehrmann <derhoermi@gmx.net>
CC: uri@w3.org, uri-review@ietf.org
Message-ID: <45522474.2060009@ninebynine.org>
Bjoern Hoehrmann wrote:
> * Graham Klyne wrote:
>> What is the "resource" that is identified by a javascript: URI?  If you mean the
>> javascript code, then how is the result of dereferencing the URI [1] (executing
>> the code) to be considered a "representation" of that resource?  In this
>> respect, I find the comparison with data: URIs to be misleading, since the
>> resource identified by a data: URI is clearly represented in some way by its body.
> 
> It may well be that the draft doesn't use the best available terminology
> and I am happy to change it if someone can come up with better terms. It
> currently does not say what resource is being identified, instead it de-
> fines two forms of "URI dereference". I think that executing code is not
> "the result" of dereference, but part of it.

I see my wording was subject to reasonable misinterpretation:  what I meant was:
[[
... then how is the result of dereferencing the URI [1] (i.e. the result of
executing the code) to be considered a "representation" of that resource?
]]

> I further think that dereference, as defined in RFC 3986, does not imply retrieval of a resource/
> representation--and as such I don't think your second question is valid.

This gets closer to the nub.  While I agree that it does not imply "retrieval",
I do think that obtaining a "representation" is implied, and it the nature of
this representation -- what is it a representation *of* -- that I am questioning.

>   URI "resolution" is the process of determining an access
>   mechanism and the appropriate parameters necessary to
>   dereference a URI; this resolution may require several
>   iterations.  To use that access mechanism to perform an
>   action on the URI's resource is to "dereference" the URI.
>
> In the "resolution" process you determine whether you use only content
> retrieval or in-context evaluation, and in case of the latter other pa-
> rameters like a ECMAScript global object, what type of code you have to
> bind the `this` object, etc. Then you dereference the identifier as the
> draft requires for the access method of your choosing.

RFC3986 carefully separates "resolution" from "dereference".  I believe that
this is because there are several ways that one can manipulate the resource
identified by a URI, one of which is to obtain a representation of its state
("deference", corresponding to an HTTP GET operation).  Some other possibilities
correspond to HTTP PUT and POST operations.

Implicit in my concern is that the treatment of javascript: URIs in a browser is
in correspondence to HTTP GET operations.  That is, if you type an HTTP URI into
a browser address bar, the browser does a get.  If I do the same to a
javascript: URI-like, the javascript code is executed and the result is rendered
(unless there are side effects in which case something else may happen).  For
example, try typing this into a browser address bar:

  javascript:"<h1>hello world</h1>"

or even:

  javascript:"<a href='javascript:%2522hello world%2522'>say hello world</a>"

Side effects further complicate the issue.

Therefore, I assert that according to my Firefox browser, the result of
*dereferencing*
  javascript:"<h1>hello world</h1>"
is the string
  "<h1>hello world</h1>"

What I'm trying to do here is establish common uses of javascript: URI-alike
things correspond to URI dereferencing.  Anything else would, I think, violate
the principle of least surprise -- when I type a URI into a browser address bar,
or click on a hyperlink, I think its commonly held that the browser performs a
dereferencing operation on the URI.

So far, the examples make javascript: look a bit like the data: URI scheme, but
what about:
  javascript:2+2+"2"
For which Firefox yields the string:
  42
Of what resource is "42" here a representation?


> Note again that there is nothing in the draft that says
>
>   <script src='javascript:...' ...
>
> is any different from
>
>   <script src='data:...' ...
>
> It may be, and often is, but in some cases, like X3D and VRML, it is
> not. If that wasn't the case, it would have been possible to say that
> a javascript:... resource identifier identifies a JavaScript object
> or nothing, and that to determine what it identifies, you have to per-
> form in-context evaluation. But as this would contradict ISO standards
> and probably cause similar confusion, I have not attempted to define
> what resource is being identified, and the "result" of the evaluation
> is termed "dereference by-product". As I said, there may be better ways
> to phrase things; suggestions welcome.

I'm not sure where ISO standards come into this.  But the point you raise is, I
think, exactly why I have this concern about regarding javascript: as a URI
(rather than just a URI-like string that can be used in certain contexts where a
URI is otherwise expected).

The problem here is that URIs are a cornerstone of World Wide Web architecture,
and the use of URIs to denote resources is somewhat baked in to the way the Web
works.  In principle, any URI can be used an any context on the web, and serve
broadly the same purpose - it identifies a "resource".

cf.
  http://www.w3.org/TR/webarch/#identification
and
  http://www.w3.org/TR/webarch/#uri-opacity

The purpose of my original post, then, was to try and understand how the use of
javascript: as a URI can be interpreted in a way that is consistent with WWW
architecture.  (I'm not claiming it can't be done, I'm just not sure if and how
it can be done, and if it can be done then this would helpfully be clear from
the specification.)

Another approach, suggested by Julian, would be to acknowledge that javascript:
is not really a URI, but that it can occasionally be used in situations where a
URI is expected.  In some respects, the text of your proposal seems to say
something like this in its introductory text ("the 'javascript' resource
identifier scheme for applications that need to specify script code in contexts
where resource identifiers are expected").  But to do this would be to
acknowledge that there may be URI-like things that are not truly URIs in the Web
sense, which itself may be a source of confusion.

#g

-- 
Graham Klyne
For email:
http://www.ninebynine.org/#Contact
Received on Wednesday, 8 November 2006 21:37:04 UTC