Re: Updating the IRI spec to include "web addresses" from Dan Connolly on 2009-06-03 (public-iri@w3.org from June 2009)

From: Dan Connolly <connolly@w3.org>
Date: Wed, 03 Jun 2009 12:45:38 -0500
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: Larry Masinter <masinter@adobe.com>, "public-iri@w3.org" <public-iri@w3.org>
Message-Id: <1244051138.5864.35547.camel@pav.lan>

On Wed, 2009-06-03 at 19:12 +0200, Bjoern Hoehrmann wrote:
> * Dan Connolly wrote:
> >On Sun, 2009-05-31 at 10:12 -0700, Larry Masinter wrote:
> >
> >> However, “Web Address” (or “Hypertext Reference”, as has been
> >> suggested) is defined as a sequence of BYTES which in turn have a
> >> CHARACTER ENCODING which is taken from the DOCUMENT or SCRIPT  in
> >> which it is embedded. 
> >
> >No, it's a sequence of characters _plus_ another character encoding.
> 
> Both characterisations are inaccurate.

I base my characterization on 2 things; 1, the text
of the current HTML 5 draft:

"A URL has an associated URL character encoding,"
 -- http://dev.w3.org/html5/spec/Overview.html#terminology-0


and 2 some testing experience.

 http://www.w3.org/html/wg/href/elab10.html

(that test is currently broken because the .htaccess
is different on w3.org than on the local server where
I had it running.)

>  The character encoding is a
> property of the context where a string is processed and ultimately
> dereferenced; it is not a property of the string itself. If it were
> you would generally expect the property to be maintained e.g. when
> an element node is copied from one document to another which is not
> the case.

Could you elaborate? I'd especially appreciate a running
test case.


-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
gpg D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E

Received on Wednesday, 3 June 2009 17:45:45 UTC