Re: 4.13: URI decomposition - non-standard terminology

Ian Hickson wrote:

 [Testing <http://whatwg.org/html5> as proposed, the colour codes
  are nice, get the green or the blue draft, not the red draft...]
 
> The term "hostport" in HTML5 isn't intended to be the old RFC2396 
> terminology, it's meant to be a special term defined just in HTML5
> for the purposes of defining the legacy "host" DOM attribute

Okay, 2.3.5 in the green draft is slightly different from 4.13 in
the blue draft.  The OP talked about 4.13, that was what I looked
at, and it explicitly mentions RFCs 3986 / 3987, above the table
with RFC 3986 terms in the second column (incl. <host> + <port>).

> It's unclear to me what you have in mind.

The OP asked for standard terminology, mentioning "protocol" as an
example.  And I tried to figure out what he had in mind after you
had explained that "protocol" is a traditional name that can't be
changed.  If your <hostport> would be different from the RFC 2396
<hostport> this would be confusing.  I fear it is, you are talking
about <ihost> ":" <port>, aren't you ?  Ditto <ihost> vs. <host>.

> It's not clear to me how you tested this.

By clicking on the quoted URLs, http://example.com:0x50/ reports an
error (displaying http://example.com:50/), and the same test with
0x80 reaches http://example.com (at port 80).  That matches what
you specify in 2.3.5, "ignore all non-digits in <port>".

Similar I tested http://example.com:/, and arrived at example.com
(port 80).  Your draft says port would be set to 0, apparently.  Is
the 0 only a trick to indicate an erroneous port for the purposes
of chapter 2.3.5 ?

| Remove any characters in the new value that are not in the range
| U+0030 DIGIT ZERO .. U+0039 DIGIT NINE. If the resulting string
| is empty, set it to a single U+0030 DIGIT ZERO character ('0').

> I recommend testing with a modern browser

I like old browsers, they enjoy security by obscurity, I know their
bugs (after some years), they are smaller and faster than "popular"
monsters wanting met to get a modern OS with modern hardware for
the purpose of watching modern ads.  I figured out how stuff works,
changing browsers is almost as bad as changing text editors.  

> The "latest published version" is a W3C anachronism

Folks on this list are not necessarily supposed to know that newer
drafts exists... :-)  June 10 is only a month old.  Discussing the
daily snapshots is "WG work" for the experts, who know precisely
which of the 977 KB talks about what, and why, and since when.

 Frank 

Received on Saturday, 12 July 2008 17:09:24 UTC