- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Wed, 25 Jun 2008 21:31:43 +0200
- To: uri@w3.org
Ian Hickson wrote: > could you quote the bits that are nonsensical? With difficulties, the memo needs ages to load over a V.90 line, and then ages to run some scripts, until my browser asks me if I want to abort whatever it is: | A URL is a valid URL if at least one of the following | conditions holds: | * The URL is a valid URI reference [RFC3986]. Period, end of story, see STD 66. | The URL is a valid IRI reference and it has no query component. | [RFC3987] Nope, that's an IRI, not an URL (matched in bullet 1). | The URL is a valid IRI reference and its query component | contains no unescaped non-ASCII characters. [RFC3987] That's also an IRI, not an URL (matched in bullet 1). There is also nothing special with query parts using unescaped characters, at least not in RFC 3987. | The URL is a valid IRI reference and the character encoding | of the URL's Document is UTF-8. [RFC3987] That's also an IRI, not an URL (matched in bullet 1). There is nothing special about UTF-8 IRIs, this only accelerates "transform to UTF-8" in an IRI-to-URI conversion. > Actually we're trying to not reinvent the Web, but to > document it, so that browser vendors can write browsers > that handle existing Web content in a fashion compatible > with legacy UAs without reverse-engineering each other. 2.3 claims to define the term URL. This term is defined in STD 66. If you want to define something else, e.g., a BURL (broken URL), or PURL (pseudo-URL), please pick a new term - but not BURL or PURL, they are already in use for other purposes. Maybe use "IRL", the IRI spec. doesn't use it. Apparently what you really want is a new variant of IRI, with special rules for <iquery> parts in non-UTF-8 documents. > It's true that this is requiring defining things that are > at odds with existing specifications, but that's mostly > because those specifications aren't in fact in line with > real usage. "Real usage" is not only what numerous broken Web pages do, or what a few browsers guess. Broken URLs have caused real damage last year: http://www.microsoft.com/technet/security/advisory/943521.mspx http://www.heise-security.co.uk/news/97878 > I make no judgement as to whether that's a good thing or > not, that doesn't much matter to me. Of course you judge things. E.g. you judge <i> and <b> as worth keeping, and you judge <s> and <tt> as worth killing, and from my POV that is wrong. Allowing them all as short and semantically equivalent to corresponding longer tags would be nice for users forced to type tags in contexts such as Wikis and comment forms, <s> would be even better than <del> for old browsers, and some tools don't support say <sample>, but permit <tt>. Just an example - I know that the semantic cabale fights about any comma in what they consider as "presentational". Frank
Received on Wednesday, 25 June 2008 19:30:49 UTC