- From: Calogero Alex Baldacchino <alex.baldacchino@email.it>
- Date: Sun, 14 Dec 2008 00:12:39 +0100
Nils Dagsson Moskopp ha scritto: > Am Samstag, den 13.12.2008, 19:09 +0100 schrieb Calogero Alex > Baldacchino: > >> Actually I'm not from any faction, to be honest. I think a rationale for >> that may be "people write strange things, both in address bars and in >> html code", thus relaxing rules when parsing an URL is meaningful; but I >> think when resolving and recomposing a whole URI the strictest rules >> should be applied. >> > Accepting weird input is not a problem here, outputting is. Try writing > a valid URI into the address bar, then get an invalid displayed. > > > Greetings > Could you make an example, please? I wasn't able to reproduce such in IE7 - Opera 9.27 (e.g., "http://real.addressofasite.com/index.html#foo%20bar" wasn't changed into "http://real.addressofasite.com/index.html#foo bar"). Anyway, I guess you got the point. Relaxed parsing rules are for input URLs, but after parsing, a normalization and/or the resolution algorithm should be applied, and the showed URL, being absolute and complete, should conform to RFC3986. Actual resolution algorithm (section 2.5.3 of html5 spec) does not mention fragment identifiers explicitly, and, although its 10th step says "Apply any relevant conformance criteria of RFC 3986 and RFC 3987, returning an error and aborting these steps if appropriate.", step 9 says "Apply the algorithm described in RFC 3986 section 5.2 Relative Resolution, using url as the potentially relative URI reference (R), and base as the base URI (Base)": AIUI, the algorithm described in section 5.2 of rfc3986 might be applied to each component of an URI without building a complete URI (instead, leaving each part separated and held as a property of an object - a components recomposition algorithm is defined in section 5.3 of rfc3986, but that's not a 'must'); when a single component of an URI is to be handled, rfc3986 does not require %-encoding as a 'must', thus the freedom of interpretations and the different behaviors in different UAs, leading to inconsistent results when copying a URL from a UA and pasting it into another one. I think a uniform behaviour should be defined as standard (and implemented!), instead (the concern you rised about copy&paste perhaps results in a further issue regarding how line breaks should be handled by parsing rules - e.g. stripped like leading and trailing characters). Regards, Alex -- Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f Sponsor: CheBanca! La prima banca che ti d? gli interessi in anticipo. * Fino al 4,70% sul Conto Deposito, zero spese e interessi subito. Aprilo! Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=7918&d=14-12
Received on Saturday, 13 December 2008 15:12:39 UTC