Re: URI parsing

On Tue, 24 Jun 2008, Julian Reschke wrote:
> > 
> > Oh, I wasn't aware that there was an active working group maintaining 
> > the URI specs. That would make my life much easier. I'll contact the 
> > group immediately. Thanks for the heads-up.
> 
> No, there is no working group, but there is a mailing list.

Ok, well, I've mailed the list anyway.


> > The URI specs don't define anything to do with error handling. The IRI
> 
> They don't need to. Garbage in, garbage out.

Yes, well, for HTML5 we're aiming slightly higher than that and are 
defining exactly how the input garbage gets turned into output garbage.


> > specs are incompatible with legacy non-UTF-8 content. Those are the 
> > main
> 
> How so?

Non-ASCII characters in the query component are encoded using the document 
character encoding instead of UTF-8.


> There is work on updating RFC3987 going on, so you may want to 
> bring that up with Martin Dürst.

Good to know. Do you know which group is working on this? Is there a 
mailling list I should contact, or should I mail Martin directly?


> >    http://www.whatwg.org/specs/web-apps/current-work/#urls
> > 
> > I try to defer to URI (3986), IRI (3987), XML Base, and IDN (3490) as 
> > much as possible.
> 
> A quick look shows that you're still referencing RFC2396.

Oops, copy/paste error. Fixed.


> Also, what's wrong with the Reference Resolution defined by RFC3986?

Nothing, step 10 or 11 of the "resolve a URL" algorithm will just defer 
straight to it. I haven't gotten that far yet.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 24 June 2008 10:16:22 UTC