Re: URI parsing

Ian Hickson wrote:
> On Tue, 24 Jun 2008, Julian Reschke wrote:
>>> Oh, I wasn't aware that there was an active working group maintaining 
>>> the URI specs. That would make my life much easier. I'll contact the 
>>> group immediately. Thanks for the heads-up.
>> No, there is no working group, but there is a mailing list.
> 
> Ok, well, I've mailed the list anyway.

That's the right thing to do.

>>> The URI specs don't define anything to do with error handling. The IRI
>> They don't need to. Garbage in, garbage out.
> 
> Yes, well, for HTML5 we're aiming slightly higher than that and are 
> defining exactly how the input garbage gets turned into output garbage.

That's fine, but I wouldn't agree that this is something the URI specs 
should have done in the first place.

>>> specs are incompatible with legacy non-UTF-8 content. Those are the 
>>> main
>> How so?
> 
> Non-ASCII characters in the query component are encoded using the document 
> character encoding instead of UTF-8.

How is that a problem with respect to URI/IRI? Even if the character 
encoding is a different one, the result is still a legal URI, thus a 
legal IRI.

>> There is work on updating RFC3987 going on, so you may want to 
>> bring that up with Martin Dürst.
> 
> Good to know. Do you know which group is working on this? Is there a 
> mailling list I should contact, or should I mail Martin directly?

I think the URI mailing list is the best place.

> ...

BR, Julian

Received on Tuesday, 24 June 2008 11:09:01 UTC