Re: parsing URI (references) according to RFC 3986

On Mon, Jun 20, 2011 at 1:57 AM, Julian Reschke <julian.reschke@gmx.de> wrote:
> On 2011-06-20 10:47, Adam Barth wrote:
>>
>> On Mon, Jun 20, 2011 at 1:13 AM, Julian Reschke<julian.reschke@gmx.de>
>>  wrote:
>>>
>>> On 2011-06-20 10:03, Adam Barth wrote:
>>>>
>>>> Even just trivial things need to be cleaned up, like:
>>>>
>>>> http://ExAmple.CoM/
>>>
>>> What needs to be cleaned up here?
>>
>> * FF canonicalize('http://GoOgLe.CoM/') is 'http://google.com/'
>> * IE canonicalize('http://GoOgLe.CoM/') is 'http://google.com/'
>> * KR canonicalize('http://GoOgLe.CoM/') is 'http://google.com/'
>> * SA canonicalize('http://GoOgLe.CoM/') should be http://google.com/.
>> Was http://GoOgLe.CoM/.
>>
>> IE, Firefox, and Chrome convert host names to lower case.  Safari does
>> not.
>
> Yes. Why is this a problem? What does it have to do with URI/IRI parsing or
> resolution?
>
>>>> http://www.example.com/##asdf
>>>
>>> Either reject the reference as invalid, or treat this as a fragment with
>>> value "#asdf".
>>>
>>> *How* to handle fragments depends on media types, not URI parsing, so I'm
>>> not sure we should try to answer this here...
>>
>> FF canonicalize('http://www.example.com/##asdf') is
>> 'http://www.example.com/##asdf'
>> IE canonicalize('http://www.example.com/##asdf') is
>> 'http://www.example.com/##asdf'
>> KR canonicalize('http://www.example.com/##asdf') is
>> 'http://www.example.com/##asdf'
>> SA canonicalize('http://www.example.com/##asdf') should be
>> http://www.example.com/##asdf. Was http://www.example.com/#%23asdf.
>>
>> The question is whether # occurring in the fragment should be coerced
>> to be %-escaped.  My reading of the evidence here says "no."
>
> Again, what's to do here?
>
> You can observe this in the DOM, so it's a DOM issue.
>
> You can observe the *behavior* when navigating, and that's a media type
> issue.
>
>> On Mon, Jun 20, 2011 at 1:34 AM, Julian Reschke<julian.reschke@gmx.de>
>>  wrote:
>>>
>>> On 2011-06-20 10:24, Chris Weber wrote:
>>>>
>>>> 6) Handling percent-encoded values in various components
>>>
>>> Is there a *problem* related to this?
>>>
>>> I can see that the exposed DOM properties vary on how things are
>>> canonicalized, but that's a DOM issue, not a URI/IRI issue.
>>
>> You can play games about who needs to spec this stuff, but it needs to
>> be specced.  In implementations, this work is done by the URL
>> processing code, not by the DOM processing code.
>
> I'm not playing games, I'm trying to understand what needs to be done
> *here*. And I don't believe this belongs here (nor other DOM questions like
> how the port number defaulting works).

I can just repeat what I've said before.  This behavior needs to be
specced.  We can either spec it here or somewhere else.  Given that
the behavior is implemented in the URL processing code (and not the
DOM), the natural place to spec it is here (and not in the DOM specs).

Adam

Received on Monday, 20 June 2011 09:03:11 UTC