W3C home > Mailing lists > Public > uri@w3.org > June 2008

Re: URIs in HTML5 and issues arising

From: Ian Hickson <ian@hixie.ch>
Date: Sun, 29 Jun 2008 21:20:03 +0000 (UTC)
To: Julian Reschke <julian.reschke@gmx.de>
Cc: uri@w3.org, HTML WG <public-html@w3.org>
Message-ID: <Pine.LNX.4.62.0806292107190.13974@hixie.dreamhostps.com>

On Sun, 29 Jun 2008, Julian Reschke wrote:
> Ian Hickson wrote:
> > > Fair enough.  Use "HTML URL" a few times, then, particularly in the 
> > > context of the definition of validity.
> > 
> > It was pointed out that "HTML URL" would also be misleading, since 
> > there are already spec writers looking to use these definitions 
> > elsewhere.
> Not sure why this means it can't be called "HTML URL".

Because it would be even more confusing to have non-HTML specs talk about 
their URLs being HTML URLs.

> > > Interesting. If so that's a flat-out browser bug and should be 
> > > fixed.
> > 
> > That's nice in theory, but content depends on this behaviour now.
> How much? It would be nice to make this decision based on reliable 
> information, because it's an expensive one for the future.

Philip has already posted numbers, and I cited them in the e-mail to which 
you replied. If you're not going to do research yourself, the least you 
could do is read the e-mails to which you are replying completely before 
asking that other people do the research for you.

> > Having had to deal with content in mixed encodings before, I disagree 
> > that it's better. At least with data loss you get much quicker 
> > feedback that something went wrong.
> How do you know that it is data loss?

It's pretty obvious when you paste a URL into a document and then click it 
to see if it worked that it didn't work if it goes to a page you're not 
expecting, all the more so when it does so because the data got converted 
into question marks.

> > > On the other hand, documenting something that is clearly broken 
> > > seems to be the wrong approach to me, in particular as we have proof 
> > > that there currently isn't any reliable interoperability for this 
> > > edge case.
> > 
> > This is error handling (this can't happen for conforming documents), 
> > so I'm surprised that you have an opinion as to what should happen. 
> > :-)
> I care because I'd like to see documents using non-ASCII characters in 
> query parts become compliant no matter what encoding they are in.

Unless we change the definition of HTML5's URLs to be conforming even when 
those URLs would not be treated as IRIs, I don't see any way to get there 
from here.

> > I agree. However, in this case I don't believe "URL" as per RFC3986 is 
> > "well known". I think "URL" as per HTML5 is what it is most commonly 
> > assumed to mean.
> I believe many developers rely on the RFC definitions.

I would be surprised if this were so, but I have no data to back my 

> Whether or not RFC 3986 defines "URL" is really not the point. If it 
> didn't, another, earlier RFC would.

Terminology defined in obsolete URLs would be even less of an issue 

> > They are certainly a big part of the intended audience.
> Sorry? People who think '"URL" simply means "the internet address you 
> can type in a web browser"' are the intended audience for this spec? If 
> you really think that, I recommend letting those people try to read and 
> understand it.

That would be premature, since we haven't yet set up the multiple views 
feature in the spec (the plan is to make a version of the spec available 
that hides user agent conformance requirements).

> [asking vendors]
> It would be nice to see these kinds of discussions being part of the 
> working group process, so that the other WG members can actually see 
> what was being proposed, and what the answer was.

The HTMLWG is only a small part of the broad range of places from which I 
take input, which includes hundreds of blogs, at least three separate bug 
systems, multiple other mailing lists, face to face discussions, IRC 
conversations on dozens of channels and privately, private e-mails, etc. I 
try to keep as much of the discussions to the HTMLWG and WHATWG lists, but 
the sheer volume of traffic that would be generated by archiving all the 
sources of input on public-html would be staggering, and that's without 
even considering whether all those people would actually be willing to 
have their input forwarded in that way.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Sunday, 29 June 2008 21:20:42 UTC

This archive was generated by hypermail 2.4.0 : Sunday, 10 October 2021 22:17:51 UTC