Re: [widgets] Widgets URI scheme... it's baaaack! from Robin Berjon on 2009-09-15 (public-webapps@w3.org from July to September 2009)

From: Robin Berjon <robin@berjon.com>
Date: Tue, 15 Sep 2009 13:26:30 +0200
To: Mark Baker <distobj@acm.org>
Cc: public-webapps <public-webapps@w3.org>
Message-Id: <F6572B02-68F4-410F-ACDF-52D5719D8EEC@berjon.com>
On Sep 9, 2009, at 19:01 , Mark Baker wrote:
> On Wed, Sep 9, 2009 at 10:17 AM, Robin Berjon<robin@berjon.com> wrote:
>> On Sep 8, 2009, at 17:18 , Mark Baker wrote:
>>>>
>>>> function getSection () {
>>>>  return location.href.replace(/^http:\/\/magic.local\/([^\/]+).*/,
>>>> "$1").toLowerCase();
>>>> }
>>>>
>>>
>>> The regex could just as easily have been written to exclude the
>>> authority component of the URI.  Do you have a better example?
>>
>> It could have, but it wasn't — interoperability isn't what happens  
>> when
>> people write to a W3C working group to get their code debugged,  
>> it's what
>> happens when real people write code on their own.
>
> Sure, some people will write really bad code.  I just don't think we
> have to accommodate all of them.

Of course, but the above piece of code isn't bad at all. It gets the  
job done, and it's not more generic than one has reason to expect it  
to be, especially with web development background. What we're  
accommodating are expectations of interoperability and least surprise  
— in my book that hardly qualifies as catering to "really bad code".

>> Let us assume that we don't at all say what is returned by the many
>> attributes that normally expose URIs. What regex would you "just as  
>> easily
>> have written" to match an unspecified value? Here are some samples  
>> from
>> several implementations given an image linked to as /img/dahüt.svg:
>>
>>  A: http://magic.local/img/dahüt.svg
>>  B: file://mushroom.local/img/dahüt.svg
>>  C: file:///img/dahüt.svg
>>  D: file:///C|/img/dahüt.svg
>>  E: \\myphone\img\dahüt.svg
>>  F: C:\MY DOCUMENTS AND SETTING\MY USERS\MY MARKB\MY DOCUMENTS\MY  
>> WIDGETS\MY
>> ARSE\DAH~1.VML
>>  G: http:///img/dah%FCt.svg
>>  H: cool-product:/img/dah%u0055%u0308t.svg
>>  I: inode:DEADBABEC0EDBEEF
>>  J: many more things...
>
> Some of those aren't URIs, and some aren't hierarchical.  Of the
> others, "[:/]//?*/(.*$)" should cover it.

Sure, but in the absence of any indication from the specification, why  
should implementers use a URI there? In fact, one could make the case  
that it makes better sense to pick something that cannot

I'll note in passing that your regex doesn't take into account cases  
that might expose the query string in some implementations and not in  
others. Would you consider it to be "really bad code"? Certainly one  
could construct a more robust regex than yours, but it's a lot better  
to provide the means for implementations to be interoperable from the  
start rather than having to document which hacks work everywhere.

> But if it would simplify things, I wouldn't be averse to a  
> getBaseURI() call.

I'm not sure what exactly that would cover, and how it would help.

>> Let's imagine we say nothing and you're an implementer: what would  
>> you do?
>> Everyone in this discussion understands that introducing new  
>> schemes should
>> be done with caution — what I don't understand is what  
>> architectural value
>> you are seeing in not using URIs to identify resources, encouraging
>> non-interoperable solutions, or sweeping the issue under the rug by
>> delegating to a special name instead of a scheme.
>
> I'm not doing any of those things AFAICT.  I encourage resources to be
> identified by URIs.  I just don't see a need to tell implementations
> what their URIs should look like, other than to say they should be
> hierarchical for obvious reasons.

Should they include query strings? Fragments? Can they contain UTF8  
characters? What are the security implications of reusing an existing  
scheme with a magic name (given that it could be highjacked)?

This is a case where saying less only bring more problems. If we were  
to go your suggested route of telling implementations everything about  
what the URIs should look like except what the scheme would be, what  
we'll end up specifying is a URI scheme without a scheme name. Apart  
from scoring a perfect Montesquieu on the Mint New URI Schemes With A  
Trembling Hand, I'm not sure that it buys us much that simply  
providing implementers with information that they've been asking for  
doesn't.

> The identifiers produced by implementations you list above suggests
> that at least some implementors feel that they can reuse existing
> schemes, no?

No, the list is provided as an example, not taken from actual  
implementations. Implementers have been asking repeatedly for this URI  
scheme to be defined, I think largely because they want to be  
interoperable with one another. Push-back on this has come mostly from  
the armchair end of the spectrum.

-- 
Robin Berjon - http://berjon.com/
Received on Tuesday, 15 September 2009 11:27:07 UTC