Re: URL work in HTML 5

Hello Larry,

On 2012/10/16 16:12, Larry Masinter wrote:
>
> I think it would be useful to find a system that requires general IRI syntax to be more constrained.

Specs don't actively care. If we look at implementations, I found one 
very quickly, in the standard library for Ruby:

 > irb
irb(main):001:0> require 'uri'
irb(main):002:0> uri = URI.parse 'http://example.org/abc def/page.html'
URI::InvalidURIError: bad URI(is not URI?): ...
... [details of error removed]
irb(main):003:0> uri = URI.parse 'http://example.org/abcdef/page.html'
=> #<URI::HTTP:..... URL:http://example.org/abcdef/page.html>
irb(main):004:0> exit

[disclaimer: I'm a Ruby committer, but I have never touched that library 
so far]

I assume that quite some other libraries may have similar behavior, but 
I'm not familiar enough with Java/Python/PHP/Perl to do such a quick and 
easy test.

XML Schema and anyURI may also be worth looking at. For good reasons, 
it's not very popular around there these days, but some heavy hitters 
still use it.


> Of course, individual schemes can constrain their syntax in a scheme-specific way (making anything that doesn't match the scheme template invalid-as-instance-of-scheme even if valid-as-IRI).

Yes. One question is how to apply a such scheme-specific restrictions to 
a much more lenitent base syntax. It may be very easy, or it may not.

> And individual contexts (like "space separated list of IRIs") can provide additional constraints ("before adding an IRI to a space separated list of IRIs, replace all spaces with %20").  But those kinds of rules are layered on top of 3987(bis).

The advantage of having RFC 3986 and RFC 3987(bis) is that it's clear 
that one does not have to say anything like "before adding an IRI to a 
space separated list of IRIs, replace all spaces with %20". So lots of 
specs currently don't say this.

> I'd like to talk out some of these things in Atlanta, should we try to make it a separate (bar) bof, or try to use the IRI working group time to talk about this?

If you are looking to find examples of IRI/URI-related specs and 
implementations that actually check (some of) the restrictions of the 
IRI/URI spec, then I think the best way is to ask on a few of the 
relevant mailing lists. Asking at the end of the WG meeting, and moving 
further discussion to a bar (bof) would then be the next step (if still 
necessary).

Regards,   Martin.


>> -----Original Message-----
>> From: Ted Hardie [mailto:ted.ietf@gmail.com]
>> Sent: Monday, October 15, 2012 1:00 PM
>> To: Larry Masinter
>> Cc: Robin Berjon; Anne van Kesteren; plh@w3.org; Peter Saint-Andre
>> (stpeter@stpeter.im); Pete Resnick (presnick@qualcomm.com); "Martin Dürst
>> (duerst@it.aoyama.ac.jp)"; www-archive@w3.org
>> Subject: Re: URL work in HTML 5
>>
>> On Mon, Oct 15, 2012 at 11:37 AM, Larry Masinter<masinter@adobe.com>
>> wrote:
>>
>>> I think that's the bigger implication -- the vision that the web supplants all
>> other (network) apps; for some systems,  "URLs to non-Web things" is an empty
>> set.
>>>
>>> My understanding of Peter's survey of other specs that make reference to
>> RFC 3987 was that there weren't any whose implementations relied on anything
>> other than the browser to do URL/IRI resolution and processing.
>>>
>>
>> First, can you provide a pointer to the survey?
>>
>> Second, while there may be systems for which the only handle for URIs
>> is the the browser, there are certainly systems for which that is not
>> true.  To pick one produced close to when URIs became a full standard,
>> look at RFC 4088 (http://tools.ietf.org/html/rfc4088).  I doubt there
>> are many browsers which dereference URIs like
>> snmp://example.com/bridge1;800002b804616263 with their own handlers.
>>
>> URIs used internally to systems outside the web may not be easily seen
>> in a web-based corpus, but that does not mean that they are not there,
>> nor that shifting the parsing rules won't effect them.
>>
>> regards,
>>
>> Ted Hardie
>

Received on Tuesday, 16 October 2012 09:45:42 UTC