RE: scheme-specific length limits (issue 48) from Larry Masinter on 2011-04-03 (public-iri@w3.org from April 2011)

From: Larry Masinter <masinter@adobe.com>
Date: Sun, 3 Apr 2011 05:48:17 -0700
To: Noah Mendelsohn <nrm@arcanedomain.com>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
CC: Ted Hardie <ted.ietf@gmail.com>, Tony Hansen <tony@att.com>, "public-iri@w3.org" <public-iri@w3.org>
Message-ID: <C68CB012D9182D408CED7B884F441D4D05A06574B7@nambxv01a.corp.adobe.com>

A scheme registration defines the syntax for URIs (IRIs) that are valid for the scheme.  A syntax definition can include limits -- that some strings are valid for the scheme and other strings are not. Those limits can be complicated, limit the repertoire of characters, be expressed in BNF, and can include length limits. 

Syntactic restrictions should be justified, usually by the limits of the resolution mechanism or protocol associated with a string. And we should disallow any limits (or any other syntactic restrictions) that treat %-hex encoded UTF8 characters differently than their unicode character equivalents.

The conversation seems to have confused "length limits for URIs/IRIs in general" (there are none) with syntactic limits (including length limits) in particular scheme definitions.

Larry
--
http://larry.masinter.net

-----Original Message-----
From: Noah Mendelsohn [mailto:nrm@arcanedomain.com] 
Sent: Thursday, March 31, 2011 8:13 AM
To: "Martin J. Dürst"
Cc: Larry Masinter; Ted Hardie; Tony Hansen; public-iri@w3.org
Subject: Re: scheme-specific length limits (issue 48)

On 3/31/2011 4:49 AM, "Martin J. Dürst" wrote:
> As an example, it would be total nonsense to define something like the
> http: scheme to have a maximum length of 512 (or any other number you prefer).

I think that's just a bit too strong. I do agree that, on balance, we 
should at least strongly discourage, or perhaps prohibit outright, the 
imposition of limits that don't map directly to constraints in the 
underlying protocols.

That said, I don't think the arguments to the contrary are "total 
nonsense". Handling strings of arbitrary length can, in certain computing 
environments, lead to significant increases in complexity and/or increased 
performance overhead. In a virtual memory environment, things can get 
easier, but in something like an embedded system, having to architect for 
arbitrarily long strings when in practice you're unlikely to see any has 
real costs in design time, complexity, etc. I still think, on balance, that 
baking such restrictions into scheme definitions will rarely if ever be the 
right thing to do, but to imply that doing so would be "total nonsense" 
seems way too strong.

Noah

Received on Sunday, 3 April 2011 12:49:01 UTC