RE: [schemeProtocols-49] New draft of proposed "URI Schemes and Web Protocols" Finding from Larry Masinter on 2005-12-03 (www-tag@w3.org from December 2005)

From: Larry Masinter <LMM@acm.org>
Date: Sat, 03 Dec 2005 03:21:49 -0800
To: noah_mendelsohn@us.ibm.com
Cc: www-tag@w3.org
Message-id: <000901c5f7fb$c1389510$27f0070a@corp.adobe.com>
I wrote a long reply, but let me start with my 'summary':

Let me summarize: I have less problem with what you're
trying to accomplish than I do with the way that you've
written it, giving "rules" which, if taken as written,
give blanket permission for nonsensical behavior.

No one needs blanket permission for nonsensical
behavior; implementors have no problem breaking the rules
when they want to. There may be specific cases where
alternative access methods -- outside of using the
protocol indicated by the scheme -- may be useful
and appropriate for trial, but those exceptions and
alternatives should be reviewed, analyzed, tested,
and carefully examined for their impact on the
security, performance, scalability, and user impact,
before being deployed.

Maybe you don't need to read on...
  
=======================================================================
>> But your rule R7 doesn't have any such contextual restriction, 
>> and so I think it makes no sense, as is. 

>I do understand your concern, but I think my finding offers
>just the necessary context in rule R2 [1],   which requires that: 

> "A server MUST serve resources faithfully. Regardless of the protocol
>  used, the server is responsible for ensuring that the correct resource is
accessed,
>  that operations are correctly implemented according to the specifications
for the
>  protocol, and thus that the correct resource state is either retrieved or
updated."  
	 
R2 doesn't have much to do with R7. R7 is about how someone, given a URI, is
supposed
to interpret the URI (isn't it?). R2 gives advice to what a server
implementor should do.
These two pieces of advice apply to different parties, so I don't see how R2
gives
any context to R7.
	 
> I'm not trying to imply that it's in all cases a good idea for a user
agent to
> access resources using a random or otherwise silly protocol, but I'm
claiming
> that the architecture in no case prohibits them trying should they wish
to,  
	 
I don't think what you just said makes any sense. Of course, nothing really
prevents you from turning off the power switch on your computer, or
trying to compute pi instead of accessing a resource, but as far as
"architecture" goes, it seems like a bad architecture to say that
you can try whatever you want, for whatever reason, and it's OK.
"MAY" is normative language, suggesting that the behavior is legitimate
and ordinary.

I think it may only be the way that you are writing the text, so perhaps
you could rephrase it more like: 

 In some specific situations, an agent interpreting a URI
 may have additional information which defines an alternative access
 method for accessing a resource which has been identified with scheme
 "x", giving the agent information that it might instead use a different
 access method than normally defined by the "x" scheme. In those specific
 cases, the agent MAY use that alternate access method.


>	 and crucially, that the rules provided make it safe to do so. 
>   Consider a resource in an arbitrary scheme "x"
(x://example/org/resource) 
>   and attempt to access it with seemingly unrelated protocol RANDOM. 
>   R2 requires that one of two things will happen: 
	
> 	A. If the server reports back success using the mechanisms of
protocol
>   RANDOM then per R2 you have indeed succeeded in retrieving information
from
>   or updating the state of that resource.  The server MUST NOT report
success
>   unless it has accessed the intended resource and succeeded in the
requested
>   manipulation. 
	
	-or- 
	
>	B. The server MUST fail the interaction, in which case no harm is
done 
>  except for some wasted effort. 

The problem with this formulation of a set of rules is that it has unbounded
scope -- it sounds like you're making rules for all possible protocols.
If you're not, then "RANDOM" isn't really 'random' at all. Perhaps you
are saying that there ARE some protocols for which it might be possible
to try to access a URI, and for which the access is harmless and efficient,
and, if so, an agent interpreting a URI might attempt to access the resource
identified using that protocol, in a failsafe manner.

Then you would have less of a rule (as if you might be making rules
for SIP and POP and IMAP) and more like a criteria for deciding
when a protocol is a candidate for "alternative resource access".

Certainly "alternative resource access" is part of the web architecture,
and might be considered even a generalization of the concept of
"proxy" in HTTP.

#  You can try any protocol for any resource, and it worst you will get a 
# reliable indication that nothing has been done. 

No, you can't try any protocol. You can only try protocols
for which it is actually true that "at worst you will get ..".

And "at worst" ignores resource consumption and security arguments.
For example, I wouldn't want my web client to broadcast the URIs
I am surfing to everyone else in the world. And I wouldn't
want everyone in my company spamming my machine just in case
my machine might have the resource they're searching. I think,
if you're concerned about making it "safe", you should certainly
include security and performance as part of the criteria.

  "A user agent MAY to attempt to access any resource using 
  any protocol."

People take sentences from standards documents and findings
out of context. This sentence gives 'permission' to do something
that is foolish. If you mean "In certain circumstances defined
in rule XXX, a user agent MAY attempt to access a resource using
a different protocol than the one normally indicated by the
scheme", you need to say so.

  "Insofar as networks are free from misconfiguration and tampering,
  any response received is by definition authoritative, regardless
  of whether the scheme is one traditionally associated with the protocol." 

The precondition of this statement is false (there are no
networks that are permanently free from misconfiguration and
tampering).
	
> With those rules in place, the finding then acknowledges that as a
> guideline [3], one should whenever practical deploy with the protocol(s)
> associated with a scheme

I don't think this is reasonable language in a finding.
"whenever practical" is taken to mean "whenever you feel like
it" or "whenever you can and still make your ship deadline".
Implementors only implement "MUST" clauses when it is
practical to implement them, and you don't need to
give them a loophole to not do so and still claim that
they are "compliant".

There's an important principle in system implementation -- 
"it is OK to cheat as long as you never get caught."
If I offer a server at http://my.example.com/blah and
tell someone about it, they're free to use some other protocol
to actually access that server, as long as no one can
tell that they're not really using HTTP, and yet get
the same results as if they used HTTP.
If you're trying to give them permission to
not use http and use SIP to make a VOIP call
to someone, well, that doesn't make sense.
My resource is available at "my.example.com" using
the HTTP protocol.

(summary at top)

Larry
-- 
http://larry.masinter.net
Received on Saturday, 3 December 2005 11:21:30 UTC