Re: draft-hoffman-news-nntp-uri-01.txt from Al Gilman on 2004-09-22 (uri@w3.org from September 2004)

From: Al Gilman <Alfred.S.Gilman@IEEE.org>
Date: Wed, 22 Sep 2004 15:59:49 -0400
To: uri@w3.org
Message-Id: <p0611040abd7781630cb6@[10.0.1.2]>
+1 to all Charles's assertions.  I think he has the story straight.
This includes but is not limited to "the 'nntp' scheme should not go away."

On the maybe's that have passed through this thread:

a. Allow <server> in URIs where the scheme is 'news'?
I believe that this is implemented in multiple browsers, so should
be added in as a friendly amendment above and beyond 1738.

b. Is a 'news'  URI by message-id identifying a single resource?
Yes.  In the more current thinking about URIs and the integrity and identity
of resources, this fits well into the concept of an identifier as a sufficient
key to isolate something from its peer resources.  The language 'not
enough to locate' is actually saying 'not enough to address a server
for recovery operations' and is different from the current use of 'identifier'
as distinct from a recovery address in current writings about URIs.

c. '*' as a group wildcard?
This can be found 'in the wild' in running code in Lynx.
http://lynx.isc.org/lynx2.8.5/lynx2-8-5/lynx_help/lynx_url_support.html#news_url

Hoping I held heat:light under 2:1 in this,

Al

At 2:16 PM +0100 9/22/04, Charles Lindsey wrote:
>>2.  Scheme Definition
>>
>>   The news and nntp URL schemes are used to refer to either news groups
>>   or individual articles of USENET news, as specified in RFC 1036.
>>
>>   The news URL takes the form:
>>
>>      newsURL     = "news"  ":" [ news-server ]
>>                       ( newsgroup-name | '*' | message-id )
>>      news-server =  "//" server "/"
>>      message-id  = id-left "@" id-right
>
>OK, that syntax is correct now, but you need a normative reference 
>to RFC 2822 for <id-left> and <id-right> (or maybe to the 
>definitions in the Usefor draft, if that manages to become an RFC by 
>the time this draft is ready - but don't hold your breath). 
>Presumably also to RFC 2396bis for <server>, and it still is not 
>clear to me whether <server> could include user+password 
>information, and if so what one does if the authentication required 
>by the server is SASL based, which will soon become the norm.
>>
>>   A <newsgroup-name> is a period-delimited hierarchical name, such as
>>   "comp.lang.perl.modules".  A <message-id> corresponds to the
>>   Message-ID of section 2.1.5 of RFC 1036 [RFC1036], without the
>>   enclosing "<" and ">"; it takes the form <unique>@<full_domain_name>.
>>   <unique> cannot be quoted text or have escaping characters.
>
>No, it is not as simple as that, because you can also have an IP 
>address after the '@' and this is covered, in RFC 2822, under the 
>guise of a <no-fold-literal> and that brings in quoted text and 
>escaping characters again.
>
>So I think what you have to say is something like:
>
>   "The <id-left> and the <id-right> MUST be in a canonical form in which no
>    <quoted-string> or <quoted-pair> is used in a context where the same
>    semantic meaning could have been rendered without such quoting;
>    moreover, no whitespace may be included, whether %-encoded or not and/or
>    quoted or not.
>
>    For example, neither
>       news:"abcd"@example.com
>    nor
>       "ab\cd"@example.com
>    is in canonical form, because the form
>       abcd@example.com
>    is available."
>
>Yes, there are indeed email systems around that will happily treat 
>email message identifiers using those three forms as being 
>identical, which is of course a complete no-no in News.
>>
>>   If <newsgroup-name> is "*" (as in <URL:news:*>), it is used to refer
>>   to "all available news groups".
>
>OK, that feature has been available since RFC 1738, but I am not at 
>all sure what it is meant to DO. I just tried it on my browser, and 
>it was totally confused, telling me "411 Invalid group name (not in 
>active).". I would be quite happy to see it simply dropped (unless 
>someone can point me to a system that does something useful with it).
>Next, we really need some text to explain what resource is supposed 
>to be retrieved by this URL. Something like:
>
>   "The resource retrieved by this URL is the Netnews article with the
>    given <message-id>. In a properly working Netnews system, the same
>    article will be obtained whatever server is accessed for the purpose
>    (assuming the server in question carried that article in the first
>    place and that it has not expired). If no <server> is specified, the
>    article is to be retrieved from whatever server has been configured
>    for local use."
>
>One may then need wording as to whether this is truly a global 
>resource, such as people have been discussing regarding the file 
>scheme. I am keeping out of that one myself - the point may become 
>more relevant if we bring back the nntp scheme. I notice that RFC 
>1738 contained the paragraph:
>
>"The news URLs are unusual in that by themselves, they do not 
>contain sufficient information to locate a single resource, but, 
>rather, are location-independent."
>
>I am far from clear what that actually means (if anything), but 
>maybe it is related to what I was trying to say in my suggested 
>paragraph above. AFAICS, they do indeed locate a single resource (if 
>you count different copies of the same article as "single").
>>
>>   The nntp URL defined in RFC 1738 is deprectated.
>
>No, I don't think we ever agreed that, and a couple of people have 
>pointed out places where it is implemented. I have also seen it as 
>the recommended method, in Opera, to force the system to reload an 
>article from the server if the client has lost it somehow.
>
>So I would be in favour of bringing it back. I might even be 
>persuaded to combine it with the news scheme as originally proposed, 
>but only if we establish first exactly what it is meant to do on its 
>own.
>
>--
>Charles H. Lindsey ---------At Home, doing my own thing------------------------
>Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
>Email: chl@clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
>PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
Received on Wednesday, 22 September 2004 20:36:12 UTC