W3C home > Mailing lists > Public > uri@w3.org > September 2004

draft-hoffman-news-nntp-uri-01.txt

From: Charles Lindsey <chl@clerew.man.ac.uk>
Date: Wed, 22 Sep 2004 14:16:45 +0100
To: uri@w3.org
Message-ID: <opseqch7ke6hl8nm@clerew.man.ac.uk>

> 2.  Scheme Definition
>
>   The news and nntp URL schemes are used to refer to either news groups
>   or individual articles of USENET news, as specified in RFC 1036.
>
>   The news URL takes the form:
>
>      newsURL     = "news"  ":" [ news-server ]
>                       ( newsgroup-name | '*' | message-id )
>      news-server =  "//" server "/"
>      message-id  = id-left "@" id-right

OK, that syntax is correct now, but you need a normative reference to RFC  
2822 for <id-left> and <id-right> (or maybe to the definitions in the  
Usefor draft, if that manages to become an RFC by the time this draft is  
ready - but don't hold your breath). Presumably also to RFC 2396bis for  
<server>, and it still is not clear to me whether <server> could include  
user+password information, and if so what one does if the authentication  
required by the server is SASL based, which will soon become the norm.
>
>   A <newsgroup-name> is a period-delimited hierarchical name, such as
>   "comp.lang.perl.modules".  A <message-id> corresponds to the
>   Message-ID of section 2.1.5 of RFC 1036 [RFC1036], without the
>   enclosing "<" and ">"; it takes the form <unique>@<full_domain_name>.
>   <unique> cannot be quoted text or have escaping characters.

No, it is not as simple as that, because you can also have an IP address  
after the '@' and this is covered, in RFC 2822, under the guise of a  
<no-fold-literal> and that brings in quoted text and escaping characters  
again.

So I think what you have to say is something like:

   "The <id-left> and the <id-right> MUST be in a canonical form in which no
    <quoted-string> or <quoted-pair> is used in a context where the same
    semantic meaning could have been rendered without such quoting;
    moreover, no whitespace may be included, whether %-encoded or not and/or
    quoted or not.

    For example, neither
       news:"abcd"@example.com
    nor
       "ab\cd"@example.com
    is in canonical form, because the form
       abcd@example.com
    is available."

Yes, there are indeed email systems around that will happily treat email  
message identifiers using those three forms as being identical, which is  
of course a complete no-no in News.
>
>   If <newsgroup-name> is "*" (as in <URL:news:*>), it is used to refer
>   to "all available news groups".

OK, that feature has been available since RFC 1738, but I am not at all  
sure what it is meant to DO. I just tried it on my browser, and it was  
totally confused, telling me "411 Invalid group name (not in active).". I  
would be quite happy to see it simply dropped (unless someone can point me  
to a system that does something useful with it).

Next, we really need some text to explain what resource is supposed to be  
retrieved by this URL. Something like:

   "The resource retrieved by this URL is the Netnews article with the
    given <message-id>. In a properly working Netnews system, the same
    article will be obtained whatever server is accessed for the purpose
    (assuming the server in question carried that article in the first
    place and that it has not expired). If no <server> is specified, the
    article is to be retrieved from whatever server has been configured
    for local use."

One may then need wording as to whether this is truly a global resource,  
such as people have been discussing regarding the file scheme. I am  
keeping out of that one myself - the point may become more relevant if we  
bring back the nntp scheme. I notice that RFC 1738 contained the paragraph:

"The news URLs are unusual in that by themselves, they do not contain  
sufficient information to locate a single resource, but, rather, are  
location-independent."

I am far from clear what that actually means (if anything), but maybe it  
is related to what I was trying to say in my suggested paragraph above.  
AFAICS, they do indeed locate a single resource (if you count different  
copies of the same article as "single").
>
>   The nntp URL defined in RFC 1738 is deprectated.

No, I don't think we ever agreed that, and a couple of people have pointed  
out places where it is implemented. I have also seen it as the recommended  
method, in Opera, to force the system to reload an article from the server  
if the client has lost it somehow.

So I would be in favour of bringing it back. I might even be persuaded to  
combine it with the news scheme as originally proposed, but only if we  
establish first exactly what it is meant to do on its own.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl@clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
Received on Wednesday, 22 September 2004 18:06:43 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 13 January 2011 12:15:34 GMT