Re: News and nntp URI schemes from Charles Lindsey on 2004-12-28 (uri@w3.org from December 2004)

From: Charles Lindsey <chl@clerew.man.ac.uk>
Date: Tue, 28 Dec 2004 16:56:13 -0000
To: "Frank Ellermann" <nobody@xyzzy.claranet.de>
Message-ID: <opsjp9bz2g6hl8nm@clerew.man.ac.uk>
On Tue, 21 Dec 2004 00:07:07 +0100, Frank Ellermann
<nobody@xyzzy.claranet.de> wrote:

> Charles Lindsey wrote:
>
>> The trouble is that there are just too many definitions of
>> message-id around the place.
>
> Different aspects of the same real thing, all describing the
> minimum they could get away with for their special purposes.
>

>> The alternative is to give a very loose definition, on the
>> grounds that URLs are supposed to contain only whatever has
>> been used in some existing news article (and the agent that
>> generated that article can worry about which standard it
>> conformed to). So any string of characters with an '@'
>> in the middle is good enough.
>
> That's a good idea, because it's what you really want for the
> news URL:  If somebody created a Message-ID with NO-WS-CTL,
> and a server accepted this, then maybe that's broken, but no
> problem of the news URL scheme.

OK, what I now propose is the following:

        message-id  = 1*printable-ascii "@" 1*printable-ascii
        printable-ascii = %d33-61 / %d63-126 ; excludes ">"

     A <message-id> corresponds to the <msg-id> of RFC 2822 and to the
     Message-ID of section 2.1.5 of RFC 1036, but without the enclosing
     "<" and ">". It MUST be the message identifier of an actual Netnews
     article and hence will in practice conform to the syntax defined in
     RFC 1036 or in any subsequent standard for Netnews articles. Thus not
     every <message-id> as defined above is valid.

     Observe the delimiter "@" which enables an <article> to be
     distinguished from a <newsgroup-name>. Observe also that any reserved
     character within a <printable-ascii> will need to be %-encoded.

Please can somebody tell me whether the remark about reserved characters
and %-encoding is the correct thing to say there?

Note that all the text in my previous proposal concerning <id-left> and
<id-right> and canonical forms can now be deleted, since the problem is
shifted to whoever generated the article to be accessed, who will bear
responsibility for conforming to RFC 1036 or USEFOR or whatever.


So I think the remaining issue is what to do about the '*' notation.
Currently, it says:
     If the newsURL is of one of the following forms:
        <URL:news:*>
        <URL:news://news.example.com/*>
        <URL:news://news.example.com/>
        <URL:news://news.example.com>
     it refers to "all available news groups".  The resource retrieved by
     this URL is some means to gain access to all the newsgroups that are
     available on the given <server> (usually by invoking a suitable news
     reading agent).

That seems a huge overkill. The first two were in RFC 1738, But Opera at
least barfs on them (thinking that '*' is a newsgroup-name). Frank reports
that the last three all work in his UA, and it seems to me that the last
two are more comparable to the way things are done in other schemes.

My personal view is that we should declare the first two to be obsolets
(and maybe permit <URL:news:> as a substitute for the first (though since
it merely means "open up my news browser", I am sure we could live without
it entirely).

Comments and opinions?

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl@clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
Received on Thursday, 30 December 2004 14:29:34 UTC