Re: News and nntp URI schemes

On Tue, 04 Jan 2005 21:33:27 +0100, Frank Ellermann
<nobody@xyzzy.claranet.de> wrote:

> Charles Lindsey wrote:
>
>> the 'official' "@" in the <message-id> MUST NOT be %encoded
>> (because it is a delimiter, and should be declared to
>> be reserved
>
> Yes, that makes sense, so for the news: URL scheme we have two
> reserved characters "/" and "@", and for nntp only "/", ready.

I have been reading, and re-reading, and re-re-reading RFC 3986, and I
think I have finally sussed it out. Essentially, an IRI consists of a
<scheme>, an <authority>, a <path>, a <query> and a <fragment>. What we
are doing is to provide something which satisfies the syntax for the RFC
3986 definition of a <path>. Although our scheme makes no provision for a
<query> or a <fragment>, future extensions might do so, so we should
reserve '?' and '#' just in case (anyway, the syntax of <path> requires,
as does the RE in Appendix B which should be capable of dissecting ANY URI
into its basic 5 components.

So what I have now written is:

     Within a <printable-ascii> and a <newsgroup-name>, the characters
     '%', '@', '/', '?' and '#' are reserved and MUST be %-encoded if they
     occur. All other characters MAY be used freely to represent
     themselves. It is not precluded that future extensions to the Netnews
     standard may permit octets outside of the given ranges, in which case
     they too MUST be %-encoded (except perhaps when used in an IRI [RFC
     3987]).

Note that RFC 3986 also attempts to reserve '[' and ']' within a <path>,
although they never have any delimiting meaning after the <authority>, and
they are not forbidden by the RE in Appendix B. I think that is a bug in
RFC 3986, and so I have not reserved them for our case.

> The ftp draft says:
>
> | Within a name or CWD component, the characters "/" and ";"
> | are reserved and must be encoded

I suspect it should have reserved '?' and '#' too, for RFC 3986 compliance.

>
>>> | Note that user agents may extend the ability to refer to
>>> | groups by use of "*" as a string wild-card.
>
>> Then you would be allowing "wildmats" as defined in the NNTP
>> draft. That might be workable, but does anyone anywhere
>> inplement that?
>
> No idea, it's just an elegant way to keep the similar RfC 1738
> oddity somewhere without explicitly saying that it's dead.

I think we either have to deprecate it entirely, or go the whole hog and
turn it into a <wildmat>. Currently, of all the possibilities for
specifying <all-groups>, none of them works in all current servers, and
each of them fails to work in some current server, so we are damned
whatever we do. What is for sure is that _nobody_ implements <wildmat>s
currently.

So I have two alternative texts:

2.3  The newsURI contains an <all-groups>

     If the newsURI is of one of the following forms:
        <URI:news:*>
        <URI:news://news.example.com/*>
        <URI:news://news.example.com/>
        <URI:news://news.example.com>
     it refers to "all available news groups".  The resource retrieved by
     this URI is some means to gain access to all the newsgroups that are
     available from the given <authority> (usually by invoking a suitable
news
     reading agent).

[Issue: Do we really want all those forms? Only the first was in [RFC
1738], but many agents currently accept the others. Moreover, some
agents are known to barf on anything with '*' in it. Maybe the '*' part
of the notation should be dispensed with. I therefore offer two
alternative formulations.]

[1st alternative]

        all-groups  = news-server [ "/" ] / <empty>

     The possibility for <all-groups> to consist of a "*", which was
     present in [RFC 1738] is now obsoleted, and its continued use is
     deprecated. It was, in any case, only patchily implemented.

[That allows the following forms:
        <URI:news:>
        <URI:news://news.example.com/>
        <URI:news://news.example.com>
of which the first may or may not already work on current
implementations (but that is true of the others also).]

[2nd alternative]

        newsURI     = "news:" ( article / group )
        article     = [ news-server "/" ] message-id
        group       = [ news-server "/" ] wildmat

[where <wildmat> is defined in draft-ietf-nntpext-base-*.txt and would
allow the following forms:
        <URI:news:*>
        <URI:news:comp.*>
        <URI:news:*.test>
        <URI:news://news.example.com/*>
        <URI:news://news.example.com/comp.*>
        <URI:news://news.example.com/*.test>

this is an enhancement of draft-gilman-news-url-02.txt and preserves the
"*". It would be readily implemented, but it is quite certain that
nowhere is it implemented currently. It would also be possible to
preserve the <empty> from alternative 1 as well.]

Personally, I think <wildmat>s is a step too far, and I would recommend
alternative 1. But we need to discuss this.

I have attached my complete text, as it now stands.


-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl@clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

Received on Friday, 18 February 2005 12:13:07 UTC