W3C home > Mailing lists > Public > uri@w3.org > December 2006

Fwd: [EAI] Re: draft-ellermann-news-nntp-uri-04

From: Charles Lindsey <chl@clerew.man.ac.uk>
Date: Fri, 08 Dec 2006 14:36:12 -0000
Message-ID: <op.tj8v6mse6hl8nm@clerew.man.ac.uk>
To: URI <uri@w3.org>

On Tue, 05 Dec 2006 21:21:57 -0000, Frank Ellermann
<nobody@xyzzy.claranet.de> wrote:

> Charles Lindsey wrote:

>> Again, you have a format for a whole group:
>> nntp://server.example/comp/lang.c
>> which was not actually allowed in RFC 1738, but might be useful even
>> though it means the same as
>> news://server.example/comp/lang.c
> Oddly RFC 1738 allows the server + group syntax only for the
> nntp-scheme.

Yes, but the point I was trying to make is that you were introducing a new
feature not present in RFC 1738, which does not permit
"nntp://server.example/comp.lang.c" or
"nntp://server.example/comp.lang.c/" at all. Not that I object to that
(since I want to go further by adding <range>s), but you did not point out
that you were introducing an extension.

> It's frustrating for users if some URLs don't work for them.  It's also
> bad for the support stuff if they'd have to answer question why some
> fancy style of URLs for a protocol they've never before heard of (their
> own NNTP server) doesn't work.

But that's a problem with NNTP, not with your URIs. Generally speaking,
RFC 3977 is not backwards compatible with RFC 977 insofar as several new
features in 3977 (including full wildmats and ranges in some contexts)
will not be understood by existing implementations of 977. Nevertheless,
3977 is a vast improvement on 977 because it makes the usage of wildmats
and ranges consistently available in all the places where they would make
sense. But it is reasonable to suppose that implementations will catch up
within the next few years, and it would be a pity if URIs were
artificially constrained from taking advantage when they do.
> Not all features of a decent newsreader or NNTP-server implemented after
> 3977 was published some weeks ago need a 1:1 correspondence in URIs.  An
> URL is a "hey, look at this" thing, it's supposed to work for anybody.
> That's not the case with the fancier wildmats, and also unnecessary.
> The comma doesn't work unencoded with some UAs, the question mark is a
> nightmare, and the exclamation mark doesn't fly without the comma.

I see absolutely nothing in RFC 3986 which would require %-encoding of
comma or exclamation (or course, %-encoding, even if not necessary, never
does any harm). RFC 3986 provides:
        reserved    = gen-delims / sub-delims
        gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
        sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                    / "*" / "+" / "," / ";" / "="
where the <gen-delims> are the ones given meaning in RFC 3986 itself. What
that means is that no scheme is allowed to give special meeaning to a
non-reserved character that would ever make %-encoding it necessary. But
%-encoding of <sub-delims> will never be necessary unless some particular
scheme chooses to make it so. Comma and exclamation are therefore
perfectly safe in the news and nntp schemes, and any UA that does not
accept them at face value is severely broken.

I agree that the '?' in wildmats is not especially useful and will not
often be used, and so it is no great inconvenience to require it to be
%-encoded. Though actually it should be safe left naked, except for the
unofficial and undocumented practice you discovered in relation to

So there is no argument against full <wildmat>s there.

> Doing it all in one giant jump doesn't work, look at the mailto-bis I-D,
> it tried to throw in IRIs and EAI and making coffee in various scripts.
> It's too much.  It breaks.  RFC 1738 is state of the art 1994.  I want
> to document state of the art 2005 (when 3986 was published) + Usefor-11.
> It uses the usefor-11 concept of Message-ID, that's a huge step forward,
> even in relation to 2822.  That's something I care about, because it is
> essential for NetNews, unlike those wildmats or article ranges.

And that is where you have made one huge mistake.

The syntax of <msg-id> in USEFOR is hideous (and you and I were
instrumental in making it so); but sadly it is necessary.

But that does not mean you have to copy it all into the URI draft (and
even less that you only copy part of it, as you have done). There is no
need for any implementation of the news scheme to check that syntax. If it
is offered some plausible sequence of characters with an "@" in the
middle, then you just throw it at the NNTP server and see what comes back.
It it is syntactically incorrect, then there will be no article that
matches it, and you will be told so.

So you can afford to be exceedingly liberal in what you accept and send to
the server, just so long as you avoid anything that might be a URI
delimiter. Your scheme is supposed to be an interface to RFC 3977, not to
USEFOR, and the only syntax you need for <msg-id> is the liberal syntax
given in RFC 3977, which is

       message-id = "<" 1*248A-NOTGT ">"
       A-NOTGT    = %x21-3D / %x3F-7E  ; exclude ">"

Well, you need to omit that "<" and ">", and you need to enforce an "@" in
the middle, and you probably need to enforce %-encoding of a few
characters. But that is all.

> The idea is to discourage nntp URLs, not to "improve" them.  I can see
> a point in an XREF to nntp URL transformation on dedicated servers like
> GMaNe, and that's documented.

I will grant you that the case for adding <range>s to the nntp scheme is
much weaker than the case for adding proper wildmats to the news scheme
(but then why did you add the "nntp://server.example/comp.lang.c"
extension which has no use at all?). And for sure the current use of the
nntp scheme in the wild is rare (I do not even own a browser that supports

Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131                       
   Web: http://www.cs.man.ac.uk/~chl
Email: chl@clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
Received on Friday, 8 December 2006 14:36:35 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:10 UTC