W3C home > Mailing lists > Public > uri@w3.org > December 2004

Re: News and nntp URI schemes

From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Fri, 31 Dec 2004 16:07:50 +0100
To: uri@w3.org
Message-ID: <41D56B45.6EA0@xyzzy.claranet.de>

Charles Lindsey wrote:

>|   message-id  = 1*printable-ascii "@" 1*printable-ascii
>|   printable-ascii = %d33-61 / %d63-126 ; excludes ">"
>|
>| A <message-id> corresponds to the <msg-id> of RFC 2822 and
>| to the Message-ID of section 2.1.5 of RFC 1036, but without
>| the enclosing "<" and ">".

I like it.

>| It MUST be the message identifier of an actual Netnews
>| article and hence will in practice conform to the syntax
>| defined in RFC 1036 or in any subsequent standard for
>| Netnews articles. Thus not every <message-id> as defined
>| above is valid.

True, but I'd avoid the MUST if it's not absolutely necessary.
Maybe only a matter of taste, and if that's the case forget it.

>| Observe the delimiter "@" which enables an <article> to be
>| distinguished from a <newsgroup-name>.

That's important, and I'm still unsure about news URLs with %40
instead of @, who is responsible to get this right, the UA ?

>| Observe also that any reserved character within a
>| <printable-ascii> will need to be %-encoded.

http://gbiv.com/protocols/uri/rev-2002/rfc2396bis.html#reserved
apparently says that all schemes are free to define their own
reserved char.s

For the news scheme we're interested in "/", "@", ">", and "%".
Your syntax handles ">" and asks for at least one "@".  That's
good enough for the normal <unique@mdomain> and for odd cases
like <"foo@bar"@mdomain>.

Is <news:unique%40mdomain> implicitly okay ?  <news:"foo@bar">
is implicitly bad, <"foo@bar"> is no Message-ID, and it's also
no group.  Dito <news:"foo%40bar">.

<news:"foo%40bar"%40mdomain> could be again okay, it's about
the Message-ID <"foo@bar"@mdomain> (as per 2396bis 2.1).  My
UA doesn't like it, maybe it's a s-o-1036 extremist like me ;-)

<news:"bro%ken"%40mdomain> is broken beyond repair, it should
be <news:"bro%25ken"%40mdomain> for msg-id <"bro%ken"@mdomain>.

What about <news://auth@example/unique@mdomain> ?  You didn't
explicitly say that "/" is reserved, is this about the (bad)
Message-ID <//auth@example/unique@mdomain>, or does the UA know
that it's <unique@mdomain> on server example with login auth ?

Let's assume that "/" is implicitly reserved by your syntax.
Then <news://example/path@mdomain> is tricky, it's NOT about
Message-ID <//example/path@mdomain>.  A real problem, because
Message-IDs like <path/file/2004-12-31@mdomain> exist, and
Message-ID <//example/path/date@mdomain> is allowed.

<news:%2F/example/path/date@mdomain> or
<news:/%2Fexample/path/date@mdomain> should IMHO work.  But a
<news://example%2Fpath/date@mdomain> is probably bad, and a
<news://example/path%2Fdate@mdomain> is something different,
it's Message-ID <path/date@mdomain> on server example.

> Please can somebody tell me whether the remark about reserved
> characters and %-encoding is the correct thing to say there?

IMHO you should also say _which_ characters are reserved in the
news scheme.  AFAIK "%" is always reserved, it's in a separate
chapter of 2396bis.  Apparently you need "/" as reserved char.

But you don't need "@" and ":", they are only reserved in...

| authority   = [ userinfo "@" ] host [ ":" port ]

...and that's handled in 2396bis.  BTW, 2396bis says authority
instead of server, it's probably better to adopt this term:

| news-server = "//" authority
|
| <authority> is defined in [2396bis], and provides for a
[etc.]

> the remaining issue is what to do about the '*' notation.
[...]
> Comments and opinions?

I like the solution in draft-gilman-news-url-02 section 2.2:

| Note that user agents may extend the ability to refer to
| groups by use of "*" as a string wild-card.

Add this note to your section 2.2 (your first 2.2, the second
should be 2.3 ;-), and remove the "*" from the overall syntax:

| all-groups  = news-server [ "/" ]

This implicitly kills the "*" problem in 2.3, and if a 2.2 "*"
doesn't work as expected it's a problem or feature of the UA.

My UA treats say <news://news.gmane.org/gmane.ietf.*> like
<news://news.gmane.org/> or simply <news://news.gmane.org>.

The dubious RfC 1738 <news:*> is still covered by the note in
<http://www.newsreaders.com/tech/draft-gilman-news-url-02.txt>
if you copy it to 2.2 (see above), because in 2.2 the server...

| group       = [ news-server "/" ] newsgroup-name

...is optional, and Gilman's note allows "*" as wildcard.

For some questions about your NNTP section 3 and SNEWS see
<news://news.gmane.org/41C8775F.4294@xyzzy.claranet.de> or
<http://article.gmane.org/gmane.org.w3c.uri/333> or maybe
<http://article.gmane.org/gmane.org.w3c.uri:333>

The latter form is already one of the questions ;-)  Bye, Frank
Received on Friday, 31 December 2004 15:21:45 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:08 UTC