Re: News and nntp URI schemes

Charles Lindsey wrote:

> The trouble is that there are just too many definitions of
> message-id around the place.

Different aspects of the same real thing, all describing the
minimum they could get away with for their special purposes.

> There is going to be strong pressure to keep to RFC 2822
> or at least to a subset of it (which the RFC 1036 definition
> is not).

All 1036 Message-IDs are 2822 Message-IDs, "printable ASCII"
is a proper subset of "ASCII minus white space".

> The alternative is to give a very loose definition, on the
> grounds that URLs are supposed to contain only whatever has
> been used in some existing news article (and the agent that
> generated that article can worry about which standard it
> conformed to). So any string of characters with an '@'
> in the middle is good enough.

That's a good idea, because it's what you really want for the
news URL:  If somebody created a Message-ID with NO-WS-CTL,
and a server accepted this, then maybe that's broken, but no
problem of the news URL scheme.

 [quoting NNTP]
>| For the purposes of this specification, message-ids are
>| opaque strings

That's fine, garbage in, garbage out, and the same garbage is
the same message and v.v.

>| MUST begin with "<" and end with ">", and MUST NOT contain
>| the latter except at the end.

Not good enough for the news URL, we need the "@", otherwise
it could be a newsgroup name.

> A message-id MUST be between 3 and 250 octets in length.

The minimum is <@> ?  LOL, no problem.

>| A message-id MUST NOT contain octets other than printable
>| US-ASCII characters.

See ?  No NO-WS-CTL here.  That's a sane definition for servers
and user-agents.  For UseFor we need a "domain" as RHS (because
that's required for algorithms trying to create new unique IDs),
and the "@", because otherwise RHS makes no sense (right hand
side of what without an "@" ?)

For the news URL we need only an "@" (in theory more than one
is fine).  Putting it all together:  printable US-ASCII minus
">" and at least one "@" is good enough for the news URL.

>| Other specifications may define two different sequences as
>| being equal because they are  putting an interpretation on
>| particular characters.

These other specifications are IMHO erroneous.

>| Note that RFC 1036 [RFC1036] never treats two different
>| strings as being identical.  Its draft successor restricts
>| the syntax of message-ids so that, whenever RFC 2822 would
>| treat two strings as equivalent, only one of them is valid

Talking about s-o-1036 probably, but we could use the same
recipe.  You already did it, but somehow you forgot to exclude
NO-WS-CTL.  But these drafts are yet no RfCs, you can't quote
them in the new memo about a news URL.  But the solution of a
"loose definition" is okay, printable with "@" and without ">".

> I just noticed that NNTP will not accept Non-WS-Controls

A fatal error in RfC 2822, don't mention it for the news URL.

> Anyway, the uri@w3c.org list is the proper place to continue
> this discussion.

Okay, I'll post my answer (t)here.  Do we need a note about any
Message-ID starting with "//" like a server name ?  Something
like "'/' MUST be escaped as %2F if (insert conditions)" ?  And
the opposite, '@' MUST not be esacaped, or is it the job of the
user agent to see a %40 ?
                           Bye, Frank

Received on Monday, 20 December 2004 23:34:33 UTC