RE: RFC 2822 email addresses in tag URIs

Hi Tim,

I agree with all of your reasoning below _except_ for your last
two sentences:

> Tags containing those different forms are distinct.  We leave 
> that to the users.

URI schemes should define canonicalization/normalization rules
that allow correct comparisons based on user expectations.  No
user expects that changing the case in an email address will
cause it to differ in meaning.

Cheers,
- Ira

Ira McDonald (Musician / Software Architect)
Blue Roof Music / High North Inc
PO Box 221  Grand Marais, MI  49839
phone: +1-906-494-2434
email: imcdonald@sharplabs.com

> -----Original Message-----
> From: uri-request@w3.org [mailto:uri-request@w3.org]On Behalf Of Tim
> Kindberg
> Sent: Friday, September 23, 2005 10:10 AM
> To: uri@w3.org
> Cc: Tim Kindberg; sandro hawke
> Subject: RFC 2822 email addresses in tag URIs
> 
> 
> 
> Dear URI community,
> 
> I've been overloaded and only just managed to return to the 
> problem of 
> updating the tag URI specification to accommodate RFC 2822, the most 
> recent standard for email addresses. I had some great input from Etan 
> Wexler and Frank Ellerman a while back. Following that, I'm 
> inclined to 
> go for Frank's simpler approach: take a subset of RFC 2822 email 
> addresses that users could be expected to read & manipulate 
> by hand and 
> brain (following the 'tag' philosophy), and simply %-encode 
> certain of 
> their characters.
> 
> Principle 1: only allow relatively simple, 
> human-legible/tractable email 
> address to be embedded in tags. So only allow printing 
> characters (%20 - 
> %7E). NB only whitespace character is " " (which has to be quoted in 
> RFC2822-land). No folding, no control characters.
> 
> Principle 2: disallow obsolete constructs.
> 
> Principle 3: disallow comments -- no value in a tag but lots of 
> potential for confusion.
> 
> In addition, the following characters should not appear literally as 
> part of an email address in a tag; they must be %-encoded (ONCE) from 
> the original email address:
> 
>        gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
> 
>        sub-delims-used-in-tag-syntax  = ","
> 
>        percent = "%"
> 
>        non-uri-characters = "<", ">", "^", "`", "{", "|", "}"
> 
> 
> The above gives me the following syntax, bsed on similar 
> terms in RFC 2822:
> 
> emailAddress      = tag-local-part "@" DNSname
> tag-local-part    = tag-dot-atom-text / tag-no-fold-quote
> tag-dot-atom-text = 1*tag-atext *("." 1*tag-atext)
> tag-atext         = ALPHA / DIGIT  /
>                       "!"   / "%23" /
>                       "$"   / "%25" /
>                       "&"   / "'"   /
>                       "*"   / "+"   /
>                       "-"   / "%2F" /
>                       "="   / "%3F" /
>                       "%5E" / "_"   /
>                       "%60" / "%7B" /
>                       "%7C" / "%7D" /
>                       "~"
> tag-no-fold-quote = "%22" *(tag-qtext / tag-quoted-pair) "%22"
> tag-quoted-pair   = "%5C"  tag-qptext
> tag-qtext         = tag-atext / "(" /
>                       ")"   /  "%2C" /
>                       "."   /  "%3A" /
>                       ";"   /  "%3C" /
>                       "%3E" /  "%40" /
>                       "%5B" /  "%5D" /
> tag-qptext        = tag-qtext / "%20" / "%5C" / "%22"
> 
> (The defn of DNSname in 
> http://www.ietf.org/internet-drafts/draft-kindberg-tag-uri-07.txt 
> remains intact.)
> 
> A final comment: there are several ways to write a given 
> email address, 
> e.g. timothy@hpl.hp.com = "\timothy"@hpl.hp.com = TIMothy@hpl.hp.com. 
> Tags containing those different forms are distinct. We leave 
> that to the 
> users.
> 
> As always, the community's comments would be appreciated.
> 
> Cheers,
> 
> Tim.
> 
> -- 
> 
> Tim Kindberg
> hewlett-packard laboratories
> filton road
> stoke gifford
> bristol bs34 8qz
> uk
> 
> purl.org/net/TimKindberg
> timothy@hpl.hp.com
> voice +44 (0)117 312 9920
> fax +44 (0)117 312 8003
> 
> 
> 

Received on Friday, 23 September 2005 15:27:18 UTC