RE: '#' in mailto URIs from noah_mendelsohn@us.ibm.com on 2009-10-14 (public-iri@w3.org from October 2009)

From: <noah_mendelsohn@us.ibm.com>
Date: Wed, 14 Oct 2009 17:54:29 -0400
To: Larry Masinter <masinter@adobe.com>
Cc: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "jwz@jwz.org" <jwz@jwz.org>, "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>, "Michael A. Puls II" <shadow@shadow2531.com>
Message-ID: <OF1F09B6EC.4F1A679D-ON8525764F.006BDBA2-8525764F.00785B2F@lotus.com>
Martin Dürst wrote:

> The text that I might put in (if we think we need some) is:
> 
>  >>>>
> Note that this specification, like any URI scheme specification, does 
> not define syntax or meaning of a fragment identifier, because these 
> depend on the media type of the retrieved resource. In the currently 
> known usage scenarios, a 'mailto' URI does not serve to retreive a 
> resource with a media type. Therefore, fragment identifiers are 
> meaningless, SHOULD NOT be used on 'mailto' URIs, and SHOULD be ignored 
> upon resolution.
>  >>>>

So, this reminds me of an aspect of RFC 3986 that I find surprising.  It 
says [1] :

>    The fragment identifier component of a URI allows indirect
>    identification of a secondary resource by reference to a primary
>    resource and additional identifying information.  The identified
>    secondary resource may be some portion or subset of the primary
>    resource, some view on representations of the primary resource, or
>    some other resource defined or described by those representations.  A
>    fragment identifier component is indicated by the presence of a
>    number sign ("#") character and terminated by the end of the URI.
> 
>       fragment    = *( pchar / "/" / "?" )
> 
>    The semantics of a fragment identifier are defined by the set of
>    representations that might result from a retrieval action on the
>    primary resource.  The fragment's format and resolution is therefore
>    dependent on the media type [RFC2046] of a potentially retrieved
>    representation, even though such a retrieval is only performed if the
>    URI is dereferenced.  If no such representation exists, then the
>    semantics of the fragment are considered unknown and are effectively
>    unconstrained.  Fragment identifier semantics are independent of the
>    URI scheme and thus cannot be redefined by scheme specifications.


What surprises me in the above is the specific reference to media types. 
If I hadn't read the above, I would have assumed that the Web worked 
something like this:

* Resources are identified with URIs, each of which has a scheme
* For some such URIs, protocols such as HTTP can be used to retrieve 
representations of the resource
* For the representation to be usable, it will typically be necessary for 
the protocol to convey (explictly or implicitly) the type of each such 
representation.  In the case of HTTP, typing is done using media types 
[RFC 2046], but other protocols may use different typing schemes.

The quote form RFC 3986 seems to imply that media types are the only 
supported typing mechanism for media types, regardless of the protocol 
used for retrieval.  I understand that we are also trying to achieve a 
situation in which fragment identifier resolution is defined with respect 
to the type of the representation, not the URI scheme or retrieval 
protocol.  Still, I would have thought it should say something like:

"The semantics of a fragment identifier are defined by the set of 
representations that might result from a retrieval action on the
primary resource.  The fragment's format and resolution is therefore 
dependent on >the type< of a potentially retrieved representation >(media 
type [RFC2046] in the case of HTTP retrievals)<, even though such a 
retrieval is only performed if the URI is dereferenced.

Martin: given what's in 3986, your specific reference to media type is OK, 
I guess, but it still feels strange to me in the context of mailto.  I 
also find it somewhat more appropriate to speak of retrieving 
representations than retrieving resources.  Therefore, I wonder whether it 
might be a little better to say (changes marked with >...<):

---Proposed---
Note that this specification, like any URI scheme specification, does 
not define syntax or meaning of a fragment identifier, because these 
depend on the >type of a retrieved representation<. In the currently 
known usage scenarios, a 'mailto' URI >cannot be used to retreive
such representations<. Therefore, fragment identifiers are meaningless,
SHOULD NOT be used on 'mailto' URIs, and SHOULD be ignored upon
resolution.
---End Proposed---

Noah

[1] http://www.ietf.org/rfc/rfc3986.txt


--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








Larry Masinter <masinter@adobe.com>
Sent by: uri-request@w3.org
10/14/2009 01:31 PM
 
        To:     "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "Michael A. 
Puls II" <shadow@shadow2531.com>
        cc:     "jwz@jwz.org" <jwz@jwz.org>, "PUBLIC-IRI@W3.ORG" 
<PUBLIC-IRI@w3.org>, (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        RE: '#' in mailto URIs


What about encouraging URI/IRI scheme registrations to
say about whether fragment identifiers are necessary,
important, useful, allowed.

mailto: could then disallow # fragment identifiers.

Larry

-----Original Message-----
From: "Martin J. Dürst" [mailto:duerst@it.aoyama.ac.jp] 
Sent: Tuesday, October 13, 2009 9:37 PM
To: Michael A. Puls II
Cc: Larry Masinter; jwz@jwz.org
Subject: Re: '#' in mailto URIs

This is some very old mail. The current mailto: draft doesn't contain 
anything about fragment identifiers. Should it?

The text that I might put in (if we think we need some) is:

 >>>>
Note that this specification, like any URI scheme specification, does 
not define syntax or meaning of a fragment identifier, because these 
depend on the media type of the retrieved resource. In the currently 
known usage scenarios, a 'mailto' URI does not serve to retreive a 
resource with a media type. Therefore, fragment identifiers are 
meaningless, SHOULD NOT be used on 'mailto' URIs, and SHOULD be ignored 
upon resolution.
 >>>>

Regards,   Martin.

On 2008/04/02 6:32, Michael A. Puls II wrote:
>
> <!--"charset=utf-8"-->
> On Tue, 01 Apr 2008 13:18:27 -0400, Larry Masinter <LMM@acm.org> wrote:
>
>>> So, it sounds like, in short, you're saying that Safari and Firefox
>>> shouldn't use # that way because it's reserved for future use in 
mailto
>>> URIs.
>>>
>>> Perhaps you could explicitly note that in your next draft?
>>
>> It isn't reserved "for future use", it's just not allowed.
>
> Martin said that # is *always* a fragment identifier. If it's not
> allowed, ever, then you're saying that mailto URIs don't support
> fragment identifiers and won't ever support fragment identifiers because
> # is not allowed. (Which would make sense to me)
>
> If that's true, then a raw # that is found in a mailto URI (even though
> it's not allowed) would not be anything special and could just be
> accepted literally (if you were not going to throw an error).
>
> That would make sense to me.
>
> However, if mailto URIs support fragment identifiers or might support
> fragment identiers in the future, then # and everything after it in the
> URI needs to be ignored (at least by the mail client itself when parsing
> and filling in the compose fields).
>
> What I got from Martin's response is that mailto URIs (like http URIs)
> support fragment identifiers. It's just that no client *currently* makes
> use of them in any way for 'mailto'.
>
> Basically, I just need to be sure what to do with a raw # in a mailto
> URI (even if it's an error).
>
>> Not every possible string has to have an interpretation.
>
> I don't know what you mean by that sentence or what it pertains to.
> Please clarify.
>
> Thanks
>

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
Received on Wednesday, 14 October 2009 21:55:15 UTC