Re: FragIds in semantic web (ACTION-543) from Jeni Tennison on 2011-04-12 (www-tag@w3.org from April 2011)

From: Jeni Tennison <jeni.tennison@googlemail.com>
Date: Tue, 12 Apr 2011 08:25:13 +0100
To: Noah Mendelsohn <nrm@arcanedomain.com>
Cc: Larry Masinter <masinter@adobe.com>, "www-tag@w3.org List" <www-tag@w3.org>
Message-Id: <AB988B95-5948-480C-B981-A666271F684C@googlemail.com>
Noah,

I think that the phrase you're objecting to is in the first of the paragraphs that I gave, which was an almost direct copy of Larry's original text for the section.

Jeni

On 12 Apr 2011, at 03:28, Noah Mendelsohn wrote:

> Jeni,
> 
> Yves quoted one of the pertinent paragraphs from RFC 3986, but there are more, including:
> 
> "   The semantics of a fragment identifier are defined by the set of
>   representations that might result from a retrieval action on the
>   primary resource.  The fragment's format and resolution is therefore
>   dependent on the media type [RFC2046] of a potentially retrieved
>   representation, even though such a retrieval is only performed if the
>   URI is dereferenced.  If no such representation exists, then the
>   semantics of the fragment are considered unknown and are effectively
>   unconstrained.  Fragment identifier semantics are independent of the
>   URI scheme and thus cannot be redefined by scheme specifications."
> 
> So, I think it's a misleading paraphrase to say "The URL spec glibly noted that "the definition of the fragment identifier meaning depends on the Internet Media Type"", implying that you've made a direct quote. Also, while there's lots to dislike about what RFC 3986 says, there is some nuance and subtlety to it, and I think that needs to be preserved.  The simplification/paraphrase I might buy is along the lines of:
> 
> "In the case where a representation has been retrieved using HTTP, the semantics of a particular fragment identifier are determined by the specification for the media type (I.e. the Content Type); if that specification does not exist or is silent on fragment interpretation, then the semantics of the fragment identifier are unspecified."
> 
> Noah
> 
> On 4/6/2011 6:18 PM, Jeni Tennison wrote:
>> Hi Larry,
>> 
>> I have an action (ACTION-543: Propose addition to MIME/Web draft to discuss sem-web use of fragids not grounded in media type) to propose some wording to slot into your "MIME and the Web" draft which I'm taking to be the version at:
>> 
>>   http://tools.ietf.org/id/draft-masinter-mime-web-info-02.html
>> 
>> You already have a Section 4.6 (Fragment identifiers) which touches on the issue, so I suggest extending that to read something like:
>> 
>> ---
>>   The Web added the notion of being able to address part of an entity
>>   and not the whole content by adding a 'fragment identifier' to the
>>   URL that addressed the data. Of course, this originally made sense
>>   for the original Web with just HTML, but how would it apply to other
>>   content types? The URL spec glibly noted that "the definition of the
>>   fragment identifier meaning depends on the Internet Media Type", but
>>   unfortunately, few of the Internet Media Type definitions included
>>   this information, and practices diverged greatly.
>> 
>>   Content negotiation becomes extremely difficult when the interpretation
>>   of fragment identifiers depends on the MIME type as there is no
>>   guarantee that the syntax of a fragment identifier that is legal for
>>   one MIME type is also legal (or interpreted in an equivalent way) for
>>   another MIME type. For example, the common `#identifier` syntax for
>>   HTML is not consistent with the XPointer-based syntax defined for XML.
>> 
>>   This is exacerbated in common semantic web practice, which not only
>>   makes heavy use of content negotiation but in which URLs with fragment
>>   identifiers are used to identify real-world Things. In these cases,
>>   the URI as a whole is used to identify the real-world Thing, and the
>>   fragment identifier does not address a part of any entity, so
>>   interpreting the fragment identifier based on the MIME type of whatever
>>   entity happens to be returned does not make sense.
>> ---
>> 
>> Section 5.1.3 (Fragment identifiers) talks briefly about what might be done about fragment identifiers, stating that the problem is that MIME type definitions don't talk about fragment identifiers. I think the problem goes deeper than that because of the inconsistency of interpretation across media types. I think we might want to do something at the level of the URL specification to guarantee support for simple fragment identifiers (ie #identifier) across media types.
>> 
>> 
>> Having read through, I've also got one suggestion and some small editorial fixes.
>> 
>> The one suggestion is to include somewhere a section that describes the 'application/atom+xml' or 'application/schema+json' pattern (introduced in RFC 3023 I believe) in which there's a generic MIME type (application/xml or application/json) for a meta-language and a syntax pattern for MIME types for languages based on that meta-language. Perhaps it might make sense to have lower/fewer hoops to jump through if you're defining a MIME type for a language based on a meta-language. Maybe there are implications of compatibility between the language and the meta-language in sniffing and the interpretation of fragment identifiers that mean the registration needn't be so detailed.
>> 
>> The editorial fixes are:
>> 
>> 1. Introduction: s/are describes./are described./
>> 
>> 2.1 Origins of MIME: s/Message sent from A to B./Message is sent from A to B./
>> 
>> 2.2 Introducing MIME into the Web: s/HTTP have minor/HTTP are minor/
>> 
>> 3.1 Lack of clarity:
>> 
>>   s/its uses, the meaning/its uses, and the meaning/
>>   s/W3C specifications TAG findings and Internet/W3C specifications, TAG findings, and Internet/
>> 
>> It would be good to have some examples of the incorrect assumptions that this paragraph talks about.
>> 
>> 
>> 3.2 Differences between email and Web delivery
>> 
>> Can you clarify for me, in the first bullet point where you say 'GET has no content', is that always the case? I can't see the part of HTTP (1.1 or bis) that says this but suspect that's because I'm missing something.
>> 
>> 
>> 3.3 The Rules Weren't Quite Followed:
>> 
>>   s/that are registration/that are registered/
>>   s/sherperding/shepherding/
>>   s/Orgnaizations/Organizations/
>> 
>> 
>> 4.4 Evolution, Versioning, Forking:
>> 
>>   s/litle/little/
>>   s/try to insure/try to ensure/
>> 
>> 5. Recommendations:
>> 
>>   s/aggreement/agreement/
>>   s/to use of MIME/to the use of MIME/
>> 
>> 5.1.4. Application info: s/section to be clearer/section be clearer/
>> 
>> Hope this is useful,
>> 
>> Jeni
> 
>
Received on Tuesday, 12 April 2011 07:25:43 UTC