Re: FragIds in semantic web (ACTION-543)

Jeni,

Yves quoted one of the pertinent paragraphs from RFC 3986, but there are 
more, including:

"   The semantics of a fragment identifier are defined by the set of
    representations that might result from a retrieval action on the
    primary resource.  The fragment's format and resolution is therefore
    dependent on the media type [RFC2046] of a potentially retrieved
    representation, even though such a retrieval is only performed if the
    URI is dereferenced.  If no such representation exists, then the
    semantics of the fragment are considered unknown and are effectively
    unconstrained.  Fragment identifier semantics are independent of the
    URI scheme and thus cannot be redefined by scheme specifications."

So, I think it's a misleading paraphrase to say "The URL spec glibly noted 
that "the definition of the fragment identifier meaning depends on the 
Internet Media Type"", implying that you've made a direct quote. Also, 
while there's lots to dislike about what RFC 3986 says, there is some 
nuance and subtlety to it, and I think that needs to be preserved.  The 
simplification/paraphrase I might buy is along the lines of:

"In the case where a representation has been retrieved using HTTP, the 
semantics of a particular fragment identifier are determined by the 
specification for the media type (I.e. the Content Type); if that 
specification does not exist or is silent on fragment interpretation, then 
the semantics of the fragment identifier are unspecified."

Noah

On 4/6/2011 6:18 PM, Jeni Tennison wrote:
> Hi Larry,
>
> I have an action (ACTION-543: Propose addition to MIME/Web draft to discuss sem-web use of fragids not grounded in media type) to propose some wording to slot into your "MIME and the Web" draft which I'm taking to be the version at:
>
>    http://tools.ietf.org/id/draft-masinter-mime-web-info-02.html
>
> You already have a Section 4.6 (Fragment identifiers) which touches on the issue, so I suggest extending that to read something like:
>
> ---
>    The Web added the notion of being able to address part of an entity
>    and not the whole content by adding a 'fragment identifier' to the
>    URL that addressed the data. Of course, this originally made sense
>    for the original Web with just HTML, but how would it apply to other
>    content types? The URL spec glibly noted that "the definition of the
>    fragment identifier meaning depends on the Internet Media Type", but
>    unfortunately, few of the Internet Media Type definitions included
>    this information, and practices diverged greatly.
>
>    Content negotiation becomes extremely difficult when the interpretation
>    of fragment identifiers depends on the MIME type as there is no
>    guarantee that the syntax of a fragment identifier that is legal for
>    one MIME type is also legal (or interpreted in an equivalent way) for
>    another MIME type. For example, the common `#identifier` syntax for
>    HTML is not consistent with the XPointer-based syntax defined for XML.
>
>    This is exacerbated in common semantic web practice, which not only
>    makes heavy use of content negotiation but in which URLs with fragment
>    identifiers are used to identify real-world Things. In these cases,
>    the URI as a whole is used to identify the real-world Thing, and the
>    fragment identifier does not address a part of any entity, so
>    interpreting the fragment identifier based on the MIME type of whatever
>    entity happens to be returned does not make sense.
> ---
>
> Section 5.1.3 (Fragment identifiers) talks briefly about what might be done about fragment identifiers, stating that the problem is that MIME type definitions don't talk about fragment identifiers. I think the problem goes deeper than that because of the inconsistency of interpretation across media types. I think we might want to do something at the level of the URL specification to guarantee support for simple fragment identifiers (ie #identifier) across media types.
>
>
> Having read through, I've also got one suggestion and some small editorial fixes.
>
> The one suggestion is to include somewhere a section that describes the 'application/atom+xml' or 'application/schema+json' pattern (introduced in RFC 3023 I believe) in which there's a generic MIME type (application/xml or application/json) for a meta-language and a syntax pattern for MIME types for languages based on that meta-language. Perhaps it might make sense to have lower/fewer hoops to jump through if you're defining a MIME type for a language based on a meta-language. Maybe there are implications of compatibility between the language and the meta-language in sniffing and the interpretation of fragment identifiers that mean the registration needn't be so detailed.
>
> The editorial fixes are:
>
> 1. Introduction: s/are describes./are described./
>
> 2.1 Origins of MIME: s/Message sent from A to B./Message is sent from A to B./
>
> 2.2 Introducing MIME into the Web: s/HTTP have minor/HTTP are minor/
>
> 3.1 Lack of clarity:
>
>    s/its uses, the meaning/its uses, and the meaning/
>    s/W3C specifications TAG findings and Internet/W3C specifications, TAG findings, and Internet/
>
> It would be good to have some examples of the incorrect assumptions that this paragraph talks about.
>
>
> 3.2 Differences between email and Web delivery
>
> Can you clarify for me, in the first bullet point where you say 'GET has no content', is that always the case? I can't see the part of HTTP (1.1 or bis) that says this but suspect that's because I'm missing something.
>
>
> 3.3 The Rules Weren't Quite Followed:
>
>    s/that are registration/that are registered/
>    s/sherperding/shepherding/
>    s/Orgnaizations/Organizations/
>
>
> 4.4 Evolution, Versioning, Forking:
>
>    s/litle/little/
>    s/try to insure/try to ensure/
>
> 5. Recommendations:
>
>    s/aggreement/agreement/
>    s/to use of MIME/to the use of MIME/
>
> 5.1.4. Application info: s/section to be clearer/section be clearer/
>
> Hope this is useful,
>
> Jeni

Received on Tuesday, 12 April 2011 02:29:11 UTC