Re: Regarding terminology of fragid-best-practices from Jeni Tennison on 2012-09-23 (www-tag@w3.org from September 2012)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Sun, 23 Sep 2012 18:00:35 +0100
To: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
Cc: www-tag@w3.org
Message-Id: <8CA67D51-3E25-4C51-B402-79ACA044FA82@jenitennison.com>
Hi Sebastian,

Thank you very much for your comments on http://www.w3.org/TR/fragid-best-practices. I have put together a new editor's draft which is available at:

  http://www.w3.org/2001/tag/doc/mimeTypesAndFragids-2012-09-23

I have addressed your points as follows:

On 3 Aug 2012, at 02:29, Sebastian Hellmann wrote:
> 1. http://tools.ietf.org/html/rfc2046 clearly defines the terms "media type, subtype, and zero or more optional parameters" . In the current draft it is often unclear, if you refer to type or subtype. "subtype" is not mentioned once.

The draft at http://tools.ietf.org/html/draft-ietf-appsawg-media-type-regs-14 says:

   The mechanism used to label such content is a media type, consisting
   of a top-level type and a subtype, which is further structured into
   trees.  Optionally, media types can define companion data, known as
   parameters.

So I am taking 'media type' to mean something like 'text/html', 'top-level type' to mean something like 'text' and 'subtype' to mean something like 'html'. I have scanned through and changed a few instances of 'top-level media type' to 'top-level type' but I couldn't see anywhere where using the term 'subtype' would add to the clarity of the text. If you have specific examples where you think that term should be used, please let me know.

> 2. The current working draft talks a lot about  "named structured syntax suffix". There is only the IETF draft who uses this expression to explain the registration procedure. It would simplify the working draft, if you would make the name more intuitive such as in "subtype syntax", especially since "syntax" implies "structured" and "suffix" or "registered" implies "named".  The actual "+suffix" is of minor importance here in comparison to the IETF draft, so for me it would make more sense to talk about "subtype syntax definition", rather than the "named structured syntax suffix registration".

Thanks, I agree that the long noun phrases make the document hard to read. I have addressed this by adopting shorter and more common phrases throughout, as much as I could.

> 3. Throughout the document you use fragment identifier "rules", "processing requirements", "constraints", "processing behaviour".  I believe, what you really want to say is, that: "A fragment identifier structure is a defined set of fragment identifier syntax, semantics and pragmatics", the classical trinity of "form", "meaning" and "use".  The level of "meaning" specifies, what the fragment identifier "refers to", "selects" or "denotes" ; the level of "use" defines constraints, behaviour, rules, etc.

Thank you, that's a really helpful way of describing it, though I'm not sure about the term 'pragmatics' to describe 'use' as I have not heard that term used in that way before, and it doesn't seem to gel with the linguistic term, at least as described in Wikipedia (http://en.wikipedia.org/wiki/Pragmatics).

> Please see the change:
> "Media type registrations should avoid "inheriting" generic fragment identifier rules from both the top-level type and any structured syntax suffix that they use if the fragment identifier syntaxes defined for these overlap and may provide different meanings for the same fragment identifier. "
> would be:
> "Media *sub*type registrations should avoid "inheriting" fragment identifier semantics and pragmatics from type, subtypes and any subtype syntax definitions that they use if the fragment identifier syntaxes defined for these overlap and may provide inconsistent meanings or conflicting usage for the same fragment identifier."

I have rephrased to:

  Media type registrations should avoid "inheriting" generic fragment identifier 
  semantics and processing requirements from both the top-level type and a 
  structured syntax suffix if the fragment identifier syntaxes defined for these 
  overlap and may provide inconsistent meanings or processing for the same fragment 
  identifier.

I haven't adopted the phrase "media subtype registrations" as I cannot see this phrase used anywhere within the media type registration draft, and I haven't adopted the term 'pragmatics' for the reason given above.

> "Structured syntax suffix registrations should define processor behaviour for fragment identifiers that is consistent with the relevant associated generic media type."
> would become:
> "Subtype syntax definitions should define fragment identifier pragmatics that are consistent with the associated media type." (associated and generic are superfluous)"

In this case I really was referring to structured syntax suffix registrations, as described in http://tools.ietf.org/html/draft-ietf-appsawg-media-type-regs-14#section-6. I have changed to:

  +Suffix registrations should define fragid semantics and processing requirements 
  that are consistent with their associated media type.

> Note, before it was unclear, whether "behaviour" includes "semantics". E.g. it does not make sense to make an XPointer fragment identifier refer to a fragment of an image.

Interesting example as, of course, this is exactly what SVG's XPointer schema does.

> 4. Best Practices 1-2-3-6-7-8 are mostly the same, i.e. saying that syntax, semantics and pragmatics should be  consistent for subtypes regarding subtype syntax and type, i.e. no ambiguity, no uncertainty. Equal syntax should imply equivalent meaning.

Yes.

> ( I must admit, that I was unable to cognitively wrap my head around 7, at all)

I have rephrased to:

  +Suffix registrations should not classify as errors fragids that do not 
  match the defined fragid syntax for the +suffix or that do not resolve 
  to a fragment, or constrain what they identify; instead the +suffix registration 
  should say that such fragids are resolved according to rules in the registration 
  of the vocabulary and may identify anything.

The aim is to advise not to, for example, say that only XPointers could be used with +xml media types.

> 5. Section 6: The definitions for "syntax-based" and "semantic" fragment identifier structures are very fuzzy and I am unable to understand them.
> Which category does "plain name fragment identifier" fall into?

Plan name fragids could be either, depending on the definition of the media type. In XML, plain name fragids are syntax-based structures because they address elements within the XML tree. But it would be possible for, say, a music format to use plain name fragment identifiers to name phrases of music.

> Why is xPointer syntax-based? I would assume that addressing parts of the DOM is definitely "application-level understanding of its meaning" .

I guess the ambiguity is because there are obviously applications at the different levels: one application that processes the XML into a DOM and can resolve XPointers against that DOM, and another application that understands that the XML is encoding a financial transaction or a piece of music or whatever, and can resolve fragids against that level of meaning.

I have changed the definition of "semantic fragid structure" to:

  semantic fragid structures provide access to semantic fragments of a document 
  based on the information that it encodes, and may be used across multiple 
  media types that use different syntaxes; media fragment URIs are an example

Does that help? Can you suggest another phrasing?

> What is the main criteria distinguishing both categories? It sounds like the syntax-based is for the subtype syntax and the semantic is for the media type? Maybe you can categorize them by what level they specify. And what about subtypes that do not have a subtype syntax definition, e.g. text/html ?

I think the key is that semantic fragids could be used across conneg'd variants of a resource, whereas syntax-based fragids couldn't.

By definition any fragid structure referenced within a +suffix registration has to be syntax-based.

The same isn't true for top-level types. The text/plain fragment identifiers defined in RFC5147 could be defined for all text/* subtypes, and they would be syntax based (because they are based on lines and characters). Conversely, media type fragment URIs could be defined for all image/*, audio/* and video/* subtypes but I'd categorise them as semantic because the fragid would still make sense across conneg'd image/audio/video formats. You could imagine a fragid structure for image/audio/video based on bytes rather than areas and timestamps which would be syntax-based.

Fragid structures that are only usable with one media type would be classified as syntax-based.

> I hope, what I wrote is not complete nonsense. I am currently trying to see what is what for myself.

Your comments have been really helpful. Please do get back to me if there are more changes that you'd suggest.

Cheers,

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com
Received on Sunday, 23 September 2012 17:01:08 UTC