Review of versioning: terminology: http://www.w3.org/2001/tag/doc/versioning from Williams, Stuart (HP Labs, Bristol) on 2007-09-13 (www-tag@w3.org from September 2007)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Thu, 13 Sep 2007 17:03:52 +0100
To: "David Orchard" <dorchard@bea.com>
Cc: <www-tag@w3.org>
Message-ID: <C4B3FB61F7970A4391A5C10BAA1C3F0DDA9AAD@sdcexc04.emea.cpqcorp.net>
Hello Dave,

Attached below is my review of the terminology part of the versioning
findings [1]. [Tracker: this is ACTION-35 [2]]

The review below is in document order...

I think that there is a lot of good stuff in the document and I think it
is clearer than when I read it before. I think that there are a few
places where it is inconsistent with itself or needs to speak more
rigourously. I think that the defined/accept/unknown set stuff could use
some small examples where the sets being spoken of are actually
elaborated (ie. small language example where all the texts of the
language can be enumerated and shown as sets) and then likewise for
intential variations of the language in order to demonstrate that the
claimed superset/subset relations hold. That however is a lot of work!
and I am not sure that it is worth doing unless we are sure that this
extensional approach is really useful for us.

I'd really like to see reviews from folks that work on the philosphy of
language - they must have been working on this kind of stuff for years
and either have sound works that we can reference or terminology that we
can borrow. I fear that we will get significant feedback that we have
just gone off and invented our own approach to speaking of language that
flies in the face of years of philosophy.

I guess that the core of the terminology definition part of the document
ends roundabout the end of section 1.1.1.1. I think that sections 1.1.2
1.1.3 and 1.1.4 at illustrate application of our  model of a Language
and the language compatibility constraint to: an example language
(PHTML/PCSS); partial understanding (language flavours or subsets);
Divergent Undertanding and Comaptibility of agents (I think). But they
don't provide rigourous argument because they don't really elaborate the
text sets that are being spoken of.

So... part of me would like the document to stop at the end of 1.1.1.1
because that's really where defining terminology ends; and part of me
would like the document to continue with examples of the terminology in
use, *but* I would like to see examples where the sets being spoken of
get enumerated (or failing that the production rules for testing set
membership are stated) - with the intent of showing that the claimed
sub/superset relation hold (at least in those example where compatible
or incompatible change is being shown). I think it hard to make robust
claims about what is being shown without doing it that way.

Also, I do wonder how much the other two documents need this document as
a foundation to build on... if they make very little use of the
terminology this document seeks to establish - then maybe, whilst it has
been a useful learning vehicle, I am wondering whether it is one that we
need to complete.

On the GPNs toward the end... i haven't really thought about them
deeply... I ddn't find them particularly compelling, but equally I don't
have particular objection to them either.


Stuart
--
[1] http://www.w3.org/2001/tag/doc/versioning
[2] http://www.w3.org/2001/tag/group/track/actions/35
-- 

TOC/Layout:
----------

All the action of the document is 'compressed' into section 1.1 of the
document (and its subsections). Please reorganise structure for more
breadth and les depth (in terms of content hierarchy which look
unbalanced (IMO)).

--

1.1 Terminology
---------------
Drop the apologetic language about names being potentially a poor choice
of example.

--
Suggest replace:
	"Defn: An *Act of Production* is the creation of text"

with
	"Defn: An *Act of Production* is an act of creating a text or a
language"

--
Suggest replace:
	"Defn: An *Act of Consumption* is the processing of text of a
language"
with
	"Defn: An *Act of Consumption* is an act of processes a text of
a language."
--

Suggest replace:
	"Defn: A *Language* ..."
with:
	"Defn: A *Language* consists of a set of admissable text; a set
of constraints on the information expressible in the language; and a
mapping between admissable texts and the information they are intended
to convey"

FWIW: I am not sure that I like this definition. It is not clear what
"constraints on the information"  refers to (an of such a constraint
example might help). Also, I am not at all sure how one characterises
the information conveyed in a text of a language, how one tests for
similarity of informations (as opposed to similarity of syntax). If
'information conveyed' is a way of saying 'meaning' I think that
language philosophers will have problems about there being some
objective meaning of a text that can be compared. Pat Hayes has earlier
referred us to some  writings by Quine oon this topic - which are
somewhat disparaging about working with the meaning of texts (IIRC).

--

I do not find the definition of *component* at all clear or appealing.
It feels like a component is some production in the in say the grammar
rules of a language... 

The phrase "...that is a sub-part of the texts and carries a specific
meaning." is difficult to comprehend and (ironically) it's own mening
seems far from specific.

Standing back a bit: I think this is evidence of a tension between
defining the syntax of a language in an extesnional way (enumerating a
set of admissable sets) and defining in an intensional way (a set of
production rules from creating 'components' and for combining component
instances to create texts). Speaking of a "sub-part" of a text is the
piece that makes the above phrase difficult because at one level on
might be thinking of sub-setting the text-set (however the texts of the
component may not themselves be members of the language text-set in and
of themselves).

--

Non-Definition:

[Definition: A language may have a syntax that determines the set of
strings in a language.] does not seem to me to be a definition.

--

Definition of binding: is this a necessary definition - it is again in
the area of trying to couple texts with meaning (taking information as a
surrogate for meaning).

I think that this notion of 'binding' probably has some relation to the
concept of Interpretation iin a model-theory which contains (amongst
other structures) the notion of a mapping from terms in a vocabulary to
things in a domain of discourse.

--

[Definition: A language is *Extensible* if the syntax of a language
allows information that is not defined in the current version of the
language.]

I have a little trrouble with this definition. I expected to say
something like [Definition: (The syntax of) a language is extensible if
it leaves open clear extension points where new terms (or production
rules or...) may be added - note that the extended language may be more
closed that the language that was being extended because the extension
restricts some of what may be added at the extension point)].

The current definition speaks of 'syntax allowing (the conveyance?) of
information' not currently defined in the current language version. I
think that you could speak more specifically about an extensible syntax.

The middle name example could be expressed entirely in terms of
syntactic extension.

--

1.1.1  Compatibility
--------------------
I've mentioned this before: "Backward compatibility means that a newer
version of a consumer can be rolled out in a way that does not break
existing producers." This assumes a closed loop exchange between
producer and consumers - otherwise it is not possible for the mere
act-of-consumption to in some sense break the producer. The producer
only 'breaks' if either it is incapable of 'consuming' a new response or
the revised consumer fails to honour some expectation of the producer
entailed in the text that it sent... however, producer/consumer roles
sort of which wrt to responses and the narrative so far has really been
about singleton producers and consumers rather than agents that
variously adopt producing and consuming roles in respect of the exchange
of messages in some higher level of message exchange protocol.

--

Example 2:

I find the diagram at ex 2 confusing. In particular, it is not at all
clear whether the compatiblity relation is a relation between agents
(consumers or producters) or language versions. I would find the
document easier to read/understand if it was entirely about
compatibility/versioning of language definitions as a primary topic, and
the consequences for consumers/producters of forward and backward
compatible language changes.

I do not know what the arcs in the diagram are intended to signify.

--

A Positive comment!!
--------------------
The definition of the defined, accept and unknown text sets at least
make it clear what each set is and their intention.

--

The intention the definitons of strict and full compatibility also now
seem clear - though they are expressed again in terms of compatibilty
between agents rather than as a relation between language definitions
(which I suppose is the best you can do... but I continue to find the
switching between language compatibity and agent compatability
confusing).

--

Mathmatical defintions:

5th bullet: "Text T is "full compatible" with language L2..." now
compatibility is a relation between a text and a language!  

Now (by example) we have all of: compat(language:language),
compat(agent:agent), compat(text:language)

6th bullet seems to be stated backward in that I2 arises *after* I1 at
least in an intentional sense. oh... and now we have
compat(information:information). I1 I2 ordering cascades backward to 5th
bullet.

Ok... the language compatibility defns (7 bullets beginning either L1,
L2 or And) seem to make sense and compat(text:language) and
compat(information:information) seem to be intermediate technical terms
on the scaffolding that gets to defn's of language compatibility.

Alternate defns includes: L2 is "full strictly" compatible with L1 -
wasn't expeting to see "fully" and "strictly" combined in this way.

--

1.1.1.1 Composition
-------------------

With the definition of text-sets as given, the composition of text sets
for the compound language is much more complicated than set unions of
the accept sets of the component languages. eg. an text from the defined
set of the Name language is *very* unlikely (on it's own) to be a member
of the accept set of the PO langugage.

--

BANANNA Example

para below Example 6:
"...and all of the legal strings (texts) in that new language are in
it's Defined Text set -- the Accept Text set is the empty set..." 

The Accept Text set is a superset of the Defined Text set and cannot
therefore be empty. (suggest more global difference in the use of accept
and defined set terminology in this section from th rest of the
document).

--

1.1.3 Partial understanding
---------------------------
2nd para: "The Defined Text set for the Given Name Language is given"
As stated this is just plain wrong... what is 'given' i assume that it
some production rule that constrains what can show up at a particular
place in the document. The defined text set of the given language is any
text that satisfies it's syntactic definition - and in some sense is the
extension (an infinitely large extension) of the syntactic defn of the
language. The literal text 'given' is almost certainly *not* the Defined
Set of this language.

Likewise "the information set of the Given Language contains just
given."

--


1.1.3.1 Agents that are Consumers and Producers.
-----------------------------------------------

The tail end of this speaks in terms of revised consumers and producers
rather than constraint on revisions to language definitions for the
messages exchanged by system components.

ie. the language in on direction must undergo a backward compatible
change, whilst the language in the reverse direction must undergo a
forward compatible change (otherwise it cease to be possible to update
the agent at at least one end of the interaction).

--

1.1.4 Divergent Understanding and Compatibility.
------------------------------------------------

might be worth noting that in the TagSoup case different agents map the
same text in the Unknown Set to *different* texts in the defined set in
order to interpret them. Hixie has been showing this in terms of the
different DOMs that different agents build for TagSoup HTML (which at
least roughly corresponds to different fix-up before processing to
create a DOM).







Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks
RG12 1HN
Registered No: 690597 England
Received on Thursday, 13 September 2007 16:06:38 UTC