- From: Harry Halpin <hhalpin@ibiblio.org>
- Date: Fri, 08 Sep 2006 16:30:58 +0100
- To: David Orchard <dorchard@bea.com>
- Cc: noah_mendelsohn@us.ibm.com, www-tag@w3.org
I just went through David's finding - wow, that was a *lot* of work,and I think David did an admirable job with a host of difficult issues. Yet, I think to a large extent *both* David and Noah are both right (as I explain in the next paragraph), although David is thinking more inline with philosophers and Noah more inline with compsci. However, I'd like to make one suggestion: While I think the intuitions of the TAG are right and very informative, combining a *formalization of these intuitions* and make a usable and readable document that your average corporation or even RDF/XML hacker can use might be two separate goals, and perhaps the "Versioning Primer" should be given as a separate document from "Versioning Semantics". I think as a good introduction most of David's document should stay as is, and the troublesome bits should be further hashed out in a perhaps formal document (the "Versioning Semantics" or something). >From a theoretical computer science, they usually start with a notion of an alphabet, which is finite set of symbols. Then you have "strings" which are a finite set of symbols from the alphabet, and a language is any set of strings over the alphabet - and this means a language can contain an *infinite number* of strings. This is important, since you then need to define the language using a grammar (ala constraints), like production rules, regular expressions, etc. etc. So - Noah and Dan are sort of both right. Languages are usually thought of "as sets of strings" but since those sets are infinite, we need a finite representation in terms of some constraints. Just like an XML Schema can validate (sort of define) an infinite amount of actual XML documents. I sort of like the text->information division, even though it's a bit untraditional and I think you're trying to get at something a bit different than the language->interpretation/denotation division traditionally made in semantics and so this new terminology has to be *carefully* connected to semantics or normalized with terminlogy done in logic/math/everywhere else/ as Pat pointed out. I'll give it a go at some point, but then again - this sort of connections might be possible to do in a separate document. Using terms "information" and "constraints" is though not unheard of, it sounds very like "situation semantics " in flavor [1], and it might take some real work to map this out formally. There are still a few live situation semanticists floating about, but none of them are very Webby. Robin Cooper comes to mind[2]. [1] http://www-csli.stanford.edu/~john/PHILPAPERS/sitsem.pdf [2] http://www.ling.gu.se/~cooper/ David Orchard wrote: > Noah, > > Part 1 of ? Parts. > > I've gone through your comments. Thanks again for doing some extensive > reviewing. The comments that you made did not substantially conflict > with much of the work that I had done, which is goodness. > > I'm going to respond by quoting sections that you wrote followed by my > comments. I hope that's the best way to work through the comments. > > NM>>> > The finding claims that constraints are part of the language. I'm not > convinced that's a good formulation, since the constraints are embodied > in the set of texts & mappings. Stated differently, I think we're > confusing a "language" with "the specification of a lanuage", and those > are very different. So, I think a language should be a set of texts and > their interpretation as information, and I am very happy with the way > you present that much. > > I think we should have separate sections that talk about managing the > specifications for languages as they evolve, and certainly constraint > languages like XML Schema are among the good tools for writing > specifications. It's OK to talk about keeping a language and its > specification in sync. and to talk about constraint language features > that facilitate versioning. I don't think the constraints are the > language. I think they are emergent properties of the language that can > sometimes be usefully set down in mathematical and/or machine readable > notation such as regex's or XML Schemas. This is an important > distinction on which I disagree with the finding as drafted > << > > I'm not pursuaded that a language doesn't include constraints on the > language. I think the key part is that the set of texts may be > determined by the constraints. Using one of your favourite examples, if > I create a language that has Red,Green,Blue. There we have listed the > texts. > > But one of my favourite examples is the Name language, which has given > and family, and those are simply strings. Whether Aaaaa and Aaaa and > Aaa and Aa are part of the language didn't even occur to me until I > wrote this. When a processor determines whether a text is in the > language, it doesn't generate all the texts "in hand" and then compare, > it will look at the constraints and evaluate without having all the > texts in hand. I think any constraints are fundamentally part of the > language. > > It seems to me that some languages, membership is determined by having > the set of texts, and in others the set of texts can be generated from > the constraints. So, can we come up with a modelling mechanism that > allows a language to refer to one thing, rather than the 2 that it > currently does (texts and constraints). Perhaps this was what the > "membership" bucket was an attempt to model. > > Now I could flip this around and suggest we should go the opposite way > and suggest removing Text Set and Information Set: languages have > semantics, syntax and texts are in a language if they meet the syntax. > Languages also have a mapping between any individual text and an > individual information "item". > > I thought about breaking the relationship between language and syntax, > leaving syntax just connected to text set. If I squint hard enough, I > can see that could work. But I think that doesn't pass the common view > of language, which is that language is directly related to syntax rather > than indirectly via text set, see my name syntax example. > > NM>> > I think we can and should do better in telling a story about whether a > particular text is compatible as interpreted in L1 or L2, vs. the senses > in which languages L1 and L2 as a whole are compatible. I think the > story I would tell would be along the lines of: > > Of a particular text written per language L1 and interpreted per > language L2: "Let I1 be the information conveyed by Text T per language > L1. Text T is "fully compatible" with language L2 if and only if when > interpreted per language L2 to yield I2, I1 is the same as I2. Text T is > "incompatible" if any of the information in I2 is wrong (I.e. was not > present in I1 or replaces a value in I1 with a different one...this > rules disallows additional information, because only the information in > I1 is what the sender thought they were conveying, so anything else is > at best correct accidently). There are also intermediate notions of > compatibility: e.g. it may be that all of the information in I2 is > correct, but that I2 is a subset of I1. [Not sure whether we should name > some of these intermediate flavors, but if we do, they should be defined > precisely.] > > Of languages L1 and L2: We say that language L2 is "fully backward > compatible" with L1 if every text in L1 is fully compatible with L2. We > say that language L1 is "backwards incompatible" with L2 if any text in > L1 is incompatible with L2. We say that Language L1 is "fully forwards > compatible" with L2 if every text in L2 is fully compatible with L1. We > say that L2 is "forwards incompatible with L1" if any text in L2 is > incompatible with L1. As with texts, there may be intermediate notions > of langauge compatibility for which we do not [or maybe we should?] > provide names here. > > That all seems pretty simple and clean to me, and I think it's a firm > foundation for much of the rest of the analysis. Notice that it seems > natural to leave out discussion of the constraints in this layer; the > story gets simpler without them. The current draft seems to me a bit > loose in both talking about and defining issues for languages as a whole > vs. for individual texts. > << > > I have been moving towards this space as well in my examination of > partial understanding. It is yet another example of this "class" vs > "instance" that seems to always come up in modeling and system design. > > But I still disagree with the removal of syntax. I could easily suggest > somewhat alternate wording that makes use of syntax and makes sense to > me. "Of languages L1 and L2: We say that language L2 is "fully backward > compatible" with L1 if every text valid under L1's constraints is fully > compatible with L2". I could even push it further and define the > Syntactic constraints, S1 and S2, then rephrase as "We say that language > L2 is "fully backward compatible" with L1 if every text valid under S1 > ..." > > NM>> > Clarify focus on texts vs. documents > ... > << > > I agree. I've inserted part of one of your paragraphs. > > My comments on your comments ends, > > > > >> -----Original Message----- >> From: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com] >> Sent: Monday, August 28, 2006 4:22 PM >> To: David Orchard >> Cc: www-tag@w3.org >> Subject: Noah Mendelsohn Comments on July 26 Draft of TAG >> Versioning Finding >> >> First of all, thanks again to Dave for the truly heroic work >> on the versioning finding. This problem is as tough as they >> get IMO, and I think the drafts are making really steady >> progress. Still, as I've mentioned on a number of >> teleconferences, I have a number of concerns regarding the >> conceptual layering in the draft versioning finding, and some >> suggestions that I think will make it cleaner and more >> effective. Dan Connolly made the very good point that it is >> really only approriate to raise such concerns in the context >> of a detailed review of what has already been >> drafted. So, I've tried to do that. >> >> A copy of my annotated version of the July 26 draft is >> attached. I've taken quite a bit of trouble over these >> comments, which are quite extensive, and while I'm sure that >> they will prove to be only partly on the right track, I hope >> they will get a detailed review not just from Dave >> but also from other concerned TAG members. Anyway, what >> I've done is to >> take Dave's July 26th draft and add comments marked up using >> CSS highlighting. These are in two main groups: >> >> 1) An introductory section sets out some of the main >> architectural issues and ideas that I've been trying to >> convey. I don't expect these will seem entirely justified >> until you read the rest of the comments (if then), but I >> think it's important to collect the significant ideas, and to >> separate them from the smaller editorial suggestions. >> >> 2) I've gone through about the first third of Dave's draft, >> inserting detailed comments. Some of these are purely >> editorial, but most of them are aimed at motivating and >> highlighting the concerns that led me to propose the major >> points in that introductory chapter. Indeed, I've tried to >> hyperlink back from the running comments to the larger >> points, as I think that helps to motivate them. >> >> No editor working on a large draft entirely welcomes >> voluminous comments, especially ones that have structural >> implications. Dave: I truly hope this is ultimately useful, >> and I look forward to working with you on it. >> Where possible, I've tried to suggest text fragments you can >> steal if you like them. I actually am fairly excited, >> because working through Dave's draft has helped me to >> crystalize a number of things about versioning in my own >> mind. I think we're well along to telling a story that's >> very clean, very nicely layered, and perhaps a bit simpler >> and shorter than the current draft suggests. I don't think >> it involves throwing out vast swaths of what Dave has >> drafted, so much as cleaning up and very carefully relayering >> some concepts. >> >> BTW: I will be around on and off until about Wed. afternoon, >> then gone >> until after US Labor Day weekend. Thanks again, Dave. >> Really nice work! >> >> Noah >> >> [1] http://www.w3.org/2001/tag/doc/versioning-20060726.html >> >> >> >> -------------------------------------- >> Noah Mendelsohn >> IBM Corporation >> One Rogers Street >> Cambridge, MA 02142 >> 1-617-693-4036 >> -------------------------------------- >> >> >> >> >> > > -- -harry Harry Halpin, University of Edinburgh http://www.ibiblio.org/hhalpin 6B522426
Received on Friday, 8 September 2006 15:31:18 UTC