- From: <noah_mendelsohn@us.ibm.com>
- Date: Wed, 20 Jun 2007 12:06:52 -0400
- To: www-tag@w3.org, Tim Berners-Lee <timbl@w3.org>
- Cc: "David Orchard" <dorchard@bea.com>
During our discussions of versioning at the June 2007 TAG F2F I raised concerns about our notions of "defined text set" and "accept text set" [1]. If you're reading the minutes at [2], look for the bit that begins: "NM: I think by the way that the defined set and the accept set is less useful than we thought...". At the meeting, I challenged the group with a sketch of an example, asking how it would be handled by the defined-set/accept-set model. I didn't hear an answer that satisfied me in the room, but Tim and I happened to sit together on the flight home. He expressed to me some support for the defined-set/accept-set formulation, and he suggested how he would apply it to the example I had in mind. The purpose of this note is to record the example and Tim's explanation, as I understood it, and then to comment a bit. I assume he'll correct any misunderstandings on my part. The Example Language: PHTML --------------------------- The example is motivated by HTML, as styled by CSS. To avoid ratholes relating to particular historical details of those languages, I'll here use two mythical languages PTHMTL, and PCSS (pretend HTML and pretend CSS), pertinent details of which are as follows: Assume that PTHML is a language of well formed XML documents that must have a root tag <PHTML>. A few HTML-like tags, such as <P> for paragraph and <BODY> for body are defined in the PTHML version 1 specification. As with HTML, PHTML allows for the appearance of arbitrary tags such as <BANANA> not named explicitly in the specification; it defines any document containing such an extension tag to have the the same semantics as a similar document from which that tag has been deleted. Thus, per the PHTML spec., the following two documents have the same meaning: doc1.phtml: <PHTML> <BODY> <P>Versioning is hard.</P> </BODY> </PHTML> -and- doc2.phtml: <PHTML> <BODY> <BANANA> <P>Versioning is hard.</P> </BANANA> </BODY> </PHTML> This is basically the example language we discussed at the F2F. (The application of CSS to this language is discussed later in this note.) Application of defined text sets and accept text sets to PHTML -------------------------------------------------------------- The invariant in the defined-set/accept-set formulation is that every document in the accept-set conveys the same meaning as some particular document in the defined set. doc1.phtml above is in the defined set for PHTML, because all of its content has a meaning supplied directly by the specification. doc2.phtml is in the accept set; that document too is in the PHTML language, but the semantics of doc2.phtml are defined by means of its equivalence to a defined-set document, doc1.phtml. So far, so good. All of this makes sense to me. Defined sets and accept sets do a good job of explaining this extensibility. The challenge ------------- Now we come to the interesting part of the example. We allow our pretend CSS language to style the markup in PHTML documents, and crucially, the styles can be applied to <BANANA> elements as well as to paragraphs. <PHTML> <HEAD> <STYLE type="text/pcss"> P {font-size: 120%} BANANA {color:yellow} </STYLE> </HEAD> <BODY> <BANANA> <P>Versioning is hard.</P> </BANANA> </BODY> </PHTML> The paragraph will have a large font and will be yellow. My challenge to the TAG was: how do defined and accept sets explain this sort of extensibility? (Note that the equivalent is allowed for real CSS applied to real HTML.) My understaning of Tim's preferred answer is: "PTHML as redefined by PCSS is a different language than PHTML on its own, and all of the legal strings (texts) in that new language are in its defined set -- the accept set is the empty set. The PCSS specification is the one that gives a non-vacuous meaning to <BANANA> elements, and indeed in the presence of PCSS, a document with a <BANANA> is no longer equivalent to one without. Thus, according to PHTML as redefined by PCSS, all PHTML documents are in the defined set, and none are in the accept set." Some Comments ------------- I found this analysis to be tremendously helpful. It certainly meets my challenge at the F2F, which was to show how the defined-set/accept-set model can be coherently applied to this example. I hope Tim can confirm that I've correctly captured the essence of his analysis, and I'll be very curious to see whether others who've been advocating the defined-set/accept-set approach would apply it the same way. As to my own position, I want to give it some more thought. My initial concern (Dave and I discussed this at great length at dinner in Mountain View) was that I didn't really see how to apply the concepts to my example, and Tim has resolved that concern. What remains is a worry that by putting everything into the defined-set, our versioning model is no longer saying much about the sense in which the PHTML+PCSS V1 language is indeed extensible. When PHTML+PCSS version 2 comes along and defines semantics for some elements like <BANANA>, I'm not sure how the model's going to help us explain what happened, because everything was in the V1 defined set to begin with. Many languages in fact provide nontrivial default semantics for their extension content -- our mythical PTHML/PCSS allowed the extension content to be styled, but a PDOM might well have allowed scripts to address the extensions, and many, many non-HTML languages provide interesting default semantics for extension content (store it, print it, etc.). I think these are really interesting use cases, and I'll be disappointed if our formal models don't explain them well. Nonetheless, I think Tim and others are making a strong point that, in the interesting sub-case where extension content is truly and completely ignored, the defined-set/accept-set model gives a nice, clean, set-oriented explanantion. I buy that. So, I think this is progress. Thanks to Tim for being patient in working through this with me. Noah [1] http://www.w3.org/2001/tag/doc/versioning-20070518 [2] http://www.w3.org/2001/tag/2007/05/30-minutes#item06 -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Wednesday, 20 June 2007 16:06:40 UTC