RE: Defined sets, accept sets, and <banana> elements

Tremendous!  This is wonderful.  Do you think this would be worthwhile
to add to the versioning works, or is it extra material that takes away
from the larger picture?  I could add this using this example, or a
different example..  

Another idea, is that we could do this as a standalone "micro-finding"
that would be linked from the other finding(s).

I'm happy to do whatever editorial work the group agrees on.  My
proposal is that this should be a stand-alone "micro-finding" that would
be linked.  "For an example of defined text set and accept text set
applied to an HTML extension, see @@"..

Cheers,
Dave

> -----Original Message-----
> From: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com] 
> Sent: Wednesday, June 20, 2007 9:07 AM
> To: www-tag@w3.org; Tim Berners-Lee
> Cc: David Orchard
> Subject: Defined sets, accept sets, and <banana> elements
> 
> During our discussions of versioning at the June 2007 TAG F2F 
> I raised concerns about our notions of "defined text set" and 
> "accept text set" 
> [1].  If you're reading the minutes at [2], look for the bit 
> that begins: 
> "NM: I think by the way that the defined set and the accept 
> set is less useful than we thought...".  At the meeting, I 
> challenged the group with a sketch of an example, asking how 
> it would be handled by the defined-set/accept-set model.  I 
> didn't hear an answer that satisfied me in the room, but Tim 
> and I happened to sit together on the flight home. He 
> expressed to me some support for the defined-set/accept-set 
> formulation, and he suggested how he would apply it to the 
> example I had in mind.  The purpose of this note is to record 
> the example and Tim's explanation, as I understood it, and 
> then to comment a bit.  I assume he'll correct any 
> misunderstandings on my part.
> 
> The Example Language: PHTML
> ---------------------------
> 
> The example is motivated by HTML, as styled by CSS.  To avoid 
> ratholes relating to particular historical details of those 
> languages, I'll here use two mythical languages PTHMTL, and 
> PCSS (pretend HTML and pretend CSS), pertinent details of 
> which are as follows:
> 
> Assume that PTHML is a language of well formed XML documents 
> that must have a root tag <PHTML>.  A few HTML-like tags, 
> such as <P> for paragraph and <BODY> for body are defined in 
> the PTHML version 1 specification.  As with HTML, PHTML 
> allows for the appearance of arbitrary tags such as <BANANA> 
> not named explicitly in the specification;  it defines any 
> document containing such an extension tag to have the the 
> same semantics as a similar document from which that tag has 
> been deleted.  Thus, per the PHTML spec., the following two 
> documents have the same meaning:
> 
>   doc1.phtml:
> 
>         <PHTML>
>          <BODY>
>           <P>Versioning is hard.</P>
>          </BODY>
>         </PHTML>
> 
> -and-
> 
>   doc2.phtml:
> 
>         <PHTML>
>          <BODY>
>           <BANANA>
>            <P>Versioning is hard.</P>
>           </BANANA>
>          </BODY>
>         </PHTML>
> 
> This is basically the example language we discussed at the 
> F2F.  (The application of CSS to this language is discussed 
> later in this note.)
> 
> Application of defined text sets and accept text sets to PHTML
> --------------------------------------------------------------
> 
> The invariant in the defined-set/accept-set formulation is 
> that every document in the accept-set conveys the same 
> meaning as some particular document in the defined set.  
> doc1.phtml above is in the defined set for PHTML, because all 
> of its content has a meaning supplied directly by the 
> specification.  doc2.phtml is in the accept set;  that 
> document too is in the PHTML language, but the semantics of 
> doc2.phtml are defined by means of its equivalence to a 
> defined-set document, doc1.phtml.
> 
> So far, so good.  All of this makes sense to me.  Defined 
> sets and accept sets do a good job of explaining this extensibility.
> 
> The challenge
> -------------
> 
> Now we come to the interesting part of the example.  We allow 
> our pretend CSS language to style the markup in PHTML 
> documents, and crucially, the styles can be applied to 
> <BANANA> elements as well as to paragraphs. 
> 
> <PHTML>
>  <HEAD>
>   <STYLE type="text/pcss">
>    P {font-size: 120%}
>    BANANA {color:yellow}
>   </STYLE>
>  </HEAD>
>  <BODY>
>   <BANANA>
>    <P>Versioning is hard.</P>
>   </BANANA>
>  </BODY>
> </PHTML>
> 
> The paragraph will have a large font and will be yellow.  My 
> challenge to the TAG was: how do defined and accept sets 
> explain this sort of extensibility?  (Note that the 
> equivalent is allowed for real CSS applied to real HTML.) 
> 
> My understaning of Tim's preferred answer is:  "PTHML as 
> redefined by PCSS is a different language than PHTML on its 
> own, and all of the legal strings (texts) in that new 
> language are in its defined set -- the accept set is the 
> empty set.  The PCSS specification is the one that gives a 
> non-vacuous meaning to <BANANA> elements, and indeed in the 
> presence of PCSS, a document with a <BANANA> is no longer 
> equivalent to one without. 
> Thus, according to PHTML as redefined by PCSS, all PHTML 
> documents are in the defined set, and none are in the accept set."
> 
> Some Comments
> -------------
> 
> I found this analysis to be tremendously helpful.  It 
> certainly meets my challenge at the F2F, which was to show 
> how the defined-set/accept-set model can be coherently 
> applied to this example.  I hope Tim can confirm that I've 
> correctly captured the essence of his analysis, and I'll be 
> very curious to see whether others who've been advocating the 
> defined-set/accept-set approach would apply it the same way.
> 
> As to my own position, I want to give it some more thought.  
> My initial concern (Dave and I discussed this at great length 
> at dinner in Mountain
> View) was that I didn't really see how to apply the concepts 
> to my example, and Tim has resolved that concern.  What 
> remains is a worry that by putting everything into the 
> defined-set, our versioning model is no longer saying much 
> about the sense in which the PHTML+PCSS V1 language is indeed 
> extensible.  When PHTML+PCSS version 2 comes along and 
> defines semantics for some elements like <BANANA>, I'm not 
> sure how the model's going to help us explain what happened, 
> because everything was in the V1 defined set to begin with.  
> Many languages in fact provide nontrivial default semantics 
> for their extension content --  our mythical PTHML/PCSS 
> allowed the extension content to be styled, but a PDOM might 
> well have allowed scripts to address the extensions, and 
> many, many non-HTML languages provide interesting default 
> semantics for extension content (store it, print it, etc.).  
> I think these are really interesting use cases, and I'll be 
> disappointed if our formal models don't explain them well.
> 
> Nonetheless, I think Tim and others are making a strong point 
> that, in the interesting sub-case where extension content is 
> truly and completely ignored, the defined-set/accept-set 
> model gives a nice, clean, set-oriented explanantion.  I buy 
> that.  So, I think this is progress. 
> Thanks to Tim for being patient in working through this with me.
> 
> Noah
> 
> [1] http://www.w3.org/2001/tag/doc/versioning-20070518
> [2] http://www.w3.org/2001/tag/2007/05/30-minutes#item06
> 
> --------------------------------------
> Noah Mendelsohn
> IBM Corporation
> One Rogers Street
> Cambridge, MA 02142
> 1-617-693-4036
> --------------------------------------
> 
> 
> 
> 
> 

Received on Wednesday, 20 June 2007 16:12:11 UTC