Progarm vs. Information Compatibility (was: Re: using the UML terms to define backward, forward compatibility [XMLVersioning-41)]

On our call this week I had a much appreciated opportunity to discuss the 
first of my significant comments on the versioning draft (I.e. the first 
one at [1]).  Your recent note [2] is a good openning for me to bring up 
another comment I made, one we did not yet discuss in detail:

Dan Connolly wrote:

> I'm pretty sure that the term "backwards compatible" can be 
> completely defined in terms of intent, text, etc. and a bit of logic.

I want to make the case that we need to discuss compatibility not just at 
the information transfer level, but also in terms of how programs behave 
and in terms of what users consider to be a successful result of 
processing.  First of all, I agree completely with what I think you're 
trying to say, which is that insofar as what we mean is: "the texts are in 
both languages, and the information they convey in the two languages is 
either the same or sufficiently close", you have a good definition of 
backward compatibility at the information level.  Furthermore, I agree 
that we should be able to tell that story in terms of {text, intent, 
logic}.  Indeed, that was exactly my hope in making the comments we 
discussed on Tues.

The concern I want to discuss here is that, while the above is a very 
important building block for us to put in place, it does not define 
compatiblity as users in practice worry about it.   There is another 
related view of compatibility that has only indirectly to do with 
languages, texts and information.  Stated informally:  "a given text is 
compatible with my program if that program will produce results that I 
consider acceptable when processing that text".  Similarly, "a given 
language is compatible with my program if that program will produce 
acceptable results from all texts in that language".   That's a very 
different notion of compatibility, and we need to keep them separate. 
Let's call the sort of compatibilty you were discussing "information 
compatibility" and this other one "program compatibility".

First of all, program compatibility is not an all or nothing thing. That's 
one of the reasons I want to be careful with glib use of the term 
"backwards compatible" at the program level (it's fine at the information 
level as you use it).  Consider the attached HTML file.  The W3C Validator 
says it's valid HTML transitional.  If you open it in Firefox, it's a bit 
ugly, but it basically works.   You can see the whole thing, but if your 
display isn't pretty wide you'll probaby have to scroll.  Now stay in 
Firefox and try "Print Preview".   I think you'll see that information is 
lost.  So, if this file had come from a program writing some newer version 
of HTML would we consider it backwards compatible?  I think we all believe 
that internally Firefox has extracted all the intended information from it 
(I bet that even their printing support isn't internally failing to 
correctly interpret the <td>...</td> and extract the long string), but 
many users would say "That file is not compatble with Firefox's printing 
support", and they'd be right.

My point is that I want to tell the story at both levels.  I think you're 
on the right track with the information compatibility part, but I want to 
talk about program compatibility too, because that's what users really 
care about.  Explaining the difference could be very helpful, I think.

What is the relationship?  Well, as best I can tell, to perform whatever 
function, a program will depend on some or all of the information it 
gleans from the input.  However, even if the program gets all the 
information correctly, it may still not behave well for other reasons, as 
was the case with Firefox.  It may even perform according to 
specification, but just not do what the user wants.  Interestingly, it's 
also possible that information was lost or mangled due to language 
mismatch, but that the program doesn't care about the part that's in 
error.   In that case, the program will behave compatibly for the purpose, 
even though there was some incompatibility at the information transfer 
level.

The draft finding does try to tell a story about program compatibility, 
but I don't think the approach taken really works well.  As I understand 
it, the draft proposes that we characterize each program by language that 
it accepts.  In the Firefox example, we'd have to talk about at least two 
HTML languages:  the one that it can display on the screen, and the one 
that it can successfully print.  The problems I see are that, first of 
all, it's very hard to define at the text level which input strings will 
cause a given version of Firefox to clip it's print outout.   There also 
would be a tremendous proliferation of versions to discuss.   In fact, you 
can get the attached file to print in Firefox by setting the printing 
scale to 30%:   should we say there's a different version of the HTML 
language for each possible setting of the Firefox Print Scale (and Font, 
etc.)?   It's coherent intellectually, but it's not how people think about 
these things.  I'd rather that the finding admit that there's just one 
version of the HTML language involved in this scenario, that Firefox is 
presumed to be completely compatible with that language at an information 
transfer level, but that for certain purposes it may still not behave 
compatibly at the program level.

By the way, mustUnderstand can be much better modeled at the program level 
than the information transfer level, I think, but that's a subject for a 
different note.

Bottom line:  I'd like the finding to have some sections on information 
transfer compatiblity, and I agree that at that level the formalism you're 
looking at is a good start.  I also want the finding to talk about program 
compatiblity, and how at that level you need to define compatibility in 
terms of what you consider to be a successful outcome.  Indeed, at that 
level, it's even reasonable to consider an input as incompatible if a 
correct result is produced, but if the time taken to do it is impractical.

Noah

[1] 
http://lists.w3.org/Archives/Public/www-tag/2006Aug/att-0111/versioning26July2006withNoahComments.html#noahComments
[2] http://lists.w3.org/Archives/Public/www-tag/2006Sep/0043.html


--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Saturday, 9 September 2006 01:03:56 UTC