RE: Versioning and HTML -- version indicators from Marc de Graauw on 2009-05-16 (www-tag@w3.org from May 2009)

From: Marc de Graauw <marc@marcdegraauw.com>
Date: Sat, 16 May 2009 11:46:46 +0200
To: <noah_mendelsohn@us.ibm.com>, "'Larry Masinter'" <masinter@adobe.com>
Cc: <www-tag@w3.org>
Message-ID: <3E49E31E70FB47318671DDB59F1D3D10@Marc>
Noah,

| In particular, I still tend to believe that:
| 
| "If the same document means different things in different 
| versions of a 
| language, then it's very important to indicate which version 
| the author 
| had in mind when creating the document. Putting that version 
| indicator 
| into the document itself is one good way to do it."

I believe the basic notion should not so much be what version the author had
in mind, but what the author expects or requires from the reader. This might
not be a single version indicator, but a collection of required or expected
capabilities, or nothing much at all.

More explicit: in the case of publication formats, understanding the MIME
type and having something at all which can try to process it may sometimes
do. If I publish something on the web, for reading sometimes a 'best effort'
will do: render what you understand, ignore the rest. (I'm no expert on
HTML, and this is not meant as a position on what versioning information
should be included in HTML.)

For healthcare, or other messaging systems, this is not the case: as a
receiver, I expect you to understand an EHR XML message about me, and the
medication code system used - if you do not understand either, you should
not try to process or read the message: you might be messing with my life. 

As the medication code sample shows, multiple indicators may be needed. For
the base version of the language, maybe for code systems, extensions,
localizations etc. For a v2 things may become different. An EHR v2 might
just contain allergy information as an optional add-on. If this is the case,
for messages without the optional allergy-related parts, understanding the
base version + medication codes still will do. (I wrote an article with an
extended example a while ago [1]).

You've explained this in your blog item [2] with the example of optional
pictures in v2 quite clear. But in the discussion there you only ask whether
v2 docs without pictures should be marked v1 or v2: you don't consider a 'v1
+ pics' versus a 'v1' option: two markers, one optional, which reflects the
language evolution.

In short, I believe having a single version indicator does not cover more
complicated cases. I also don't believe the auther's intentions matter much,
what matters is what the author expects of the reader. This is very much
context-dependent, in publications for the world this might not be much
aside from some basic reading capability; in other contexts this may be very
constrained. This also reflects on the effort the author is willing to do to
provide detailed versioning information: as you write in the blog, "I don't
want to have to go through the specifications for every version of the
recipe language that's ever existed just to find the oldest that works".
True in most cases, not in some. The options the author has are basically:
- provide no specific versioning information at all, other than MIME type,
and leave the rest up to the reader
- provide the version of the language spec used to write the doc (the
laziest option after the previous)
- provide more detailed information on the specific capabilities required or
expected of the receiver.

There's no generic 'best option' here. You proposed in the blog: "If a
language or data format will change in incompatible ways, then indicate the
language version used for each instance." I believe that's too strong for
all cases. Quite often for the author it's enough if the receiver's software
will fail after an incompatible change. In source code, having a version
indicator is uncommon and it does not matter much: the compiler will report
incompatible code. In other cases, it's not strong enough since more than a
single 'language version' may need to be communicated. Note that the
distinction between a language, sublanguages, incorporated languages,
extensions, localizations and such is blurred - some definition on what
constitutes a language may needed. On the other hand, if one uses multiple
versioning markers, and the burden placed on the receiver is clear, it may
not matter much whether those multiple markers pertain to one or several
languages.

| I would actually propose that, with respect to explicit 
| version indicators 
| in particular, we take the points in the blog entry as a 
| starting point, 
| and either publicize them, elaborate them, or where necessary correct 
| them.  To me, they look right as far as they go.

I don't know whether this has any relevance to HTML at all, but since this
discussion is also in the context of the TAG Versioning Finding, I hope you
view this as an effort to elaborate them. I think specifically the case of
multiple markers with versioning-related information should be covered.

Regards,

Marc de Graauw

http://www.marcdegraauw.com

[1] http://www.xml.com/pub/a/2007/04/11/a-smoother-change-to-version-20.html
[2] http://www.w3.org/QA/2007/12/version_identifiers_reconsider.html
Received on Saturday, 16 May 2009 09:47:42 UTC