W3C home > Mailing lists > Public > public-html@w3.org > January 2010

RE: Updated DOCTYPE versioning change proposal (ISSUE-4)

From: Larry Masinter <masinter@adobe.com>
Date: Thu, 7 Jan 2010 18:04:03 -0800
To: Adam Barth <w3c@adambarth.com>
CC: "public-html@w3.org" <public-html@w3.org>
Message-ID: <C68CB012D9182D408CED7B884F441D4D309824@nambxv01a.corp.adobe.com>
>From change proposal rationale:
>> While everyone *hopes* there are never going to be any further
>> incompatible changes to HTML in the future, there *is* a
>> possibility that in some unfortunate situation, it will be
>> necessary to introduce incompatible changes. In that case, it will
>> be necessary to introduce a new version indicator, to allow (alas)
>> processors to determine which of the incompatible interpretations
>> was meant.

> If the new version is going to be incompatible anyway, why not add
> the version indicator at that time?

If you have a production workflow that only deals in valid documents,
and you later introduce a version indicator which isn't valid now,
then the existing workflow would reject the new version indicator as
invalid.

>> While this will be unfortunate, it would be doubly unfortunate to
>> have to introduce a new "place" for a version indicator that was
>> previously non-conforming, which would cause even worse uproar,
>> because documents that *didn't* want the new incompatible behavior
>> would have no place to say explicitly that which version of the
>> incompatible behavior they wanted.

> We wouldn't need to invent a new "place" for this information, we
> could just resurrect this proposal to use the old place.  The
> documents that don't want the new behavior can just use the HTML5
> doctype of "<!DOCTYPE html>".  If we were certain that this
> eventuality would come to pass, we might want to optimize for it by
> providing a more elegant alternative, but the current indications
> are that this not a likely course of events.

If you remember from the issue with "Referer", using the absence of an
indicator to indicate something is ambiguous; a user specifying
<!DOCTYPE html> might be saying they want the HTML5 behavior, or they
might be saying they don't care about the behavior because they're not
using the feature that has the incompatible change, or that they are
willing to accept either behavior.

>> By *allowing* a version indicator in conforming content today, we
>> can avert more serious damage. Having a location for a version
>> indicator, even if it isn't explicitly used, allows it to be used
>> at some point in the future.

> This is the core of your argument, but if the future version we're
> planning for is incompatible anyway, why does it matter if it
> re-introduces versioned doctypes?

This isn't the core of my argument, by the way; it's only one of
several arguments. And I think "incompatible" is not a binary "yes
it's incompatible" or "no it is not". In most cases, incompatibilities
are minor, only affect edge cases, may not be particularly significant
to most users and only apply in particular cases.

> You're not asking for anything to change for user agent
> implementations, so it's not like user agents will act differently
> in your alternative future.

I think you have to be careful to allow staging of incompatible
changes, basically, to allow for new browsers which implement the
incompatible behavior to be deployed asynchronously with new content
that is explicit that it wants the current behavior and not the new
behavior. The way that can happen is:

NOW:
1. allow current browsers and validators to accept a current version 
   indicator
2. allow (but do not encourage) content to be deployed which 
   explicitly calls for current behavior using a current version indicator

LATER: (after deciding to introduce incompatible change):

3. encourage current content which depends on old behavior
   to identify old version (only allowed by 2)
4. deploy new browsers which implement the new incompatible behavior,
   if there is an explicit new version indicator
5. deploy content which uses the version indicator to call
   for new behavior rather than old

This change proposal basically does 1 and 2.  Step 3 is possible
because conforming validators allow current version indicators in 2.
Steps 3 and 4 are asynchronous, and step 5 (which is what you
need to get any benefit) is delayed until 4 reaches an acceptable
threshold.

If you do not allow explicit version indicators now, then 
we would need another step before 3 which would be to deploy
validators which accepted a current version indicator.

This would delay consistent adoption; more likely people would
deploy incompatible content labeled with "Best Viewed With
Firefox 12" instead.

> You explicitly tell authors not to use the extended syntax: "A
> PublicIdentifier SHOULD NOT be used," so the extant documents on the
> world-wide web at this future time when we need an incompatible
> version will likely be the same.

I use "SHOULD NOT" carefully in the RFC 2119 sense, that the practice
is not recommended except when there are good, well-understood
reasons. There are some such justifications now (for use within
editing or polyglot pipelines), but should an incompatible change be
necessary, then the reasons would increase. Well before browsers that
implement the incompatible change are widely deployed, content which
is explicit as to the version intended can be deployed. And plans for
incompatible changes would add to the justification for using
version indicators.

Possibly the SHOULD NOT in fact should be softened
to MAY, or only applied in the context of hand-edited or manually
assembled content, or situations where the actual version is 
unknown. 

> All that seems to change is the conformance status of documents
> produced *after* the new incompatible spec is issued.

No, documents with an explicit version indicator would be conforming
NOW. That's important in staging the deployment of (unfortunate,
hopefully unnecessary) incompatible changes.

> Moreover, it's only the conformance status of those document
> w.r.t. the *old* specification.  That seems like pedantry in the
> extreme.

Pedantry:
 * the character, qualities, practices, etc., of a pedant,
   esp. undue display of learning.
 * slavish attention to rules, details, etc.

I'm not sure how "pedantry" applies. Is supplying technical 
analysis based on facts and previous experience "undue"?
We're engaged in writing a technical specification for a language
used by millions, and being careful about rules and details.
Is it "slavish" to do so?

I think "pedantry" is inappropriately pejorative in
this context. 

I said:
>> In the history of computer languages, there are no languages that
>> have not evolved, been extended, or otherwise "versioned" as long
>> as the language has been in use.

to which you replied:
> Really?  Where are the version indicators for C++?  The C++
> languages has certainly evolve since its inception, but it hasn't
> needed an explicit version indicator.

I didn't say that there are no languages without in-band version
indicators. C++ programs are not self-contained. They come with
make files, version compatibility installers, and other documents
which indicate -- usually pretty clearly -- which version of which
language of which compiler is to be used to compile the program.  C++
versioning is in fact problematic, and there are numerous ad-hoc
solutions to managing the evolution of programming languages and
compiler implementations.

It is not possible to retrieve a C++ program and run it without
getting out of band information about which version of C++ was
intended, or using version information embedded in the README or
configuration files or scripts.

One of the key innovations of the web (in the early '90s) over
previous distributed network information systems was the adoption of
the MIME architecture, previously invented for email, to make
interpretation of message bodies and content self-contained, so that a
large amount of contextual information wasn't necessary in order to
discern the meaning of an exchange.  This allows HTTP to be stateless,
which allows for load balancing, distribution, cloud computing,
caching, and a wide variety of other facilities that were not possible
with other distributed information systems.

And part of the MIME architecture is that the content-type label in a
message indicates the general category of information contained in the
message, while any other versioning information is contained within
the message itself.

The adoption of MIME in HTTP wasn't a slam-dunk; it followed from the
resolution to use MIME in Gopher at GopherCon '93; see
http://prentissriddle.com/trips/gophercon1993.html.

The adoption of the MIME architecture in HTTP between HTTP 0.9 (which
had no content labels at all) and HTTP 1.0 was again one of the major
innovations in the web which has led to its growth and evolution over
such a long period of time.

Most of the MIME types in use on the web for stand-alone content
contain in-band version indicators, whether for the whole file
(as with image/gif, image/jpeg, application/ogg, application/pdf,
Flash, Java), or through version indicators on chunks in an 
unversioned file structure (as with image/png and audio/mp3).

While CSS and JavaScript don't have versions, but they are also
not standalone content -- a CSS style sheet doesn't constitute
a "message" in any meaningful way, and so the type and the
version of the type can be managed as part of the bundle.

> It seems entirely likely that HTML will continue to evolve without a
> version indicator because the mechanism we've been using for
> versioning has been more or less ignored because authors screw it up
> too much.

"more or less ignored" is in browsers; it seems to me that the
majority of HTML editors use version indicators as part of the
HTML authoring process.

That the web has been successful, but that much current
web content seems to be sloppily constructed, is not evidence
of a causal relationship. By itself, it not an argument
for reifying sloppy construction or adding a requirement
for sloppy construction (by, for example, not allowing authors
to identifying in a standard way which specification(s)
they are attempting to be compatible with). The sloppy 
construction was a result of the success of the web and
the DotCom bubble, not a cause.


When I said:
>> This applies to network protocols, character encoding standards,
>> programming languages, and certainly to every known technology
>> found on the web.
You replied:
> That's quite a bold claim and certainly untrue.

But the "this" was in context of languages evolving without
incompatible changes, and subsequently when I asserted:

>> There are no known cases where a language hasn't gone through some
>> at least minor incompatible change.

you replied:
> Right, but that doesn't mean we need a version indicator.

So I think your "certainly untrue" is contradicted by your "right". If
you had a counter-example of a web language that hasn't had
at least a minor incompatible changes, you would have supplied it.

> HTML has gone though a number of minor incompatible changes and the
> world has managed not to end in spite of everyone ignoring the
> version indicator.

I don't think the criteria for continuing an HTML 4 feature in HTML 5
includes the requirement that "the world will end if we don't".

And *everyone* does not ignore version indicators. In fact, the change
proposal is much more explicit than before in requiring BROWSERS to
not exhibit different rendering behavior in the face of version
indicators, but to allowing validators, editors, and content
production pipelines to use them.

Yes, HTML can survive without a global version indicator, and only
specifying <!DOCTYPE html> may continue to work in the narrow context
of communication between web server and current browser, but leaving
the ability to provide a PublicIdentifier and SystemIdentifier will
allow some current production and editing tools to work better, and,
if used carefully, will cause no harm. I think that's all we ask for
new features, and the threshold for retaining old features should be
lower than the threshold for adding new ones.

> In summary, we don't need to add versioning now to future-proof the
> spec because the effects of this change are felt only after we
> discover an incompatible version is required.  Attempting to prepare
> for that eventuality as described in you change proposal doesn't
> actually do anything substantive to help.

The Change Proposal is not to "add versioning" but to "leave (some
parts of) HTML4 versioning as a HTML5 feature". Couching this as an
"addition" is misleading.

The argument for "future proofing" was just one of several
arguments. And I think I've made the case for why attempting
to add a version indicator later would require an extra step
in staging the deployment of what would be, presumably, an
fix important enough to introduce an incompatibility.

Larry
--
http://larry.masinter.net
      
Received on Friday, 8 January 2010 02:04:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:57 GMT