Re: [HTML5] 2.8 Character encodings

Ian Hickson:
> My apologies for top-posting, but I couldn't really work out how to reply
> in context to this e-mail (included in full below).
>
> I think we have some sort of fundamental difference in understanding of
> the purpose of specifications, and I don't understand your view well
> enough to figure out a common ground from which to satisfy your comments
> in the spec.
>
> >From my point of view, a language spec's purpose is to ensure
>
> interoperable behaviour between software products. In this view,
> obsoleting a feature from an earlier version of the language leads to that
> feature not having meaning in the new language other than the
> backwards-compatible processing requirements. Authors of previous versions
> of the language are irrelevant, since (if they acknowledge the new
> language definition) they are now required to use the new version of the
> language, and thus the old language is irrelevant to them.
>

I agree with this.
Even more, if the version of the language is indicated within the
document, it is defined by the author, which version of the language
is relevant and the others are irrelevant. If not and there is no other
information available, it is undefined. It can be a good first approach
to try the newest language for a guess. Analysis of the content can
lead to an even better/different guess.
This is more a task for archaeologists, historians or language 
theoreticians in the future for each document, if this gets 
important (for many documents indeed it will not be important 
at all). Why to make their work harder as required by forcing 
authors not to indicate which version they use?

You claimed, that the new version defines the meaning of old versions
too. This can only be true, if there are no inconsistencies and the new
version is a real superset of the old. 
For some partly good reasons this is not the case for 'HTML5' as
I pointed out with some samples, therefore we can forget this 
variant.
And of course in pratice there will be always such minor or major
differences in each new version, just because authors of new
version specifications learned something or just have another view 
or understanding of language constructions.

And obviously if a document was written 10 years ago using 
HTML4, it has nothing to do with the meaning of elements/structures 
as defined in 'HTML5' now. And if those authors acknowlegde 
the new version, this does not mean, that they update all old 
documents each time a new version appears, just because the
new version changes some details in the definition of the meaning
or structure model of elements. If the authors already died, there
will be neither an acknowledgment of a new version nor a change
of documents any more. New language versions are irrelevant
for their documents.
Indeed, such old documents are irrelevant for the specification of the
new version as well, just because the old specification applies for them.
And of course it is useful, if the new version is at least aligned in
such a way, that new user agents are still able to present old 
documents.

The main purpose of written text is to conserve information over
time. Written text is not just for one moment and a document 
once written may remain for years or hundreds of years. It does 
not change or does not change structure or meaning, just because 
a new version of a language appears. To assume this would be 
very esotheric. 

The main problem concerning this issue of 'HTML5' is currently,
that it has no version indication and therefore authors wanting
to use it and wanting to write defined documents cannot use
it, respectively cannot indicate, which version they use. 
For the future this simply means, that 'HTML5' is a version without
documents it applies to. If someone likes, it can be applied to 
documents without any version indication (as many other
language versions too), just because such documents have no 
defined relation and no defined meaning of elements and 
structures at all. Whether this is useful or not depends strongly 
on the date of origin and the circumstances and attitude of the 
author, if known. 
If not, it can be simply considered as arbitrary tag soup.

Olaf

Received on Thursday, 3 September 2009 09:06:08 UTC