- From: Dr. Olaf Hoffmann <Dr.O.Hoffmann@gmx.de>
- Date: Thu, 21 Jan 2010 19:03:59 +0100
- To: public-html@w3.org
Tab Atkins Jr.: > > It doesn't mutate, because it isn't, by itself, an XHTML file. It's a > bag of bits. This is another abstraction level and a general problem of data saved on hard discs and related things - even more, it is a general problem of 'information', that it is not information by itself, just be some cultural agreement how do encode/decode information. On a filesystem like ext3, reiserfs, vfat or something like this, there is some agreement of which bits have to be combined to a file somehow. Due to the sorrowful history of computers, there is no agreement how to relate metainformation about what the file represents, therefore practically this is somehow inside the file. And once it is managed to encode that it is 'readable' text and there is an XML processing instruction at the beginning, this is already some piece of metainformation to distinguish - HTML has no such processing instruction, therefore HTML can be already excluded. Other cultural agreements can be found too for the relation of the processing instruction to XML and to determine the namespace, then typically knowing the namespace one can determine the language and the version of the language - well this is it, not nice but in practice one can do this step by step in a similar way as to distinguish between hieroglyphs and celtic runes. And then, if you want or there is some mandatory advice you can try to interprete the hieroglyphs as celtic runes and vice versa - whatever is required and whatever is the result. If the server or the author insists on this, why not? But this does not change the simple fact, as what language it is noted in the case of XHTML (with the help of some minor cultural agreement on how to read text files). > It can be interpreted as XHTML, or HTML, or plaintext, > or a bitmap for that matter. Files don't carry around an essential > identity, they obtain one when you choose to interpret them in a > particular way. No, finally these are magnetic areas on a hard disc or electric charges and there is some common cultural agreement up to some abstraction level, how to interprete these physical phenomena as information, it suddenly has its own identity as information, what is lost of course if any information about the cultural agreement is lost. But as long as this is not lost and there is some chain of agreements, the file has some information as content and therefore its own identity. The 'miracle' about such digital files is, that you can duplicate the information - and this is what happens if something is served to the browser, either by a file-system or a server or whatever. And the duplicate has somehow the same identity as the original, just because it has the same information as content. The file-system or the server sends additional meta-information and advices about the file. This is additional information, not belonging directly to the block of information what we can call the file. And therefore it does not change the identity of the file. Still we can save the file again and compare it to the original - if nothing went wrong, we will find, that both contain still exactly the same information, whatever the metainformation was and whatever happend with this metainformation. If this would not be possible, computers and internet would not exist and information would be still written only on paper or knocked into stones (and even this can be used to save digital information and to duplicate digital information without losses and changes - for example in punchcards and -tapes). Even more exiting and appearing like a miracle is quantum information with entangled states - but this is another issue, what could help to avoid, that suddenly your money vanishes from your digital bank account without your agreement due to security holes in browsers or similar devices. > > That's why there was never any such thing as "XHTML served as > text/html". It was always HTML, albeit with some slightly invalid > syntax inspired by the XHTML syntax which browsers tolerated/ignored. > If they served it as application/xhtml+xml, then it would have been > XHTML. > If you do not believe in the identity of digital files, you should not have such a digital bank account, only gold coins and diamonds - and even they depend somehow on a cultural convention that they mean something. > Unfortunately, I think we've gone far down the rabbit hole of an > irrelevant sidetrack, so it's probably good to stop now. > > ~TJ With this I agree, would have been a better idea, to have left the old fashioned HTML in the last millenium and to use XHTML in this millenium without any dirty tricks for outdated browsers. However, doesn't HTML5 do exactly the opposite, describing how to interprete what looks like XHTML as HTML? Finally this seems to 'standardisease' somehow the common bad practice to stir all this up to one soup. I think the informative section in XHTML1.0 was only intended to please users of netscape3/4 and msie3/4. It got only a permanent desease, because after 10 years there are still browsers in use without much implementation progress compared to netscape4 and msie4 ;o) With some more implementation work this could have been avoided and there would be no need to 'standardisease' in HTML5 and one could have been started with something up to date in XHTML2 or 3. Olaf
Received on Thursday, 21 January 2010 18:18:28 UTC