- From: Dr. Olaf Hoffmann <Dr.O.Hoffmann@gmx.de>
- Date: Fri, 28 Aug 2009 17:19:35 +0200
- To: public-html-comments@w3.org
Ian Hickson: > On Sun, 16 Aug 2009, Dr. Olaf Hoffmann wrote: > > Ian Hickson: > > > On Wed, 12 Aug 2009, Dr. Olaf Hoffmann wrote: > > > > The meaning of some elements is different in 'HTML5' as well or is > > > > defined in a more restrictive way, what excludes some use cases > > > > possible in HTML4. > > > > > > Yes, but in practice that's not an issue since HTML5 describes how > > > HTML4 UAs actually did things. > > > > User agents present the elements somehow, often this does not directly > > imply a meaning. > > I agree, only the spec can imply a meaning. > > > And if we take (again, I already discussed this with Anne) the sample > > of the element small, the presentation implicates no specific meaning, > > what is ok for HTML4, because the definition does not imply a specific > > meaning either. The audience has to derive this from relations to the > > content around it. 'HTML5' defines a meaning for small. > > > > Therefore the 'HTML5' definition does not apply for all use cases > > in HTML4 documents, just to a subset. > > By that reasoning, the 'HTML4' definition does not apply for all instances > of HTML4 documents either. After all, most HTML4 documents have some level > of conformance error, and an overwhelming number of HTML4 documents use > elements incorrectly. I don't think this is a useful line of reasoning. 'Not really sure what you're saying here.' I never had a problem to understand the specified meaning of most HTML4 elements, therefore I cannot see errors in the definitions of the meaning of those elements, especially because for several of them the semantical meaning is vague - what corresponds often to vague use cases in many documents. In general HTML does not have many elements and therefore to cover every possible type of text the meaning of elements has to be very broad. Another approach could be of course to have a larger collection of elements to markup text (what is in parts available in other formats for more or less specific use cases). That some authors use some elements for not intented purposes is not directly a problem of the specification, this can be a social problem, a problem of limited intellectual capabilities, of indifference and ignorance. Such things cannot be changed with another version of a language. Maybe this cannot be changed at all. > > > However, because user agents do not need to care about the meaning, > > the presentation may not differ. Authors have to care about the meaning > > and cannot use the element in 'HTML5' for some use cases. > > This is not necessarily a problem for authors, if there are other > > elements intented in 'HTML5' for their use case and because HTML4 and > > 'HTML5' are different versions and looking at the version indication > > (doctype) one can at least indentify, when authors use the HTML4 > > definitions. As long as 'HTML5' has no version indication, there is no > > simple way to indicate, that the definition in 'HTML5' applies. > > Just always assume the HTML5 definition applies. This would be surely wrong for several use cases, 'HTML5' excludes. It is no problem for me, that 'HTML5' excludes them for 'HTML5' documents. However, for HTML4 documents they are still possible and this is not time dependent and does not depend on other people specifying other versions of the language. To believe in 'HTML5' for such documents would implicate, that some constructions have no meaning at all, because the elements are used for the wrong purpose. Or if I define a HTML-version on my own, exchanging the definitions of the elements p and blockquote, and ensuring that this new HTML-version superseedes any other version, I think this does not really change the meaning of p and blockquote in HTML4 or in 'HTML5', it changes only the meaning for documents written in my own HTML-version. Or more related to the discussion - in 'my version' of 'HTML5' I could define a version attribute - how does this change the meaning of the W3C draft? It changes meaning and usability of my draft, but not that of the W3C draft. And the changes are not necessarily relevant for the presentations in a user agent. > As far as I can tell, > that won't cause any problems. Can you point to a page where doing this > causes a practical problem with an software package? > Well, authors are no software packages, therefore 'Not really sure what you're saying here.' > > Or the meaning of cite is defined more precisely, what is ok for > > a new version, but not applicable for the usage in HTML4 documents. > > HTML4's description was apparently vague enough that different people > consider it to mean different things. My interpretation of HTML4's text is > that HTML5's definition of <cite> is a superset of HTML4's. > > > Ok - what really a proper content is, depends on several things, > > if simply a private communication or a dictum is quoted, it has no > > title and one has to note the name of the person. > > If a work has specific authors, both title and authors and maybe > > a source or a unique identifier may belong to the citation information. > > Not really sure what you're saying here. > I just compare what is often included in citations with what is currently noted about the content of cite in the current draft. I think you mean, that the draft definition is a subset of that what is possible as content in HTML4? It is no problem for me, that the draft defines in more detail, what to note in cite in 'HTML5' documents, however, HTML4 does not do it and therefore this element can contain other content in HTML4 documents. For example: <blockquote> Not really sure what you're saying here.<br> <cite>Ian</cite> </blockquote> looks ok for me within HTML4, not in 'HTML5'. Or <p> It was demonstrated, that the simultaneous optical excitation of atoms and molecules within collisions can be observed with differential detection in a beam experiment. <br> <cite id="ref1">V. A. Aleskseev, J. Grosser, O. Hoffmann, F. Rebentrost in JCP <b>129</b> 201102 (2008)</cite> </p> Alternatively the cite could contain an element a with a reference: <cite><a href="#ref1">[1]</a></cite> There are different styles for citation and different information - whom, what and which resource, not only the title 'title of a work'. For 'HTML5' one has to write something like this: <cite>Simultaneous optical excitation of Na electronic and CF<sub><small>4</small></sub> vibrational modes in Na+CF<sub><small>4</small></sub> collisions</cite> What is a quite different information. This sample includes a problem with the element small of course, therefore this is again only ok in HTML4, but not in 'HTML5', there I think one has to use MathML to markup the molecule. > > acronym - non conforming feature in the current draft, well > > defined in HTML4. > > I think, with the instead recommended element abbr there is a problem > > with other (legacy?) versions of MSIE. > > Obviously here the 'HTML5' draft does not include an explanation of > > the meaning of HTML4 documents and does not necessarily > > do a better job concerning the description of the interpretation of > > legacy viewers. > > I don't understand what you're asking for here. HTML5 says that <acronym> > should be handled as a synonym for <abbr>. It is noted under: '12.2 Non-conforming features' with: "acronym Use abbr instead. " Because every acronym is an abbreviation too, there is no problem in doing this in 'HTML5' - one can use microdata/RDFa to specify it in more detail, if required. However, in HTML4 it is not a 'non-conforming feature' and can have a slightly different and more specific meaning as abbr. Therefore this is surely another example of something an author can use in HTML4 with a meaning, but should not in 'HTML5' because it is indicated as a 'non-conforming feature'. Obviously, this definition does not apply to acronym within HTML4 documents. > > > The content model of dl is more restrictive in 'HTML5' - surely > > it cannot describe uses of the less restrictive model of HTML4. > > <dl> hasn't changed as far as I can tell. > HTML4: <!ELEMENT DL - - (DT|DD)+ -- definition list --> 'HTML5': Content model: Zero or more groups each consisting of one or more dt elements followed by one or more dd elements. > > And viewers have no problem to present such uses, therefore > > 'HTML5' may have a better definition of definition lists, excluding > > some not very nice use cases, but it does not describe several > > really existing HTML4 documents or how they are presented > > by current viewers. > > I don't follow. Could you include some examples maybe? These are for example the nasty 'poetry' samples as discussed a longer time ago. Because HTML still has no elements for poetry, at least in HTML4 documents one has to work around this. In XHTML one can use related elements from other languages or in XHTML+RDFa one can indicate the meaning with RDFa, in 'HTML5' this may work with microdata/RDFa as well. However, for the last two variants one has to use proper elements with a sufficient structure model for strophes (stanzas) and strophe lines. For example dl/dd (excluded in 'HTML5') or div/div or maybe section/div. Because 'HTML5' has other elements and dl/dd is not applicable, the best possible solution in 'HTML5' looks different than in HTML4 or XHTML1.x. Maybe some use dl for recipes, bills or some structures in the bible for example (is already discussed in the related wiki). 'HTML5' does not describe this, what is no problem for documents of previous HTML versions and authors of 'HTML5' documents can find other (maybe better) solutions. This is why 'HTML5' is different from HTML4 and why it does not define the meaning of some structures in HTML4 documents. This happens mainly, because there is no concept in 'HTML5' either just do simplify the element collection and to use something like RDFa to provide a semantical meaning or to define a more complete collection of semantical elements for specific use cases. In 'HTML5' it is more a matter of tast mixture of changes or improvements, therefore clearly different from HTML4 and none of them is a true subset of the other. They share many common or similar features. > > > I think, for object some attributes are missing. > > Well, some authors used some of them wrong and > > something like declare was not widely implemented. > > Both does not indicate directly a problem with the HTML4 > > definition. > > I think that's exactly what it indicates, actually. > > > I think, there is still no declarative method in > > the draft to start some time dependent content of object, > > therefore declare is really missing in 'HTML5', not only > > for object. However, if an author uses it in a HTML4 > > document, one cannot expect that the behaviour of > > a browser ignoring this attribute is that, what was > > intented by the author ;o) > > The implementation gap simply excludes some use > > cases of object in practice - maybe one of the reason, > > why there is currently a lot of strange content around, > > trying to simulate such functionality somehow to work > > around the gap. > > I'm not sure what you're asking for here. Well - not related to this discussion here, but SMIL and SVG have declarative methods to begin and to end for example video and audio in a declarative way. HTML4 had at least declare to begin such objects. 'HTML5' does not have it. Maybe the best approach for authors is still to embed SVG or flash to do the job. 'HTML5' clearly fails to provide a simple declarative method to allow authors to specify buttons or a selection tools to begin and end such media. To be able to begin for example a video or audio after an interactivity of the user is often important to allow to select between different options. > > > I think, there are several more samples, all of them show, that > > 'HTML5' does not describe all 'valid' HTML4 documents properly. > > Could you list them? I should fix them, if so. > From my point of view 'HTML5' is a new version of HTML and can be therefore different. No need to fix something or to list it. If an author indicates that the version 'HTML5' is used, this new definitions and meanings apply. If HTML4 is indicated, the old meanings apply. No problem at all with version indication. And no need to spend time to compare and to list differences (which can have good reasons of course). > > I do not think, 'HTML5' has to do this, because it is a new version > > of the language. > > I think HTML5 must do this, because it is a new version of the language. > > > It is just pretty useless to disclaim such simple > > facts and incompatibilities. > > Not sure what you mean. ;o) > > > > > And has far as I have seen, those changes are not mentioned > > > > in the current draft (as well as maybe some missing attributes). > > > > If we take the sample of the version attribute itself, it does not > > > > define what it means, HTML4 for example does. > > > > > > HTML4's statements on the matter are inconsistent with actual > > > implementations and legacy content. > > > > I cannot see, what is inconsistent here: > > > > "version = cdata [CN] > > Deprecated. The value of this attribute specifies which HTML DTD version > > governs the current document. This attribute has been deprecated because > > it is redundant with version information provided by the document type > > declaration. > > " > > This is inconsistent, e.g., with the following text in HTML4: > > # The document type declaration names the document type definition (DTD) > # in use for the document [...] > > Which is it? The DOCTYPE or the version="" attribute? > Not really sure what you're saying here. If there is no doctype (as in XHTML+RDFa), the indication in version applies. If there is a doctype, that applies. If doctype and version information are incompatible, the version seems to be undefined, because I think there is no information what takes precedence. Therefore authors have to avoid such conflicts as they should in general. > > This does not even suggest a specific use of the attribute or that > > the interpretation or presentation of a simple browser must depend > > on such an information. > > Indeed, the text you quoted is completely empty of normative conformence > criteria. It doesn't define anything; the spec would lose nothing if that > text was removed. This is typical of much of HTML4. > Another variant of (X)HTML is more specific about this. In XHTML+RDFa it is noted: 'There SHOULD be a @version attribute on the html element with the value "XHTML+RDFa 1.0"' And this is more relevant, because HTML4 documents have the doctype to indicate the version, this XHTML variant has not necessarily a doctype. 'HTML5' has no other version indication currently, but the XHTML namespace has, therefore at least for the XHTML/XML variant of 'HTML5' one can indicate a version attribute belonging to the XHTML namespace, but because 'HTML5' still does not say, how to indicate the version, the value of the attribute is still a question - best choice could be the URI of the recommendation maybe, because there are not two versions with the same URI if nothing went wrong. > Anyway, I'm not interested in arguing about the flaws of HTML4. It's a > decade too late for that. > > > [...] > > I don't really understand what you want me to do, at this point. If you > could concisely state what problem exists in the HTML5 spec that you > believe should be addressed, I can try to address it (if it really is a > problem). Well that is simple. Allow authors to indicate 'HTML5' as a version, for example with <html version="http://www.w3.org/TR/html5/" ...> This would be already better than the XHTML+RDFa approach. This is mainly a meta information about the semantical meaning of the document content, not a requirement for user agents to do something specific. It is similar to those microdata information. For simple presentation you need not to care about it, but if there is someone trying to find out the relation between the current document and the meaning of the used language and its elements, this is an interesting information. HTML documents often have meta information not relevant for any user agent, for example meta elements containing descriptions or keywords or with encoding information, if the server already sent the encoding information. However, it is not completely useless, just because it is not relevant for some user agents or some situations. > However, this conversation at this point is meandering > apparently aimlessly and I'm not sure that it will lead to a productive > conclusion. > My personal impression is, that this happens quite often with discussions in the 'HTML5 WG', if semantical issues or issues interesting for authors are discussed. To change this, maybe one has to find out, what the collective problem of the WG with such issues is ;o) > > > > A current draft cannot change the meaning of a previous > > > > specification/recommendation and it does not change the meaning of > > > > documents written in this previous language version. > > > > > > Actually, it can, when the older specification was incorrect. > > > > How can it be incorrect, if the semantical meaning of the content of an > > element is defined? > > The spec isn't the final word on the meaning of the language. The use of > the language is the final word on the meaning of the language. > This applies more for spoken languages and dictionaries. The dictionaries only describe, how the words of a language are currently used. With a specified language this is different. It is a technical terminology with fixed meanings. This is one of the main advantages, why to use something like markup languages at all. > If everyone uses <embed> to embed a plugin, then that's what <embed> > means. If everyone uses <object codebase=""> to specify the source of the > plugin, then that's what that attribute means, even if HTML4 says that the > attribute gives the base URL for the classid="" attribute. In HTML4 documents 'embed' means nothing. And at least on my Linux computers not even all browsers interprete this (for example I think, Opera still ignores it, at least in combination with SVG ;o) And even if millions of people believe in 1+1=1, this does not mean, that this is the meaning or that this is true or that this implicates to change the convention, that '+' typically means addition and not multiplication. It mainly indicates, that millions of people are wrong. This is not surprising. And one of the advantages of well defined technical terminologies is, that a minority is still able to express and to share relevant information, even if the majority is not able to understand or to use such information at all. And if they refer to a specification of there terminology, every one else can at least learn it and can understand, what was intended. On the other hand, it is quite simple to check, whether an assertion within this terminology is meaningful or not. If the meaning would be adjusted to the majority, this would mainly result in more stupidity. Olaf
Received on Friday, 28 August 2009 15:32:26 UTC