- From: Dr. Olaf Hoffmann <Dr.O.Hoffmann@gmx.de>
- Date: Thu, 17 Sep 2009 12:30:20 +0200
- To: public-rdf-in-xhtml-tf@w3.org
Manu Sporny: >Dr. Olaf Hoffmann wrote: >> I think, it would be more consistent to use >> something like 'HTML5+RDFa1.0' as a value >> for the HTML5 superset to avoid confusion, because >> the semantics, content models and element and attribute >> collection of HTML5 are different (and partly incompatible) from >> XHTML1.1 and none of them is a superset of the other, >> therefore another version indication for the HTML5 variant >> seems to be essential to be able to distinguish. > >Hi Olaf, > >This is one of the items that will be covered on one of the upcoming >RDFa telecons, so the language in that spec is preliminary and hasn't >been discussed in detail. > >I was attempting to find a balance between backwards compatibility and >semantic accuracy when writing that section. > >My reading is that the WHAT WG and HTML WG do not want to be able to >distinguish between HTML4 and HTML5, as HTML5 is meant to be a >backwards-compatible, natural progression to HTML4. I agree with this observation basically, but as far as I understand this, they do not care about XHTML1.1, only about HTML4 and XHTML1.0. And obviously they failed to get really semantically backwards compatible drafts, because meanings and content models of several elements have been changed in the current drafts. In some cases this might have good reasons and combined with new elements and attributes is finally a big progress, however surely not compatible with HTML4 or XHTML1.x on a semantical and structural level. Personally I think it is a major problem for careful authors, because they cannot indicate the (X)HTML-version for 'HTML5', they cannot use it at all, just because it is not known how to indicate it. For HTML4, XHTML1.0, XHTML1.1 or XHTML+RDFa 1.0 they can indicate what they use (partly with strings and constructions, they do not really understand, but with a clear relation to some specification) and therefore they are practically usable. With a version indication for HTML5+RDFa1.0 this becomes usable too, what is especially even more important for the RDFa variant, because with authors using this, the probability increases, that they care, that there is a well defined relation to a specification, what the elements mean. They can use it in a similar way as they use RDF(a) to indicate a relation to other specification to get a well defined meaning of there constructions. Semantical and author issues are not very relevant for the current HTML5 drafts at all, therefore a version indication seems to be irrelevant for the HTML5-tag-soup variant for many people. But authors using RDFa at all can be expected to care about semantics and well defined relations. Therefore such a relation indication becomes much more relevant for the HTML5+RDFa variant than for the HTML5-tag-soup variant. >Several people have >asserted that HTML5 shouldn't be versioned, so if we were to put "HTML5" >in the @version attribute, there would probably be push-back. Well authors, who do not care about a defined relation to the meaning of elements can simply not use the version attribute. This will happen anyway, independent on the question, if some RDFa appears within the document by copy and paste techniques or intentionally ;o) Authors using RDF(a), because they understand the mechanism maybe want to care about the version they use and can do this with this extension. If this RDFa draft is designed to be an extension to the HTML5 draft concerning semantics and readability for simple programs, it looks like a progress/extension, that a simple program can be enabled to identify the HTML version, currently used in the document too. > >The first draft of the HTML+RDFa spec actually specified only one >version: "XHTML+RDFa 1.0", which seemed inaccurate when the version >attribute was specified in a non-XML mode document (HTML5 vs. XHTML5). > It is more inaccurate, because this is already preserved to indicate the relation to XHTML1.1, maybe the string for this variant could have been better 'XHTML1.1 + RDFa 1.0', but this is now a little bit late to change. As far as I understand the XHTML+RDFa 1.0 recommendation version="XHTML+RDFa 1.0" simply identifies XHTML1.1 + RDFa 1.0. Currently a program can for example use it to validate the document and to indicate errors and so on. Everything in (X)HTML5 not fitting to XHTML1.1 + RDFa 1.0 should be indicated as wrong, for example new elements introduced in HTML5. This can only be avoided with another string for (X)HTML5+RDFa. >So, we could have two acceptable @version attribute values for RDFa (one >for non-XML mode and another for XML mode): > >version="HTML+RDFa 1.0" >version="XHTML+RDFa 1.0" > >This is a bit annoying because we really only care about the "RDFa 1.0" >part of the version string. Another option could be an URI (with fragment identifier) as a value of the version attribute pointing to the definition of the used variant - or a whitespace separated list of such pointers, indicating all used versions and formats, then one does not always have to update the RDFa extension, if a new format or version appears ... This fits almost to the URI/CURIE approach of RDFa itself ;o) >So, to be backwards-compatible with >XHTML+RDFa 1.0 and to provide some degree of future-proofing, we could >say that the @version string should contain the text "RDFa 1.0" in it >somewhere. The following regular expression could be used to detect the >string in the @version attribute in any language employing RDFa: > >\+?RDFa 1\.0(\+|\+.*|)$ > >Basically, if the string "RDFa 1.0" exists in the @version attribute >(either surrounded by '+' characters or not), then the document contains >RDFa 1.0 syntax. This allows people to do stuff like: > >version="SVGTiny 1.2+RDFa 1.0" or This is not necessary, because SVGT1.2 already has the necessary attributes defined. One simply can use it - would be another good approach for HTML5, especially because it is a not modularised version, different from XHTML1.1. SVGT1.2 defines only: version = "1.0" | "1.1" | "1.2" and baseProfile = "none" | "full" | "basic" | "tiny" RDFa attributes are already covered by version="1.2" baseProfile="tiny". version="SVGTiny 1.2+RDFa 1.0" would be an unsupported value, what means almost the same, as if the attribute had not been specified at all. >version="HTML+RDFa 1.0+CoolLanguageExtension 2.1" >This is beneficial because we do want RDFa to be easily mixed-in with >future element/attribute-based languages. Then the URI/CURIE-list approach for the version attribute looks perfect to do this. Typically URIs of specifications are unique and persistent, therefore there is a meaningful (bijective) relation between the string and the definitions. >> By the way, 2.1 explains, that the document structure >> can be changed. Maybe it could be useful to add a >> note for authors, that they can avoid this, if they note >> implied elements and other HTML artefacts explicitly >> to ensure, that such a modification does not change >> their intents... > >I'm not quite sure I understand what you mean completely, so the >following may not address your concern. > >AFAIK, there is currently no way to signal that a document's elements >shouldn't be re-arranged by an HTML5 parser. Henri, Ian, is this correct? I think, there is no way to prevent this, but if authors do not use wrong nested elements, do not leave out elements, which are added then to DOM automatically (appears for example in tables, I think with tbody, what caused already some confusion with CSS), they can avoid, that the parser manipulates something in practice. This manipulation does not happen randomly, it has defined reasons, one can avoid. If authors do something stupid, such a DOM manipulation due to implied elements or error fixing may change the intended structure to extract the RDFa issues. At least an informational note/hint to authors might help, that they are more careful to avoid such nonsense, because they can do this, if they want and if there is a danger, that something is misinterpreted, if such RDFa structures are extracted by a simple program. And from a HTML5-tag-soup-parser they get no hint, that something went wrong or that the structure had to be manipulated to fix nonsense or to add implied elements before something is extracted. >Since RDFa usually sits on top of the DOM layer in XHTML and HTML, or is >provided a SAX-based interface to the document, the RDFa Processor >doesn't know if the input stream was or wasn't modified by a document >parser. In short, I don't think there is any way for us to say that >authors can avoid document restructuring as that happens outside of the >RDFa Processor's purview. > >I've added your primary concern to the wiki: > >http://rdfa.info/wiki/Html5-rdfa-wd-issues#3.1_Document_Conformance > >-- manu
Received on Thursday, 17 September 2009 10:42:15 UTC