- From: Maciej Stachowiak <mjs@apple.com>
- Date: Tue, 01 Sep 2009 23:50:20 -0700
- To: Manu Sporny <msporny@digitalbazaar.com>
- Cc: HTMLWG WG <public-html@w3.org>, RDFa Developers <public-rdf-in-xhtml-tf@w3.org>
- Message-id: <165CBAE1-AE50-4334-88C4-38DCEAB7522A@apple.com>
Hi Manu, Thanks for your prompt reply. And thanks again for submitting this draft. I think that is a very positive step. On Sep 1, 2009, at 9:20 PM, Manu Sporny wrote: > Maciej Stachowiak wrote: > >> 4 Modifications to XHTML+RDFa >> - One concern I have with only applying the changes to HTML: what >> if an >> RDFa processor has a parsed DOM, but does not know if the DOM was >> originally created from parsing HTML or XML? > > Hmm... that shouldn't matter. What made you think it does matter? > >> It would be better if a >> single set of rules could be used once you have a DOM, without >> having to >> know what kind it is, since the DOM itself does not directly expose >> that >> information. > > This was the intent of the section, what made you think that the rules > for an XHTML DOM and an HTML DOM were different? I may have just failed to understand the spec. Here's what led to my conclusion: "XHTML+RDFa specifies the attributes and processing rules for extracting RDF from an XHTML document. This section specifies changes to the attributes and processing rules defined in XHTML+RDFa in order to support extracting RDF from HTML documents." To me, this implies that the changes here apply only to HTML (as in the text/html serialization), but XHTML (even XHTML5) should be processed strictly according to XHTML+RDFa and nothing else. As a concrete example, my reading was that the "lang" attribute is only processed for text/html documents, but "xml:lang" is only processed for XML documents. Thus the recommendation to include both, since that would be the only way to get consistent behavior. That seems like a case where an RDFa processor that works on a DOM would have to know if the DOM came from an HTML or XML serialization. Am I misunderstanding? > >> 4.2 Invalid XMLLiteral values >> >> - Do XMLiteral values only need to be well-formed, or do they need >> to be >> namespace well-formed? > > I believe the current consensus is that they just need to be > well-formed. Did you have a technical reason why they should be one > over > the other? I figured since namespaces are important to RDF, that ns-well-formed content would be desired. I don't know of a strong reason to prefer one or the other. However, I have noticed what I think is another problem with this section. The definition for "well-formed XML" points to the definition of a well-formed XML document. But it appears to me that an XMLLiteral is an XML fragment, not an XML document, and in general an XML fragment need not be a well-fomed XML document (for example it may have multiple elements and text nodes at top level instead of a single root). Further, serializing an elements DOM children a XHTML5 per the spec will not guarantee a well-formed XML document. I believe it will guarantee a valid XML fragment, but I'm not sure offhand where that is defined. Conclusion: I think the definition here should point to a definition of well-formed XML fragment. Additional comment: it seems like the serialization as XHTML5 per the HTML5 spec rules should always be done. A DOM fragment doesn't really have a notion of being well-formed XML or not - it needs to be serialized somehow. And it probably makes sense to use the HTML5 algorithm regardless of whether the source DOM tree was HTML or XML. This might avoid the need to link to any well-formedness definitions (not sure though). > >> 4.3 The xmlns: attribute >> - "CURIE prefix mappings specified using xmlns:" does not clearly >> specify how attributes starting with xmlns: turn into prefix >> mappings. >> The processing model for this should be defined precisely. > > The processing rules for converting xmlns: to prefix mappings are > outlined in the XHTML+RDFa spec, Section 5.5: > > http://www.w3.org/TR/rdfa-syntax/#sec_5.5. > > Is that sufficient? If not, why not? Some concerns: - The draft references [Namespaces in XML], not Section 5.5 of RDFa in XHTML. Looking at the bit of that section that's relevant: "Next the [current element] is parsed for [URI mapping]s... Mappings are provided by @xmlns. The value to be mapped is set by the XML namespace prefix, and the value to map is the value of the attribute—a URI." However, since HTML doesn't really have a notion of XML namespace prefix, the processing rules need to be defined in terms of the textual name of the attribute for HTML DOMs; you can't soundly reference XML-only concepts to define things for HTML. Also, reading over this, it seems like the processing rule is wrong even for RDFa in XML! The attribute named "xmlns" does not establish any namespace prefix binding, it just gives the default namespace URI. Rather than @xmlns, the spec surely meant to say something like "Mappings are provided by XML namespace declarations - attributes that have the xmlns namespace prefix". Second, the part of the attribute that should define the prefix binding is the local name, not the XML namespace prefix - the XML namespace prefix for all non-default namespace decarations is the string "xmlns", and for the literal attribute name "xmlns" the namespace prefix is the empty string. It seems to me this needs to be errata'd, because the spec taken literally is surely incompatible with what all real RDFa processors do. > >> General comments: >> - I found it very hard to follow this document, since it seems to >> assume >> full knowledge of RDFa in XHTML and only defines a delta. > > That's correct, this spec does require full knowledge of XHTML+RDFa. > The > document attempts to not duplicate normative content between XHTML > +RDFa > and HTML5+RDFa specifications. There are very few changes needed to > put > RDFa into HTML5, so we didn't see a need to re-state large sections of > the RDFa specification in this document. By duplicating the XHTML+RDFa > REC language, we create a mechanism where we unnecessarily duplicate > content at best, and at worst, we could accidentally deviate from the > pre-existing RDFa REC language (and the test suite). At the very least, references to the appropriate sections of XHTML +RDFa should be made explicit. Right now it seems there is a lot of implicit linkage. It also seems reviewers will have to study XHTML +RDFa to properly review HTML+RDFa. > >> As a result: >> - It was hard for me to understand the actual processing model, so >> that I'd understand what I had to do as an implementor. > > The processing model is the exact same as XHTML+RDFa, except for > section > 4.1 and 4.2 in the HTML5+RDFa document. Would expressing which steps > section 4.1 and 4.2 refer to in the XHTML+RDFa document be beneficial? Definitely. And also stating very clearly what should be done differently, and whether it applies only to HTML DOMs, or to XML DOMs as well. > >> - I had no notion of the syntax, so I wouldn't know what to do as an >> author. > > The syntax is covered in detail in the XHTML+RDFa Syntax and > Processing[1] document as well as the RDFa Primer[2] document. Are > these > not sufficient? I would tentatively guess it's not sufficient, since those don't cover the syntax to use in HTML at all, and there are apparently some differences. > >> - As a reviewer, it was impossible for me to determine if the >> processing requirements were precisely specified, free of >> contradictions >> and sane. > > Would making the changes you listed help alleviate this issue? Only partly. I think what I need to do to give a sufficiently thorough review is to review RDFa+XHTML itself. That ma take a while - it is considerably longer than the draft you submitted. > >> For example, there was the idea to use a >> "prefix" attribute instead of xmlns: declarations to define CURIE >> prefixes, and also the idea to allow full URIs as an alternative to >> CURIEs. Have these ideas been rejected? > > Neither idea has been rejected. We're still discussing @prefix, but we > cannot add it to XHTML+RDFa without performing a revision of the > specification -- including the usual LC->REC process. > > @prefix is part of a larger set of changes that may be realized in > RDFa > 1.1 (the next version of RDFa, which we hope will unify RDFa > expression > in both HTML and XHTML). We hope that RDFa 1.1 will replace the > current > XHTML+RDFa and HTML5+RDFa FPWD with one specification document. Hence, > why I personally think it would be better to have RDFa defined outside > of the HTML5 specification than inside. > > We discussed full URI support in @rel/@rev/@property/@resource, and > there is a technical solution that would allow it to happen, but it > was > met with some pushback in the RDFa community. I prefer to have this > supported in RDFa, but we haven't attempted to gather consensus around > the feature and probably won't until RDFa 1.1. > > RDFa 1.1 could also have a mechanism to extend the set of keywords, > allowing more Microformats-like property names (that map to URIs), but > again... that's a feature that may not be standardized for another > year > or so and would require a full LC->REC process. Based on what you say, RDFa 1.1 seems potentially more interesting than the posted draft. Folding in text/html support in a primary spec instead of a delta spec, and enabling cross-serialization DOM consistency, both sound like big wins. What's the timeline for RDFa 1.1? Is it necessary to wait a year? Will the work be hosted by an existing Working Group, or will a new one be formed? You mention that a full LC->REC process is needed, but the same is true for the draft you posted. Regards, Maciej
Received on Wednesday, 2 September 2009 06:52:22 UTC