Re: Review comments on HTML+RDFa (was Re: FPWD Review Request: HTML+RDFa) from Maciej Stachowiak on 2009-09-02 (public-rdf-in-xhtml-tf@w3.org from September 2009)

From: Maciej Stachowiak <mjs@apple.com>
Date: Wed, 02 Sep 2009 12:28:05 -0700
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: HTMLWG WG <public-html@w3.org>, RDFa Developers <public-rdf-in-xhtml-tf@w3.org>
Message-id: <66E94596-98B8-4039-B506-D9A1834207EF@apple.com>
On Sep 2, 2009, at 10:28 AM, Manu Sporny wrote:

> Maciej Stachowiak wrote:
>> I may have just failed to understand the spec. Here's what led to my
>> conclusion:
>>
>> "XHTML+RDFa specifies the attributes and processing rules for  
>> extracting
>> RDF from an XHTML document. This section specifies changes to the
>> attributes and processing rules defined in XHTML+RDFa in order to
>> support extracting RDF from HTML documents."
>>
>> To me, this implies that the changes here apply only to HTML (as in  
>> the
>> text/html serialization), but XHTML (even XHTML5) should be processed
>> strictly according to XHTML+RDFa and nothing else.
>>
>> As a concrete example, my reading was that the "lang" attribute is  
>> only
>> processed for text/html documents, but "xml:lang" is only processed  
>> for
>> XML documents. Thus the recommendation to include both, since that  
>> would
>> be the only way to get consistent behavior. That seems like a case  
>> where
>> an RDFa processor that works on a DOM would have to know if the DOM  
>> came
>> from an HTML or XML serialization. Am I misunderstanding?
>
> Ahh, now I see how you came to that conclusion. Thanks.
>
> The intent is to have a unified set of rules for both XHTML and HTML.
> That intent is clearly not conveyed in an effective manner and the  
> text
> that you cite. The current HTML+RDFa FPWD spec is confusing the  
> matter.
> I've added the comment to the wiki as the paragraph should be reworded
> to be more clear.
>
> In general, an RDFa processor should not have to detect whether the  
> DOM
> came from an HTML or XML serialization. The only reason I say "In
> general" is because this may not hold true for retrieving the
> xmlns:<prefix> mappings -- or the RDFa processor implementation may  
> need
> to try multiple calls to the DOM to detect whether or not  
> xmlns:<prefix>
> mappings exist for a particular element.
>
> To be clear, the intent is that RDFa Processors should use the same
> processing rules when processing "lang" and "xml:lang" for both HTML  
> and
> XHTML DOMs.
>
> That being said, using @lang in an XHTML 1.1 document will result in a
> non-conformant document:
>
> http://www.w3.org/TR/xhtml11/changes.html
>
> Since RDFa is defined as operations on an tree-based model (a DOM-like
> structure), we can state rules that may operate on a non-conformant
> documents that are translated into a DOM.
>
> Does that clarify the intent? If so, I'll attempt to author language
> that makes this more clear.

Thanks for the clarification. I do think the spec should be updated to  
make this clear.

>
> The markup above should produce the following triple:
>
> <> ex:markup "<rect width=\"300\" height=\"100\"
>            style=\"fill:rgb(0,0,255);stroke-width:1;
>            stroke:rgb(0,0,0)\"/>"^^rdf:XMLLiteral
>
> Tests 100-103 cover these namespace/whitespace preservation cases:
>
> http://rdfa.digitalbazaar.com/rdfa-test-harness/
>
> However, I'll check with the RDFa TF on this as it seems as if
> xmlns="http://www.w3.org/2000/svg" should be preserved in this case...
> don't remember why we don't preserve it. I may not be remembering a  
> spec
> detail correctly.

If the spec doesn't preserve namespace declarations (whether default  
or prefix), that seems like a problem, since the semantics of the  
XMLLiteral will be changed if it is extracted without its associated  
namespace declarations.

>> Conclusion: I think the definition here should point to a  
>> definition of
>> well-formed XML fragment.
>
> I'll make the spec text more clear on this. I don't know if there is a
> definition of a well-formed XML fragment anywhere. My understanding  
> was
> that a well-formed XML fragment is any XML fragment that you can
> encapsulate in a single root element and that passes the test for a
> well-formed document. For example:
>
> <foo>
> YOUR_XMLLITERAL_HERE
> </foo>
>
> If the above passes an XML well-formedness validator, then you should
> generate the XMLLiteral triple. I've added a note to the wiki[1] to
> address this concern.

I'm not enough of an XML expert to know if this is a sound or  
appropriate definition, but it sounds like an improvement.

>
>> Some concerns:
>
> I've added these concerns to the wiki.[1]
>
>> Also, reading over this, it seems like the processing rule is wrong  
>> even
>> for RDFa in XML! The attribute named "xmlns" does not establish any
>> namespace prefix binding, it just gives the default namespace URI.
>> Rather than @xmlns, the spec surely meant to say something like
>> "Mappings are provided by XML namespace declarations - attributes  
>> that
>> have the xmlns namespace prefix". Second, the part of the attribute  
>> that
>> should define the prefix binding is the local name, not the XML
>> namespace prefix - the XML namespace prefix for all non-default
>> namespace decarations is the string "xmlns", and for the literal
>> attribute name "xmlns" the namespace prefix is the empty string. It
>> seems to me this needs to be errata'd, because the spec taken  
>> literally
>> is surely incompatible with what all real RDFa processors do.
>
> Nice catch. I've raised this as an errata item for XHTML+RDFa:
>
> http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Sep/0015.html
>
> We didn't have a single implementor or reviewer catch that comment,
> perhaps because the test suite and examples made it clear what was  
> being
> discussed, but that doesn't mean it shouldn't be changed to be more
> accurate.

The people reviewing it probably knew what was intended and so glossed  
over the details.

Side note: it seems like the problem here might be avoided by simply  
referring to the Namespaces in XML definition of namespace scoping.

>
>> You mention that a full LC->REC process is needed, but the same is  
>> true
>> for the draft you posted.
>
> Correct, but it's simpler and more effective to do the LC->REC process
> with a document that has already gone through REC and thus needs minor
> modifications (XHTML+RDFa) than a completely new, 60+ page document
> (RDFa 1.1) with features that are still being worked out on the  
> drawing
> board.
>
> We want to make sure that for those that are authoring RDFa in HTML
> today, that there is a valid spec for them to do so... sooner than  
> later.

If the delta spec can in fact be done much faster, then that's fine.  
However, I think there is a lot of work to make a precise and  
technically sound delta spec.

Regards,
Maciej
Received on Wednesday, 2 September 2009 19:28:54 UTC