Re: XMLLiterals and Exclusive XML Canonicalization from Gregg Kellogg on 2010-10-24 (public-rdfa-wg@w3.org from October 2010)

From: Gregg Kellogg <gregg@kellogg-assoc.com>
Date: Sun, 24 Oct 2010 02:42:58 -0400
To: Manu Sporny <msporny@digitalbazaar.com>
CC: RDFa WG <public-rdfa-wg@w3.org>
Message-ID: <8EEAE75C-20FE-4772-8AE1-57E053A545AE@kellogg-assoc.com>

On Oct 23, 2010, at 9:09 PM, Manu Sporny wrote:

On 10/22/2010 07:52 PM, Gregg Kellogg wrote:
* Problems with @xmlns propogation to XMLLiterals given Exclusive
Canonical XML rules
* Potential need to keep @prefix and @xmlns separate in Evaluation
Context
* Need to keep in-scope @profiles in Evaluation Context

Yeah, I think you're right Gregg. The spirit of the XMLLiteral
serialization rules that the RDFa Task Force created a long time ago
ensured that active mappings were placed into the generated XMLLiteral
via @xmlns. @prefix needs to be preserved now as well, but I'm not sure
if we remembered to make that point in the document. Furthermore, the
way that you preserve @xmlns and the way that you preserve @prefix need
to be done very carefully.

Is there any expectation that @xmlns mappings and @prefix mappings be
kept distinct?

Yes, I believe that this is true. You need to now track where certain
mappings came from. That is, did they come from @xmlns, @prefix or a
@profile attribute?

As for keeping track of active profiles, you'll need to do that too in
order to make sure that the generated XMLLiteral preserves all necessary
state information.

Somebody more familiar with Exclusive Canonical XML rules will have to
answer your question about that - I remember there being a specific
reason we chose the form that we did, but can't remember what it was.

My understanding of Exclusive Canonical XML is that only @xmlns which is actually required for the used XML nodes will be output, so this would not include the use of prefixes in RDFa attributes, unless I miss-understand this. For these, mappings need to be generated using @prefix.

If I'm incorrect in this, then I'm fine with preserving the use of @xmlns. Or, we could just not claim that it is Exclusive Canonical XML.

So, the rules probably go something like this:

- For all prefixes defined via xmlns, follow the old XMLLiteral
generation rules.
- Ensure that prefixes defined via @prefix are placed into a @prefix
attribute on the XMLLiteral
- Ensure that all active @profiles are preserved in the @profile
attribute for the XMLLiteral.

Additionally, the active default vocabulary needs to be specified.

My logic is the following, which I think is correct:

Foreach XML Element in the literal:
Traverse element and all descendant nodes to determine required namespaces
Add to "defined_mappings" hash and set as @xmlns:xxx attribute on element
Extract existing @prefix mappings and add to "defined_mappings" and "prefix_mappings" hashs
Foreach prefix required to be defined on element
Unless prefix is in "defined_mappings" hash
add mapping to "prefix_mappings" hash
Unless "prefix_mappings" hash is empty
serialize all entries in "prefix_mappings" hash to @prefix

Add profiles to @profile attribute
Add default vocabulary to @vocab attribute
Add any language to @xml:lang attribute

(You can see the ruby code which implements this here: http://gist.github.com/643221)

This generates something like the following:

"<span xmlns=\"http://www.w3.org/1999/xhtml\"
property=\"foaf:firstName\"
prefix=\"foaf: http://xmlns.com/foaf/0.1/ rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# xhv: http://www.w3.org/1999/xhtml/vocab#\"
profile=\"http://rdfa.digitalbazaar.com/test-suite/test-cases/xhtml11/0199.profile.html\"
vocab=\"http://www.w3.org/1999/xhtml/vocab#\">
Gregg
</span>"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>

I remember it being that generated XMLLiterals didn't have to be the
same... in that, if the XMLLiteral expressed the proper semantics, it
was valid RDFa output.

I believe that validating this requires making both the generated and expected XML canonical, so that they can be easily compared. Where we have values within attributes, such as the ordering of @profile and @prefix, we really need to specify the ordering.

I think that we should deal with your input as a Last Call
comment/issue, would that work for you?

I've updated my Ruby parser to follow these rules. You can try it on your own markup here: http://distiller.kellogg-assoc.com.

In summary, I believe that @prefix needs to be preserved, and @xmlns needs to be promoted to @prefix if it is not preserved through Exclusive Canonical XML mapping (i.e., not used in XML nodes). Further more, @prefix should be placed in a canonical order (sorted by prefix) and @profile needs to be have order maintained to preserve the proper processing order.

Hopefully, someone else is updating their own parser and can validate the steps I've described, or create simpler ones. I think the processing steps need to be made more explicit.

-- manu

--
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: Saving Journalism - The PaySwarm Developer API
http://digitalbazaar.com/2010/09/12/payswarm-api/

Gregg

Received on Sunday, 24 October 2010 06:44:05 UTC