Re: References to CSS rules in RDFa syntax document from Niklas Lindström on 2007-11-06 (public-rdf-in-xhtml-tf@w3.org from November 2007)

From: Niklas Lindström <lindstream@gmail.com>
Date: Tue, 6 Nov 2007 16:03:33 +0100
To: "Ivan Herman" <ivan@w3.org>
Cc: "Manu Sporny" <msporny@digitalbazaar.com>, "W3C RDFa task force" <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <cf8107640711060703y58d0c49q867581197186abb4@mail.gmail.com>
Hi!

I believe there are more options (and pitfalls) to explore in IE7.

For instance, setting style="white-space: pre" on the h1 element makes
innerHTML return a value with *most* space preserved. Added newlines
do seem to disappear entirely though (e.g. if I replace the space in
"The Most" with only a newline, the innerHTML presents that part as
"TheMost". Not in the rendered page, but in the returned value..).

Furthermore, getting at the content with pure DOM calls (and provided
"white-space" is set to "pre" as above) makes things somewhat cleaner
-- e.g. newlines are preserved. Although the *first* newline in the
first (text) childNode of the h1 is missing.. Also note that
programmatically setting "someElem.style.whiteSpace = 'pre'" happens
asynchronously, so any code that would try that to get at the original
(well, sort of..) white space have to "wait for it".. :/

I just did these quick tests in case the capabilities of IE7 is what
would put an end to any hope of keeping non-canonicalized XMLLiterals.
There seems to be some  possibilities, but perhaps not stable enough?
So if nothing else, it should be noted that requiring normalized space
from RDFa parsers in such a case would require manual processing (DOM
walking + normalizing) in some (at least non-XHTML-aware..) client
implementations.

Oh, and of course, IE (including IE7) has the bad habit of
upper-casing any HTML in both innerHTML and nodeName values, so this
has to be accounted for as well.

FWIW, this is a piece of what I added in Manu's test code
(<http://rdfa.digitalbazaar.com/tests/xmlliteral.html>) while testing:
----
   var title = document.getElementById('dc-title');
   alert(title.innerHTML); // upper-cased "SUP"
   alert(title.firstChild.nextSibling.nodeName); // "SUP" here too
   title.style.whiteSpace = 'pre';
   alert("wait for it..");
   alert(escape(title.firstChild.nodeValue));
----

Finally, if one *really* wanted to, I suppose using XMLHttpRequest to
get the current document (i.e. refetching it) as proper XML is also a
workaround for IE. Not exactly a great solution, but it might work..

For my personal opinion, I think the ideal would be for XMLLiterals in
RDFa to be given as they are in the source document. But if existing
(client) implementations are to be of concern, including IE (at least
IE7), I can understand if this ideal may have to be abandoned
(specifically for XHTML 1.1 + RDFa). However, achieving
canonicalization (of white space, element names as lower case,
attribute ordering..) may be a bit of a headache anyway (in *at least*
IE7).

Best regards,
Niklas


On 11/6/07, Ivan Herman <ivan@w3.org> wrote:
> I think we have to yield on this issue:-( I have update pyRdfa to
> canonicalize XML Literals, too...
>
> Test #11 is passed now...
>
> Ivan
>
> Ivan Herman wrote:
> > Ouch, ouch, ouch! That hurts...
> >
> > If your findings are confirmed than indeed we have much less choice than
> > before. I hate that!:-)
> >
> > Ivan
> >
> > P.S. I never liked programming in javascript:-(
> >
> > Manu Sporny wrote:
> >> Ivan Herman wrote:
> >>>> In other words, the following XHTML (Test Case #11):
> >>>>
> >>>> <div about="">
> >>>>    Author: <span property="dc:creator">Albert Einstein</span>
> >>>>    <h2 property="dc:title">
> >>>>         E = mc<sup>2</sup>: The Most Urgent Problem of Our Time
> >>>>    </h2>
> >>>> </div>
> >>>>
> >>>> Should produce the following triples:
> >>>>
> >>>> @prefix _5:
> >>>> <http://www.w3.org/2006/07/SWD/RDFa/testsuite/xhtml1-testcases/0011.>.
> >>>> @prefix dc: <http://purl.org/dc/elements/1.1/>.
> >>>>
> >>>> _5:xhtml dc:creator "Albert Einstein";
> >>>>   dc:title """E = mc<sup>2</sup>: The Most Urgent Problem of Our Time"""
> >>>>           ^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>.
> >>>>
> >>>>> So I believe we should either refer to these two ideas, or even import
> >>>>> the prose as is, if we have to.
> >>> Wait, that is a different issue. It is still undecided whether the
> >>> canonicalization should apply on XML Literals. Mark's proposal is to use
> >>> XPath for the definition of canonicalization, not (yet) on what exactly
> >>> it applies to!
> >> If only we had a choice, Ivan :)
> >>
> >> I took some time last night to do some research on how XMLLiterals could
> >> be implemented in Javascript. Here are the results for RDFa Test Case #11:
> >>
> >> http://rdfa.digitalbazaar.com/tests/xmlliteral.html
> >>
> >> If you use Firefox's DOM and Javascript implementation to get the
> >> contents of the H2 element, here are the results on the node:
> >>
> >> outerHTML: 'undefined'
> >> innerHTML:
> >> '\n        E = mc<sup>2</sup>: The Most Urgent Problem of Our Time\n
> >>  ' (there are extra spaces after the last \n)
> >> innerText: 'undefined'
> >>
> >> If you use Internet Explorer 7's DOM and Javascript implementation to
> >> get the contents of the "E = mc^2: The Most Urgent Problem of Our Time",
> >> here are the results on the node:
> >>
> >> outerHTML: '\r\n<H2 id=dc-title property="dc:title">E = mc<SUP>2</SUP>:
> >> The Most Urgent Problem of Our Time </H2>'
> >> innerHTML: 'E = mc<SUP>2</SUP>: The Most Urgent Problem of Our Time '
> >> innerText: 'E = mc2: The Most Urgent Problem of Our Time '
> >>
> >> In short - Firefox's implementation allows you to retrieve the original
> >> whitespace and line breaks using Javascript. IE7 does not.
> >>
> >> IE7 normalizes all of the whitespace before inserting it into the DOM,
> >> which means that Javascript does not have access to the original text in
> >> the XHTML file.
> >>
> >> This means that the same canonacalization rules should be used for
> >> regular strings and XMLLiterals for RDFa-in-XHTML.
> >>
> >> Somebody please correct me if they have a different understanding of the
> >> IE7 DOM.
> >>
> >> -- manu
> >>
> >
>
> --
>
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
>
>
Received on Tuesday, 6 November 2007 15:03:56 UTC