W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > July 2003

Re: first pass parseType="Literal" text for primer

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Date: Mon, 28 Jul 2003 14:09:14 +0100
To: Graham Klyne <gk@ninebynine.org>
Cc: "Patrick.Stickler" <Patrick.Stickler@nokia.com>, w3c-rdfcore-wg <w3c-rdfcore-wg@w3.org>
Message-Id: <20030728140914.75764e8f.dave.beckett@bristol.ac.uk>

On Mon, 28 Jul 2003 13:13:01 +0100
Graham Klyne <gk@ninebynine.org> wrote:

> At 14:31 28/07/03 +0300, Patrick.Stickler@nokia.com wrote:
> 
> 
> 
> > > 2. <title rdf:parseType='Literal'>Why the &lt;FONT&gt; Tag is
> > > Bad</title>
> > >
> > > I take the value of this 'title' property to be:
> > >
> > >    "Why the &lt;FONT&gt; Tag is Bad"^^rdf:XMLLiteral

This is correct.

> >
> >Eh? Really?
> >
> >Don't you mean
> >
> >    "Why the <FONT> Tag is Bad"^^rdf:XMLLiteral

This is wrong.

<snip/>

I've quoted this before:

[[
Text Nodes- the string value, except all ampersands are replaced by
&amp;, all open angle brackets (<) are replaced by &lt;, all closing
angle brackets (>) are replaced by &gt;, and all #xD characters are
replaced by &#xD;.
]] -- http://www.w3.org/TR/2001/REC-xml-c14n-20010315#ProcessingModel


> [...removes sleeping cat from copy of syntax spec...]
> 
> Looking at the syntax spec, struggling a bit...
> 
> [Dave:  Should section "6.1.2 Element Event" be "Start Element Event",

Yes, it should be, I will rename it.

> and should there be a description of what the "string-value" accessor 
> returns?  Maybe not, but I note section 6.1 says that all events have a 
> string-value accessor.]

Hmm.  string-value is never called on an element or root event.  The
canonicalization algorithm from XC14N is separate.  I should reword 6.1



> Ah, got it:
> 
> In the syntax spec, we have sections 7.2.17 and 7.2.33 which together claim 
> the literal string value is the exclusive XML canonicalization of the 
> content, which I think means that the escaping of '&', '<' and '>' has to 
> be re-inserted:

Yes.

> [[
> The string used as the lexical form of the XML Literal is the Exclusive XML 
> Canonicalization [XML-XC14N]) with comments and with empty 
> InclusiveNamespaces PrefixList of the literal text l, i.e. the entire 
> element content of this property element.
> ]]
> -- 
> http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#parseTypeLiteralPropertyElt
> 
> [Dave:  is it worth adding a note to clarify this point?]

I might move some words around in parseTypeLiteralPropertyElt
so that when "literal l" is discussed, the string value is clearly shown
to be that after the canonicalization.

> >If you wanted/needed
> >
> >     "Why the &lt;FONT&gt; Tag is Bad"^^rdf:XMLLiteral
> >
> >then you'd have to say
> >
> >   <title rdf:parseType='Literal'>Why the &amp;lt;FONT&amp;gt; Tag is 
> > Bad</title>
> >
> >No?

Yes.  That's the so-called double escaping and will give you an XML
literal with the characters 'W', 'h', .... '&', 'l', 't', ';', 'F', 'O', ...
which is what the first string above presents.  This is because
XML escapes such as & do not apply inside N-Triples strings
(or RDF's rdf:XMLLiterals in the graph).

> >
> >If this is not the case, then I've really missing something
> >major here and am very alarmed!
> 
> I think that may be workable, but it's not how I read the documents we're 
> working on.
> 
> (Note that this formulation of the abstract syntax is for definitional 
> purposes, and does not of itself require that an application do this.  You 
> may have some other way of storing an XML literal which is fine as long as 
> you get the same final answers.)

Indeed.  In the same way as N-Triples are not required.

Dave
Received on Monday, 28 July 2003 09:11:58 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:58:50 EDT