W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > July 2003

RE: first pass parseType="Literal" text for primer

From: Graham Klyne <gk@ninebynine.org>
Date: Mon, 28 Jul 2003 13:13:01 +0100
Message-Id: <5.1.0.14.2.20030728125613.02525eb8@127.0.0.1>
To: <Patrick.Stickler@nokia.com>, <w3c-rdfcore-wg@w3.org>
Cc: Dave Beckett <dave.beckett@bristol.ac.uk>

At 14:31 28/07/03 +0300, Patrick.Stickler@nokia.com wrote:



> > 2. <title rdf:parseType='Literal'>Why the &lt;FONT&gt; Tag is
> > Bad</title>
> >
> > I take the value of this 'title' property to be:
> >
> >    "Why the &lt;FONT&gt; Tag is Bad"^^rdf:XMLLiteral
>
>Eh? Really?
>
>Don't you mean
>
>    "Why the <FONT> Tag is Bad"^^rdf:XMLLiteral
>
>Surely the entities are resolved the same as for any
>literal.

Not by my reading of Concepts:

[[
The lexical space
     is the set of all strings which:

         * are well-balanced, self-contained XML data [XML];
         * correspond to exclusive Canonical XML (with comments, with empty 
InclusiveNamespaces PrefixList ) [XML-XC14N];
         * when embedded between an arbitrary XML start tag and an end tag 
form a document conforming to XML Namespaces [XML-NS]
]]
-- 
http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-concepts-20030117/#section-XMLLiteral

which would require the '<' and '>' here to be &-escaped.  When the XML 
literal is eventually interpreted, you'd get the bare '<' and '>' 
characters back.

[...removes sleeping cat from copy of syntax spec...]

Looking at the syntax spec, struggling a bit...

[Dave:  Should section "6.1.2 Element Event" be "Start Element Event", and 
should there be a description of what the "string-value" accessor 
returns?  Maybe not, but I note section 6.1 says that all events have a 
string-value accessor.]

Ah, got it:

In the syntax spec, we have sections 7.2.17 and 7.2.33 which together claim 
the literal string value is the exclusive XML canonicalization of the 
content, which I think means that the escaping of '&', '<' and '>' has to 
be re-inserted:

[[
The string used as the lexical form of the XML Literal is the Exclusive XML 
Canonicalization [XML-XC14N]) with comments and with empty 
InclusiveNamespaces PrefixList of the literal text l, i.e. the entire 
element content of this property element.
]]
-- 
http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#parseTypeLiteralPropertyElt

[Dave:  is it worth adding a note to clarify this point?]



>If you wanted/needed
>
>     "Why the &lt;FONT&gt; Tag is Bad"^^rdf:XMLLiteral
>
>then you'd have to say
>
>   <title rdf:parseType='Literal'>Why the &amp;lt;FONT&amp;gt; Tag is 
> Bad</title>
>
>No?
>
>If this is not the case, then I've really missing something
>major here and am very alarmed!

I think that may be workable, but it's not how I read the documents we're 
working on.

(Note that this formulation of the abstract syntax is for definitional 
purposes, and does not of itself require that an application do this.  You 
may have some other way of storing an XML literal which is fine as long as 
you get the same final answers.)

#g


-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9  A131 01B9 1C7A DBCA CB5E
Received on Monday, 28 July 2003 08:17:05 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:58:50 EDT