W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > July 2003

Re: first pass parseType="Literal" text for primer

From: Brian McBride <bwm@hplb.hpl.hp.com>
Date: 29 Jul 2003 10:45:35 +0100
To: Graham Klyne <gk@ninebynine.org>
Cc: Martin Duerst <duerst@w3.org>, rdf core <w3c-rdfcore-wg@w3.org>, i18n <w3c-i18n-ig@w3.org>
Message-Id: <1059471935.2143.30.camel@dhcp-91-3.hpl.hp.com>

On Tue, 2003-07-29 at 00:35, Graham Klyne wrote:

[...]

> >Their representations are different. But why do their denotations
> >have to be different?

[...]

> >But note that we are not speaking about changing the interpretation
> >of something by changing from plain literal to XML literal, we are
> >speaking about two different representations ((1) and (2)) that
> >could/should denote the same string of characters.
> 
> I thought about that, but within the current scheme couldn't see any way to 
> make it work, for the reason noted above.
> 
> I don't think I've anything more constructive to add at this stage, and 
> should back off.  Maybe someone else can see a way past the block that I 
> perceive?

I haven't followed this thread in detail, so may be off base, but in
this last message I see something that may be relevant.

Martin's use of the terms "interpretation" and denotation may be
different to ours, as suggested elsewhere by Pat.  Perhaps we have not
made clear, what I'll loosely call the substition rule.

Martin, the issue is one of round tripping.  Given the following input:

<rdf:Description>
  <eg:prop rdf:parseType="Literal"><em></eg:prop>
</rdf:Description>

If the xml literal *denotes* the character string "<em>", and in,

<rdf:Description>
  <eg:prop>&lt;em&gt;</eg:prop>
</rdf:Description>

the plain literal also denotes the character string "<em>", then it is
legal for an RDF processor to substitute one for another, e.g. an RDF
copy program could read the first of these and write the second because
their semantics, according to RDF, are exactly the same.

I don't think anyone wants that.

Martin: does the "substitution rule" explain why the denotations must be
different?

Now, Martin may have spotted something we have missed.  I suggest a way
to describe this clearly.  We have three concepts, concrete syntax
(rdf/xml), abstract syntax (closest rep is n-triples) and denotation. 
This can be described in a three column table, concrete syntax, abstract
syntax and denotation.

I think what we have at the moment is:

Concrete Syntax                 | Abstract Syntax       | Denotation
-----------------------------------------------------------------
<eg:prop>a</eg:prop>            | "a"                   | "a"
<eg:prop>&lt;em&gt;</eg:prop>   | "<em>"                | "<em>"
<eg:prop pt="L"><em></eg:prop>  | "<em>^^rdf:XMLLiteral | C("<em>")
<eg:prop pt="L">&amp;</eg:prop> | "&"^^rdf:XMLLiteral   | C("&")

I've abbreviated rdf:parseType="Literal" to pt="L" to fit on one line. 
C(x) is cannonicalization of x, encoded as a UTF8 octet sequence, e.g.
C("&") is the octet sequence corresponding to "&amp;".
The relationship between the abstract syntax and the denotation must be
functional - i.e. there is only one denotation (per interpretation) for
any given fragment of abstract syntax.

The point I made above, is that for any two rows with the same value in
the denotation column, it doesn't matter what form of concrete syntax
one uses - they have equivalent meaning and one can be freely
substituted for the other.

Brian
Received on Tuesday, 29 July 2003 05:46:29 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:58:51 EDT