- From: Dan Connolly <connolly@w3.org>
- Date: Thu, 12 Jul 2001 00:58:45 -0500
- To: Ron Daniel <rdaniel@interwoven.com>
- CC: Dave Beckett <dave.beckett@bristol.ac.uk>, RDF Core <w3c-rdfcore-wg@w3.org>
Ron Daniel wrote:
>
> Dan Connolly raised this issue, stating it as:
>
> A statement with a parseType of 'Literal' has as its object
> an XML structure, not a simple string. For example, the first
> character of the literal <foo>bar</foo> is not '<'.
>
> This is an interesting suggestion. It raises several questions.
> I'll confine myself to one (at least for now)...
>
> 1) What evidence is there that this was the intent of the
> M&S 1.0 specification?
Er... this looks like pretty direct evidence:
[[[
If the
content of E contains no XML markup or if parseType="Literal" is
specified in the start tag of E then v is the
content of E (a literal).
]]]
-- Resource Description Framework (RDF) Model and Syntax
Specification
http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/
Wed, 24 Feb 1999 14:45:07 GMT
E referes to an XML element in that bit of the spec.
The content of an XML element isn't (in general) a string; it's
a sequence of data characters and/or elements, PIs, and comments:
[[[
Content of Elements
[43]
content
::=
CharData? ((element | Reference | CDSect |
PI | Comment)
CharData?)*
]]]
-- Extensible Markup Language (XML) 1.0 (Second Edition)
http://www.w3.org/TR/REC-xml#dt-content
Thu, 05 Oct 2000 12:19:51 GMT
Let's take this example from the RDF spec:
-------------
In the following example, the value of the Title property is a literal
containing some MATHML markup.
<rdf:Description
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/metadata/dublin_core#"
xmlns="http://www.w3.org/TR/REC-mathml"
rdf:about="http://mycorp.com/papers/NobelPaper1">
<dc:Title rdf:parseType="Literal">
Ramifications of
<apply>
<power/>
<apply>
<plus/>
<ci>a</ci>
<ci>b</ci>
</apply>
<cn>2</cn>
</apply>
to World Peace
</dc:Title>
<dc:Creator>David Hume</dc:Creator>
</rdf:Description>
-------------
what do you suggest is the value of the dc:Title (sic) property?
I suggest it's a structured thing, ala the XML infoset
or XPath data model; it's got some characters, a
mathml:apply element, and some more characters. No?
> Searching through the archives of the w3c-rdf-syntax-wg
> list for 'infoset' turns up VERY few messages.
The infoset hardly existed then.
> Re-reading those messages, IMHO, supports a very different
> interpretation of the WG intent - that parseType="Literal" was a
> stop-gap measure to let us deal with embedded XML content through
> the simple expedient of turning off RDF parsing of that content.
> In fact, the phrase "generates no tuples" is used in the emails
> above in a manner that seems to indicate that the WG wanted to
> completely ignore the content and markup in the Literal, and treat it
> as a simple string. Later applications might do something with the
> markup.
I could live with encoding the structured thing as a string
provided
(1) namespace info isn't lost; in the example above,
the resulting string must capture the namespace
name associated with <apply/> etc.
(2) it remains distinguishable from a string that
happens to have the same characters.
i.e. I'm OK with "delaying" the parsing, so long as we don't
lose information.
> If that is the case, then the clarification document can't say
> that M&S 1.0 requires the generation of tuples for the infoset of
> the embedded content. That seems the opposite of the intent.
I agree that M&S 1.0 doesn't give the URIs of the relevant
properties, so it would be more than clarification to specify them.
But this seems like another bug in the spec: "anybody can
say anything about anything; but if you want to give the language
of a Literal or model the structure of XML content, don't use
RDF properties to do it!" I don't think that was the intent.
> Dan's suggestion could be within the scope of a 2.0 revisitation
> of M&S, but clearly seems to exceed our chartered tasks.
>
> (At that time, there may be an approach we can take which reconciles
> the views. We might say that in 1.0, a Literal is just a String,
> but that in 2.0, we have some extra info in the model so that
> we not only have the string, we have a URI for it. (We should also
> agree on just what those URIs are). That URI can be used as the
> subject for all sorts of statements. We could use it in statements
> which have a predicate called something like 'rdf2:hasInfoset'. The
> rest is left as an exercise for the future.)
>
> But for now, I think that as far as RDF 1.0 processors are concerned,
> Literals are just strings, and the first character of a string
> like "<foo>bar</foo>" would be '<'.
As long as we somehow distinguish
<prop parseType="Literal"><foo>bar</foo></prop>
from
<prop><foo>bar</foo></prop>
--
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Thursday, 12 July 2001 01:58:56 UTC