- From: Dan Connolly <connolly@w3.org>
- Date: Thu, 12 Jul 2001 00:58:45 -0500
- To: Ron Daniel <rdaniel@interwoven.com>
- CC: Dave Beckett <dave.beckett@bristol.ac.uk>, RDF Core <w3c-rdfcore-wg@w3.org>
Ron Daniel wrote: > > Dan Connolly raised this issue, stating it as: > > A statement with a parseType of 'Literal' has as its object > an XML structure, not a simple string. For example, the first > character of the literal <foo>bar</foo> is not '<'. > > This is an interesting suggestion. It raises several questions. > I'll confine myself to one (at least for now)... > > 1) What evidence is there that this was the intent of the > M&S 1.0 specification? Er... this looks like pretty direct evidence: [[[ If the content of E contains no XML markup or if parseType="Literal" is specified in the start tag of E then v is the content of E (a literal). ]]] -- Resource Description Framework (RDF) Model and Syntax Specification http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/ Wed, 24 Feb 1999 14:45:07 GMT E referes to an XML element in that bit of the spec. The content of an XML element isn't (in general) a string; it's a sequence of data characters and/or elements, PIs, and comments: [[[ Content of Elements [43] content ::= CharData? ((element | Reference | CDSect | PI | Comment) CharData?)* ]]] -- Extensible Markup Language (XML) 1.0 (Second Edition) http://www.w3.org/TR/REC-xml#dt-content Thu, 05 Oct 2000 12:19:51 GMT Let's take this example from the RDF spec: ------------- In the following example, the value of the Title property is a literal containing some MATHML markup. <rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/metadata/dublin_core#" xmlns="http://www.w3.org/TR/REC-mathml" rdf:about="http://mycorp.com/papers/NobelPaper1"> <dc:Title rdf:parseType="Literal"> Ramifications of <apply> <power/> <apply> <plus/> <ci>a</ci> <ci>b</ci> </apply> <cn>2</cn> </apply> to World Peace </dc:Title> <dc:Creator>David Hume</dc:Creator> </rdf:Description> ------------- what do you suggest is the value of the dc:Title (sic) property? I suggest it's a structured thing, ala the XML infoset or XPath data model; it's got some characters, a mathml:apply element, and some more characters. No? > Searching through the archives of the w3c-rdf-syntax-wg > list for 'infoset' turns up VERY few messages. The infoset hardly existed then. > Re-reading those messages, IMHO, supports a very different > interpretation of the WG intent - that parseType="Literal" was a > stop-gap measure to let us deal with embedded XML content through > the simple expedient of turning off RDF parsing of that content. > In fact, the phrase "generates no tuples" is used in the emails > above in a manner that seems to indicate that the WG wanted to > completely ignore the content and markup in the Literal, and treat it > as a simple string. Later applications might do something with the > markup. I could live with encoding the structured thing as a string provided (1) namespace info isn't lost; in the example above, the resulting string must capture the namespace name associated with <apply/> etc. (2) it remains distinguishable from a string that happens to have the same characters. i.e. I'm OK with "delaying" the parsing, so long as we don't lose information. > If that is the case, then the clarification document can't say > that M&S 1.0 requires the generation of tuples for the infoset of > the embedded content. That seems the opposite of the intent. I agree that M&S 1.0 doesn't give the URIs of the relevant properties, so it would be more than clarification to specify them. But this seems like another bug in the spec: "anybody can say anything about anything; but if you want to give the language of a Literal or model the structure of XML content, don't use RDF properties to do it!" I don't think that was the intent. > Dan's suggestion could be within the scope of a 2.0 revisitation > of M&S, but clearly seems to exceed our chartered tasks. > > (At that time, there may be an approach we can take which reconciles > the views. We might say that in 1.0, a Literal is just a String, > but that in 2.0, we have some extra info in the model so that > we not only have the string, we have a URI for it. (We should also > agree on just what those URIs are). That URI can be used as the > subject for all sorts of statements. We could use it in statements > which have a predicate called something like 'rdf2:hasInfoset'. The > rest is left as an exercise for the future.) > > But for now, I think that as far as RDF 1.0 processors are concerned, > Literals are just strings, and the first character of a string > like "<foo>bar</foo>" would be '<'. As long as we somehow distinguish <prop parseType="Literal"><foo>bar</foo></prop> from <prop><foo>bar</foo></prop> -- Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Thursday, 12 July 2001 01:58:56 UTC