- From: Art Barstow <barstow@w3.org>
- Date: Tue, 16 Oct 2001 09:21:32 -0400
- To: Ron Daniel <rdaniel@interwoven.com>
- Cc: w3c-rdfcore-wg@w3.org
Ron - thanks for providing these notes. I like the general approach that is outlined and would favor representing a Literal as a bNode. I prefer providing as much info as possible (to assist round-tripping) so I favor generating triples for such things as the parseType itself. WRT BC, would it help to define an additional parseType value (e.g. "rdf:Literal")? When parsers built on M&S 1.0 encounter this new parseType they would simply treat the property as if it had parseType set to "Literal". New parsers - those that support the new parseType - would do-the-new-thing when they encounter the new parseType and could either do-the-new-thing or do-the-old-thing when the parseType is "Literal" (new parsers could support a switch). Anyhow, was the owner of rdfms-literal-is-xml-structure present? If so, was he in favor of this approach? Should we expect a more formal proposal based on this outline from someone (or the group that met after the meeting)? Art --- On Fri, Oct 12, 2001 at 01:53:50PM -0700, Ron Daniel wrote: > > A few of us remained on the call after the official close > of the 2001-10-12 teleconference. We had some more discussion > of parseType="Literal" and other non-controversial topics. > I thought the rest of the group might be interested in one > suggestion that was made for handling literals (all > literals, not just XML literals). I do not recall any > objections being made to this proposal. It was: > > 1) With respect to literals, let's define a solution based on where > we want to be with RDF 2, then treat RDF 1 M&S as a special case, > or simple mapping, down from that. > > 2) In RDF 2, let each occurrence of a literal be a prince/b/whatever_node, > identified in whatever way we decide to handle the things we used to > call anonymous resources. > > 3) That node will have an rdf:value property whose value is the > literal's character string. > (A corollary might be that rdf:value properties are the only ones > that actually have character strings as values. That would be > conceptually cleanest. However, it might be a pain in practice as > things like xml:lang and some other things might be best served > with immediate string values. TBD.) > > 4) The xml:lang is another property of that node. > > 5) If the literal had rdf:parseType="Literal", this will be reflected > in the model by giving that node an rdf:type property with an > appropriate value, perhaps rdf2:xmlLiteral. > > 6) The namespace bindings in effect for XML literals will appear > as another property (or set of properties) of that node. > > 7) This mechanism will allow 'Literals as subjects' in RDF 2. > > 8) Literals as subjects are not part of the RDF 1 M&S. > > 9) The 2.0 model can be rendered in 1.0 syntax by: > a) Rendering the xml:lang property as an attribute on an element at > an appropriate scoping level > b) Rendering the rdf:type property whose value is "rdf2:xmlLiteral" as > an rdf:parseType attribute with the value "Literal". > c) Rendering the xmlns properties as attributes at the appropriate > scoping level (which probably means 'as high as possible'). > d) Not rendering any other properties of the literal. > (This means that a 2.0 model cannot be round-tripped through the > 1.0 syntax. That is OK. A 1.0 syntax will still be > round-trippable through the 2.0 model.) > > 10) We still have the question of how to express the language and parse > type in the 1.0 model (i.e. n-triples). We have at least the following > choices: > a) Leave n-triples with three fields. Literals can only appear in > the third, object, field. Literals follow some grammar like: > Literal := QUOTE literal_string (DELIM1 lang_string)? > (DELIM2 xmlns_string)? UNQUOTE > and we argue for awhile over the characters we actually use for > the terminals QUOTE, DELIM1, DELIM2, and UNQUOTE. > b) Let statements in an n-triples document which have literal values > contain more than 3 fields (which, to me, seems no different > than (a) since we still have to argue over how things will be > delimited). > c) Say that the 1.0 model was never defined clearly, and just start with > the 2.0 model, letting the 1.0 syntax be the thing that requires > various restrictions. (In other words, the n-triples representation > would use the p/b/anon_nodes to carry xml:lang, rdf:type, and > namespace properties as separate statements.) > d) something else? > > Currently, I'd be OK with 10(c). > > As an example, here's an XML document with embedded RDF, followed > by a possible n-triples representation of the RDF portion. > > <?xml version="1.0" encoding="ISO-8859-1"?> > <m:article xmlns="the XHTML namespace URI" > xmlns:rdf="the RDF 1.0 URI" > xmlns:dc="the Dublin Core namespace URI" > xmlns:prism="the PRISM namespace URI" > xmlns:m="a magazine article message namespace" > xml:lang="en-US"> > <rdf:RDF> > <rdf:Description rdf:about=""> > <dc:title rdf:parseType="Literal"><i>CRN</i> Interview: Ellen Hancock, Exodus Communications</dc:title> > <dc:subject rdf:resource="http://example.org/subject_codes/networks"/> > <prism:releaseTime>2001-10-12</prism:releaseTime> > </rdf:Description> > </rdf:RDF> > <body> > <m:headline>Interview: Ellen Hancock, Exodus Communications</m:headline> > <p>If this were a real story, there would be lots of stuff here.</p> > <p>Some of that stuff would include pithy quotes from Ms. Hancock, > such as <quote prism:speaker="Ellen Hancock">Like Mark Twain said, > <quote prism:speaker="Samuel Clemens">It's better to keep one's > mouth shut and appear a fool, than to open it and remove all doubt</quote>. > Too bad that Ron Daniel guy doesn't follow that advice</quote>.</p> > <p>But it's not a real story, so there isn't.</p> > </body> > </article> > > Assuming the file was called hancock.article, the n-triples might look > like the following (modulo the use of QNames instead of full URIs because > I'm lazy and think full URIs are hard to read and harder to type): > > > <hancock.article> <dc:title> _:lit1. > <hancock.article> <dc:subject> <http://example.org/subject_codes/networks>. > <hancock.article> <prism:releaseTime> _:lit2. > > _:lit2 <rdf:value> "2001-10-12". > > _:lit1 <rdf:value> "<i>CRN</i> Interview: Ellen Hancock, Exodus Communications". > _:lit1 <xml:lang> "en-US". > _:lit1 <rdf:type> <rdf2:xmlLiteral> > _:lit1 <rdf2:ns> _:gen3 > > _:gen3 <rdf:type> <rdf:Bag> > _:gen3 <rdf:_1> "xmlns=\"the XHTML namespace URI\"". > _:gen3 <rdf:_2> "xmlns:rdf=\"the RDF 1.0 namespace URI\"". > _:gen3 <rdf:_3> "xmlns:dc=\"the Dublin Core namespace URI\"". > _:gen3 <rdf:_4> "xmlns:prism=\"the PRISM namespace URI\"". > > (Not sure what to do about the character encoding. I assume > that we don't specify it, requiring instead that all Unicode > strings in an n-triples file are carried in some mandatory > encoding.) > > Note that the generated identifiers should distinguish between > the IDs of the nodes for literal strings and the IDs for generic > anonymous nodes which happen to contain an rdf:value. Otherwise > we won't be able to round-trip things like: > > <dc:creator>John Smith</dc:creator> > <dc:subject rdf:parseType="Resource"> > <rdf:value>Dogs</rdf:value> > </dc:subject> > > > Ron Daniel Jr. > Standards Architect > Tel: +1 415 778 3113 > Fax: +1 415 778 3131 > Email: rdaniel@interwoven.com > > Register for GearUp 2001, Oct. 9-12 > The Year's Hottest Content Infrastructure Conference > Visit www.interwoven.com/gearup2001
Received on Tuesday, 16 October 2001 09:21:33 UTC