Re: after-hours conversation (#literal-as-resources #literal-is-xml-structure #xmllang #graph #identity-anon-resources #literal-subjects)

Ron - thanks for providing these notes.  

I like the general approach that is outlined and would favor
representing a Literal as a bNode.  I prefer providing as much
info as possible (to assist round-tripping) so I favor generating
triples for such things as the parseType itself.

WRT BC, would it help to define an additional parseType value
(e.g. "rdf:Literal")?  When parsers built on M&S 1.0 encounter
this new parseType they would simply treat the property as if 
it had parseType set to "Literal".  New parsers - those that 
support the new parseType - would do-the-new-thing when they
encounter the new parseType and could either do-the-new-thing
or do-the-old-thing when the parseType is "Literal" (new parsers 
could support a switch).

Anyhow, was the owner of rdfms-literal-is-xml-structure present?  
If so, was he in favor of this approach?  Should we expect a more
formal proposal based on this outline from someone (or the group
that met after the meeting)?

Art
---

On Fri, Oct 12, 2001 at 01:53:50PM -0700, Ron Daniel wrote:
> 
> A few of us remained on the call after the official close
> of the 2001-10-12 teleconference. We had some more discussion
> of parseType="Literal" and other non-controversial topics.
> I thought the rest of the group might be interested in one
> suggestion that was made for handling literals (all
> literals, not just XML literals). I do not recall any
> objections being made to this proposal. It was:
> 
> 1) With respect to literals, let's define a solution based on where
>    we want to be with RDF 2, then treat RDF 1 M&S as a special case,
>    or simple mapping, down from that.
> 
> 2) In RDF 2, let each occurrence of a literal be a prince/b/whatever_node,
>    identified in whatever way we decide to handle the things we used to
>    call anonymous resources.
> 
> 3) That node will have an rdf:value property whose value is the
>    literal's character string.
>    (A corollary might be that rdf:value properties are the only ones
>     that actually have character strings as values. That would be
>     conceptually cleanest. However, it might be a pain in practice as
>     things like xml:lang and some other things might be best served
>     with immediate string values. TBD.)
> 
> 4) The xml:lang is another property of that node.
> 
> 5) If the literal had rdf:parseType="Literal", this will be reflected
>    in the model by giving that node an rdf:type property with an
>    appropriate value, perhaps rdf2:xmlLiteral.
> 
> 6) The namespace bindings in effect for XML literals will appear
>    as another property (or set of properties) of that node.
> 
> 7) This mechanism will allow 'Literals as subjects' in RDF 2.
> 
> 8) Literals as subjects are not part of the RDF 1 M&S.
> 
> 9) The 2.0 model can be rendered in 1.0 syntax by:
>    a) Rendering the xml:lang property as an attribute on an element at
>       an appropriate scoping level
>    b) Rendering the rdf:type property whose value is "rdf2:xmlLiteral" as
>       an rdf:parseType attribute with the value "Literal".
>    c) Rendering the xmlns properties as attributes at the appropriate
>       scoping level (which probably means 'as high as possible').
>    d) Not rendering any other properties of the literal.
>       (This means that a 2.0 model cannot be round-tripped through the
>        1.0 syntax. That is OK. A 1.0 syntax will still be
>        round-trippable through the 2.0 model.)
> 
> 10) We still have the question of how to express the language and parse
>     type in the 1.0 model (i.e. n-triples). We have at least the following
>     choices:
>    a) Leave n-triples with three fields. Literals can only appear in
>       the third, object, field. Literals follow some grammar like:
>          Literal := QUOTE literal_string (DELIM1 lang_string)?
>                       (DELIM2 xmlns_string)? UNQUOTE
>       and we argue for awhile over the characters we actually use for
>       the terminals QUOTE, DELIM1, DELIM2, and UNQUOTE.
>    b) Let statements in an n-triples document which have literal values
>       contain more than 3 fields (which, to me, seems no different
>       than (a) since we still have to argue over how things will be
>       delimited).
>    c) Say that the 1.0 model was never defined clearly, and just start with
>       the 2.0 model, letting the 1.0 syntax be the thing that requires
>       various restrictions. (In other words, the n-triples representation
>       would use the p/b/anon_nodes to carry xml:lang, rdf:type, and
>       namespace properties as separate statements.)
>    d) something else?
> 
> Currently, I'd be OK with 10(c).
> 
> As an example, here's an XML document with embedded RDF, followed
> by a possible n-triples representation of the RDF portion.
> 
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <m:article  xmlns="the XHTML namespace URI"
>             xmlns:rdf="the RDF 1.0 URI"
>             xmlns:dc="the Dublin Core namespace URI"
>             xmlns:prism="the PRISM namespace URI"
>             xmlns:m="a magazine article message namespace"
>             xml:lang="en-US">
>  <rdf:RDF>
>    <rdf:Description rdf:about="">
>      <dc:title rdf:parseType="Literal"><i>CRN</i> Interview: Ellen Hancock, Exodus Communications</dc:title>
>      <dc:subject rdf:resource="http://example.org/subject_codes/networks"/>
>      <prism:releaseTime>2001-10-12</prism:releaseTime>
>    </rdf:Description>
>  </rdf:RDF>
>  <body>
>    <m:headline>Interview: Ellen Hancock, Exodus Communications</m:headline>
>    <p>If this were a real story, there would be lots of stuff here.</p>
>    <p>Some of that stuff would include pithy quotes from Ms. Hancock,
>       such as <quote prism:speaker="Ellen Hancock">Like Mark Twain said,
>       <quote prism:speaker="Samuel Clemens">It's better to keep one's
>       mouth shut and appear a fool, than to open it and remove all doubt</quote>.
>       Too bad that Ron Daniel guy doesn't follow that advice</quote>.</p>
>    <p>But it's not a real story, so there isn't.</p>
>  </body>
> </article>
> 
> Assuming the file was called hancock.article, the n-triples might look
> like the following (modulo the use of QNames instead of full URIs because
> I'm lazy and think full URIs are hard to read and harder to type):
> 
> 
> <hancock.article> <dc:title> _:lit1.
> <hancock.article> <dc:subject> <http://example.org/subject_codes/networks>.
> <hancock.article> <prism:releaseTime> _:lit2.
> 
> _:lit2 <rdf:value> "2001-10-12".
> 
> _:lit1 <rdf:value> "<i>CRN</i> Interview: Ellen Hancock, Exodus Communications".
> _:lit1 <xml:lang> "en-US".
> _:lit1 <rdf:type> <rdf2:xmlLiteral>
> _:lit1 <rdf2:ns> _:gen3
> 
> _:gen3 <rdf:type> <rdf:Bag>
> _:gen3 <rdf:_1> "xmlns=\"the XHTML namespace URI\"".
> _:gen3 <rdf:_2> "xmlns:rdf=\"the RDF 1.0 namespace URI\"".
> _:gen3 <rdf:_3> "xmlns:dc=\"the Dublin Core namespace URI\"".
> _:gen3 <rdf:_4> "xmlns:prism=\"the PRISM namespace URI\"".
> 
> (Not sure what to do about the character encoding. I assume
> that we don't specify it, requiring instead that all Unicode
> strings in an n-triples file are carried in some mandatory
> encoding.)
> 
> Note that the generated identifiers should distinguish between
> the IDs of the nodes for literal strings and the IDs for generic
> anonymous nodes which happen to contain an rdf:value. Otherwise
> we won't be able to round-trip things like:
> 
>   <dc:creator>John Smith</dc:creator>
>   <dc:subject rdf:parseType="Resource">
>     <rdf:value>Dogs</rdf:value>
>   </dc:subject>
> 
> 
> Ron Daniel Jr.
> Standards Architect
> Tel: +1 415 778 3113
> Fax: +1 415 778 3131
> Email: rdaniel@interwoven.com 
> 
> Register for GearUp 2001, Oct. 9-12
> The Year's Hottest Content Infrastructure Conference
> Visit www.interwoven.com/gearup2001

Received on Tuesday, 16 October 2001 09:21:33 UTC