W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > October 2001

parseType="Literal" as syntactic sugar for infoset description (#rdfms-literal-is-xml-structure)

From: Dan Connolly <connolly@w3.org>
Date: Wed, 10 Oct 2001 15:24:26 -0500
Message-ID: <3BC4AE7A.9AD31884@w3.org>
To: w3c-rdfcore-wg@w3.org
Dan Connolly wrote:
[...]
> Jeremy, as owner of #rdfms-literal-is-xml-structure, I'd like
> you to show what the n-triples form of that document should be.
> 
> I've got a proposal, involving exploding it out as infoset
> properties... I think I'll work out the details and put
> the results on the table.

ok... we've decided that rdf:ID="..." is syntactic sugar
for rdf:about="#..." and that rdf:li is syntactic sugar
for rdf:_N and so on; I suggest that parseType="Literal"
is syntactic sugar for a description of a bunch of infoset
items; sorta like the way '(a b c) is syntactic
sugar, in lisp, for (cons (quote a) (cons (quote b) (cons (quote c)
nil))).


I've implemented the reduction in

  http://www.w3.org/2001/04rs22/litSugar.xsl
which is a little wrapper around
  http://www.w3.org/XML/2000/04rdf-parse/content.xsl

you can see the mathml example from the RDF spec:
  http://www.w3.org/2001/04rs22/rdf-mathlit.rdf
reduced to have no parseType="Literal"
  http://www.w3.org/2001/04rs22/rdf-mathlit-noLit.rdf
which reduces straightforwardly to ntriples:
  http://www.w3.org/2001/04rs22/rdf-mathlit-noLit.nt

Perhaps a simpler example is in order; let's use one
of ArtB's test cases...

 <rdf:Description rdf:about="http://www.example.org">
   <eg:property rdf:parseType="Literal">well-formed XML</eg:property>
 </rdf:Description>

becomes

 <rdf:Description rdf:about="http://www.example.org">
   <property xmlns="http://example.org/">
    <xi:InfoItemSeq xmlns:xi="http://www.w3.org/2000/07/infoset#">
     <rdf:li 
        xmlns:xia="http://www.w3.org/2000/07/hs78/content.xsl?term=">
      <xia:Characters rdf:value="well-formed XML"/>
     </rdf:li>
    </xi:InfoItemSeq>
   </property>
 </rdf:Description>

or, in ntriples:
 
 _:a0     <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://www.w3.org/2000/07/infoset#InfoItemSeq> .
 
 _:a0     <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> _:a1 .
 
 _:a1     <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://www.w3.org/2000/07/hs78/content.xsl?term=Characters> .
 
 <http://www.example.org/>     <http://example.org/property> _:a0 .

The http://www.w3.org/2000/07/infoset# namespace is a sort of
by-product of the XML Infoset spec; we'd have to pick it
up (it's a NOTE) and finish it. The xia:Characters thingy
is a short-cut for individual character information items.

I've done some work on a version of the infoset schema
that exploits DAML+OIL to be more expressive/precise:
  http://www.w3.org/2000/10/swap/infoset/infoset-daml.n3
  http://www.w3.org/2000/10/swap/infoset/infoset-diagram.svg (and .png)

So... this proposal has answers to all the various details,
though if we persue it, we might want to tweak some of the details.

In particular, any xml:lang info in an ancestor of a propertyElement
is lost, but XML namespaces are carried thru. (I guess I should
cook up some tests to demonstrate that, but I think I'm gonna
stop here until I hear more support for persuing this approach).

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Wednesday, 10 October 2001 16:24:27 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:40:59 EDT