Re: RDF syntax 'improvements'? - was RE: [Fwd: xmlns, uri+name pairs or just uris..? Clarification needed.] from Dan Brickley on 2000-08-05 (www-rdf-interest@w3.org from August 2000)

From: Dan Brickley <Daniel.Brickley@bristol.ac.uk>
Date: Sat, 5 Aug 2000 10:05:38 +0100 (BST)
To: Guha <guha@guha.com>
cc: Dan Brickley <danbri@w3.org>, "Perry A. Caro" <caro@Adobe.COM>, "'www-rdf-interest@w3.org'" <www-rdf-interest@w3.org>
Message-ID: <Pine.GHP.4.21.0008050942000.15091-100000@mail.ilrt.bris.ac.uk>

So here's a quirky strawman I've been chatting about with Dave Beckett.

We use a syntactic subset of RDF 1.0, drawing on the existing constructs
for representing Statement, predicate, subject, object. From an RDF
1.0 point of view this is just a bunch of quoted statements that we've not
said anything about. Legal, just odd: the data is contentfully
contentless. 

To add content to the interpretation of an XML document
containing this stuff, we wrap it in one containing tag,
eg. rdump:StatementSet, which is a purely syntatic XML construct
with meaning "Here's a bunch of RDF statements represented in
the 'rdump' simple syntax for an rdf statement set.". Application
context (or further wrapper tags or factoids carried in the statementset
itself) could supply any additional information, eg. carrying
provenance of those quoted statements, such as 'here are some triples I
got from doing a GET on URI foo at date bar').

Unlike normal RDF 1.0 syntax, the DTD and/or XML Schema that defined
this syntax would nail down some pretty tight constraints so quick perl scripts
etc could have reasonable expectations about what they're getting.

<rdump:StatementSet xmlns:web="http://example.com/rdf-dump-syntax-0.0">

 <RDF xmlns:web="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

 <Statement>
  <predicate resource="http://purl.org/dc/elements/1.1/creator"/>
  <subject resource="http://example.com/danhomepage/"/>
  <object resource="genid:435345353453"/>
 </Statement>

 <Statement>
  <predicate resource="http://example.com/vocab/personalMailbox"/>
  <subject resource="genid:435345353453"/>
  <object resource="mailto:danbri@w3.org"/>
 </Statement>

 <Statement xml:lang>
  <predicate resource="http://purl.org/dc/elements/1.1/title"/>
  <subject resource="http://example.com/danhomepage/"/>
  <object>danbri's fictional home page</object>
 </Statement>
[...]

 </RDF>
</rdump:StatementSet>

Whatever we do is gonna look a bit like this; making it also parse as
RDF 1.0 is kind of interesting, but maybe a distraction. A more
interesting question is "how hacky to make it". At the hacky end of the
scale is stuff like tab-separate text files that ignore xml:lang,
charset, markup inside literals etc. While I admit to having used such
things occasionally myself, these feel _too_ hacky. At the other end of
the scale is a proper XML syntax, using namespaces etc. in the expected
manner.

While I can write a nasty perl script to parse the above example data if
we assume (dodgily) that we know what namespace prefixes are being used,
I don't have a sense of how much hassle it would be to have this RDF
dump syntax be namespace aware. I hope it won't be a big deal and we can
find something that is xml namespace friendly while still being
pretty easy to process...

Comments? 

Dan

On Fri, 4 Aug 2000, Guha wrote:

> Amen. It would be nice to have a standard "log" format for  triples.
> 
> guha
> 
> Dan Brickley wrote:
> 
> > On Fri, 4 Aug 2000, Perry A. Caro wrote:
> >
> > > Lee,
> > >
> > > If you look back at the archives, you'll see a long series of messages about
> > > simplifying the RDF syntax.  The most radical proposal was to reduce the RDF
> > > serialization to simple statements, like:
> > >
> > > <srdf:Statement prop="title" res="...URI">Literal Value</srdf:Statement>
> > > <srdf:Statement prop="creator" value="#id001" res="...URI"/>
> > > <srdf:Statement prop="rdf:_1" res="id001">Author 1</srdf:Statement>
> > > <srdf:Statement prop="rdf:_2" res="id001">Author 2</srdf:Statement>
> > > <srdf:Statement prop="rdf:type" value="rdf:Bag" res="id001"/>
> > >
> > > etc.  There were several other proposals, including one from Tim
> > > Berners-Lee.
> > >
> > > The silence may be a way of saying, "Been there, done that." :-)
> >
> > Seems a shame if we've all got tired of the discussion without actually
> > finishing an alternative serialisation spec. There are lots of issues,
> > eg. above you use qnames inside attribute values. Also the issue of how to
> > identify anonymous/transient nodes in such a way as to not confuse
> > generated IDs with 'proper' URIs.
> >
> > I keep finding myself re-inventing variants on the above syntax (for
> > quickie Perl / Javascript work), sometimes just using tab-separated data
> > structures. This suggests to me that a writeup of such a syntax would be a handy
> > thing to have.
> >
> > I'd be interested to hear whether a (say) W3C Note specifying such a
> > simple lowest common denominator 'rdf dump syntax' would be useful to
> > implementors. My own implementation experience suggests 'yes'. Other
> > perspectives would be useful...
> >
> > Dan
> 
>

Received on Saturday, 5 August 2000 05:05:42 UTC