- From: Dan Brickley <Daniel.Brickley@bristol.ac.uk>
- Date: Sat, 5 Aug 2000 10:05:38 +0100 (BST)
- To: Guha <guha@guha.com>
- cc: Dan Brickley <danbri@w3.org>, "Perry A. Caro" <caro@Adobe.COM>, "'www-rdf-interest@w3.org'" <www-rdf-interest@w3.org>
So here's a quirky strawman I've been chatting about with Dave Beckett. We use a syntactic subset of RDF 1.0, drawing on the existing constructs for representing Statement, predicate, subject, object. From an RDF 1.0 point of view this is just a bunch of quoted statements that we've not said anything about. Legal, just odd: the data is contentfully contentless. To add content to the interpretation of an XML document containing this stuff, we wrap it in one containing tag, eg. rdump:StatementSet, which is a purely syntatic XML construct with meaning "Here's a bunch of RDF statements represented in the 'rdump' simple syntax for an rdf statement set.". Application context (or further wrapper tags or factoids carried in the statementset itself) could supply any additional information, eg. carrying provenance of those quoted statements, such as 'here are some triples I got from doing a GET on URI foo at date bar'). Unlike normal RDF 1.0 syntax, the DTD and/or XML Schema that defined this syntax would nail down some pretty tight constraints so quick perl scripts etc could have reasonable expectations about what they're getting. <rdump:StatementSet xmlns:web="http://example.com/rdf-dump-syntax-0.0"> <RDF xmlns:web="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <Statement> <predicate resource="http://purl.org/dc/elements/1.1/creator"/> <subject resource="http://example.com/danhomepage/"/> <object resource="genid:435345353453"/> </Statement> <Statement> <predicate resource="http://example.com/vocab/personalMailbox"/> <subject resource="genid:435345353453"/> <object resource="mailto:danbri@w3.org"/> </Statement> <Statement xml:lang> <predicate resource="http://purl.org/dc/elements/1.1/title"/> <subject resource="http://example.com/danhomepage/"/> <object>danbri's fictional home page</object> </Statement> [...] </RDF> </rdump:StatementSet> Whatever we do is gonna look a bit like this; making it also parse as RDF 1.0 is kind of interesting, but maybe a distraction. A more interesting question is "how hacky to make it". At the hacky end of the scale is stuff like tab-separate text files that ignore xml:lang, charset, markup inside literals etc. While I admit to having used such things occasionally myself, these feel _too_ hacky. At the other end of the scale is a proper XML syntax, using namespaces etc. in the expected manner. While I can write a nasty perl script to parse the above example data if we assume (dodgily) that we know what namespace prefixes are being used, I don't have a sense of how much hassle it would be to have this RDF dump syntax be namespace aware. I hope it won't be a big deal and we can find something that is xml namespace friendly while still being pretty easy to process... Comments? Dan On Fri, 4 Aug 2000, Guha wrote: > Amen. It would be nice to have a standard "log" format for triples. > > guha > > Dan Brickley wrote: > > > On Fri, 4 Aug 2000, Perry A. Caro wrote: > > > > > Lee, > > > > > > If you look back at the archives, you'll see a long series of messages about > > > simplifying the RDF syntax. The most radical proposal was to reduce the RDF > > > serialization to simple statements, like: > > > > > > <srdf:Statement prop="title" res="...URI">Literal Value</srdf:Statement> > > > <srdf:Statement prop="creator" value="#id001" res="...URI"/> > > > <srdf:Statement prop="rdf:_1" res="id001">Author 1</srdf:Statement> > > > <srdf:Statement prop="rdf:_2" res="id001">Author 2</srdf:Statement> > > > <srdf:Statement prop="rdf:type" value="rdf:Bag" res="id001"/> > > > > > > etc. There were several other proposals, including one from Tim > > > Berners-Lee. > > > > > > The silence may be a way of saying, "Been there, done that." :-) > > > > Seems a shame if we've all got tired of the discussion without actually > > finishing an alternative serialisation spec. There are lots of issues, > > eg. above you use qnames inside attribute values. Also the issue of how to > > identify anonymous/transient nodes in such a way as to not confuse > > generated IDs with 'proper' URIs. > > > > I keep finding myself re-inventing variants on the above syntax (for > > quickie Perl / Javascript work), sometimes just using tab-separated data > > structures. This suggests to me that a writeup of such a syntax would be a handy > > thing to have. > > > > I'd be interested to hear whether a (say) W3C Note specifying such a > > simple lowest common denominator 'rdf dump syntax' would be useful to > > implementors. My own implementation experience suggests 'yes'. Other > > perspectives would be useful... > > > > Dan > >
Received on Saturday, 5 August 2000 05:05:42 UTC