W3C home > Mailing lists > Public > www-rdf-interest@w3.org > June 2003

Proposal: "Canonical" RDF/XML

From: Benja Fallenstein <b.fallenstein@gmx.de>
Date: Sun, 29 Jun 2003 04:35:58 +0200
Message-ID: <3EFE508E.5010902@gmx.de>
To: rdf-i <www-rdf-interest@w3.org>

Hi all,

I've been thinking about how to "canonicalize" the RDF/XML syntax, so 
that the same graph (with namespaces/anonymous nodes labeled the same 
way) always produces the same output file. A major application would be 
to interact well with textual 'diff'/'merge' and versioning systems like 
CVS-- if the RDF is formatted differently on every save, these tools 
lose their value.

Does anybody know whether there are proposals/implementations for 
something like this already?

My idea is to use the following rules:

- All triples with the same subject are collected in a single 
<rdf:Description> element (which is a child of the <rdf:RDF> element). 
Each <rdf:Description> has a rdf:about or rdf:nodeID attribute.
- A triple "a x:prop b" is represented as <x:prop rdf:resource="b"/> 
inside the <rdf:Description> of a. Similar for triples with literal 
values. Blank node values are identified through rdf:nodeID.
- The <rdf:Description> elements are ordered by subject.
- The property elements inside an <rdf:Description> are ordered first by 
property, then by object of the triple.
- Each <rdf:Description> and </rdf:Description> is on its own line, not 
indented. Each property element is on its own, single line (except for 
multiline literals), indented two spaces.
- All namespace declarations are on the <rdf:RDF> element.
- Canonical XML is applied.

For example, the following graph:

     <http://example.org/DOC/12>   dc:author   _:lucia
     <http://example.org/DOC/12>   dc:title    "Kitchen Can Openers (II)"
     <http://example.org/DOC/24>   dc:author   _:lucia
     <http://example.org/DOC/24>   dc:title    "About Frogs"

     _:lucia                       rdf:type    ex:Person
     _:lucia                       ex:age      "27"

would be serialized like this:

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
xmlns:dc="http://purl.org/dc/elements/1.1/" 
xmlns:ex="http://example.org/stuff/1.0/">
<rdf:Description rdf:nodeID="lucia">
   <ex:age>27</ex:age>
   <rdf:type rdf:resource="http://example.org/stuff/1.0/Person"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/DOC/12">
   <dc:author rdf:nodeID="lucia"/>
   <dc:title>Kitchen Can Openers (II)</dc:title>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/DOC/24">
   <dc:author rdf:nodeID="lucia"/>
   <dc:title>About Frogs</dc:title>
</rdf:Description>
</rdf:RDF>


What do you think, is this a sensible approach? (Can it serialize 
everything that can be serialized in RDF/XML? -- I think so.)

Thanks,
- Benja
Received on Saturday, 28 June 2003 22:37:15 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:51:59 GMT