- From: Benja Fallenstein <b.fallenstein@gmx.de>
- Date: Sun, 29 Jun 2003 04:35:58 +0200
- To: rdf-i <www-rdf-interest@w3.org>
Hi all,
I've been thinking about how to "canonicalize" the RDF/XML syntax, so
that the same graph (with namespaces/anonymous nodes labeled the same
way) always produces the same output file. A major application would be
to interact well with textual 'diff'/'merge' and versioning systems like
CVS-- if the RDF is formatted differently on every save, these tools
lose their value.
Does anybody know whether there are proposals/implementations for
something like this already?
My idea is to use the following rules:
- All triples with the same subject are collected in a single
<rdf:Description> element (which is a child of the <rdf:RDF> element).
Each <rdf:Description> has a rdf:about or rdf:nodeID attribute.
- A triple "a x:prop b" is represented as <x:prop rdf:resource="b"/>
inside the <rdf:Description> of a. Similar for triples with literal
values. Blank node values are identified through rdf:nodeID.
- The <rdf:Description> elements are ordered by subject.
- The property elements inside an <rdf:Description> are ordered first by
property, then by object of the triple.
- Each <rdf:Description> and </rdf:Description> is on its own line, not
indented. Each property element is on its own, single line (except for
multiline literals), indented two spaces.
- All namespace declarations are on the <rdf:RDF> element.
- Canonical XML is applied.
For example, the following graph:
<http://example.org/DOC/12> dc:author _:lucia
<http://example.org/DOC/12> dc:title "Kitchen Can Openers (II)"
<http://example.org/DOC/24> dc:author _:lucia
<http://example.org/DOC/24> dc:title "About Frogs"
_:lucia rdf:type ex:Person
_:lucia ex:age "27"
would be serialized like this:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:ex="http://example.org/stuff/1.0/">
<rdf:Description rdf:nodeID="lucia">
<ex:age>27</ex:age>
<rdf:type rdf:resource="http://example.org/stuff/1.0/Person"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/DOC/12">
<dc:author rdf:nodeID="lucia"/>
<dc:title>Kitchen Can Openers (II)</dc:title>
</rdf:Description>
<rdf:Description rdf:about="http://example.org/DOC/24">
<dc:author rdf:nodeID="lucia"/>
<dc:title>About Frogs</dc:title>
</rdf:Description>
</rdf:RDF>
What do you think, is this a sensible approach? (Can it serialize
everything that can be serialized in RDF/XML? -- I think so.)
Thanks,
- Benja
Received on Saturday, 28 June 2003 22:37:15 UTC