- From: Benja Fallenstein <b.fallenstein@gmx.de>
- Date: Sun, 29 Jun 2003 04:35:58 +0200
- To: rdf-i <www-rdf-interest@w3.org>
Hi all, I've been thinking about how to "canonicalize" the RDF/XML syntax, so that the same graph (with namespaces/anonymous nodes labeled the same way) always produces the same output file. A major application would be to interact well with textual 'diff'/'merge' and versioning systems like CVS-- if the RDF is formatted differently on every save, these tools lose their value. Does anybody know whether there are proposals/implementations for something like this already? My idea is to use the following rules: - All triples with the same subject are collected in a single <rdf:Description> element (which is a child of the <rdf:RDF> element). Each <rdf:Description> has a rdf:about or rdf:nodeID attribute. - A triple "a x:prop b" is represented as <x:prop rdf:resource="b"/> inside the <rdf:Description> of a. Similar for triples with literal values. Blank node values are identified through rdf:nodeID. - The <rdf:Description> elements are ordered by subject. - The property elements inside an <rdf:Description> are ordered first by property, then by object of the triple. - Each <rdf:Description> and </rdf:Description> is on its own line, not indented. Each property element is on its own, single line (except for multiline literals), indented two spaces. - All namespace declarations are on the <rdf:RDF> element. - Canonical XML is applied. For example, the following graph: <http://example.org/DOC/12> dc:author _:lucia <http://example.org/DOC/12> dc:title "Kitchen Can Openers (II)" <http://example.org/DOC/24> dc:author _:lucia <http://example.org/DOC/24> dc:title "About Frogs" _:lucia rdf:type ex:Person _:lucia ex:age "27" would be serialized like this: <?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:ex="http://example.org/stuff/1.0/"> <rdf:Description rdf:nodeID="lucia"> <ex:age>27</ex:age> <rdf:type rdf:resource="http://example.org/stuff/1.0/Person"/> </rdf:Description> <rdf:Description rdf:about="http://example.org/DOC/12"> <dc:author rdf:nodeID="lucia"/> <dc:title>Kitchen Can Openers (II)</dc:title> </rdf:Description> <rdf:Description rdf:about="http://example.org/DOC/24"> <dc:author rdf:nodeID="lucia"/> <dc:title>About Frogs</dc:title> </rdf:Description> </rdf:RDF> What do you think, is this a sensible approach? (Can it serialize everything that can be serialized in RDF/XML? -- I think so.) Thanks, - Benja
Received on Saturday, 28 June 2003 22:37:15 UTC