- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Thu, 23 Aug 2001 20:04:17 +0200
- To: <w3c-rdfcore-wg@w3.org>
- Cc: <www-xml-fragment-comments@w3.org>
I am unconvinced as to whether fragments addresses RDF needs. While the context stuff is relevant, there is no canonicalisation. Fragments are represented as XML document(s), which approximate to text strings. Hence the following tests are distinguishable. All should be processed using the same base URL (if this offends globally replace ":Description" with ":Description rdf:about='http://example.org/parseTypeEqualsLiteral'" where rdf is the appropriately bound namespace). test0001 shows that an empty element expressed as one tag is distibguishable from one expressed as two; test0002 shows that attribute order matters; test0002c shows that whitespace within a tag matters; test0003 shows that comments are not stripped; test0004 shows that namespace bindings are relevant, (the attribute meaning may be changed by the choice of prefix for the RDF namespace!) test0005 shows that all namespace bindings are significant, even though neither of these are referred to in the literal. === test0001a.rdf <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <rdf:value rdf:parseType="Literal" > <foo></foo> </rdf:value> </rdf:Description> </rdf:RDF> === test0001b.rdf <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <rdf:value rdf:parseType="Literal" > <foo/> </rdf:value> </rdf:Description> </rdf:RDF> === test0002a.rdf <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <rdf:value rdf:parseType="Literal" > <foo a="a" b="b"/> </rdf:value> </rdf:Description> </rdf:RDF> === test0002b.rdf <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <rdf:value rdf:parseType="Literal" > <foo b="b" a="a"/> </rdf:value> </rdf:Description> </rdf:RDF> === test0002c.rdf <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <rdf:value rdf:parseType="Literal" > <foo a="a" b="b"/> </rdf:value> </rdf:Description> </rdf:RDF> === test0003a.rdf <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <rdf:value rdf:parseType="Literal" > <foo></foo> </rdf:value> </rdf:Description> </rdf:RDF> === test0003b.rdf <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <rdf:value rdf:parseType="Literal" > <foo><!-- a comment --></foo> </rdf:value> </rdf:Description> </rdf:RDF> === test0004a.rdf <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <rdf:value rdf:parseType="Literal" > <foo a="x:b"></foo> </rdf:value> </rdf:Description> </rdf:RDF> === test0004b.rdf <x:RDF xmlns:x="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <x:Description> <x:value x:parseType="Literal" > <foo a="x:b"></foo> </x:value> </x:Description> </x:RDF> === test0005a.rdf <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description> <rdf:value rdf:parseType="Literal" > <foo></foo> </rdf:value> </rdf:Description> </rdf:RDF> === test0005b.rdf <x:RDF xmlns:x="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <x:Description> <x:value x:parseType="Literal" > <foo></foo> </x:value> </x:Description> </x:RDF> =========== End of tests. My belief is that to progress the literal representation issue we need to first consider test cases like these. We can consider them against a number of proprosals e.g. STRING: the literal is represented precisely by the string in the source document FRAG: the literal is represented by a (to be defined) representation conformant with XML Fragment Interchange specification. CANON: the literal is represented by the XML Canonicalisation of the string in the source document. INFOSET: the string is represented by something from which a string can be derived which when inserted into the source document in the place of the original string leaves the XML Infoset of the source document unchanged. NODESET: the string is represented by something from which a string can be derived which when inserted into the source document in the place of the original string leaves the Xpath nodeset of the source document unchanged. CANONINFO: the string is represented by a canonical representation of the infoset of the original string. Note: defining such a representation is quite hard and not done. We note that both STRING and FRAG are special cases of INFOSET; and CANON is a special case of NODESET. The truth table for the tests above is as follows 1 2 3 4 5 STRING f f f t t FRAG f f f f f CANON t t f f f INFOSET - - f t* t* NODESET - - f - - CANONINFO t t f t* t* I am unsure about the four starred entries. The t shows that the test data produces the same model, an f shows that the test data produces different models, the - means that implementations may produce either result. If this is seen as a positive way forward, I can produce some more examples early september in time for the RDF Core WG teleconference on Sept 7. An argument against this approach is that the current M&S spec specifically excludes testing for equality on such XML literals; in my view, this is because that spec explicitly ducked doing these properly, and one of the clarifications we are expected to make would allow for equality testing. As I see it, the heart of the problem is what is the meaning of some XML. The answer is that it is application dependent, and we should not try and second guess which parts of infoset the application will look at; but the application may not look at things outside infoset. However, it is plausible to take a well-defined subset of Infoset, in particular a subset blessed by some other W3C WG (such as the XPath nodeset). Jeremy Carroll HP Labs Bristol
Received on Thursday, 23 August 2001 13:55:11 UTC