- From: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
- Date: Wed, 19 Sep 2001 15:18:22 +0100
- To: "dehora" <dehora@eircom.net>
- Cc: "'W3C Rdfcore'" <w3c-rdfcore-wg@w3.org>
At 01:17 PM 9/19/01 +0100, dehora wrote: > > ** 1st complication: some old parsers may misinterpret > > rdf:Resource and > > Literal; I think that results in loss of functionality rather than > > complete failure of interoperability, but it's still messy. > >In the sense that they'll treat rdf:Literal as a single string? My typo above. I meant to say "some old parsers may misinterpret rdf:Resource *as* Literal" [...] > > > > ** 2nd complication: what does this effect? Does it mean that the > > contained literal must be canonical XML [...]? > >Yes, I mean this. RDF-XML writers that want to mark RDF-XML Literals as >having a canonical parseType _must_ canonicize the Literal on >serialization. It's the writers job to make sense. The meaning of >parseType in my mind is a lexing/parsing clue not a transformation clue. Oh, I didn't take it that way. In which case, why bother? RDF writers can write canonical XML anyway if they so choose. I don't see the value in labelling it as such. >I could live with this, Jeremy has also suggested as much. But it seems >odd that we don't eat our own food. I agree with your sentiment, *but* we are stuck with the unadorned forms for compatibility purposes and there doesn't seem to be a clear migration path so I'm suggesting what seems to be an approach with slightly less room for confusion. > > I find point (3) is more difficult. Why do we want to > > canonicalize XML? I > > think the answer is to define a form of equivalence that can > > be tested by > > string comparison. I'd rather target the equivalence issue > > directly, and I > > think this is an issue separate from rdf:parseType since it > > also arises > > with non-XML literals (e.g. dates, numbers, etc.). Anyway, > > canonicalization doesn't completely solve the problem in the case of > > rdf:parseType="Resource", where the logical definition of > > equivalence would > > be two literals yielding the same RDF graph. > >[I don't consider whatever is at the end of parseType="Resource" to be a >literal with structure, it's expanded as RDF.] True. >Literals will require a canonical form, or if one prefers, a shared >abstraction of a literal, for two reasons I see: > >-equivalence operations as you point out. >-serialization and roundtripping. By "serialization and roundtripping" do you mean "roundtripping via serialization"? I think this matter of round-tripping is somewhat bound up with the question of equivalence. That is, when roundtripping from form A to form B and back to form A, one wants the result to be "equivalent" to the original. Which begs a definition of equivalence. I have a second concern about getting too insistent on roundtripping: in some cases, the alternative forms just doesn't convey all the same information -- some information is bound to be lost. (Again, a suitable definition of equivalence provides us with a way to test whether any information loss is significant to the purpose at hand.) [...] >My approach has been to look at Unicode rather than data structures >(though I'm aware that some people have made good arguments for literal >as structure) for the following reasons: I wasn't suggesting moving away from Unicode strings as the representation, just introducing concepts of equivalence that are not simple string equality. Simple example, for decimal numbers the following might be considered equivalent: "55" "055" "+55" though they're clearly different strings. This can be stated without dictating an alternative representation, though it does presume some syntactic structure and corresponding interpretation. (Maybe subsumed by the "qLiteral -> LV" mapping of the model theory?) [...] >With regard to stuff like dates and numbers, that's another issue again: >datatyping. I observe that if we allow namespaced parseTypes values, >people can simply slot in XML Schema simple types for ad hoc literal >processing. You get XML Schema pretty much for free this way. Again, I like the sentiment here. But RDF M&S requires that any literal with an rdf:parseType attrribute is well-formed XML. And I don't think a simple string like "55" qualifies. Maybe this is fixable? But if we try to fix this in the way you suggest, I think we have to relax the requirement that unrecognized rdf:parseType values must have well-formed XML content. #g ------------------------------------------------------------ Graham Klyne MIMEsweeper Group Strategic Research <http://www.mimesweeper.com> <Graham.Klyne@MIMEsweeper.com> ------------------------------------------------------------
Received on Wednesday, 19 September 2001 10:33:22 UTC