- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Thu, 3 Jul 2003 13:39:24 +0300
- To: <w3c-rdfcore-wg@w3.org>, "Brian_McBride" <bwm@hplb.hpl.hp.com>, <jjc@hplb.hpl.hp.com>, "ext pat hayes" <phayes@ihmc.us>
- Cc: "Martin Duerst" <duerst@w3.org>
----- Original Message ----- From: "ext pat hayes" <phayes@ihmc.us> To: <w3c-rdfcore-wg@w3.org>; "Brian_McBride" <bwm@hplb.hpl.hp.com>; <jjc@hplb.hpl.hp.com> Cc: "Martin Duerst" <duerst@w3.org> Sent: 03 July, 2003 04:52 Subject: Re: Summary of strings, markup, and language tagging in RDF (resend) > > > > >The wrapper is one solution to carry the language information. > >Of course you can choose whatever solution fits you best, but > >you should not forget that there are other solutions. One of > >them would be to handle XML Literals in the same way as plain > >literals, carrying the language information separately. If that > >can be done for plain literals, why can it not be done for > >XML Literals? > > > > Martin puts his finger on the key point. It could be, but we chose a > design for XML literals in which the XML 'label' is treated as a > built-in datatype; which then puts a strong design constraint on us > to treat it uniformly with the other datatypes, and that in turn > requires either than it not have lang tags or that all other datatype > namespaces have lang tags. The latter option is unworkable, so we > chose the former. > > Since this issue seems to be so centrally important, and since our > design now appears to people like Martin to be so completely > brain-damaged, I think that is perhaps a bit strong. It may not be ideal for folks having an XML-centric view, but it's certainly not brain-damaged. I think perspective counts for alot in understanding the tradeoffs inherent in the present solution. See my recent long post touching on this. > let me propose that we re-open this issue Please, don't. > and change > our design slightly, by reverting to an older design. The trouble > seems to arise from our insisting that XML literals are treated > uniformly with typed literals: so let us abandon that idea, in spite > of its being very neat, It is much more than just being "very neat". It has direct and substantial gains for distributed, metadata driven, modular content management as well as lays the foundation for the full support of XML Schema complex typing in RDF. It's not simply our favorite color. > and revert to the state where the XML > literals as treated as a special syntactic case in the RDF graph, so > that there would be five kinds of literal: plain and XML with and > without lang tags, plus datatyped literals. I agree that this would, in a way, be a bit closer to the original M&S, but also, I don't consider the present solution to be contrary to M&S, and is a much more useful long range solution. Again, there are better ways to model language qualification than xml:lang (even though at the expense of additional triples) and the fact that lang tags for plain literals are invisible to generic inference rules is IMO a far greater shortcoming of the final solution than not having lang tags on XML literals. But that's another (and probably needless, at this moment) discussion. Literals are, after all, literals, so it seems to me to be pretty shoddy engineering to allow the contextual characteristics of the serialization syntax infect *literals* in any way shape or form. That includes plain literals. IMO, the original M&S decision to allow xml:lang to apply at all insofar as the semantics of the RDF graph is concerned is the real mistake. That decision is of course understandable, I think, considering the newness of both XML and RDF at the time, and the lack of a distinct MT, but regrettable, and we struggle with that legacy. At least, for now, that error is limited only to plain literals. Let's not make the impact of that error broader (again) by reintroducing it into the treatment of XML literals. > In detail, the proposal is as follows. > > 1. There are five kinds of literal in an RDF graph, indicated in > Ntriples as follows: > "string" plain > "string"@tag plain plus lang tag > "string"^^rdf:XMLLIteral XML > "string"@tag^^rdf:XMLLiteral XML plus lang tag > "string"^^foo:baz typed, where foo:baz is any > URI other than 'rdf:XMLLiteral' > > Notice that the Ntriples way of indicating the XML case is just as it > is now, but thats just a syntactic decision to save work; > rdf:XMLLiteral isn't a datatype and XML literals are not typed > literals in this design, so the possibility of having lang tags in > its lexical space isn't going to cause any headaches.. See my recent post about why I think this is less optimal than the present solution, insofar as modular content management is concerned. The problem with xml:lang has always been that it is intended for the consumption of XML content by XML applications. The only real world XML application that operates on RDF/XML is an RDF parser -- and for the *parser* the xml:lang scope for XML literals is visible and relevant BUT the parser is not required to convey all information that might be available regarding XML content to the RDF graph. The RDF spec specifies what in the XML instance is considered relevant to RDF applications, and how that will be organized for RDF applications, and we are free to discard whatever we like in the XML, so long as we are clear about what we are doing. Our specs are very clear about the non-relevance of xml:lang scope on XML literals insofar as the semantics of *RDF* (not *XML*) are concerned. XML semantics only is relevant to an RDF parser. End of story. *RDF* users (as opposed to *XML* users) will be aware of this (if they've read even the primer) and should act accordingly. The only argument against this particular solution that may be valid is that XML users who have no clue about RDF might be confused about the non-relevance of xml:lang for certain RDF constructs, namely typed literals and thus, XML literals. Well, they can grab the specs and learn a bit about the tool they are using. At the risk of offending, and apologies in advance, this really is a case of RTFM. We are not violating the XML specs by disregarding xml:lang in the way we do, and in fact, we are simply making manditory what was otherwise optional for RDF applications in M&S, that the xml:lang scope could be disregarded. And the decisions are not residue from the WG process. There are solid reasons for modelling XML Literals as typed literals. > ... > > 4. Regarding Martin's other beef, that some XML without any markup in > it is 'really' just plain text, I'm not 100% sure that this is in fact Martin's position, but if it is, or is anyone else's position, then my reply is that this is simply wrong. The difference between a plain literal and an XML literal, regardless of the presence of markup, is that an RDF application is free to presume that an XML literal constitutes well-formed XML, whereas a plain literal need not. Period. It's as simple as that. An XML literal is a string that conforms to XML well-formedness conditions. Furthermore, the comparison of XML literals is not based on string-equality. These important distinctions from plain strings are captured semantically in the definition of the datatype rdf:XMLLiteral and in the RDF datatyping model. Though this is not pointed out anywhere explicitly in the RDF specs (which is a shame, but understandable, since it needs a bit more testing to ensure there are no major dragons) the present RDF datatyping solution, with XML Literals modelled as a datatype, allow for us to support the entire range of XML Schema types, including complex types! And thereby, define property ranges to be e.g. xhtml:title, asserting that all property values conform to the content model constraining the lexical space of xhtml:title elements. Such benefits simply dissappear if XML Literals are not modeled as typed literals -- and we certainly don't want to go back to treating rdf:XMLLiteral as a special case of datatype with lang tag. Again, the WG has chosen between many "least of all evils" or "best of all options" and IMO has chosen well. We can't make everyone happy, even if we very much want to, and I also think that the dissatisfaction of some regarding the present solution is mostly a matter of perspective or perception of the relationship between the RDF graph and its XML representation and not actual technical or practical shortcomings in the solution itself. While I'm very sympathetic to Martin's concerns about consistency and agreement between standards and the risk of misunderstanding to XML-only users, I have not noted any problems with the present solution that I would consider show stoppers. I feel that this issue should remain closed, and that we should wrap up. Regards, Patrick -- Patrick Stickler Nokia, Finland patrick.stickler@nokia.com
Received on Thursday, 3 July 2003 06:40:07 UTC