- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Thu, 3 Jul 2003 12:29:41 +0300
- To: "ext Brian McBride" <bwm@hplb.hpl.hp.com>, "Martin Duerst" <duerst@w3.org>
- Cc: "Graham Klyne" <gk@ninebynine.org>, "Dan Connolly" <connolly@w3.org>, <w3c-i18n-ig@w3.org>, "Ralph R. Swick" <swick@w3.org>, <misha.wolf@reuters.com>, "Tim Berners-Lee" <timbl@w3.org>, "rdf core" <w3c-rdfcore-wg@w3.org>
----- Original Message ----- From: "ext Brian McBride" <bwm@hplb.hpl.hp.com> To: "Martin Duerst" <duerst@w3.org> Cc: "Graham Klyne" <gk@ninebynine.org>; "Dan Connolly" <connolly@w3.org>; <w3c-i18n-ig@w3.org>; "Ralph R. Swick" <swick@w3.org>; <misha.wolf@reuters.com>; "Tim Berners-Lee" <timbl@w3.org>; "rdf core" <w3c-rdfcore-wg@w3.org> Sent: 02 July, 2003 12:30 Subject: Re: Summary of strings, markup, and language tagging in RDF (resend) > RDFCore considered a number of options, but the relevant ones were that > the object of the statement should either be one of the following xml > fragments: Martin, et. al, In the hopes of offering something useful to this discussion, I'll summarize my own thoughts on this issue, which I'll do with the caveat (or possibly benefit ;-) of just having a month long vacation from all this stuff: It may be useful to consider the following viewpoint regarding the significance of xml:lang in the processing of RDF/XML. RDF/XML is an intermediate tool for the interchange of knowledge between disparate systems, not for the direct modelling and consumption of knowledge. RDF/XML could be seen as providing instructions of a sort for how to construct an RDF graph. It is the resultant graph that matters to RDF applications, not the RDF/XML serialization. XML is not the "model of consumption" for RDF knowledge, and as such, many of the constraints and conventions relevant for the consumption of XML documents where the XML encoding *is* the model of consumption simply do not apply to RDF applications. The model of consumption for RDF knowledge is the RDF graph, not the RDF/XML serialization. Alot of the discussion that I've reviewed since returning from vacation seems to center around a presumed requirement that RDF applications must interpret RDF/XML according to the same conventions as XML applications, and that if there is a consistent means in XML to deal with language qualification of both plain text content and marked up content, then RDF applications must also follow such conventions. This is simply not the case. RDF applications do not interpret RDF/XML at all. Only RDF parsers do so. RDF applications operate on an RDF graph, not on XML. It is clearly specified what parts of an RDF/XML serialization are relevant to, and are reflected in an RDF graph. If some parts of the XML serialization, such as xml:lang scoping, or qualified names, or namespace declarations, or such are not reflected in the RDF graph, that is not "wrong". It is not "wrong" that RDF users would qualify their plain literals for language in one way and yet qualify XML literals for language in a different way. The role of RDF/XML is different from the role of most other XML serializations where language is relevant There are also practical reasons why XML literals should, by default, not be infected with the language scope defined elsewhere in an RDF/XML instance -- primarily because RDF is an ideal tool for modular content management which also has powerful mechanisms for dynamic syndication of knowledge which may be interchanged (via RDF/XML) intensively during processing and the risk for inadvertantly and erroneously infecting XML literals during complex combinations of syndication, interchange serialization, and reserialization can result in huge headaches for both users and developers. I would in fact go so far as to assert that the infection of xml:lang values on plain literals is nearly as bad, though the risk for inadvertant infection is far less in their case; yet since the treatment of xml:lang attributes with regards to plain literals is fairly clear in M&S in contrast to XML literals, it would be too great an affront to the charter to remove the affect of xml:lang entirely from the mapping of RDF/XML to the RDF graph. So it remains. Though it should be noted that there are other (and IMO better) ways to model language scoping than xml:lang tags. In the future, I wouldn't be surprised to see the use of xml:lang in RDF/XML become deprecated entirely, at least in practice if not in future specs (but I digress...) The WG, after *alot* of discussion, decided to model XML literals using an RDF datatype. There are many solid reasons for modelling XML literals as typed literals. Two of the most prevalent are (1) it allows us to define the relation between lexical forms of expression with canonical forms of comparison in an elegant and precise manner, and (2) it provides a solid foundation for the subsequent support of XML Schema complex types as subclasses of rdf:XMLLiteral. The original solution allowed for XML literals to be special and have an associate datatype. But there were dragons... (particularly green and red ones, though also a few blue ones here and there... or was that the tequila...) The WG then decided, based on yet more discussion, and clear community feedback, to treat XML literals as a normal datatype, without lang tag, killing a whole lot of dragons in the MT and elsewhere (ahhh... what a barbaque that was...). The cost for this decision was that content producers would not be able to have global language definitions for all XML literals within an RDF/XML instance. Well, a similar situation exists for datatyping as well, where one cannot have global datatyping definitions for typed literals. That seems to be now then a consistent characteristic of RDF/XML. That one must be explicit about what one means to say. All in all, that's probably a very good thing. Also, consider again my comments above about the inherent problems in global infection of language on XML literals in the context of processing involving a high degree of syndication and interchange between heterogenous agents (the expected norm on the SW). In that regard, the cost for this decision is actually a benefit. And, harking back to discussions in Cannes, there is not really any technical difference between ignoring lang tags for typed literals than for ignoring them for XML literals. After all, XML Schema defines things such as names, tokens, etc. all of which usually contain linguistic data. In the end, it's all about tradeoffs. The benefits derived for modelling XML literals as typed literals, and the benefits of having a consistent datatyping model without any special cases, and the benefits for safe modular content management in a distributed SW environment far outweight IMO any loss of convenience or inconsistency from the general XML world from xml:lang attributes not applying to XML literals. While I am myself a vocal proponent for consistency and unity between standards, and appreciate the fact that the RDF treatment of xml:lang with regards to XML literal content is not fully in sync with the XML world, I think that the present design is more than sufficiently grounded in the real-world (and unique) needs of the RDF community to justify such a departure. And content providers are still able to express language qualification of all XML content in XML literals. We are not preventing the rich and powerful use of language tagging in XML content. We're simply drawing the line about where it can be done a bit more tightly than is done for more traditional XML serializations, but for very good reasons. In these matters, the WG has faced many "least of all evils" decisions, and no solution will be perfect. Cest la vie. I hope that my comments above will serve to help all those who have been dissatisfied with the present solution at least come to a point where they can live with it. We really do need to get this stuff done, and what we have now is a pretty damn good solution, all things considered. Cheers, Patrick -- Patrick Stickler Nokia, Finland patrick.stickler@nokia.com
Received on Thursday, 3 July 2003 06:06:00 UTC