- From: Brian McBride <bwm@hplb.hpl.hp.com>
- Date: 08 Jul 2003 18:48:35 +0100
- To: w3c-i18n-ig@w3.org, w3c-rdfcore-wg@w3.org
I first sent this out on Sunday, but omitted to include the i18n distribution list - sorry folks. Here is a version with minor updates based on feedback from RDFCore. Updating the arguments as I've understood them ... [...] > > So, if I can try to summarize (at least my understanding) of the details > you gave (I've included some detailed comments amongst your text below, > but they are not greatly relevant, I think). > > - users familar with XML will be surprised that the lang tag does not > affect an xml literal > - users will be confused that plain literals are treated differently > from XML literals > - the common case is that the user wishes an enclosing lang tag to > apply to an xml literal, so why burden the user with duplicating the > information > - not all XML languages have neutral elements such as <span> that can > be added to hold extra type information > - conversion from other XML languages to RDF/XML will require more > complex code Martin wished to add a concern that RDFCore has exceeded its charter in changing XML literals as defined in M&S. I have had another attempt at stating a rationale for the current design based in what Patrick, Pat, Jeremy et al have written recently, and some thoughts of my own. I'm just trying to capture RDFCore thinking in one place, so as usual, please correct, amend, clarify etc. 1. For RDF, its abstract syntax, i.e. the graph, is it primary representation. RDF/XML is a concrete syntax for representing graphs, i.e. from an RDF perspective, the goal is to figure out how best to represent graphs in RDF/XML, not how to represent RDF/XML in graphs. The typical use case for XML Literals is where an XML literal will be written into the middle of an XML document by an application. This is simplest if the xml literal is a standalone fragment that can be simply written into the XML document. 2. RDFCore agrees with feedback that it received, that building an XML specific mechanism into its core model is architecturaly inappropriate - it mixes things that should be independent. Accepting this implies that parseType="Literal" values must use one of the existing mechanisms - i.e. either plain literals or typed literals, or a new more general mechanism must be invented, e.g. a new triple structure. An XML specific mechanism is undesirable. 3. For the common use case, where applications embed a literal in an XML document, it is preferable to distinguish,in the graph, between plain and XML literals, so that e.g. different escaping conventions can be applied. 4. Taking the datatype approach creates the opportunity for future applications to subclass the datatype XMLLiteral, so that the value of a property may be restricted to a specific form of XML Literal, possibly specified using XML Schema. 5. The equality rules are different for plain and XML literals. "<eg:prop eg:a='a' eg:b='b'/>" and "<eg:prop eg:b='b' eg:a='a'/>" are different plain literals, but equal XML literals. 6. The notion that the literal in the RDF/XML fragment below <eg:prop xml:lang="en" rdf:parseType="Literal"> <span xml:lang="fr">chat</span> </eg:prop> contains the English string "chat" as a substring seems bizarre. 2, 3, 4, 5 and 6 argue for using the datatyping mechanism to represent xml literals. 7. The XSD datatyping model does not support the notion that the value of a literal is affected by a language tag. RDFCore's attempts to introduce this notion caused considerable complexity and difficulty in the model theory and met with strong negative feedback. Thus, if language is to affect the value of an xml literal it must be part of the members of the lexical space of the datatype. This can be accomplished by the parser generating a wrapper element to hold the lang tag. 8. The generation of a wrapper element is undesirable for the following reasons: - it is unhelpful in a primary use case where one wants to simply embed the literal in another XML document - the application has to get rid of the wrapper element, and find another enclosing element on which to hang the lang tag. - implementation complexity in general, caused by introducing and removing the wrapper element - the value of a property cannot be an arbritary XML fragment - it must always have an outer wrapper - the user may be surprised that the XML fragment is not identical to the one represented in the RDF/XML, e.g. XPATH expressions won't work as expected. Thus we are left with the current RDFCore proposal. The practical experience of WG members suggests that thinking of parseType="Literal" values as isolated fragments of XML that do not inherit language from their context, i.e. the current RDFCore design, is appropriate in practice. It has also been suggested that it is easier to integrate data from different sources when xml lang is not inherited from context. I am finding that one hard to follow, unless the integration is being done cut and paste by hand, since it is easy to always put an xml:lang="" next to every rdf:parseType="Literal" to ensure language isolation. Perhaps someone can provide an example to show the advantage. Martin: - leaving aside whether you agree with the value judgements it makes, would you accept that the above represents close to a coherent rationale for the current RDFCore proposal? - do you find it at all persuasive? Brian
Received on Tuesday, 8 July 2003 13:49:04 UTC