- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Mon, 26 Sep 2011 10:50:20 +0200
- To: public-rdf-wg@w3.org
All, I'd like to discuss here 2 options for lang tagged literals, viz., option 2d (Richard's proposal) and option 4 (one datatype with non-empty lexical space). Richard's solution consists in adding a datatype with empty lexical space, and define the abstract syntax and semantics of these literals in a ad hoc fashion. I don't have a problem with having an empty lexical space but I find issues in that proposal because I find that it does not make things more uniform than previously, with the only exception that tagged literals would have a datatype. - First, syntactically, tagged literals are an exception to the standard typed literals, since it would be inconsistent to write anything of the form "xxx"^^rdf:LangString. - Second, semantically, tagged literals are an exception too, since standard tagged literals are normally interpreted according to the L2V mapping, which is empty in this case. - Third, in SPARQL, the DATATYPE keyword would have to be redefined with an exception, since currently SPARQL says nothing about typed literals which cannot be written in the form "xxx"^^dt. I especially don't like when the RDF working group imposes a change to another WG's specification, especially at such a late stage. In contrast, I find that making rdf:LangString a "normal" datatype with a non-empty lexical space makes everything more uniform (option 4). - Syntactically, xxx@lll would simply be a shortcut for the abstract syntax "xxx@lll"^^rdf:LangString. The only exceptional feature would be that we recommend the concrete syntax xxx@lll, but we already made such an exception for "xxx"^^xsd:string, which we recommend to write "xxx". - Semantically, tagged literals would be interpreted as standard typed literals through the L2V mapping. - In terms of SPARQL specs, DATATYPE(xxx@lll) would be rdf:LangString, as required already by SPARQL without any change (since it is in fact the same as DATATYPE("xxx@lll"^^rdf:LangString)). Now, responding a concern raised by Andy who said that no matter the option chosen, tagged literals are made "special" in some way [1]. I do not think so. Option 4 makes lang tags part of the lexical form, such that language must be accessed by parsing the literal. That's how information from a literal should always be accessed. For instance, time zone, year, hour, date in xsd:datetimeStamp are obtained by parsing the lexical form. Same for exponent in xsd:float, same for any component of any typed literals. Why should it be different for lang tagged strings? In terms of pure specs, I think option 4 is much more elegant and easy. However, I understand that there are practical issues with option 4: in SPARQL, STR(xxx@lll) should send back "xxx@lll" instead of "xxx", unless an exception is added to SPARQL. I would not mind getting this, but I understand that this can be unpleasant to some people. Other problems exist wrt APIs, and this may have consequences on existing implementation. I am not sure to what extent this is causing troubles. All in all, I would not be against Richard's "minimal" proposal since it does not imply dramatic changes and could be integrated quite smoothly, but I still have a strong preference for option 4. [1] http://lists.w3.org/Archives/Public/public-rdf-wg/2011Sep/0041.html -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
Received on Monday, 26 September 2011 08:50:59 UTC