- From: Sandro Hawke <sandro@w3.org>
- Date: Tue, 27 Sep 2011 01:01:37 -0400
- To: Jan Wielemaker <J.Wielemaker@vu.nl>
- Cc: Pat Hayes <phayes@ihmc.us>, RDF Working Group WG <public-rdf-wg@w3.org>
Jan, FYI I strongly agree with your intuitions here -- perhaps not surprisingly given the many long hours I've spent happily coding with SWI Prolog. About two weeks ago, I was arguing this position using somewhat different tactics in email; finally I spent an hour on the phone in which folks -- mostly Andy and Gavin -- convinced me that while this option (3a, giving us lang:en) may be architecturally appealing, there are details that would require a lot of work to get right, to give us something comfortable and sensible for users, and I gave up. Alas, the only bit I remember right now is case sensitivity, that "chat"@en="chat"@EN in SPARQL, but it's probably not practical to make "chat"^^lang:en="chat"^^lang:EN in SPARQL. This puts a real (if minor) problem for users up against an architectural-purity argument, and I don't like to be on the side against the users. -- Sandro On Mon, 2011-09-26 at 22:09 +0200, Jan Wielemaker wrote: > Hi Pat, > > On 09/26/2011 07:34 PM, Pat Hayes wrote: > > Perhaps the best way to resolve this interminable debate would be to > > start from the other end. Rather than implementors pointing out the > > horribleness of various proposals, could we have a list of what > > implementors would consider to be the least objectionable behavior? I > > I fear there is no single obvious consensus amoung implementors :-( > > > myself have no idea why "xxx@lll" is so very much worse than "xxx" > > paired with the datatype langbase:tag, but I am quite willing to be > > Nice challenge. It strikes me as odd, which might be more of an > intuition than science. We already have a two-dimensional space, > consisting of a value and a datatype on one hand and a very similar > two-dimensional space consisting of a value (string) and a language tag. > Fortunately, this is not a three dimensional space, but just a two > dimensional one because all language tagged strings are (implicitly) of > type xsd:string (or some rdf:langString subtype). I.e., there is no such > thing as "1.0"@en^^xsd:float vs. "1,0"@nl^^xsd:float (oops, forget that; > I see TONS of WORMS ...). > > It is clear that we want to support operations mostly on the plain > string value, such as search and comparison. That is, I don't want a > search for @ to succeed on "foo"@en. Also, I don't want "foo"@en to be > lexically smaller than "foo"@nl. So, from an implementation point of > view, I probably want to maintain the two-dimensional space where the > value ("foo") is separated from the tag [1]. It would make me very happy > if the 2.67 dimensional space (value + datatype/language tag/nothing) is > reduced to a simple two-dimensional space: value+URL. Changing "foo" > into "foo"^^xsd:string is a good step here. Changing "foo"@en into > "foo"^^lang:en would be a nice and consistent second step, putting all > literals in a nice two-dimensional space without exceptions. > > In addition, I believe that having a URL for a language opens some > nice opportunities to model relations between languages in RDF. > > > told that there is a consensus among implementors that this is so (or > > whatever in fact is the consensus) and then I am sure I can design an > > RDF modification which will realize that desired behavior and have a > > reasonably coherent semantics. > > > > I would however observe that as tagged literals are exceptional, and > > In what sense exceptional? I think there are lots of use-cases where > language tags play a vital role. > > > as we are proposing to make some kind of change to the existing spec, > > that *some* amount of change to existing code might have to be > > contemplated. If no changes are allowed at all to any existing > > deployed code, the WG should just pack up now, define RDF2 to be the > > same as RDF1 and declare its business done. We all have other things > > to do, I am sure. > > That is far too sceptical to me :-) Just map @tag into ^^lang:tag and > define some more mappings for related SPARQL constructs and I think that > everything is much more orthogonal and simple. Yes, we will have infinite > debates on the relations between @en, @en-US, @en-GB, etc., but we won't > be able to resolve these anyway. Just declare these out of the scope of > this working group. At least we provide whoever wants to model langages > with URLs about which they can make statements. > > Cheers --- Jan > > [1] I'm part of the camp where operations on strings are considered > both slow and dangerous ... Data processing systems should try to > avoid looking into strings as much as possible. > > > Pat > > > > On Sep 26, 2011, at 4:51 AM, Jan Wielemaker wrote: > > > >> On 09/26/2011 11:28 AM, Richard Cyganiak wrote: > >>> You understate the issues. > >>> > >>> Every existing application that uses the Literal.getLexicalForm() > >>> call of some API to get at the xxx part of xxx@lll would have to > >>> be changed, because the lexical form of xxx@lll is now xxx@lll. > >>> > >>> That's a complete non-starter. > >> > >> I fully agree. Also note that APIs for (notably in-core) RDF stores > >> can now typically work on a single shared representation of the > >> literal. If we add a tag to the literal many of the operations will > >> have to create a copy without the tag. I'm not saying this cannot > >> be solved, but I fear it will be natural nor pretty, especially for > >> existing stores that did not anticipate this in their design > >> phase. > >> > >> I must admit that I'm only following this from the sideline. As an > >> implementor I'm starting to get worried about some wild ideas > >> though. The solution I still like best is that foo@tag is the same > >> as "foo"^^langbase:tag, where langbase is some to be decided prefix > >> for language identifiers. Any implementation should be fairly > >> comfortable with that (typically it will just simplify things). > >> > >> I understand things get complicated if we want to attach semantics > >> to the these datatypes, so I'd propose not to do that. Most likely > >> others will make an attempt. > >> > >> Regards --- Jan > >> > >> > >> > >> > > > > ------------------------------------------------------------ IHMC > > (850)434 8903 or (650)494 3973 40 South Alcaniz St. > > (850)202 4416 office Pensacola (850)202 > > 4440 fax FL 32502 (850)291 0667 > > mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes > > > > > > > > > > > > > >
Received on Tuesday, 27 September 2011 05:01:51 UTC