- From: Jan Wielemaker <J.Wielemaker@vu.nl>
- Date: Tue, 27 Sep 2011 09:19:23 +0200
- To: Sandro Hawke <sandro@w3.org>
- CC: Pat Hayes <phayes@ihmc.us>, RDF Working Group WG <public-rdf-wg@w3.org>
Sandro, On 09/27/2011 07:01 AM, Sandro Hawke wrote: > Jan, FYI I strongly agree with your intuitions here -- perhaps not :-) When I saw the first bits of this discussion I was under the impression there would be a lot of debate, but in the end it would land around the 3a proposal. I couldn't see it otherwise ... Seems I was wrong :-( > surprisingly given the many long hours I've spent happily coding with > SWI Prolog. About two weeks ago, I was arguing this position using > somewhat different tactics in email; finally I spent an hour on the > phone in which folks -- mostly Andy and Gavin -- convinced me that while > this option (3a, giving us lang:en) may be architecturally appealing, > there are details that would require a lot of work to get right, to give > us something comfortable and sensible for users, and I gave up. Alas, > the only bit I remember right now is case sensitivity, that > "chat"@en="chat"@EN in SPARQL, but it's probably not practical to make > "chat"^^lang:en="chat"^^lang:EN in SPARQL. This puts a real (if minor) > problem for users up against an architectural-purity argument, and I Does it? AFAIK, XML language specifiers are indeed case insensitive, so what is wrong with "chat"@EN --> "chat"^^lang:en? Canonizing cannot be a bad idea. > don't like to be on the side against the users. As a user of a system where identity is in URIs and which provides a powerful mechanism to say things about URIs, I would be disappointed to see language identifiers (!) not being represented as URIs. Could you, Andy and Gavin get the key counter arguments together? --- Jan > > -- Sandro > > > On Mon, 2011-09-26 at 22:09 +0200, Jan Wielemaker wrote: >> Hi Pat, >> >> On 09/26/2011 07:34 PM, Pat Hayes wrote: >>> Perhaps the best way to resolve this interminable debate would be to >>> start from the other end. Rather than implementors pointing out the >>> horribleness of various proposals, could we have a list of what >>> implementors would consider to be the least objectionable behavior? I >> >> I fear there is no single obvious consensus amoung implementors :-( >> >>> myself have no idea why "xxx@lll" is so very much worse than "xxx" >>> paired with the datatype langbase:tag, but I am quite willing to be >> >> Nice challenge. It strikes me as odd, which might be more of an >> intuition than science. We already have a two-dimensional space, >> consisting of a value and a datatype on one hand and a very similar >> two-dimensional space consisting of a value (string) and a language tag. >> Fortunately, this is not a three dimensional space, but just a two >> dimensional one because all language tagged strings are (implicitly) of >> type xsd:string (or some rdf:langString subtype). I.e., there is no such >> thing as "1.0"@en^^xsd:float vs. "1,0"@nl^^xsd:float (oops, forget that; >> I see TONS of WORMS ...). >> >> It is clear that we want to support operations mostly on the plain >> string value, such as search and comparison. That is, I don't want a >> search for @ to succeed on "foo"@en. Also, I don't want "foo"@en to be >> lexically smaller than "foo"@nl. So, from an implementation point of >> view, I probably want to maintain the two-dimensional space where the >> value ("foo") is separated from the tag [1]. It would make me very happy >> if the 2.67 dimensional space (value + datatype/language tag/nothing) is >> reduced to a simple two-dimensional space: value+URL. Changing "foo" >> into "foo"^^xsd:string is a good step here. Changing "foo"@en into >> "foo"^^lang:en would be a nice and consistent second step, putting all >> literals in a nice two-dimensional space without exceptions. >> >> In addition, I believe that having a URL for a language opens some >> nice opportunities to model relations between languages in RDF. >> >>> told that there is a consensus among implementors that this is so (or >>> whatever in fact is the consensus) and then I am sure I can design an >>> RDF modification which will realize that desired behavior and have a >>> reasonably coherent semantics. >>> >>> I would however observe that as tagged literals are exceptional, and >> >> In what sense exceptional? I think there are lots of use-cases where >> language tags play a vital role. >> >>> as we are proposing to make some kind of change to the existing spec, >>> that *some* amount of change to existing code might have to be >>> contemplated. If no changes are allowed at all to any existing >>> deployed code, the WG should just pack up now, define RDF2 to be the >>> same as RDF1 and declare its business done. We all have other things >>> to do, I am sure. >> >> That is far too sceptical to me :-) Just map @tag into ^^lang:tag and >> define some more mappings for related SPARQL constructs and I think that >> everything is much more orthogonal and simple. Yes, we will have infinite >> debates on the relations between @en, @en-US, @en-GB, etc., but we won't >> be able to resolve these anyway. Just declare these out of the scope of >> this working group. At least we provide whoever wants to model langages >> with URLs about which they can make statements. >> >> Cheers --- Jan >> >> [1] I'm part of the camp where operations on strings are considered >> both slow and dangerous ... Data processing systems should try to >> avoid looking into strings as much as possible. >> >>> Pat >>> >>> On Sep 26, 2011, at 4:51 AM, Jan Wielemaker wrote: >>> >>>> On 09/26/2011 11:28 AM, Richard Cyganiak wrote: >>>>> You understate the issues. >>>>> >>>>> Every existing application that uses the Literal.getLexicalForm() >>>>> call of some API to get at the xxx part of xxx@lll would have to >>>>> be changed, because the lexical form of xxx@lll is now xxx@lll. >>>>> >>>>> That's a complete non-starter. >>>> >>>> I fully agree. Also note that APIs for (notably in-core) RDF stores >>>> can now typically work on a single shared representation of the >>>> literal. If we add a tag to the literal many of the operations will >>>> have to create a copy without the tag. I'm not saying this cannot >>>> be solved, but I fear it will be natural nor pretty, especially for >>>> existing stores that did not anticipate this in their design >>>> phase. >>>> >>>> I must admit that I'm only following this from the sideline. As an >>>> implementor I'm starting to get worried about some wild ideas >>>> though. The solution I still like best is that foo@tag is the same >>>> as "foo"^^langbase:tag, where langbase is some to be decided prefix >>>> for language identifiers. Any implementation should be fairly >>>> comfortable with that (typically it will just simplify things). >>>> >>>> I understand things get complicated if we want to attach semantics >>>> to the these datatypes, so I'd propose not to do that. Most likely >>>> others will make an attempt. >>>> >>>> Regards --- Jan >>>> >>>> >>>> >>>> >>> >>> ------------------------------------------------------------ IHMC >>> (850)434 8903 or (650)494 3973 40 South Alcaniz St. >>> (850)202 4416 office Pensacola (850)202 >>> 4440 fax FL 32502 (850)291 0667 >>> mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes >>> >>> >>> >>> >>> >>> >> >> > >
Received on Tuesday, 27 September 2011 07:20:06 UTC