- From: Peter F.Patel-Schneider <pfps@research.bell-labs.com>
- Date: Thu, 21 May 2009 14:14:22 -0400
- To: <phayes@ihmc.us>
- CC: <public-rdf-text@w3.org>
From: Pat Hayes <phayes@ihmc.us> Subject: Re: A modest note on rdf:text from the an OWLer Date: Thu, 21 May 2009 12:22:53 -0500 > > On May 21, 2009, at 10:15 AM, Peter F.Patel-Schneider wrote: > >> From: Pat Hayes <phayes@ihmc.us> >> Subject: Re: A modest note on rdf:text from the an OWLer >> Date: Thu, 21 May 2009 09:53:39 -0500 >> >>> On May 20, 2009, at 8:01 PM, Peter F.Patel-Schneider wrote: >>> >>>> I've been following along on the conversation and not contributing, >>>> but >>>> I'm now going to stick by toe in. >>>> >>>> >>>> Here is a fictious dialog between several points of view. You may >>>> decide, if you wish, to assign human actors to these points of view. >>>> That is completely up to you, I'm not saying that anyone holds these >>>> views as I've stated them. >>>> >>>> (An well-known Zakim meeting and IRC chat room.) >>>> >>>> IH: We need better datatype support in OWL 2! >>>> IH: But how can it be done? >>>> BM: Let's use XML Schema datatypes and facets! >>>> IH: Sounds good, go for it. >>>> JC: But what about plain literals? We need to support all RDF >>>> graphs! >>>> BM: Hmm, we need a datatype for them. >>> >>> PH: Hey, BM, run that past me again. Why exactly does OWL need to >>> have >>> a **datatype** for plain literals? That is, why can't it simply allow >>> RDF-style plain literals? Of course, it would be *ugly*, but would >>> anything actually break? (What?) And you know the old aphorism: if >>> it >>> ain't broken, don't fix it. OWL can always invent a class of all >>> plain >>> literal values and even declare it to be a datatype class if that >>> makes the document-writers aesthetic sense happier and saves them >>> some >>> boring repetition when stating rules and conditions, without needing >>> to actually change how RDF writes its literals. >> >> Well, OWL needs *something* to say that the range of some property is >> any string (with or without a language tag), i.e., a datatype. The >> datatype extensibility solution in OWL (borrowed from XML Schema >> datatypes) requires a datatype to hang facets on, for example to >> have a >> range that consisting exactly of strings with a US English language >> tag. > > Ah, so this is why just having a class name for the range will not be > enough, right? (Everyone so far has cited the 'range' issue, which > seems to be trivially solvable.) OWL (2) DL needs a datatype name even just for the range. >> That said, there is nothing in OWL forbidding RDF-style plain literals >> in its syntaxes. The functional syntax, for example, allows for >> literals like "Padre de familia"@es. There is also nothing in OWL (or >> in the rdf:text document) that forbids the use of plain literals in >> RDF graphs or even in any RDF exchange format. > > Right, but it kind of devalues it, since OWL users will of course want > to use the new form, and this will then leak into non-OWL RDF, but > having lost its meaning; > and, more to the point, non-OWL RDF users > will be writing RDF which OWL will not be able to utilize properly. This is not the case. OWL 2 (and OWL 1) can handle plain literals. In OWL 1 plain literals without language tags belong to xsd:string. In OWL 2 plain literals belong to ---:text. > When they mix, nobody will be happy. OWL 2 (and OWL 1) are perfectly happy with both the current (and envisioned) state of affairs. > I can almost directly predict the > forlorn emails to the lists asking why their engine is only finding > half the information in the billion-sized triple stores, and who do > not want to be told that all they have to do is rewrite all the > literals properly. But in any case, seems to me that any 'transcribe > in/out' solution isn't going to work because there is information loss > in either direction, so nobody will be able to round-trip. So there > will be two RDF worlds: the OWL-2-savvy one and the others, and they > in effect won't be able to communicate, in practice, even if they are > both technically legal according to the specs. Maybe so. But this problem *already* exists. It exists with xsd:string, which is an essential part of OWL 1. It exists with numbers. >>>> I know, we'll just use >>>> xsd:string - its extension includes all reasonable plain >>>> literals. >>>> JC: Not so - to satisfy internationalization concerns we need to >>>> also >>>> handle plain literals with a language tag. >>>> BM: Then let's have a new datatype, owl:text, that includes both >>>> strings without a language tag and strings with a language tag. >>> >>> PH. Might have been better, in retrospect, to have restricted it to >>> the case with a language tag, which is the only case you actually >>> need >>> - and stuck to xsd:string for the other case. Then at least we only >>> have two ways to write plain literals (datatyped or not) instead of >>> three. And, as y'all are constantly pointing out, the world has >>> already gotten used to the fact that "foo" and "foo"^^xsd:string are >>> the same, and your new type would just be doing exactly the same >>> thing >>> for the tagged case, which is a smaller pill to swallow. >> >> Well, if owl:text is restricted to require a language tag, then >> there is >> a (minor) need for the union of owl:text and xsd:string. I don't see >> any pain difference between the two solutions, and the one that OWL >> chose requires one less new datatype. > > ? xsd:string isn't new, surely. But the pain difference between the > two-choice case (plain vs. xsd:string typed, plain+tag vs. rdf:text > typed) and the three-choice case (plain vs. xsd:string vs. rdf:text) > is huge. In the 2-choice case there is one way to get it wrong (so > you can then try the other); in the three-choice case there are five > ways to get it wrong. Its hard enough to get two-way agreements, but > getting three-way agreements is just about impossible. I don't see any new pain. I don't see any new ways to get things "wrong". >>>> It >>>> is just like xsd:string but with complete coverage of all plain >>>> literals. It is a perfectly good RDF datatype, conforms perfectly >>>> to XML Schema Datatypes, there are no downsides. >>>> JC: Sounds like a plan. I'm happy. >>>> >>>> (Lots of on-stage document hacking.) >>>> >>>> (AP enters, announced by Zakim.) >>>> >>>> AP: What's this owl:text? This other WG is doing the same thing >>>> and so >>>> both WGs should use the same name for it! >>>> IH: Hmm. OK, let's form a task force and come up with a joint >>>> document. >>>> BM: Let's call the datatype rdf:text, because that is a good >>>> description of the purpose of the joint datatype. >>>> JC: OK by me, but not really necessary, any name will do. >>>> AP: That's fine. This other WG just needs to add functions, which >>>> you >>>> in OWL don't seem to have. >>>> BM: I don't see any reason not to include a section on functions >>>> for >>>> rdf:text. >>>> JC: OK by me, but it doesn't make me any happier. I don't need the >>>> functions. >>>> IH: Let's include a bit of wording to encourage tools to use plain >>>> literals whereever possible, just so that tools that are not >>>> aware of the rdf:text datatype work as if they did. >>>> BM: Why just for rdf:text? Other datatypes have the same issue, or >>>> even worse! We are not going to require normalization of all >>>> literals! Making bad design choices just to support existing >>>> tools is not a good idea in general, and is certainly not a good >>>> idea here. >>> >>> PH: Actually, following previous design choices, whatever their >>> perceived merits, is a VERY good idea when writing standards for >>> interoperability. Such a good idea, in fact, that one should only not >>> follow it only when there is a pressing, urgent, user-driven need to >>> not do so. There is a very good chance that a future WG will not >>> agree >>> with your judgements about 'good' or 'bad' design choices anyway. >>> After all, you apparently disagree with at least one previous WG. >> >> As far as I am concerned, OWL and rdf:text are precisely following >> previous design choices, *except* for the special rules that try to >> force a certain way of writing strings. I would be very happy if this >> particular violation of previous design choices was removed from OWL, >> and from rdf:text. >> >>>> IH: I know, I know, but making a special case for rdf:text might >>>> make >>>> it more acceptable. >>>> BM: OK, but you are going to be sorry you ever tried to be nice. >>>> >>>> (A moderate amount of on-stage document hacking.) >>>> >>>> IH: Hello world! The OWL WG and this other WG have this great new >>>> thing for you! A new datatype, called rdf:text, for any sort of >>>> internationalized text. >>>> (Tomatoes being thrown from everywhere in the audience.) >>>> IH: Did I say "any sort of internationalized text"? I meant to say >>>> "strings with language tags". >>>> (More tomatoes being thrown, but only from one place in the >>>> audience.) >>>> IH: Oh, you don't like rdf:text at all? Well, the OWL WG will just >>>> go >>>> back to the previous happy situation and have a new datatype in >>>> OWL, called owl:text, to go along with our use of lots of new XML >>>> Schema datatypes and datatype facets. Other WGs can use this new >>>> datatype if they want, or not. Other WGs can even define >>>> functions >>>> on this new datatype, the OWL WG has nothing to say about this. >>> >>>> IH: [Aside to BM] I'm really, really sorry. >>>> BM: [Aside to IH] I told you so. >>>> >>>> (A bit of on-stage document hacking.) >>>> >>>> IH: Hello world! The OWL WG is proud to present its CR documents! >>>> Sorry about the previous brouhaha. >>>> AP: [Inaudible] Grumble, grumble. >>>> >>>> (Everyone realizes that if they throw tomatoes at owl:text they have >>>> to >>>> also shoot down the entire idea of RDF datatypes and D-entailment.) >>> >>> ? Not at all. Only at the idea of insisting that ALL literals must be >>> datatyped, even the ones that aren't. >> >> And neither OWL nor rdf:text does this insisting. > > Yes, I overstated the case, you are correct. But I perceive an overall > tendency to put pressure on the RDF world to move towards this new > view of all-typed literals, and I don't think it is the right kind of > pressure to be applying at this point in time. The last thing we want, > right now, is to break the deployment of 'loose' RDF in a wide-world > (and therefore messy and unstructured) setting, especially over the > issue of representations of plain text fragments. See my recent post > to the list for an alternative suggestion. I do not see any such pressure. Current OWL 1 tools already handle both plain literal strings and xsd:string datatyped strings. If there was going to be significant pressure and significant problems, then they should have already been seen. > Pat > > PS. terrific summary, BTW. I never did like tomatoes. > >> In fact, the only >> violation of previous design choices concerns the suggestion/ >> requirement >> for one particular way of writing string literals (namely the >> non-datatyped one). >> >>> Pat >>> >>>> >>>> (World peace reigns!) >>>> >>>> >>>> This appears to be the result that is being argued for. >>>> >>>> >>>> Peter F. Patel-Schneider >>>> Bell Labs Research >> >> peter peter
Received on Thursday, 21 May 2009 18:14:51 UTC