- From: Harry Halpin <hhalpin@ibiblio.org>
- Date: Thu, 21 May 2009 20:38:31 +0200
- To: Pat Hayes <phayes@ihmc.us>
- Cc: Boris Motik <boris.motik@comlab.ox.ac.uk>, "Eric Prud'hommeaux" <eric@w3.org>, Andy Seaborne <andy.seaborne@hp.com>, Alan Ruttenberg <alanruttenberg@gmail.com>, public-rdf-text@w3.org, Semantic Web <semantic-web@w3.org>, Sandro Hawke <sandro@w3.org>, Axel Polleres <axel.polleres@deri.org>
Quick note - I have not been tracking this issue, but it does seem a very bad idea (well worthy of a formal objection) in general to create a incompatibility with something as a basic as a text string, ala plan literal. As this seems to be an issue almost entirely motivated by formal semantics, I see *no* reason why formal semantic motivations should cause pain for users and already existing data. Why not just in RIF and OWL2 have plain literals default to be treated as having a data-type of "rdf:text" (or whatever is needed in the formal semantics), and never require the explicit edition of any work by the users? In particular, ""Family Guy" would then default to ""Family Guy@". Why is this option not tenable? Seems rather sensible to me, but I assume there *must* be some reason for not doing it that way. On Wed, May 20, 2009 at 7:20 PM, Pat Hayes <phayes@ihmc.us> wrote: > > On May 20, 2009, at 9:57 AM, Boris Motik wrote: > >> Hello, >> >> I have to agree that the text in the rdf:text specification might not >> reflect >> correctly the intentions I expressed. Quite frankly, we (i.e., the authors >> of >> the rdf:text specification) haven't been really aware of all the >> repercussions >> and possible interpretations of our spec. The text you refer to at the end >> of >> this e-mail has been introduced as a reaction to one of the earlier >> comments by >> the SPARQL WG. >> >> Nevertheless, here is what the goals of rdf:text are: > > Thanks for this summary. > >> >> 1. Both RIF and OWL 2 find the distinction between plain and typed >> literals >> painful. This is because, whenever one refers to literals, one needs two >> subcases: for a plain and for a typed literal. > > So? > >> Both RIF and OWL 2 have >> independently come up with exactly the same idea: they opted to represent >> the >> "semantic content" of plain literals through typed literals whose value is >> the >> same as the corresponding plain literals. > > Thanks for making this clear in a public forum. OWL 2 and RIF are > deliberately, by design, creating a central incompatibility with a basic > feature of RDF. This seems to me to be a quite extraordinary and amazing > observation, one that deserves to be publicized as widely as possible (which > is why I am CCing this to semantic-web@w3.org). Why would two W3C WGs set > out to deliberately *create* interoperability problems with other W3C > standards, just when those standards are beginning to achieve widespread > acceptance? > >> This makes the definitions and the >> semantic treatment of literals in both RIF and OWL 2 much simpler. > > It makes it more elegant, yes, but is there really a PROBLEM here that needs > to be solved? That is, what actual issues for users or implementations are > posed by the presence of two literal forms? Or is this discomfort simply a > matter of theoretician's feelings of inelegance or clumsiness? Because if > the latter (as I strongly suspect), this is not a sufficient reason to > attempt to retroactively undermine the existing RDF standard, and to > deliberately create what I believe will be troublesome and awkward problems > for an entire generation of implementations, and certainly for a majority of > existing ones. Creating problems like this is exactly what W3C WGs should > NOT be doing, especially at a critical point in the deployment of SWeb > technology. Google just quietly announced their cautious support for RDFA. > It is not exactly a great idea for two W3C WGs to be at that very moment > deliberately attempting to undermine one of the basic aspects of the RDF > design. Elegance, is, frankly, not of central importance right now. > >> >> >> 2. Both RIF and OWL 2 need a mechanism to refer to the set of all plain >> literals. For example, in OWL 2 you might want to say "the range is a >> piece of >> text". > > That problem can be trivially solved by introducing a class of such values, > and giving it a reserved name. RDF plain literals denote themselves, so that > the class of plain literal values is also the class of plain literals which > is also the class of pieces of text. > >> In OWL 2 this is very important because of facets. Using a datatype for >> this purpose is natural. > > Natural. maybe, but not REQUIRED. And given the problems that it causes, > maybe it isn't so natural after all. Think of OWL 2 as part of an existing > world-wide deployment of SWeb systems, and then ask if it is 'natural'. > >> Both RIF and OWL 2 have chosen to follow the >> definitions of datatypes from XML Schema. Thus, each datatype consists of >> a set >> of lexical values, a value space, and a L2V mapping. Plain literals do not >> follow these principles > > Of course they do. Abstractly, the plain literal 'datatype' is as follows: > the lexical space is all character strings; the value space is all character > strings; and the L2V mapping is the identity map. Obvious extension to the > case of tagged literals. Where is the conceptual problem here? > >> ; therefore, rdf:text defines lexical values that encode >> the content of plain literals. > > Giving rise immediately, and predictably, to the interoperability nightmare > of there being two ways to represent one ubiquitous kind of thing - a piece > of unmarked text, with an optional language tag - which require exotic > means to establish their equivalence, and different specs requiring > different ways to be used. This is an elementary systems-engineering > mistake, a decision which could have been designed to create global systemic > problems. (Even the "managerial" situation of narrowly focussed WGs working > on parts of the problem in isolation is classic. Future systems engineering > 101 course textbooks will be able to cite this as an example.) > >> Now as I have already said, we have not had the complete store as clear in >> our >> minds right from the beginning. Given all the LC comments (which have by >> the way >> have been quite useful and have significantly improved the spec), however, >> both >> RIF and OWL 2 have agreed that the view I proposed in my e-mail is the >> appropriate one (at least from the RIF and OWL 2 points of view). > > All SWeb WG's points of view should be primarily to further the deployment > of the SWeb. > >> As I've stated >> in my summary e-mail, to achieve this we simply need to remove from the >> specification any special treatment of rdf:text: this should be a datatype >> like >> any other. This is precisely the part of the document that you are >> referring to. >> >> Thus, the final version of the document would not mention any >> interoperability >> problems. > > How wonderful. We will not mention them, so they will have gone away. Or, > more precisely, they have not gone away, but they aren't OUR problem. We > are just doing our job, and making the Semantic Web work isn't in our WG > charter: we are just concerned with RIF/OWL. > > Sorry about the sarcastic tone, but this really does deserve it. > >> Furthermore, we may also rework the introduction to make the intention >> behind rdf:text clearer. > > Certainly, the rdf:text document is very misleading as written. It purports > to be about representing internationalized text, which is clearly not even > close to the truth, and it does not even mention the apparently real > motivation, which is (see above) to create incompatibilities with the RDF > plain literal design. > > Pat > >> >> Regards, >> >> Boris >> >>> -----Original Message----- >>> From: Eric Prud'hommeaux [mailto:eric@w3.org] >>> Sent: 20 May 2009 16:36 >>> To: Boris Motik >>> Cc: 'Seaborne, Andy'; 'Alan Ruttenberg'; public-rdf-text@w3.org; 'Sandro >>> Hawke'; 'Axel Polleres' >>> Subject: Re: A summary of the proposal for resolving the issues with >>> rdf:text >>> --> Could you please check it one more time? >>> >>> On Wed, May 20, 2009 at 01:38:29PM +0200, Boris Motik wrote: >>>> >>>> Hello, >>>> >>>> I fully appreciate use case and I agree with your observation: this is >>> >>> something >>>> >>>> that has to be addressed. I don't think, however, that solving this >>>> problem >>> >>> is >>>> >>>> in the domain of rdf:text. The rdf:text specification merely defines yet >>> >>> another >>>> >>>> datatype by specifying it in exactly the same way as this is done in XML >>> >>> Schema. >>>> >>>> This datatype is just like any other XML Schema datatype; hence, the job >>> >>> from >>>> >>>> rdf:text's point of view is done. >>> >>> Ahh, perhaps we have different goals for rdf:text. rdf:text was, if I >>> understand, created to address the issue that one could not infer the >>> presenece of or the consequences of plain literals. One could fill >>> that hole by creating a datatype that consumes and infers plain >>> literals, or one could create a datatype which bijects to plain >>> literals. Special machinery associated with that datatype is required >>> in either case. >>> >>> (I, who was not involved in rdf:text except as an afterthought, >>> argue that it is intended to take the former approach. You, an >>> author, argue something closer to the latter. >>> ) >>> >>>> Furthermore, the addition of rdf:text to the mix of the supported >>>> datatypes >>> >>> adds >>>> >>>> no new conceptual problems to SPARQL: the situation with rdf:text is no >>>> different than with, say, xsd:integer (there are other examples as >>>> well). >>> >>> For >>>> >>>> example, assume that you have an RDF graph >>>> >>>> G = { <a, b, "1"^xsd:integer> } >>>> >>>> but you ask the query >>>> >>>> Q = { <a, b, "1.0"^^xsd:decimal> }. >>>> >>>> Clearly, G D-entails Q, so Q should be answered as TRUE in G. It is not >>>> the >>>> business of XML Schema to specify how this is to be achieved: XML Schema >>> >>> merely >>>> >>>> specifies what the correct answer to the above question is. It is a >>>> SPARQL >>>> implementation such as OWLIM that should think of how to support such a >>>> definition. >>> >>> SPARQL is defined in terms of the graph, so Q will fail to match G. As >>> entailments supplement the graph, a D-entailing system confronted with >>> <a, b, "1"^xsd:integer> >>> >>> will have a (notional) graph >>> G = { <a, b, "1"^xsd:integer> . >>> <a, b, "1.0"^^xsd:decimal> . }. >>> >>> I'd say that we're aguing whether <a, b, "bob@en"^^rdf:text> shows up >>> in the graph. You propose something like: >>> <a, b, "bob@en"^^rdf:text> D-entails to >>> G = { <a, b, "bob@en"^^rdf:text> . >>> <a, b, "bob"@en> . }. >>> while I propose it that you never utter <a, b, "bob@en"^^rdf:text> and >>> have the tools that implement the specification produce only <a, b, >>> "bob"@en> >>> . >>> >>> >>>> I don't know whether a solution to the above problem (with xsd:integer >>>> and >>>> xsd:decimal) exists. If not, I agree that one should be developed; >>>> however, >>> >>> we >>>> >>>> would not go to the XML Schema WG and ask them to specify how should >>>> SPARQL >>>> handle this case, would we? >>>> >>>> The problem with rdf:text is *precisely* the same as the one that I >>>> outlined >>>> above. At an abstract level, it can be stated as "Several syntactic >>>> forms of >>>> literals get mapped to the semantically identical data values". AS >>> >>> demonstrated >>>> >>>> above, this problem exists without rdf:text, so I don't see how rdf:text >>> >>> brings >>>> >>>> anything new into the whole picture. Thus, you can apply to the rdf:text >>> >>> case >>>> >>>> exactly the same solution that you would apply to xsd:integer and >>> >>> xsd:decimal. >>> >>> Your proposal is analogous to the D-entailment of numeric types, while >>> I interpret the rdf:text last call wording as attempting to reduce the >>> interrop challenges that would stem from spotty coverage with respect >>> to that D-entailment. >>> >>> >>>> If such a solution doesn't exist yet, then the SPARQL WG should address >>> >>> these >>>> >>>> issues, and it should do so in general for all datatypes (xsd:integer, >>>> xsd:decimal, and so on), not just for rdf:text. >>> >>> I'd argue that it's more of an RDF Core issue (admitting that they >>> don't exist). To solve an entailment especially for SPARQL sidesteps >>> the other folks who want to know what's in the graph, for instance, an >>> RDF graph API (such as exist in Jena, Sesame, ...), other entailment >>> regimes that may or may not stack on top of OWL (imagine an >>> app-directed regime like FOAF smushing), as well as secondary >>> consumers of RDF graphs, for instance, an XSLT wich runs on the XML >>> results returned from a SPARQL query service. >>> >>> >>>> To summarize, I think that the work from the point of view of the >>>> rdf:text >>> >>> WG is >>>> >>>> *done* and that we should not do anything else in this forum. >>> >>> Andy has argued that approach 1 is the only of the 3 that is >>> compatible with this text from the last call document: >>> [[ >>> Despite the semantic equivalence between typed rdf:text literals and >>> plain literals, the presence of typed rdf:text literals in an RDF >>> graph might cause interoperability problems between RDF tools, as not >>> all RDF tools will support rdf:text. Therefore, before exchanging an >>> RDF graph with other RDF tools, an RDF tool that suports rdf:text MUST >>> replace in the graph each typed rdf:text literal with the >>> corresponding plain literal. The notion of graph exchange includes, >>> but is not limited to, the process of serializing an RDF graph using >>> any (normative or nonnormative) RDF syntax. >>> ]]. \1 is clarifying the boundries of the above graph exchange. >>> >>>> Regards, >>>> >>>> Boris >>>> >>>>> -----Original Message----- >>>>> From: Eric Prud'hommeaux [mailto:eric@w3.org] >>>>> Sent: 20 May 2009 13:18 >>>>> To: Boris Motik >>>>> Cc: 'Seaborne, Andy'; 'Alan Ruttenberg'; public-rdf-text@w3.org; >>>>> 'Sandro >>>>> Hawke'; 'Axel Polleres' >>>>> Subject: Re: A summary of the proposal for resolving the issues with >>> >>> rdf:text >>>>> >>>>> --> Could you please check it one more time? >>>>> >>>>> On Wed, May 20, 2009 at 09:29:00AM +0200, Boris Motik wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> I don't see the benefit of option 1, as it makes things unnecessarily >>>>> >>>>> complex. >>>>>> >>>>>> The fewer exceptions we have, the easier it will be to actually >>> >>> implement a >>>>>> >>>>>> conformant system. The dichotomy between plain und typed literals is >>> >>> just an >>>>>> >>>>>> example of an exception that just makes implementation difficult. >>> >>> Instead of >>>>>> >>>>>> introducing more special cases, I think we should unify these whenever >>>>> >>>>> possible. >>>>>> >>>>>> Furthermore, I'm not sure whether sorting out things such as the ones >>>>> >>>>> pointed >>>>>> >>>>>> out below is necessary to finalize the rdf:text specification. Please >>> >>> note >>>>> >>>>> that >>>>>> >>>>>> rdf:text already has a well-defined lexical and value space, and this >>>>>> is >>>>> >>>>> *the >>>>>> >>>>>> only* thing that we need to be able to plug rdf:text into the model >>> >>> theory >>>>> >>>>> of >>>>>> >>>>>> RDF. That is, given RDF graphs G1 and G2 possibly containing rdf:text >>>>> >>>>> literals >>>>>> >>>>>> and/or plain literals, using the definitions from the present rdf:text >>>>>> specification one can unambiguously answer the question whether G1 D- >>> >>> entails >>>>> >>>>> G2. >>>>>> >>>>>> For example, if G1 is >>>>>> >>>>>> <a, b, "abc@en"^^rdf:text> >>>>>> >>>>>> and G2 is >>>>>> >>>>>> <a, b, "abc"@en> >>>>>> >>>>>> then, according to the existing RDF model theory document, G1 >>>>>> D-entails >>> >>> G2 >>>>> >>>>> and >>>>>> >>>>>> vice versa. I don't see what else is there for the rdf:text >>> >>> specification to >>>>> >>>>> do: >>>>>> >>>>>> I really think that the specification is complete. If SPARQL or other >>>>>> specifications want to apply rdf:text in a different way and create >>> >>> special >>>>>> >>>>>> cases, they are free to do so; however, I don't think it is in scope >>>>>> of >>> >>> the >>>>>> >>>>>> rdf:text specification to solve all such problems. >>>>> >>>>> (Hesitantly re-stating use case), consider the use case of the OWLIM >>>>> plugin for Sesame. If OWLIM forward chains some triples into the >>>>> Sesame repository with objects like "bob"@en, existing SPARQL queries >>>>> on the existing Sesame engine will match them as expected. RIF rules >>>>> can consume those triples and know that any rules applying to a domain >>>>> of rdf:text apply. >>>>> >>>>> Constrast that with an OWLIM which emits triples with objects like >>>>> "bob@en"^^rdf:text . These triples will not match conventional queries >>>>> intended to discover e.g. all the folks named "Bob". The Sesame SPARQL >>>>> implementation can be extended, but then we are in Pat's scenario of >>>>> fixing RDF by visiting all the deployed code. >>>>> >>>>> I expect that any design of rdf:text would have it reacting to plain >>>>> literals as if they had a datatype of rdf:text and the appropriate >>>>> lexical transformation. I propose that the simplest complete design is >>>>> one where the inference of rdf:text objects results in their >>>>> expression as plain literals, avoiding a dualism between >>>>> "bob@en"^^rdf:text and "bob"@en which would lose interroperability >>>>> with existing queries, graph APIs, XPaths operating on SPARQL Results, >>>>> non-OWL inferencing systems, ... >>>>> >>>>> >>>>>> Regards, >>>>>> >>>>>> Boris >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: public-rdf-text-request@w3.org [mailto:public-rdf-text- >>>>> >>>>> request@w3.org] >>>>>>> >>>>>>> On Behalf Of Eric Prud'hommeaux >>>>>>> Sent: 20 May 2009 03:18 >>>>>>> To: Seaborne, Andy >>>>>>> Cc: Alan Ruttenberg; public-rdf-text@w3.org; Boris Motik; Sandro >>> >>> Hawke; >>>>> >>>>> Axel >>>>>>> >>>>>>> Polleres >>>>>>> Subject: Re: A summary of the proposal for resolving the issues with >>>>> >>>>> rdf:text >>>>>>> >>>>>>> --> Could you please check it one more time? >>>>>>> >>>>>>> On Tue, May 19, 2009 at 03:57:11PM +0000, Seaborne, Andy wrote: >>>>>>>> >>>>>>>> Apologies: >>>>>>>> >>>>>>>>> On Fri, May 15, 2009 at 11:50 AM, Seaborne, Andy >>>>> >>>>> <andy.seaborne@hp.com> >>>>>>> >>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Monday PM end before 18:00 (GMT+1) >>>>>>>>>> Thursday PM. >>>>>>>>>> Tuesday @17:00 (GMT+1) for a short call; end before 17:30. >>>>>>>> >>>>>>>> I can't make the slot. >>>>>>>> >>>>>>>> Input: please consider interoperability of data between OWL and RDF. >>>>> >>>>> Option >>>>>>> >>>>>>> 1 is better for that than option 2 as Eric points out. >>>>>>>> >>>>>>>> This is also the least change to LC and IMHO is not a substantive >>> >>> change >>>>> >>>>> (it >>>>>>> >>>>>>> follows on from the current graph exchange intent) to add the text >>> >>> needed >>>>> >>>>> for >>>>>>> >>>>>>> SPARQL. Roughly: the scoping graph of an rdf-text aware D-entailment >>> >>> for >>>>> >>>>> BGP >>>>>>> >>>>>>> matching includes the RDF forms and does not include ^^rdf:text. >>> >>> (Non- >>>>> >>>>> aware >>>>>>> >>>>>>> entailment regimes would merely treat as a datatype form.) >>>>>>> >>>>>>> does anyone oppose option 1 (plain literals are considered to satisfy >>>>>>> entailments constrained to type rdf:text and entailments of type >>> >>> rdf:text >>>>> >>>>> are >>>>>>> >>>>>>> expressed as plain literals in the RDF graph)? (i'm wondering if we >>> >>> can >>>>> >>>>> work >>>>>>> >>>>>>> this out before we work out scheduling this phone call.) >>>>>>> >>>>>>> >>>>>>>> Andy >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Alan Ruttenberg [mailto:alanruttenberg@gmail.com] >>>>>>>>> Sent: 19 May 2009 16:01 >>>>>>>>> To: Axel Polleres >>>>>>>>> Cc: Seaborne, Andy; public-rdf-text@w3.org; Boris Motik; Sandro >>> >>> Hawke; >>>>>>>>> >>>>>>>>> eric@w3.orf >>>>>>>>> Subject: Re: A summary of the proposal for resolving the issues >>> >>> with >>>>>>>>> >>>>>>>>> rdf:text --> Could you please check it one more time? >>>>>>>>> >>>>>>>>> On Mon, May 18, 2009 at 10:03 AM, Axel Polleres >>>>> >>>>> <axel.polleres@deri.org> >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Alan, since you were calling for the TC, is that fixed now? >>>>>>>>>> Otherwise, I am afraid it is not possible before Friday. >>>>>>>>> >>>>>>>>> Yes, let's have whoever can make it meet at 5:30 BST = 12:30 >>> >>> Boston >>>>>>>>> >>>>>>>>> time. >>>>>>>>> Zakim, meet on irc #rdftext for the code. I will send a code >>> >>> earlier >>>>> >>>>> if >>>>>>>>> >>>>>>>>> I can. >>>>>>>>> >>>>>>>>> -Alan >>>>>>> >>>>> >>> >>> -- >>> -eric >>> >>> office: +1.617.258.5741 32-G528, MIT, Cambridge, MA 02144 USA >>> mobile: +1.617.599.3509 >>> >>> (eric@w3.org) >>> Feel free to forward this message to any list for any purpose other than >>> email address distribution. >> >> >> >> > > ------------------------------------------------------------ > IHMC (850)434 8903 or (650)494 3973 > 40 South Alcaniz St. (850)202 4416 office > Pensacola (850)202 4440 fax > FL 32502 (850)291 0667 mobile > phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes > > > > > > >
Received on Thursday, 21 May 2009 18:39:13 UTC