- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Fri, 26 Apr 2013 16:17:16 +0200
- To: public-rdf-wg@w3.org
A custom D-entailment, like any entailment regime, requires pre-programming. E.g., an RDF system does not adapt to RDFS-entailment without having pre-programmed RDFS entailment rules. It must be understood that for each datatype map D, D-entailment is a different entailment regime that requires its own pre-programmed processing. The way I see RDF 1.0 Semantics is as follow: 1) there are 3 entailmens regimes that are fully defined by the spec (Simple, RDF and RDFS). 2) there is a family of entailment regimes that are not completely defined by the spec but that can simply be defined by providing a datatype map (that is, by providing fixed interpretations in the form of datatypes for a given set of IRIs). What 2) provides could be changed to a more general notion of "customised entailment regimes" where any of the given regimes could be extended by specifying constraints on certain IRIs. For instance, I could define RDFS+graph where I give special meaning to http://ex.com/Graph, http://ex.com/subGraphOf, http://ex.com/emptyGraph, etc. with special constraints on the interpretation of these IRIs. But in any case, it seems to me obvious that any such extension cannot be defined *solely* by the *set* of "recognised" IRIs. At some point, one needs to provide the constraints explicitly. An example of custom datatype found in the wild is Virtuoso's Geometry: http://www.openlinksw.com/schemas/virtrdf#Geometry It maps to a datatype where the lexical space is of the form POINT(<lat> <long>), with <lat> and <long> are decimals, between -90 and +90 for <lat> and between -180 and +180 for long. The value space is the set of geographic point on the idealised Earth sphere. The L2V mapping is the obvious one. Note that the URL http://www.openlinksw.com/schemas/virtrdf#Geometry leads to a 404, so Linked Data intuitions would not work for this one. AZ. Le 26/04/2013 15:42, Andy Seaborne a écrit : > Eric's description covers what I have seen most of - an unknown datatype > URI is treated as an RDF term not a value (Pat's "no special meaning"). > > I have not seen a datatype map in the wild in machien readable form and > hence not encountered an RDF processor that adapts to a new datatype > without pre-programming. > > Andy > > On 26/04/13 12:34, Sandro Hawke wrote: >> Okay, that makes sense. I'm ambivalent. On the one hand, I prefer >> the linked data approach (any real spec is going to be built on >> references to other specs; why not make those references machine >> readable?) but on the other hand I see your point that we shouldn't >> change things from the 2004 spec without more evidence than has been >> presented here. >> >> -- Sandro >> >> On 04/26/2013 03:36 AM, Antoine Zimmermann wrote: >>> Le 26/04/2013 00:41, Sandro Hawke a écrit : >>>> I think you're saying that in the 2004 semantics, one can't just say >>>> >>>> (a) In addition to some XSD stuff, this system also implements >>>> datatype http://example.org/daterange >>> >>> "http://example.org/daterange" is an IRI, not a datatype. So what >>> datatype does this system implements? Intuitively, this should mean >>> that it implements the datatype identified by >>> "http://example.org/daterange". But what this datatype is? It's not an >>> XSD datatype, so the standards do not say. There is no datatype map >>> given, so my RDF-2004 assumptions do not allow me to decide. Still, I >>> have my Linked-data assumptions that tell me I just have to look up >>> and figure out. All right, let's do that. This URL is redirecting to >>> http://example.iana.org/, which does not tell me anything useful. >>> So your entailment regime is incompletely defined. >>> >>>> >>>> Instead, by the 2004 spec, one has to say: >>>> >>>> (b) In addition to some XSD stuff, this system also implements >>>> datatype http://example.org/daterange as meaning the datatype such >>>> that the the value space is all pairs of time instants, with the >>>> first element < the second elment, and the lexical space which is >>>> the concatenation of two elements from the lexical space of >>>> xs:dateTime, separated by a "..", and the mapping between the >>>> two is >>>> such that....etc, etc. >>>> >>>> Is that right? >>>>> That's pretty much it. This could take another form, and informed >>> Linked Data specialists would take care that the IRI dereferences to a >>> description of the datatype, which would suffice as a way to indicate >>> what the IRI maps to (that is, in practice, D *can* be specified by >>> simply providing the set of IRIs and indicating that the actual >>> datatype is described in the document to which the IRIs dereference to). >>> >>> >>> And Pat's proposal would make it so people would be >>>> saying (a) instead of (b)? >>>> >>>> -- Sandro >>>> >>>> >>>> >>>> On 04/25/2013 11:05 AM, Antoine Zimmermann wrote: >>>>> >>>>> Le 25/04/2013 15:37, Sandro Hawke a écrit : >>>>>> On 04/24/2013 10:06 AM, Antoine Zimmermann wrote: >>>>>>> It seems to me that this problem is due to the removal of the notion >>>>>>> of datatype map. In 2004, applications could implement the >>>>>>> D-entailment they liked, with D being a partial mapping from IRI to >>>>>>> datatypes. >>>>>>> Now, there are just IRIs in D. The association between the IRI and >>>>>>> the >>>>>>> datatype it should denote is completely unspecified. The only >>>>>>> indication that the application can have to implement a datatype map >>>>>>> is that XSD URIs must denote the corresponding XSD datatypes. >>>>>>> >>>>>>> I have troubles understanding why datatype maps should be removed. I >>>>>>> don't remember any discussions saying that they should be changed >>>>>>> to a >>>>>>> set. This change, which now creates issues, suddenly appear in RDF >>>>>>> Semantics ED, with no apparent indication that it was motivated by >>>>>>> complaints about the 2004 design. >>>>>>> >>>>>>> Currently, I see a downside of having a plain set, as it does not >>>>>>> specify to what datatype the IRIs correspond to, while I do not see >>>>>>> the positive side of having a plain set. Can someone provide >>>>>>> references to evidence that this change is required or has more >>>>>>> advantages than it has drawbacks? >>>>>>> >>>>>> >>>>>> You seem to have a very different usage scenario in mind than I do. >>>>> >>>>> I do not have any scenario or use case in mind. In RDF 1.0, given an >>>>> entailment regime and a set of triples, it was possible to determine >>>>> what are the valid entailments and what are non-entailments wrt the >>>>> given regime, regardless of anybody's usage scenario. In particular, >>>>> given a datatype map D, anybody who's given a set of triples and use >>>>> D-entailment regime would derive exactly the same triples because the >>>>> D is saying how to interpret the datatype IRIs. It is not related to >>>>> scenarios or use case. >>>>> >>>>> In the current RDF Semantics, if you have a D, you just know what IRIs >>>>> are recognised as datatypes, but you have no indication about what >>>>> datatypes they denote. So, say D = {http://example.com/dt}, it is not >>>>> possible to know what the following triple entails: >>>>> >>>>> [] <http://ex.com/p> "a"^^<http://example.com/dt> . >>>>> >>>>> To be able to entail anything from it, you would need to know to what >>>>> datatype the IRI maps to. That's why we need somewhere, somehow, a >>>>> mapping. And the mapping is affecting the entailment regime, so it >>>>> makes sense to have it as a parameter of the regime. >>>>> >>>>> This is very different from the case where an application is making a >>>>> certain usage of an IRI. For instance, displaying instances of >>>>> foaf:Person in a certain way in a webpage does not change anything the >>>>> the conclusions you can normatively draw from the set of triples in >>>>> any entailment regime. >>>>> >>>>> >>>>>> My primary use case (and I'm sorry I sometimes forget there are >>>>>> others) >>>>>> is the the situation where n independent actors publish data in >>>>>> RDF, on >>>>>> the web, to be consumed by m independent actors. The n publishers >>>>>> each >>>>>> makes a choice about which vocabulary to use; the m consumers each >>>>>> get >>>>>> to see what vocabularies are used and then have to decide which >>>>>> IRIs to >>>>>> recognize. There are market forces at work, as publishers want to >>>>>> be as >>>>>> accurate and expressive as possible, but they also want to stick to >>>>>> IRIs >>>>>> that will be recognized. Consumers want to make use of as much >>>>>> data as >>>>>> possible, but every new IRI they recognize is more work, sometimes >>>>>> a lot >>>>>> more work, so they want to keep the recognized set small. >>>>>> >>>>>> In this kind of situation, datatype IRIs are just like very other >>>>>> IRI; >>>>>> all the "standardization" effects are the same. >>>>> >>>>> That would be true if we did not have the D-entailment machinery. >>>>> Applications can apply specific treatments to specific IRIs, including >>>>> datatype IRIs (for instance, display dates using French conventions). >>>>> But if we introduce the D-entailment regime, it means we want to >>>>> impose more precise constraints on how to interpret the IRIs (that is, >>>>> more than just "I recognise this set of IRIs"). >>>>> >>>>>> It's great for both >>>>>> producers and consumers if we can pick a core set of IRIs that >>>>>> producers >>>>>> can assume consumers will recognize. Things also work okay if a >>>>>> closed >>>>>> group of producers and consumers agree to use a different set. But >>>>>> one >>>>>> of the great strengths of RDF is that the set can be extended >>>>>> without a >>>>>> need for prior agreement. A producer can simply start to use some >>>>>> new >>>>>> IRI, and consumers can dereference it, learn what it means, and >>>>>> change >>>>>> their code to recognize it. Of course, it's still painful (details, >>>>>> details), but it's probably not as painful as switching to a new data >>>>>> format with a new media type. In fact, because it can be done >>>>>> independently for each class, property, individual, and datatype, and >>>>>> data can be presented many ways at once, I expect it to be vastly >>>>>> less >>>>>> painful. >>>>> >>>>> What you say is perfectly true and I agree with it wholeheartedly. >>>>> However, I do not think it is relevant to the D-entailment debate (or >>>>> maybe only marginally). >>>>> >>>>> >>>>>> So, given this usage scenario, I can't see how D helps anybody >>>>>> except as >>>>>> a shorthand for saying "the IRIs which are recognized as datatype >>>>>> identifiers". >>>>> >>>>> In 2004, it says more: it says "These are the datatype IRIs of my >>>>> custom D-entailment regime, and these non-XSD datatype IRIs are >>>>> interpret in this way, according to these datatypes". It could be done >>>>> independently of the D-entailment machinery, in the internal >>>>> specificities of an application, but having it in the standard allows >>>>> one to refer to the normative mechanism. >>>>> >>>>>> >>>>>> Pat, does this answer the question of how RDF gets extended to a new >>>>>> datatype? I'm happy to try to work this through in more detail, if >>>>>> anyone's interested. >>>>> >>>>> So, to summarise what I understand about your position, you say that >>>>> the D-entailment machinery isn't that much useful at all, or only in a >>>>> weak version of it. Fair enough. As I said during the meeting, I'm not >>>>> resisting strongly to the change but in general, I am reluctant to >>>>> make any change to a standard that is not motivated by clear evidence >>>>> that it improves the existing situation. If any criticism arises from >>>>> our design of D-entailment, it is far easier to justify a no-change >>>>> ("we want to keep backward compatibility, persistence of definitions, >>>>> avoid changes to implementations, etc") rather than a change. >>>>> >>>>> >>>>> AZ. >>>>> >>>>>> >>>>>> -- Sandro >>>>>> >>>>>> >>>>>>> >>>>>>> AZ. >>>>>>> >>>>>>> Le 24/04/2013 05:09, Pat Hayes a écrit : >>>>>>>> I think we still have a datatype issue that needs a little thought. >>>>>>>> >>>>>>>> The D in D-entailment is a parameter. Although RDF is usually >>>>>>>> treated >>>>>>>> as having its own special datatypes and the compatible XSD types as >>>>>>>> being the standard D, it is quite possible to use RDF with a >>>>>>>> larger D >>>>>>>> set, so that as new datatypes come along (eg geolocation datatypes, >>>>>>>> or time-interval datatypes, or physical unit datatypes, to mention >>>>>>>> three that I know have been suggested) and, presumably, get >>>>>>>> canonized >>>>>>>> by appropriate standards bodies (maybe not the W3C, though) for use >>>>>>>> by various communities, they can be smoothly incorporated into RDF >>>>>>>> data without a lot of fuss and without re-writing the RDF specs. >>>>>>>> >>>>>>>> Do we want to impose any conditions on this process? How can a >>>>>>>> reader >>>>>>>> of some RDF know which datatypes are being recognized by this RDF? >>>>>>>> What do we say about how to interpret a literal whose datatype IRI >>>>>>>> you don't recognize? Should it be OK to throw an error at that >>>>>>>> point, >>>>>>>> or should it *not* be OK to do that? Shouid we require that RDF >>>>>>>> extensions with larger D's only recognize IRIs that have been >>>>>>>> standardly specified in some way? How would we say this? >>>>>>>> >>>>>>>> The current semantic story is that a literal >>>>>>>> "foo"^^unknown:datatypeIRI is (1) syntactically OK (2) not an >>>>>>>> error >>>>>>>> but (3) has no special meaning and is treated just like an unknown >>>>>>>> IRI, ie it presumably denotes something, but we don't know what. Is >>>>>>>> this good enough? >>>>>>>> >>>>>>>> Pat >>>>>>>> >>>>>>>> ------------------------------------------------------------ IHMC >>>>>>>> (850)434 8903 or (650)494 3973 40 South Alcaniz St. >>>>>>>> (850)202 4416 office Pensacola (850)202 >>>>>>>> 4440 fax FL 32502 (850)291 0667 >>>>>>>> mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >> >> > > -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
Received on Friday, 26 April 2013 14:17:52 UTC