- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Fri, 26 Apr 2013 14:42:46 +0100
- To: public-rdf-wg@w3.org
Eric's description covers what I have seen most of - an unknown datatype URI is treated as an RDF term not a value (Pat's "no special meaning"). I have not seen a datatype map in the wild in machien readable form and hence not encountered an RDF processor that adapts to a new datatype without pre-programming. Andy On 26/04/13 12:34, Sandro Hawke wrote: > Okay, that makes sense. I'm ambivalent. On the one hand, I prefer > the linked data approach (any real spec is going to be built on > references to other specs; why not make those references machine > readable?) but on the other hand I see your point that we shouldn't > change things from the 2004 spec without more evidence than has been > presented here. > > -- Sandro > > On 04/26/2013 03:36 AM, Antoine Zimmermann wrote: >> Le 26/04/2013 00:41, Sandro Hawke a écrit : >>> I think you're saying that in the 2004 semantics, one can't just say >>> >>> (a) In addition to some XSD stuff, this system also implements >>> datatype http://example.org/daterange >> >> "http://example.org/daterange" is an IRI, not a datatype. So what >> datatype does this system implements? Intuitively, this should mean >> that it implements the datatype identified by >> "http://example.org/daterange". But what this datatype is? It's not an >> XSD datatype, so the standards do not say. There is no datatype map >> given, so my RDF-2004 assumptions do not allow me to decide. Still, I >> have my Linked-data assumptions that tell me I just have to look up >> and figure out. All right, let's do that. This URL is redirecting to >> http://example.iana.org/, which does not tell me anything useful. >> So your entailment regime is incompletely defined. >> >>> >>> Instead, by the 2004 spec, one has to say: >>> >>> (b) In addition to some XSD stuff, this system also implements >>> datatype http://example.org/daterange as meaning the datatype such >>> that the the value space is all pairs of time instants, with the >>> first element < the second elment, and the lexical space which is >>> the concatenation of two elements from the lexical space of >>> xs:dateTime, separated by a "..", and the mapping between the two is >>> such that....etc, etc. >>> >>> Is that right? >> >> That's pretty much it. This could take another form, and informed >> Linked Data specialists would take care that the IRI dereferences to a >> description of the datatype, which would suffice as a way to indicate >> what the IRI maps to (that is, in practice, D *can* be specified by >> simply providing the set of IRIs and indicating that the actual >> datatype is described in the document to which the IRIs dereference to). >> >> >> And Pat's proposal would make it so people would be >>> saying (a) instead of (b)? >>> >>> -- Sandro >>> >>> >>> >>> On 04/25/2013 11:05 AM, Antoine Zimmermann wrote: >>>> >>>> Le 25/04/2013 15:37, Sandro Hawke a écrit : >>>>> On 04/24/2013 10:06 AM, Antoine Zimmermann wrote: >>>>>> It seems to me that this problem is due to the removal of the notion >>>>>> of datatype map. In 2004, applications could implement the >>>>>> D-entailment they liked, with D being a partial mapping from IRI to >>>>>> datatypes. >>>>>> Now, there are just IRIs in D. The association between the IRI and >>>>>> the >>>>>> datatype it should denote is completely unspecified. The only >>>>>> indication that the application can have to implement a datatype map >>>>>> is that XSD URIs must denote the corresponding XSD datatypes. >>>>>> >>>>>> I have troubles understanding why datatype maps should be removed. I >>>>>> don't remember any discussions saying that they should be changed >>>>>> to a >>>>>> set. This change, which now creates issues, suddenly appear in RDF >>>>>> Semantics ED, with no apparent indication that it was motivated by >>>>>> complaints about the 2004 design. >>>>>> >>>>>> Currently, I see a downside of having a plain set, as it does not >>>>>> specify to what datatype the IRIs correspond to, while I do not see >>>>>> the positive side of having a plain set. Can someone provide >>>>>> references to evidence that this change is required or has more >>>>>> advantages than it has drawbacks? >>>>>> >>>>> >>>>> You seem to have a very different usage scenario in mind than I do. >>>> >>>> I do not have any scenario or use case in mind. In RDF 1.0, given an >>>> entailment regime and a set of triples, it was possible to determine >>>> what are the valid entailments and what are non-entailments wrt the >>>> given regime, regardless of anybody's usage scenario. In particular, >>>> given a datatype map D, anybody who's given a set of triples and use >>>> D-entailment regime would derive exactly the same triples because the >>>> D is saying how to interpret the datatype IRIs. It is not related to >>>> scenarios or use case. >>>> >>>> In the current RDF Semantics, if you have a D, you just know what IRIs >>>> are recognised as datatypes, but you have no indication about what >>>> datatypes they denote. So, say D = {http://example.com/dt}, it is not >>>> possible to know what the following triple entails: >>>> >>>> [] <http://ex.com/p> "a"^^<http://example.com/dt> . >>>> >>>> To be able to entail anything from it, you would need to know to what >>>> datatype the IRI maps to. That's why we need somewhere, somehow, a >>>> mapping. And the mapping is affecting the entailment regime, so it >>>> makes sense to have it as a parameter of the regime. >>>> >>>> This is very different from the case where an application is making a >>>> certain usage of an IRI. For instance, displaying instances of >>>> foaf:Person in a certain way in a webpage does not change anything the >>>> the conclusions you can normatively draw from the set of triples in >>>> any entailment regime. >>>> >>>> >>>>> My primary use case (and I'm sorry I sometimes forget there are >>>>> others) >>>>> is the the situation where n independent actors publish data in >>>>> RDF, on >>>>> the web, to be consumed by m independent actors. The n publishers >>>>> each >>>>> makes a choice about which vocabulary to use; the m consumers each get >>>>> to see what vocabularies are used and then have to decide which >>>>> IRIs to >>>>> recognize. There are market forces at work, as publishers want to >>>>> be as >>>>> accurate and expressive as possible, but they also want to stick to >>>>> IRIs >>>>> that will be recognized. Consumers want to make use of as much >>>>> data as >>>>> possible, but every new IRI they recognize is more work, sometimes >>>>> a lot >>>>> more work, so they want to keep the recognized set small. >>>>> >>>>> In this kind of situation, datatype IRIs are just like very other IRI; >>>>> all the "standardization" effects are the same. >>>> >>>> That would be true if we did not have the D-entailment machinery. >>>> Applications can apply specific treatments to specific IRIs, including >>>> datatype IRIs (for instance, display dates using French conventions). >>>> But if we introduce the D-entailment regime, it means we want to >>>> impose more precise constraints on how to interpret the IRIs (that is, >>>> more than just "I recognise this set of IRIs"). >>>> >>>>> It's great for both >>>>> producers and consumers if we can pick a core set of IRIs that >>>>> producers >>>>> can assume consumers will recognize. Things also work okay if a >>>>> closed >>>>> group of producers and consumers agree to use a different set. But one >>>>> of the great strengths of RDF is that the set can be extended >>>>> without a >>>>> need for prior agreement. A producer can simply start to use some new >>>>> IRI, and consumers can dereference it, learn what it means, and change >>>>> their code to recognize it. Of course, it's still painful (details, >>>>> details), but it's probably not as painful as switching to a new data >>>>> format with a new media type. In fact, because it can be done >>>>> independently for each class, property, individual, and datatype, and >>>>> data can be presented many ways at once, I expect it to be vastly less >>>>> painful. >>>> >>>> What you say is perfectly true and I agree with it wholeheartedly. >>>> However, I do not think it is relevant to the D-entailment debate (or >>>> maybe only marginally). >>>> >>>> >>>>> So, given this usage scenario, I can't see how D helps anybody >>>>> except as >>>>> a shorthand for saying "the IRIs which are recognized as datatype >>>>> identifiers". >>>> >>>> In 2004, it says more: it says "These are the datatype IRIs of my >>>> custom D-entailment regime, and these non-XSD datatype IRIs are >>>> interpret in this way, according to these datatypes". It could be done >>>> independently of the D-entailment machinery, in the internal >>>> specificities of an application, but having it in the standard allows >>>> one to refer to the normative mechanism. >>>> >>>>> >>>>> Pat, does this answer the question of how RDF gets extended to a new >>>>> datatype? I'm happy to try to work this through in more detail, if >>>>> anyone's interested. >>>> >>>> So, to summarise what I understand about your position, you say that >>>> the D-entailment machinery isn't that much useful at all, or only in a >>>> weak version of it. Fair enough. As I said during the meeting, I'm not >>>> resisting strongly to the change but in general, I am reluctant to >>>> make any change to a standard that is not motivated by clear evidence >>>> that it improves the existing situation. If any criticism arises from >>>> our design of D-entailment, it is far easier to justify a no-change >>>> ("we want to keep backward compatibility, persistence of definitions, >>>> avoid changes to implementations, etc") rather than a change. >>>> >>>> >>>> AZ. >>>> >>>>> >>>>> -- Sandro >>>>> >>>>> >>>>>> >>>>>> AZ. >>>>>> >>>>>> Le 24/04/2013 05:09, Pat Hayes a écrit : >>>>>>> I think we still have a datatype issue that needs a little thought. >>>>>>> >>>>>>> The D in D-entailment is a parameter. Although RDF is usually >>>>>>> treated >>>>>>> as having its own special datatypes and the compatible XSD types as >>>>>>> being the standard D, it is quite possible to use RDF with a >>>>>>> larger D >>>>>>> set, so that as new datatypes come along (eg geolocation datatypes, >>>>>>> or time-interval datatypes, or physical unit datatypes, to mention >>>>>>> three that I know have been suggested) and, presumably, get >>>>>>> canonized >>>>>>> by appropriate standards bodies (maybe not the W3C, though) for use >>>>>>> by various communities, they can be smoothly incorporated into RDF >>>>>>> data without a lot of fuss and without re-writing the RDF specs. >>>>>>> >>>>>>> Do we want to impose any conditions on this process? How can a >>>>>>> reader >>>>>>> of some RDF know which datatypes are being recognized by this RDF? >>>>>>> What do we say about how to interpret a literal whose datatype IRI >>>>>>> you don't recognize? Should it be OK to throw an error at that >>>>>>> point, >>>>>>> or should it *not* be OK to do that? Shouid we require that RDF >>>>>>> extensions with larger D's only recognize IRIs that have been >>>>>>> standardly specified in some way? How would we say this? >>>>>>> >>>>>>> The current semantic story is that a literal >>>>>>> "foo"^^unknown:datatypeIRI is (1) syntactically OK (2) not an error >>>>>>> but (3) has no special meaning and is treated just like an unknown >>>>>>> IRI, ie it presumably denotes something, but we don't know what. Is >>>>>>> this good enough? >>>>>>> >>>>>>> Pat >>>>>>> >>>>>>> ------------------------------------------------------------ IHMC >>>>>>> (850)434 8903 or (650)494 3973 40 South Alcaniz St. >>>>>>> (850)202 4416 office Pensacola (850)202 >>>>>>> 4440 fax FL 32502 (850)291 0667 >>>>>>> mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>> >> >> > >
Received on Friday, 26 April 2013 13:43:31 UTC