- From: Sandro Hawke <sandro@w3.org>
- Date: Thu, 25 Apr 2013 18:41:40 -0400
- To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- CC: public-rdf-wg@w3.org
- Message-ID: <5179B124.7060700@w3.org>
I think you're saying that in the 2004 semantics, one can't just say (a) In addition to some XSD stuff, this system also implements datatype http://example.org/daterange Instead, by the 2004 spec, one has to say: (b) In addition to some XSD stuff, this system also implements datatype http://example.org/daterange as meaning the datatype such that the the value space is all pairs of time instants, with the first element < the second elment, and the lexical space which is the concatenation of two elements from the lexical space of xs:dateTime, separated by a "..", and the mapping between the two is such that....etc, etc. Is that right? And Pat's proposal would make it so people would be saying (a) instead of (b)? -- Sandro On 04/25/2013 11:05 AM, Antoine Zimmermann wrote: > > Le 25/04/2013 15:37, Sandro Hawke a écrit : >> On 04/24/2013 10:06 AM, Antoine Zimmermann wrote: >>> It seems to me that this problem is due to the removal of the notion >>> of datatype map. In 2004, applications could implement the >>> D-entailment they liked, with D being a partial mapping from IRI to >>> datatypes. >>> Now, there are just IRIs in D. The association between the IRI and the >>> datatype it should denote is completely unspecified. The only >>> indication that the application can have to implement a datatype map >>> is that XSD URIs must denote the corresponding XSD datatypes. >>> >>> I have troubles understanding why datatype maps should be removed. I >>> don't remember any discussions saying that they should be changed to a >>> set. This change, which now creates issues, suddenly appear in RDF >>> Semantics ED, with no apparent indication that it was motivated by >>> complaints about the 2004 design. >>> >>> Currently, I see a downside of having a plain set, as it does not >>> specify to what datatype the IRIs correspond to, while I do not see >>> the positive side of having a plain set. Can someone provide >>> references to evidence that this change is required or has more >>> advantages than it has drawbacks? >>> >> >> You seem to have a very different usage scenario in mind than I do. > > I do not have any scenario or use case in mind. In RDF 1.0, given an > entailment regime and a set of triples, it was possible to determine > what are the valid entailments and what are non-entailments wrt the > given regime, regardless of anybody's usage scenario. In particular, > given a datatype map D, anybody who's given a set of triples and use > D-entailment regime would derive exactly the same triples because the > D is saying how to interpret the datatype IRIs. It is not related to > scenarios or use case. > > In the current RDF Semantics, if you have a D, you just know what IRIs > are recognised as datatypes, but you have no indication about what > datatypes they denote. So, say D = {http://example.com/dt}, it is not > possible to know what the following triple entails: > > [] <http://ex.com/p> "a"^^<http://example.com/dt> . > > To be able to entail anything from it, you would need to know to what > datatype the IRI maps to. That's why we need somewhere, somehow, a > mapping. And the mapping is affecting the entailment regime, so it > makes sense to have it as a parameter of the regime. > > This is very different from the case where an application is making a > certain usage of an IRI. For instance, displaying instances of > foaf:Person in a certain way in a webpage does not change anything the > the conclusions you can normatively draw from the set of triples in > any entailment regime. > > >> My primary use case (and I'm sorry I sometimes forget there are others) >> is the the situation where n independent actors publish data in RDF, on >> the web, to be consumed by m independent actors. The n publishers each >> makes a choice about which vocabulary to use; the m consumers each get >> to see what vocabularies are used and then have to decide which IRIs to >> recognize. There are market forces at work, as publishers want to be as >> accurate and expressive as possible, but they also want to stick to IRIs >> that will be recognized. Consumers want to make use of as much data as >> possible, but every new IRI they recognize is more work, sometimes a lot >> more work, so they want to keep the recognized set small. >> >> In this kind of situation, datatype IRIs are just like very other IRI; >> all the "standardization" effects are the same. > > That would be true if we did not have the D-entailment machinery. > Applications can apply specific treatments to specific IRIs, including > datatype IRIs (for instance, display dates using French conventions). > But if we introduce the D-entailment regime, it means we want to > impose more precise constraints on how to interpret the IRIs (that is, > more than just "I recognise this set of IRIs"). > >> It's great for both >> producers and consumers if we can pick a core set of IRIs that producers >> can assume consumers will recognize. Things also work okay if a closed >> group of producers and consumers agree to use a different set. But one >> of the great strengths of RDF is that the set can be extended without a >> need for prior agreement. A producer can simply start to use some new >> IRI, and consumers can dereference it, learn what it means, and change >> their code to recognize it. Of course, it's still painful (details, >> details), but it's probably not as painful as switching to a new data >> format with a new media type. In fact, because it can be done >> independently for each class, property, individual, and datatype, and >> data can be presented many ways at once, I expect it to be vastly less >> painful. > > What you say is perfectly true and I agree with it wholeheartedly. > However, I do not think it is relevant to the D-entailment debate (or > maybe only marginally). > > >> So, given this usage scenario, I can't see how D helps anybody except as >> a shorthand for saying "the IRIs which are recognized as datatype >> identifiers". > > In 2004, it says more: it says "These are the datatype IRIs of my > custom D-entailment regime, and these non-XSD datatype IRIs are > interpret in this way, according to these datatypes". It could be done > independently of the D-entailment machinery, in the internal > specificities of an application, but having it in the standard allows > one to refer to the normative mechanism. > >> >> Pat, does this answer the question of how RDF gets extended to a new >> datatype? I'm happy to try to work this through in more detail, if >> anyone's interested. > > So, to summarise what I understand about your position, you say that > the D-entailment machinery isn't that much useful at all, or only in a > weak version of it. Fair enough. As I said during the meeting, I'm not > resisting strongly to the change but in general, I am reluctant to > make any change to a standard that is not motivated by clear evidence > that it improves the existing situation. If any criticism arises from > our design of D-entailment, it is far easier to justify a no-change > ("we want to keep backward compatibility, persistence of definitions, > avoid changes to implementations, etc") rather than a change. > > > AZ. > >> >> -- Sandro >> >> >>> >>> AZ. >>> >>> Le 24/04/2013 05:09, Pat Hayes a écrit : >>>> I think we still have a datatype issue that needs a little thought. >>>> >>>> The D in D-entailment is a parameter. Although RDF is usually treated >>>> as having its own special datatypes and the compatible XSD types as >>>> being the standard D, it is quite possible to use RDF with a larger D >>>> set, so that as new datatypes come along (eg geolocation datatypes, >>>> or time-interval datatypes, or physical unit datatypes, to mention >>>> three that I know have been suggested) and, presumably, get canonized >>>> by appropriate standards bodies (maybe not the W3C, though) for use >>>> by various communities, they can be smoothly incorporated into RDF >>>> data without a lot of fuss and without re-writing the RDF specs. >>>> >>>> Do we want to impose any conditions on this process? How can a reader >>>> of some RDF know which datatypes are being recognized by this RDF? >>>> What do we say about how to interpret a literal whose datatype IRI >>>> you don't recognize? Should it be OK to throw an error at that point, >>>> or should it *not* be OK to do that? Shouid we require that RDF >>>> extensions with larger D's only recognize IRIs that have been >>>> standardly specified in some way? How would we say this? >>>> >>>> The current semantic story is that a literal >>>> "foo"^^unknown:datatypeIRI is (1) syntactically OK (2) not an error >>>> but (3) has no special meaning and is treated just like an unknown >>>> IRI, ie it presumably denotes something, but we don't know what. Is >>>> this good enough? >>>> >>>> Pat >>>> >>>> ------------------------------------------------------------ IHMC >>>> (850)434 8903 or (650)494 3973 40 South Alcaniz St. >>>> (850)202 4416 office Pensacola (850)202 >>>> 4440 fax FL 32502 (850)291 0667 >>>> mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >> >> >> >
Received on Thursday, 25 April 2013 22:41:47 UTC