Re: How does RDF get extended to new datatypes? from Sandro Hawke on 2013-04-25 (public-rdf-wg@w3.org from April 2013)

From: Sandro Hawke <sandro@w3.org>
Date: Thu, 25 Apr 2013 09:37:07 -0400
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
CC: public-rdf-wg@w3.org
Message-ID: <51793183.8000004@w3.org>
On 04/24/2013 10:06 AM, Antoine Zimmermann wrote:
> It seems to me that this problem is due to the removal of the notion 
> of datatype map. In 2004, applications could implement the 
> D-entailment they liked, with D being a partial mapping from IRI to 
> datatypes.
> Now, there are just IRIs in D. The association between the IRI and the 
> datatype it should denote is completely unspecified. The only 
> indication that the application can have to implement a datatype map 
> is that XSD URIs must denote the corresponding XSD datatypes.
>
> I have troubles understanding why datatype maps should be removed. I 
> don't remember any discussions saying that they should be changed to a 
> set. This change, which now creates issues, suddenly appear in RDF 
> Semantics ED, with no apparent indication that it was motivated by 
> complaints about the 2004 design.
>
> Currently, I see a downside of having a plain set, as it does not 
> specify to what datatype the IRIs correspond to, while I do not see 
> the positive side of having a plain set. Can someone provide 
> references to evidence that this change is required or has more 
> advantages than it has drawbacks?
>

You seem to have a very different usage scenario in mind than I do.

My primary use case (and I'm sorry I sometimes forget there are others) 
is the the situation where n independent actors publish data in RDF, on 
the web, to be consumed by m independent actors.   The n publishers each 
makes a choice about which vocabulary to use; the m consumers each get 
to see what vocabularies are used and then have to decide which IRIs to 
recognize.  There are market forces at work, as publishers want to be as 
accurate and expressive as possible, but they also want to stick to IRIs 
that will be recognized.  Consumers want to make use of as much data as 
possible, but every new IRI they recognize is more work, sometimes a lot 
more work, so they want to keep the recognized set small.

In this kind of situation, datatype IRIs are just like very other IRI; 
all the "standardization" effects are the same.   It's great for both 
producers and consumers if we can pick a core set of IRIs that producers 
can assume consumers will recognize.   Things also work okay if a closed 
group of producers and consumers agree to use a different set.   But one 
of the great strengths of RDF is that the set can be extended without a 
need for prior agreement.  A producer can simply start to use some new 
IRI, and consumers can dereference it, learn what it means, and change 
their code to recognize it.   Of course, it's still painful (details, 
details), but it's probably not as painful as switching to a new data 
format with a new media type.   In fact, because it can be done 
independently for each class, property, individual, and datatype, and 
data can be presented many ways at once, I expect it to be vastly less 
painful.

So, given this usage scenario, I can't see how D helps anybody except as 
a shorthand for saying "the IRIs which are recognized as datatype 
identifiers".

Pat, does this answer the question of how RDF gets extended to a new 
datatype?    I'm happy to try to work this through in more detail, if 
anyone's interested.

      -- Sandro


>
> AZ.
>
> Le 24/04/2013 05:09, Pat Hayes a écrit :
>> I think we still have a datatype issue that needs a little thought.
>>
>> The D in D-entailment is a parameter. Although RDF is usually treated
>> as having its own special datatypes and the compatible XSD types as
>> being the standard D, it is quite possible to use RDF with a larger D
>> set, so that as new datatypes come along (eg geolocation datatypes,
>> or time-interval datatypes, or physical unit datatypes, to mention
>> three that I know have been suggested) and, presumably, get canonized
>> by appropriate standards bodies (maybe not the W3C, though) for use
>> by various communities, they can be smoothly incorporated into RDF
>> data without a lot of fuss and without re-writing the RDF specs.
>>
>> Do we want to impose any conditions on this process? How can a reader
>> of some RDF know which datatypes are being recognized by this RDF?
>> What do we say about how to interpret a literal whose datatype IRI
>> you don't recognize? Should it be OK to throw an error at that point,
>> or should it *not* be OK to do that? Shouid we require that RDF
>> extensions with larger D's only recognize IRIs that have been
>> standardly specified in some way? How would we say this?
>>
>> The current semantic story is that a literal
>> "foo"^^unknown:datatypeIRI  is (1) syntactically OK (2) not an error
>> but (3) has no special meaning and is treated just like an unknown
>> IRI, ie it presumably denotes something, but we don't know what. Is
>> this good enough?
>>
>> Pat
>>
>> ------------------------------------------------------------ IHMC
>> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
>> (850)202 4416   office Pensacola (850)202
>> 4440   fax FL 32502                              (850)291 0667
>> mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
>>
>>
>>
>>
>>
>>
>>
>
Received on Thursday, 25 April 2013 13:37:17 UTC