W3C home > Mailing lists > Public > semantic-web@w3.org > February 2010

Re: how to define that a relation is a dataype?

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 22 Feb 2010 13:33:40 -0600
Cc: Semantic Web <semantic-web@w3.org>, Dan Connolly <connolly@w3.org>, foaf-protocols@lists.foaf-project.org
Message-Id: <2B3A1598-4807-4615-AA49-52514DE8069E@ihmc.us>
To: Story Henry <henry.story@bblfish.net>

On Feb 22, 2010, at 3:47 AM, Story Henry wrote:

>
> On 22 Feb 2010, at 06:57, Pat Hayes wrote:
>
>>
>> On Feb 21, 2010, at 6:15 PM, Story Henry wrote:
>>
>>> I have a relation :hex defined as
>>>
>>> @prefix : <http://www.w3.org/ns/auth/cert#> .
>>>
>>> :hex a owl:DatatypeProperty,
>>>    owl:InverseFunctionalProperty;
>>>  rdfs:label "hexadecimal"@en;
>>>  rdfs:domain :Integer;
>>>  rdfs:range :String;
>>>  vs:term_status "unstable" .
>>>
>>> This relates a number to a string.
>>
>> Fair enough. But be clear: that is *not* a datatype. It is the  
>> inverse of a datatype mapping, in fact. Datatypes always map FROM  
>> strings TO values. They are a way of having a fixed interpretation  
>> mapping for a string.
>
> Ah! If I told you that I thought so, you would not believe me. So  
> luckily I can point
> to an irc log :-)
> http://chatlogs.planetrdf.com/swig/2010-02-21.html#T19-33-52
>
>
>> [snip]
>>>
>>> :x :dollarValue "1234"^^xsd:int .
>>>
>>> if this WERE equivalent to the two relations:
>>>
>>> :x :dollarValue "1234".
>>> "1234" xsd:int 1234 .
>>
>> No, its certainly not. The literal denotes the value, not the  
>> string. So the right way to split that up into two triples would be
>>
>> :x :dollarValue :y .
>> "1234" xsd:int :y .
>>
>> where :y is the literal value, ie in this case, the integer one  
>> thousand two hundred and thirty four. Not a string. You could use a  
>> bnode for :y, of course.
>> [snip]
>>> So it is clear from that the xsd:int is a relation from a number  
>>> to a string.
>>
>> On the contrary, its clear that it is exactly the other way  
>> round :-). See above. Less contentiously, it all depends on which  
>> way you write the triples. Do it this way, everything works out as  
>> it should.
>
> Great. Thanks for helping me clarify this.
>
>>>
>>> So now, back to the original question: how do I write in my  
>>> ontology that cert:hex is such a literal type?
>>
>> Well, if you used the cert:xhex form, you can just say
>>
>> cert:xhex rdf:type rdfs:Datatype .
>
> ok so
>
> @prefix <http://www.w3.org/ns/auth/cert#>
>
> cert:easyHex rdf:type rdfs:Datatype;
>     owl:inverseOf cert:hex;
>     rdfs:range :Integer;
>
>
>>>
>>> :hex a owl:DatatypeProperty,
>>>    owl:InverseFunctionalProperty;
>>>  rdfs:label "hexadecimal"@en;
>>>  rdfs:domain :Integer;
>>>  rdfs:range :String;
>>>  vs:term_status "unstable" .
>>>
>>> There are a number of other questions.
>>>
>>> - What is the set of all numbers? is it xsd:integer ?
>>
>> Yes, a datatype name can be used as the class of all the datatype  
>> values.
>>
>>> - what is the set of all strings?
>>
>> xsd:string
>>
>> Though you should say 'class of..'
>
> I thought that:
> - a property is a set of ordered pairs
> - a type is a set of things
>
>>> If xsd:integer is the set of all numbers, then how can it also be  
>>> a map from numbers to strings?
>>
>> In RDFS, the same name can be used to mean a mapping and a class  
>> and an individual.
>
> But all of these names are URIs, so they can only refer to one thing  
> right? Are you saying it is a weird union object of all these things  
> in rdfs?

Yes, that is exactly what it is. To be highly technical, "it" is the  
individual thing, and then this thing has an "associated" class and  
also a property, which are what the name denotes when it - the name -  
is used in those ways.
(In OWL 2, things are arranged slightly differently under the hood -  
there are three denotation mappings - but its almost exactly  
equivalent.)

If you want to get into the details of the machinery, read the  
semantics section of the ISO Common Logic standard, which works it out  
in excruciating detail. Or you could just take my word for it: it  
really does work.

> Let me expand on this in what follows...
>
>> Same is true in OWL 2 and ISO Common Logic. Saves a lot of name- 
>> inventing. We used this in the datatyping. Used as an individual,  
>> the dtype name means the actual datatype
>
> xhex a rdfs:Datatype .
>
> No trouble for me, as xhex is the set of ordered pairs from string  
> to value, ie:
>
> xhex refers to { <"1" 1> <"2" 2> <"3" 3> ... <"1234" 4660> ....}

Well, not *quite*. 'xhex' refers to a(n individual thing called a)  
datatype, and that datatype/thing has an associated property mapping  
which is that set of pairs. So when you use the name 'xhex' in a place  
that means a property mapping, this is what you are referring to.

>
> and presumably an rdfs:Datatype is just a set of such
sets of
> ordered pairs. ie
>
> rdfs:Datatype refers to
> { ...
> { <"0" 0> <"1" 1> <"10" 2> <"11" 3> ... }
> { <"0" 0> <"1" 1> <"2" 2> <"3" 3> ... <"1234" 4660> .... }...
> ... }

Well, yes, kinda. But not exactly.  Actually, strictly, it refers to a  
thing which has an associated class extension which is a set of  
things, each of which has an associated mapping which is a set of  
pairs like this. The RDF/S semantics works this way throughout.  
Usually you don't need to think about this level of detail in the  
semantic machinery, however.

>
> and indeed the set referred to by xhex is an element of the  
> rdfs:Datatype set.

The datatype is in the rdfs:Datatype class when you assert that it is.  
Until then, it may or may not be.

>
>> as a thing, so we can say that "it" is a datatype for example.
>
> ok, so you mean when using like this
>
> x :dollarValue "1234"^^:xhex .
>
> we are saying
>
> x dollarValue 4660 .

Yes, though thats not quite what I meant. I was thinking more that

:xhex rdf:type rdfs:Datatype .

is about the datatype itself, not (directly) about the datatype mapping.

>
> I have no problem with that, because as you pointed out above, we  
> can think of ^^ as just a shorthand for
>
> x :dollarValue 4660 .
> "1234" :xhex 4660 .
>
> And indeed the second sentence is true since the pair <"1234", 4660>  
> is an element of the xhex set. And the first sentence is also true,  
> because the  dollarValue relates x to a number, and the value of x  
> (let's assume) is indeed 4660 dollars .
>
> So we have not had to do any magic yet: xhex still points to our  
> relationship.
>
>> Used as a class name, its the class of all the values.
>
> that's odd! When do we need this?
> Oh yes, because we want to say that the domain of something
>
> xhex a owl:DatatypeProperty;
>   rdfs:domain xsd:string ;
>   rdfs:range xhex .
>
> But for the above to make sense either:
>
> [A] xsd:string would have to refer to
>
> { "a string", "123", "another string", ... }
>
> and xhex would have to refer to
>
> { .... 0 1 2 3 4 5 6 7 8 .... }
>
> but just earlier we had xhex refer to a set of pairs, and xsd:string  
> too...

Its OK, relax. It refers to them all. But each use of it - the name -  
refers to just one of them, depending on how the name is used. The  
surrounding syntax completely determines whether its appropriate to  
use the individual, class or property interpretation.

>
> [B] the rdfs:range relation be a relation from sets-of-ordered-pairs  
> (properties) to sets:  -
> - where usually the range of the rdfs:range relation is a superset  
> of the unions of all the elements in second position in the set of  
> ordered pairs
> - but IF the range is literal, then the range of domain can be the  
> exact same set as the domain
>
>    xhex rdfs:range xhex .
>
> is true because xhex being a set of pairs, and rdfs:range just have  
> been defined in this odd way, it's true.
>
> But why make something simple complicated just because we want to  
> say that?

Its not complicated, unless you keep tinkering with the machinery  
under the hood :-). Just go ahead and use the URI in any way you want  
to use it, as long as you do it consistently.

> In order to save just one new URL?

It saves millions of URLs, and more importantly, it saves having worry  
about how they are related to one another. Look, if this really  
bothers you, you are free to invent all those extra URIs and use them  
consistently. So you could have

:xsd-integer-datatype
:xsd-integer-class
:xsd-integer-datatypemap

and write things like

:xsd-integer-datatype rdf:type rdfs:Datatype .
:xsd-integer-datatypemap rdfs:range :xsd-integer-class .

and presumably you will also need connecting assertions such as

xsd:integer-datatypemap :isDatatypeMapOf :xsd-integer-datatype .

After a while, you will notice patterns like :

:xsd-<foo>-datatypemap :idsDatatypeMapOf :xsd-<foo>-datatype .
:xsd-<foo>-datatypemap rdfs:range :xsd-<foo>-class .

for any <foo>, but you won't be able to write these patterns in RDF  
(or indeed OWL. Maybe in RIF, I havn't checked, but I doubt it.) But  
in the current semantics the first one isnt needed at all, and the  
second is:

<foo> rdfs:range <foo> .


for any <foo> in rdfs:Datatype.  Much simpler; and more to the point,  
you don't have to remember it, because its written into RDFS as a  
built-in semantic condition.

>
> Much easier to say :String is the set of all strings, and :Integer  
> is the set of all integers, like this
>
> xsd:integer rdfs:domain :String;
>            rdfs:range :Integer .
>
> Then it's much easier to understand what we are speaking about.

Why is it easier to say :String than xsd:string? And the latter is in  
the public domain already.

>
>> Used as a property, it is the string-to-value mapping.
>
> as I pointed out above, a property is a set of pairs, so there is no  
> problem here.
>
> So it looks like we can be much clearer just by defining classes for  
> literals instead of using xsd:integer and cert:easyHex both as a  
> mapping and as a set of instances. [ but I imagine there is  
> something I have missed here]

I guess it depends on what counts as clarity for you. Ive been using  
this RDF/Common Logic style myself now for years, and its as clear as  
day. I recommend trying to wean yourself away from the idea that there  
has to be a single thing that each name 'means', and moreover that  
this single thing is always an extension of some kind. (A set, a set  
of pairs.) Neither are true in RDFS.

>
> This is then how one could define cert:easyHex:
>
> @prefix <http://www.w3.org/ns/auth/cert#>
>
> cert:easyHex rdf:type rdfs:Datatype;
>     owl:inverseOf cert:hex;
>     rdfs:range :Integer;
>     rdfs:domain :String;
>     rdfs:comment """
>   An encoding of a positive integer (from 0 to infinity) as a  
> hexadecimal string that makes it easy to read and/or fun to present  
> on the web.
>   The purpose of this way of representing hexadecimals is to enable  
> users to copy and paste hexadecimal notations as shown by most  
> browsers, keychains or tools such as opensso, into their rdf  
> representation of choice.  There are a wide variety of ways in which  
> such strings can be presented. One finds the following
>
>  e1 dc d5 e1 00 8f 21 5e d5 cc 7c 7e c4 9c ad 86
>  64 aa dc 29 f2 8d d9 56 7f 31 b6 bd 1b fd b8 ee
>  51 0d 3c 84 59 a2 45 d2 13 59 2a 14 82 1a 0f 6e
>  d3 d1 4a 2d a9 4c 7e db 90 07 fc f1 8d a3 8e 38
>  25 21 0a 32 c1 95 31 3c ba 56 cc 17 45 87 e1 eb
>  fd 9f 0f 82 16 67 9f 67 fa 91 e4 0d 55 4e 52 c0
>  66 64 2f fe 98 8f ae f8 96 21 5e ea 38 9e 5c 4f
>  27 e2 48 ca ca f2 90 23 ad 99 4b cc 38 32 6d bf
>
> Or the same as the above, with ':' instead of spaces. We can't  
> guarantee that these are the only ways such tools will present  
> hexadecimals, so we are very lax.
> The letters can be uppercase or lowercase, or mixed.
> Some strings may start with initial 00's which would be very  
> important if the number were in complement of 2 notation, where in  
> some cases this could be the difference between a positive and a  
> negative number, in particular if the number starts with one of [8- 
> f].  But as we interpret this string as a hexadecimal number leading  
> 00s are not important  (Complement of 2 notation and hexadecimal  
> overlap for positive numbers)
> In order to make this fun, we allow any unicode characters in the  
> string. A parser should
>  1. remove all non hexadecimal characters
>  2. treat the resulting as a hexadecimal representation of a number
> This will allow people to make an ascii - better yet a UTF-8 -  
> picture of their public key when publishing it on the web.
>   """@en .
>

Looks good to me :-)

Pat


>
>
>> Pat Hayes
>>
>>>
>>> Henry
>>>
>>>
>>> [1] http://www.w3.org/TR/rdf-mt/#dtype_interp
>>>
>>> Social Web Architect
>>> http://bblfish.net/
>
>

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 22 February 2010 19:34:13 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 21:45:34 GMT