- From: Garret Wilson <garret@globalmentor.com>
- Date: Wed, 01 Aug 2007 12:32:54 -0700
- To: Story Henry <henry.story@bblfish.net>
- CC: Semantic Web <semantic-web@w3.org>
Henry,
I promise I'm going to try to take a few hours' break from this
discussion and get some work done. Too bad we can't sit down, have a
beer, and throw french fries at each other while we talk about this. :)
Just couple more comments until later this evening or tomorrow:
Story Henry wrote:
>>
>> You still didn't give me an example of what could *not* be a literal,
>> even though you stated that "there are in fact limitations on what
>> can be a Literal."
>
> George Bush can not be a literal. I think that is clear. Even if he
> thinks literally, that is without looking at the world.
I will not comment on George Bush's thinking ;) , but I know that I can
A ) create a controlled set of lexical representations to represent US
presidents, B ) identify that set of lexical representations using the
datatype eg:uspresidents, and C ) identify George Bush using "George W.
Bush"^^eg:uspresidents. I have identified George Bush using RDF, and in
the RDF model he has become an instance of rdfs:Literal, no more and no
less than the number 123 is an instance of rdfs:Literal in my model when
I use "123"^^xsd:integer.
>
> In which model does it have to be presented that way? You mean in the
> spec right? But that is just a way to make sure we all agree on
> something. A bit like a Java reference implementation for say the
> Servlet API. Everyone can create more efficient ones later.
I think that this discussion is having problems because you don't
recognize that there is something in between your database and your N3
notation called the "RDF model". It specifies the canonical way in which
resources and their properties are understood with respect to a
particular framework (RDF), independent of how it is stored in the
database, independent of how you specify it in a text file using N3, and
independent of the syntax you use to query it. The RDF data model is
analogous to the W3C Document Object Model (DOM) for XML.
Most of your discussion has been about how I specify numbers in N3, or
how I store them in the database. I'm talking about how resources are
represented in the RDF model. Yes, the RDF model is *described* by the
RDF specification, but the specification is not the model. See
http://www.w3.org/TR/rdf-primer/#rdfmodel for an introduction. Then see
http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-data-model
. These specifications define how RDF views the things that exist in the
world, independent of the serialization syntax or database
representation. (The model is different from the Java servlet API
because the Java servlet specification only specifies the interface
doesn't specify what a servlet engine does inside. The RDF
specification, on the other hand, is very interested in what your N3
representation *means*---what model is formed of the world by your
serialization.)
The conflation of serialization with the RDF model, and/or the
conflation of the database representation with the RDF model, would
certainly lead you to say that I can just "fix the problem" by getting
enough people to use URIs for integers. That won't solve the problem.
The RDF data model sees things identified by strings as different types
of things than those identified by URIs. This is a fundamental problem
with the RDF data model, and the only solution is to change the
model---the way that RDF characterizes the world.
>
> Did you go an read http://www.w3.org/TR/swbp-xsch-datatypes/
> or any of the other pointers people have been giving you?
>
> I know you have not because you are responding within minutes of the
> reply reaching you.
Now, let's not make personal attacks or rash assumptions. You gave me
the link above. I went there. I skipped the section "Status of this
Document". Then I skipped "Table of Contents". I skipped "Introduction"
and "Reading this Document" (although if I would have fully read
"Reading this Document" I would have seen that it says, "many readers
will benefit from skipping sections."). Then I skipped "Namespaces Used
in this Document."
I went straight to the section which is relevant to our discussion: "3.
Comparison of Values", which states the problem you raised: "What is the
relationship between the value spaces of the various XML Schema built-in
simple types when used within RDF and OWL? Or in other words, when do
two literals, which are written down differently, refer to the same
value. For example, "10"^^xsd:integer and "010"^^xsd:integer both denote
the integer ten." And that's the relevant portion. You asserted that, if
two xsd:integers have different lexical forms, they represent distinct
resources, unlike URIs, which may identify the same resource. I read
through the example, which gave concrete examples exactly contrary to
your assertion.
So no, I didn't read the section about the namespaces being used, or the
20% of the document that the "References" take up. But I did read the
relevant sections of the document, and I cited examples from that
document that dispute what you claimed to be a benefit of rdfs:Literal
over normal resources identified by URIs.
>> But I point out that you have exactly the same situation with
>> literals---just because the lexical form is different doesn't mean
>> that they refer to different resources. Let me quote from the same
>> document http://www.w3.org/TR/swbp-xsch-datatypes/ you point to above:
>>
>> "For example, "10"^^xsd:integer and "010"^^xsd:integer both denote
>> the integer ten."
>>
>> But it gets even worse for your case. Quoting again:
>>
>> "|"15"^^xsd:byte| and |"15.0"^^xsd:decimal| both denote the same
>> value, fifteen.This follows because xsd:byte has primitive base
>> datatype xsd:decimal."
>
> this is not a problem.:
> - How each of them map to each other is well known.
> - How you store it is up to you, so the fact that you can name them
> differently is not a problem
> - you know all you need to know about them when you have their name.
Yes, I know all this---but you were saying that somehow rdfs:Literal
brings some benefits over normal resources with URIs. You have not
demonstrated that, and the one example you gave, about how you don't
need to have OWL statements to say that two literals are distinct, could
be said about URIs that are specified to be integers as well.
> The question can really be turned around: what are you gaining by just
> having URIs.
By having URIs I have a consistent data model without some strange type
of resources that need extra types of querying or decisions in my
program. I can treat everything the same.
>
> Can you let us know how this is causing you any trouble?
Let's say I use my RDF library in JAVA to read in some N3:
RDFProcessor rdfProcessor=new RDFProcessor();
RDFDataModel rdf=rdfProcessor.process(new FileInputStream("my.n3"));
Then I find all the books in the data model:
Resource[] books=rdf.getResourcesByType("eg:Book");
Which is maybe a convenience method for this:
Resource[] books=rdf.getResourcesByPropertyValue("eg:Book", "rdf:type");
So if I call books[0].getProperty("rdf:type"), it gives me "eg:Book". No
problem.
So what if I call
books[0].getProperty("dc:title").getProperty("rdf:type")? It doesn't
give me anything, because the object of the first book's rdf:title
property is not a resource---well, RDF says (hands waving) that it's a
resource, but it treats it differently from other resources, making it
some strange rdfs:Literal thingy. I want to call
books[0].getProperty("dc:title").getProperty("rdf:type") and get
"xsd:Integer", just like other resources.
So let's take that further. Let's get the number of pages of the book.
Resource bookPageCount=books[0].getProperty("eg:pageCount");
So now I have the book's page count. The page count is a resource---so
RDF would tell you. But really, I have to do something like this:
if(bookPageCount instanceof Literal)
{
//do something if the resource is a literal
}
else
{
//do something if the resource is a non-literal resource
}
This is just crazy. And unnecessary.
And it just gets worse from here. Let's say that I want to get all the
books in the data model with an author of
<http://example.com/us/presidents/GeorgeWBush>. That's easy:
Resource[] books=rdf.getResourcesByPropertyValue("dc:author",
URI.create("http://example.com/us/presidents/GeorgeWBush"));
But how do I get all the books with exactly 100 pages? Well, if RDF were
consistent, I could do the same thing:
Resource[] books=rdf.getResourcesByPropertyValue("eg:pageCount",
URI.create("http://example.com/integers/100"));
But I can't do that. Why? Because "100" is a literal. So? Isn't it a
resource as well? Well, yes---sort of.
So let me make it easier. Let me just get all the books authored by
resource of type foaf:Person:
Resource[] books=rdf.getResourcesBySubPropertyValue("dc:author",
"rdf:type", "foaf:Person");
No problem. Whee! This is easy. So now let's find all the books with
page counts of type xsd:Integer:
Resource[] books=rdf.getResourcesBySubPropertyValue("eg:pageCount",
"rdf:type", "xsd:Integer");
What? I can't do that? But isn't 100 a resource, too? Shouldn't it have
an rdf:type, just like any other resource? Why do I have to treat it
differently?
The treatment of literals in RDF is a pain, it is strange, it is
inconsistent, and it is completely unnecessary. This is a problem with
the RDF model---the way that RDF characterizes the world---and only
changing the RDF model will fix the problem.
Garret
Received on Wednesday, 1 August 2007 19:33:06 UTC