- From: Garret Wilson <garret@globalmentor.com>
- Date: Wed, 01 Aug 2007 12:32:54 -0700
- To: Story Henry <henry.story@bblfish.net>
- CC: Semantic Web <semantic-web@w3.org>
Henry, I promise I'm going to try to take a few hours' break from this discussion and get some work done. Too bad we can't sit down, have a beer, and throw french fries at each other while we talk about this. :) Just couple more comments until later this evening or tomorrow: Story Henry wrote: >> >> You still didn't give me an example of what could *not* be a literal, >> even though you stated that "there are in fact limitations on what >> can be a Literal." > > George Bush can not be a literal. I think that is clear. Even if he > thinks literally, that is without looking at the world. I will not comment on George Bush's thinking ;) , but I know that I can A ) create a controlled set of lexical representations to represent US presidents, B ) identify that set of lexical representations using the datatype eg:uspresidents, and C ) identify George Bush using "George W. Bush"^^eg:uspresidents. I have identified George Bush using RDF, and in the RDF model he has become an instance of rdfs:Literal, no more and no less than the number 123 is an instance of rdfs:Literal in my model when I use "123"^^xsd:integer. > > In which model does it have to be presented that way? You mean in the > spec right? But that is just a way to make sure we all agree on > something. A bit like a Java reference implementation for say the > Servlet API. Everyone can create more efficient ones later. I think that this discussion is having problems because you don't recognize that there is something in between your database and your N3 notation called the "RDF model". It specifies the canonical way in which resources and their properties are understood with respect to a particular framework (RDF), independent of how it is stored in the database, independent of how you specify it in a text file using N3, and independent of the syntax you use to query it. The RDF data model is analogous to the W3C Document Object Model (DOM) for XML. Most of your discussion has been about how I specify numbers in N3, or how I store them in the database. I'm talking about how resources are represented in the RDF model. Yes, the RDF model is *described* by the RDF specification, but the specification is not the model. See http://www.w3.org/TR/rdf-primer/#rdfmodel for an introduction. Then see http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-data-model . These specifications define how RDF views the things that exist in the world, independent of the serialization syntax or database representation. (The model is different from the Java servlet API because the Java servlet specification only specifies the interface doesn't specify what a servlet engine does inside. The RDF specification, on the other hand, is very interested in what your N3 representation *means*---what model is formed of the world by your serialization.) The conflation of serialization with the RDF model, and/or the conflation of the database representation with the RDF model, would certainly lead you to say that I can just "fix the problem" by getting enough people to use URIs for integers. That won't solve the problem. The RDF data model sees things identified by strings as different types of things than those identified by URIs. This is a fundamental problem with the RDF data model, and the only solution is to change the model---the way that RDF characterizes the world. > > Did you go an read http://www.w3.org/TR/swbp-xsch-datatypes/ > or any of the other pointers people have been giving you? > > I know you have not because you are responding within minutes of the > reply reaching you. Now, let's not make personal attacks or rash assumptions. You gave me the link above. I went there. I skipped the section "Status of this Document". Then I skipped "Table of Contents". I skipped "Introduction" and "Reading this Document" (although if I would have fully read "Reading this Document" I would have seen that it says, "many readers will benefit from skipping sections."). Then I skipped "Namespaces Used in this Document." I went straight to the section which is relevant to our discussion: "3. Comparison of Values", which states the problem you raised: "What is the relationship between the value spaces of the various XML Schema built-in simple types when used within RDF and OWL? Or in other words, when do two literals, which are written down differently, refer to the same value. For example, "10"^^xsd:integer and "010"^^xsd:integer both denote the integer ten." And that's the relevant portion. You asserted that, if two xsd:integers have different lexical forms, they represent distinct resources, unlike URIs, which may identify the same resource. I read through the example, which gave concrete examples exactly contrary to your assertion. So no, I didn't read the section about the namespaces being used, or the 20% of the document that the "References" take up. But I did read the relevant sections of the document, and I cited examples from that document that dispute what you claimed to be a benefit of rdfs:Literal over normal resources identified by URIs. >> But I point out that you have exactly the same situation with >> literals---just because the lexical form is different doesn't mean >> that they refer to different resources. Let me quote from the same >> document http://www.w3.org/TR/swbp-xsch-datatypes/ you point to above: >> >> "For example, "10"^^xsd:integer and "010"^^xsd:integer both denote >> the integer ten." >> >> But it gets even worse for your case. Quoting again: >> >> "|"15"^^xsd:byte| and |"15.0"^^xsd:decimal| both denote the same >> value, fifteen.This follows because xsd:byte has primitive base >> datatype xsd:decimal." > > this is not a problem.: > - How each of them map to each other is well known. > - How you store it is up to you, so the fact that you can name them > differently is not a problem > - you know all you need to know about them when you have their name. Yes, I know all this---but you were saying that somehow rdfs:Literal brings some benefits over normal resources with URIs. You have not demonstrated that, and the one example you gave, about how you don't need to have OWL statements to say that two literals are distinct, could be said about URIs that are specified to be integers as well. > The question can really be turned around: what are you gaining by just > having URIs. By having URIs I have a consistent data model without some strange type of resources that need extra types of querying or decisions in my program. I can treat everything the same. > > Can you let us know how this is causing you any trouble? Let's say I use my RDF library in JAVA to read in some N3: RDFProcessor rdfProcessor=new RDFProcessor(); RDFDataModel rdf=rdfProcessor.process(new FileInputStream("my.n3")); Then I find all the books in the data model: Resource[] books=rdf.getResourcesByType("eg:Book"); Which is maybe a convenience method for this: Resource[] books=rdf.getResourcesByPropertyValue("eg:Book", "rdf:type"); So if I call books[0].getProperty("rdf:type"), it gives me "eg:Book". No problem. So what if I call books[0].getProperty("dc:title").getProperty("rdf:type")? It doesn't give me anything, because the object of the first book's rdf:title property is not a resource---well, RDF says (hands waving) that it's a resource, but it treats it differently from other resources, making it some strange rdfs:Literal thingy. I want to call books[0].getProperty("dc:title").getProperty("rdf:type") and get "xsd:Integer", just like other resources. So let's take that further. Let's get the number of pages of the book. Resource bookPageCount=books[0].getProperty("eg:pageCount"); So now I have the book's page count. The page count is a resource---so RDF would tell you. But really, I have to do something like this: if(bookPageCount instanceof Literal) { //do something if the resource is a literal } else { //do something if the resource is a non-literal resource } This is just crazy. And unnecessary. And it just gets worse from here. Let's say that I want to get all the books in the data model with an author of <http://example.com/us/presidents/GeorgeWBush>. That's easy: Resource[] books=rdf.getResourcesByPropertyValue("dc:author", URI.create("http://example.com/us/presidents/GeorgeWBush")); But how do I get all the books with exactly 100 pages? Well, if RDF were consistent, I could do the same thing: Resource[] books=rdf.getResourcesByPropertyValue("eg:pageCount", URI.create("http://example.com/integers/100")); But I can't do that. Why? Because "100" is a literal. So? Isn't it a resource as well? Well, yes---sort of. So let me make it easier. Let me just get all the books authored by resource of type foaf:Person: Resource[] books=rdf.getResourcesBySubPropertyValue("dc:author", "rdf:type", "foaf:Person"); No problem. Whee! This is easy. So now let's find all the books with page counts of type xsd:Integer: Resource[] books=rdf.getResourcesBySubPropertyValue("eg:pageCount", "rdf:type", "xsd:Integer"); What? I can't do that? But isn't 100 a resource, too? Shouldn't it have an rdf:type, just like any other resource? Why do I have to treat it differently? The treatment of literals in RDF is a pain, it is strange, it is inconsistent, and it is completely unnecessary. This is a problem with the RDF model---the way that RDF characterizes the world---and only changing the RDF model will fix the problem. Garret
Received on Wednesday, 1 August 2007 19:33:06 UTC