- From: Butler, Mark <Mark_Butler@hplb.hpl.hp.com>
- Date: Mon, 8 Dec 2003 15:23:56 -0000
- To: SIMILE public list <www-rdf-dspace@w3.org>
- Message-ID: <E864E95CB35C1C46B72FEA0626A2E808206335@0-mail-br1.hpl.hp.com>
Hi Stefano > > I think there are two reasons why getable / ungetable has been > <snip> > > Ok. Question: is that so bad? I mean, URIs might be designed to be > gettable, but not there yet... or contain stuff that Haystack could > parse, but not understand. What is the default behaviour of Haystack > when it encounters a 404? what is Haystack expecting? or what > would be > better to serve from those resource dereference? I think people from the Haystack team would be the best people to answer this question, especially as I may have described Haystack's behavior incorrectly here, so perhaps they could give some background here? I would characterise the Haystack behavior as unusual, in that other RDF applications encounter RDF where URLs do not exist, so they just do not try to retrieve URLs. However that's not to say those applications have got it right - it is arguable that Haystack has got it right and the others have got it wrong. The difference here is Haystack has adopted a particular processing model. One thing I've noted before is there is no standard processing model for the Semantic Web, so I have a worry this could make some of the interoperability and seamless discovery described in some of the SW scenarios hard to achieve. By processing model, I mean what do you do once you've received a piece of RDF e.g. - Do you retrieve the schema from the namespace? - Do you retrieve resources from URLs used in the RDF? - If so how do you treat the resources - as additional subgraphs of RDF or as resources, such as HTML pages or JPGs that are described by the RDF? Do you distinguish by MIME types etc - Do you query some kind of webservice to determine information additional information about a URL? Processing models like this may be necessary to allow discovery of metadata about resources where the metadata has not been created by the resource owner. > > Assuming we stick with URLs then there are two ways to overcome this > > problem a) put something behind the URL or b) make the URL > to a URN, so > > Haystack doesn't try to retrieve anything so the revised > data tries to > > take approach a). > > I'm sorry, I can parse the above but I can't get around the logic > behind it. Can you elaborate more on the alternatives you envision? I'll try to separate the how from the why. First how a) we have metadata URLs like http://web.mit.edu/simile/metadata/metadatasubset#instance1 and http://web.mit.edu/simile/metadata/metadatasubset#instance2 so to avoid the problems of not being able to retrieve these resources we place a piece of HTML at http://web.mit.edu/simile/metadata/metadatasubset b) we change the metadata URLs to http://web.mit.edu/simile/metadata/metadatasubset#instance1 to urn:x-mit-hp-w3c-simile/metadatas/metadatasubset:instance1 so software knows there is no further information available. As for the why, the advantage of approach a) over b) is we can put some data at http://web.mit.edu/simile/metadata/metadatasubset#instance1 in the future. However the advantage of b) over a) is we can regard a) as a cheat, as the data at http://web.mit.edu/simile/metadata/metadatasubset is human readable rather than machine readable. At least with b) we have unambiguously indicated there is no information available at that URL. > RE: spreading data over multiple namespaces > > I'm not sure I follow here, either. > if you are looking up the > concept, > you might just want the RDFSchema for that concept, maybe > with all the > RDF references that that concept builds upon or references. Doesn't > have to be the entire infoset. Well I re-read my argument, and you are right, it doesn't really stand up, because as you note it might be sufficient to retrieve a schema subset. From my previous use of RDF, I've built up a prejudice that namespaces are a source of complexity, so I was trying to argue that we should follow Occam's Razor here and reduce the number of namespaces where possible. However as you note really this is to about the role people assign to separators like hashes, and when you look at it from that point of view it becomes very arbitary. > whether "/" or "#" or "?" > is better than the other, well, it's highly debatable and this is > probably not the right place either. agreed :) > Don't get me wrong, I'm a strong advocate of semi-structured > repository > and you know that, but still I think that RDF is a perfect candidate > for relational technology.... where general XML is definately not. I'm not quite sure what you mean here by "RDF is a perfect candidate for relational technology" - can you give more details? > Can you explain the rationale bethind this? RDF is XML, XML > is a tree, > so you need a tree-oriented database? is that the syllogism in place? I think I would avoid strongly avoid the use of the word syllogism when discussing RDF at the moment after the Shirky article :) But yes, the argument that I was trying to advance was - libraries deal with heterogeneous, semi-structured data - they have a history of using hierarchical databases - due to fashion, although hierarchical databases were popular in the 60's and 70's, they were largely superseded by relational databases. - recently there has been renewed interest in hierarchical databases as they can be used to store XML. - Therefore some projects in the library community are using XML databases, as they are modern hierarchical databases, but also because the library community is increasingly working with XML. For some relevant links, see OpenIsis: Open source database for libraries. http://openisis.org/Doc/ UCAI GORT project. http://gort.ucsd.edu/ucai/ Harvard TED: Templated Database. http://hul.harvard.edu/ois/systems/ted/index.html Of course, the relationship between XML and RDF is another story. > What do you mean with "persistant RDF approach". We use an RDF model to store all the data structures of the application, but instead of storing that model in main memory we persist it in a relational database. > > For example Jena uses JDBC to persist RDF models using > databases like > > MySQL or Postgres, then Joseki makes those databases > querable via the > > web. > > That's what I was thinking. I think it makes perfect sense to use > relational technology for RDF, given its nature. You might want to take a look at how Jena stores RDF models in databases - see http://jena.sourceforge.net/DB/ as the resulting databases look quite different to a relational database, even though they are represented using one. Dr Mark H. Butler Research Scientist HP Labs Bristol mark-h_butler@hp.com Internet: http://www-uk.hpl.hp.com/people/marbut/
Received on Monday, 8 December 2003 10:29:25 UTC