- From: Seaborne, Andy <Andy_Seaborne@hplb.hpl.hp.com>
- Date: Fri, 13 Jun 2003 15:52:41 +0100
- To: "'www-rdf-dspace@w3.org'" <www-rdf-dspace@w3.org>
A passing comment else led to a request for me to write up my experiments with a slightly higher level abstraction than the RDF triple. To be clear: This is out of scope for the current SOW of the History Store; it is just a discussion note. Comments etc most welcome. Its ongoing work. Andy --------------------------------------------------------- RDF Data Objects ---------------- An ongoing experiment Andy Seaborne June 2003 Context ======= In a networked environment, granularity of operation is important for practical systems. Too small a grain size and network overhead is too high; too large a grain size and the impact of server performance is noticeable. Background ========== Previous work "RDF Objects" [1] has looked at describing a local part of the RDF graph so that a selection RDF statements is the primary abstraction, not a single triple. This is a client-side definition of the RDF object. The TAP system [2] has a single access function GetData that can return any RDF - in particular the exact form (graph shape, namespaces etc) is not prescribed by the client access operation but is determined by the server. In EJB systems, an efficient design avoids very small entity beans because the database overhead reduces performance, analogous to the network design issues here. RDF Data Objects ================ The idea is that the notion of the appropriate RDF to return on a request isn't just a matter known only to the client. An example would be getting a vCard [3] : a query might find the resource for one, but abstraction of the vCard isn't just a single statement or one level of the RDF graph. An example RDF/vcard (N3) <andy> vcard:FN "Andy Seaborne" ; # Formatted name vcard:N # Structured name - a bNode with further properties [ vcard:Family "Seaborne" ; vcard:Given "Andy" ] ; . so the definition of a complete vCard is part of the design of the vcard schema and does not need to be something the client should have to know. RDF allows optional triples, so the rigid triple patterns of most query languages, e.g. RDQL, used for locating a graph node, aren't good at extracting the optional RDF and structured values associated with the node found. Observation: In this case, the extraction algorithm could also be a transitive closure over bNodes but some RDF schemas (e.g. FOAF) have top level abstraction which are usually bNodes. Indeed, typically, all FOAF resources are bNodes so the closure is everything. "Fetch" ======= Joseki has the notion of a repository of RDF. It can only answer question about resources based on metadata in the repository - there may be other places to go to find out things. A building block operation for the Joseki approach is to have a "fetch" operation which is a "get me everything you know about <X>" and it is a server-side decision as to the RDF statements to return. This is a sort of simple query and fits with HTTP GET: GET http://host/repository?op=fetch&uri=%encodedURI The choice of algorithm to apply depends on the URI specified. Doing a fetch on a vCard gets the properties and the compound structure of the vcard, specifically, the vcard:N structure as well as the plain vcard:FN statement. The client is expected to navigate the subgraph returned and work out what it wants to do with the information. Choosing the RDF data object in the server can be based on, say, RDF type (and hence with OWL, characteristic properties). Further arguments could also be useful if a thing is of several types if the RDF gets too big for practical use. Reference and Containment ========================= What is really going on is that there are data objects and two kinds of link: reference links where one object links to another object and containment links where one object contains subsidiary portions of the graph. If properties were marked as containment or reference links, then a single algorithm could be used that traversed containment links (cycles need to be handled). Making properties either a subPropertyOf :reference or :contains (or subClass of :ReferenceLink or :ContainmentLink) works but I want also to handle schemas where this is not designed in. Hence the associating of different algorithms in the server. Experimental Status =================== My prototyping version of Joseki does bNode closure of RDQL queries and also provides a fetch operation. Tying into the Joseki configuration system has not yet been done. As of June 2003, these features have been tested with a demo app currently under development. [1] http://www.hpl.hp.com/techreports/2002/HPL-2002-315.html [2] http://tap.stanford.edu/ [3] http://www.w3.org/TR/vcard-rdf
Received on Friday, 13 June 2003 10:53:09 UTC