- From: Paul A Houle <devonianfarm@gmail.com>
- Date: Wed, 16 Sep 2009 15:20:42 -0400
- To: public-lod@w3.org
- Message-ID: <ad20490909161220s6e5d720bnbc891e1bec09171d@mail.gmail.com>
I think there are a few scenarios here. In my mind, dbpedia.org is a site for tripleheads. I use it all the time when I'm trying to understand how my systems interact with data from dbpedia -- for that purpose, it's useful to see a reasonably formatted list of triples associated with an item. A view that's isomorphic to the triples is useful for me there. Yes, better interfaces for browsing dbpedia/wikipedia ought to be built -- navigation along axes of type, time, and space would be obviously interesting, but making a usable interface for this involves some challenges which are outside the scope of dbpedia.org; The point of linked data is anybody who wants to make a better browsing interface for dbpedia. Another scenario is a site that's ~primarily~ a site for humans and secondly a site for tripleheads and machines, for instance, http://carpictures.cc/ That particular site is built on an object-relational system which has some (internal) RDF features. The site was created by merging dbpedia, freebase and other information sources, so it exports linked data that links dbpedia concepts to images with very high precision. The primary vocabulary is SIOC, and the RDF content for a page is ~nearly~ isomorphic to the content of the main part of the page (excluding the sidebar.) However, there is content that's currently exclusive to the human interface: for instance, the UI is highly visual: for every automobile make and model, there are heuristics that try to pick a "better than average" image at being both striking and representative of the brand. This selection is materialized in the database. There's information designed to give humans an "information scent" to help them navigate, a concept which isn't so well-defined for webcrawlers. Then there's the sidebar, which has several purposes, one of them being a navigational system for humans, that just isn't so relevant for machines. There really are two scenarios I see for linked data users relative to this system at the moment: (i) a webcrawler crawls the whole site, or (ii) I provide a service that, given a linked data URL, returns information about what ontology2 knows about the URL. For instance, this could be used by a system that's looking for multimedia connected with anything in dbpedia or freebase. Perhaps I should be offering an NT dump of the whole site, but I've got no interest in offering a SPARQL endpoint. As for friendly interfaces, I'd say take a look analytically at a page like http://carpictures.cc/cars/photo/car_make/21/Chevrolet What's going on here? This is being done on a SQL-derivative system that has a query builder, but you could do the same thing w/ SPARQL. We'd image that there are some predicates like hasCarModel hasPhotograph hasPreferredThumb starting with a URL that represents a make of car (a nameplate, like Chevrolet) we'd then traverse the hasCarModel relationship to enumerate the models, and then do a COUNT(*) of hasPhotograph relationships for the cars to create a count of pictures for each model. Generically, the construction of a page like this involves doing "joins" and traversing the graph to show, not just the triples that are linked to a named entity, but information that can be found by traversing a graph. People shouldn't be shy about introducing their own predicates; the very nature of inference in RDF points to "creating a new predicate" as the basic solution to most problems. In this case, hasPreferredThumb is a perfectly good way to materialize the result of a complex heuristic. (One reason I'm sour about public SPARQL endpoints is that I don't want to damage my brand by encouraging amnesic mashups of my content; a quality site really needs a copy of it's own data so it can make additions, corrections, etc; one major shortcoming of Web 2.0 has been self-serving API TOS that forbid systems from keeping a memory -- for instance, Ebay doesn't let you make a price tracker or a system that keeps dossiers on sellers. Del.icio.us makes it easy to put data in, but you can't get anything interesting out. Web 3.0 has to make a clean break from this.) Database-backed sites traditionally do this with a mixture of declarative SQL code and procedural code to create a view... It would be interesting to see RDF systems where the graph traversal is specified and transformed into a website declaritively.
Received on Thursday, 17 September 2009 07:40:42 UTC