RDF for sitemapping: Biz/ed testbed data from Dan Brickley on 2000-09-27 (www-rdf-interest@w3.org from September 2000)

From: Dan Brickley <Daniel.Brickley@bristol.ac.uk>
Date: Wed, 27 Sep 2000 16:46:00 +0100 (BST)
To: www-rdf-interest@w3.org
Message-ID: <Pine.GHP.4.21.0009271622500.2084-100000@mail.ilrt.bris.ac.uk>
thought this (even in rough and ready) format, might be a handy dataset
for anyone developing sitemap-style metadata applications using RDF,
Dublin Core, HTML/XLink links etc...


	Biz/ed RDF Metadata Testbed
	http://www.ilrt.bris.ac.uk/discovery/2000/08/bized-meta/

	This document provides an overview of
         a small RDF testbed we're working on
         at ILRT. It is intended primarily for
         developers and members of the
         metadata community. It describes work
         in progress, and has been made
         available before the testbed itself is
         complete. 


There are some preliminary data dumps available, using an experimental
triple-dump syntax (and possible some errors). Basic idea was to take
"real world" metadata from Biz/ed, a Web site for Business and Economics
educators and students, and make it available as a developer testbed and
use case scenario.

Biz/ed was the first "proper" web site I worked on and the reason I got
involved in all this RDF silliness in the first place. We wanted a way
of managing metadata for the site such that a page could
"inherit" properties etc. from more general pages in the site
hierarchy. PICS (and the rumoured PICS-NG, as of summer 97) seemed to
provide facilities (rdf:aboutEachPrefix being RDF 1.0's version of the
PICS "Generic label" concept) that suggested ways in which we might
better organise our embedded metadata. The site (like most) is a mix of
invalid and valid HTML. Most pages have embedded meta tags and
typed-links hooking them into next/previous/up/down structure. It's not
perfect but I think a good example of the sort of data that's out
there. Embedded metadata uses the old DC.* meta tags, so that gets
extracted with a Perl script.

While all the RDF logic/inference stuff is great fun (and IMHO has huge
potential), there's a danger in running before we can walk. I think it's
about time we nailed down some tools for the simpler scenarios like this. In 
http://www.ilrt.bris.ac.uk/discovery/2000/08/bized-meta/ I sketch a few
goals and scenarios that we'd like to implement. See also
http://www.ilrt.bris.ac.uk/discovery/2000/08/bized-meta/pic-bmw1.gif
for a (nearly readable!) RDFViz graph of one part of the site structure.

Excerpt:

	[[[
	Worked Example

         The potential uses of an RDF database
         that describes and entire site can be
         illustrated be a simple example, based
         on the Web of inter-relationships that
         connect the pages on our site. This
         example walks up the logical structure
         of the site, starting with a very detailed
         page within a specialised subsection of
         Biz/ed. 
	[...]

	The sample shown here also illustrates
         a classic problem of web metadata
         management: redundancy. Each page
         has a separate 'publisher' property with
         identical content. It might be more
         practical to exploit the known
         relationships amongst Biz/ed pages
         and associate 'publisher' information
         only with the higher levels of the Web
         site structure. RDF-based applications
         could explore the data and find the
         most accurate 'publisher' information
         for any given page, by traversing the
         'parent' and 'child' connections between
         documents. 
	[...]

	]]]

This last point raises the question of how we reflect (X)HTML typed
links into RDF. See http://www.w3.org/2000/02/rdf-xlink/ for more on
this. Perhaps it is just a matter of picking a set of URIs for the
link types in http://www.w3.org/TR/html401/struct/links.html ? I for one
would find it useful to have an agreed way of URI naming these
relationships, though I'm not yet convinced they're adequate/ideal for
sitemap metadata...

Hope someone finds this useful,

Dan
Received on Wednesday, 27 September 2000 11:46:03 UTC