Introduction: Tom Baker from Thomas Baker on 2004-04-01 (public-swbp-wg@w3.org from April 2004)

From: Thomas Baker <thomas.baker@izb.fraunhofer.de>
Date: Thu, 1 Apr 2004 19:53:07 +0200
To: SW Best Practices <public-swbp-wg@w3.org>
Message-ID: <20040401175307.GA2344@Octavius>
Dear all,

I joined the group yesterday and will attend the teleconference
twenty minutes from now.  I posted this Introduction three
or four hours ago, or so I thought until I saw I was using
the wrong email address, so here it is again...

I have been involved since 1996 with the Dublin Core Metadata
Initiative and currently serve as Chair of the DCMI Usage
Board, which means I am also a member of the DCMI Directorate.
I also helped start the DELOS network in Europe and have been
involved in various European projects over the years.

My background is in the social sciences.  I grew interested in
the Internet in the 1980s while at Stanford (and on The Well).
I tend to approach things like metadata from a linguistic
angle and can take some credit for suggesting the notion of
Dublin Core as a pidgin.  Deep down, I suspect that pidgins
or cores, fuzzy and lossy as they are, represent the best we
can reasonably expect in terms of "semantic interoperability"
on a really broad scale.

To me, DCMI is about making the pidgin idea work -- with
a core vocabulary; a process for growing the vocabulary;
policies for declaring and versioning the vocabulary; an
etiquette for playing well with complementary vocabularies;
conventions for documenting the vocabulary; a simple grammar;
and methods for dumbing down to a Core.

In the spirit of "walk before we run" (danbri), then, I am
most interested in the plain-vanilla issues around managing
and using small vocabularies.  Some issues of possible interest
to BPD WG:

1) URI policy: DCMI has a formal "namespace policy" [1],
   and the CORES Project brokered a "first-step" agreement
   among maintainers of some key standards regarding such
   policies, raising more general issues [2].  How far could
   BPD WG go in formulating best practice in this area?

2) Versioning terms and term sets: DCMI has a de-facto method
   for versioning terms, though it is not yet formally
   supported by DCMI policy.  (It is an event-based model which
   uses URIs to link changes in Term Versions to Decisions,
   which in turn are linked to supporting documentation.)
   Is the model good and might it be generalized?

3) Assertion etiquette and "good neighbor" policies: DCMI is
   working with Library of Congress on developing an RDF
   schema in which LC asserts a set of MARC Relator terms
   to be subPropertyOf dc:contributor.  DCMI wants to then
   endorse these assertions.  Might we consider such formalities
   in BPD WG?

4) Vocabulary documentation (see also Dan [3]): DCMI is looking
   to the SW community for guidance on what to publish at
   its namespace URIs (it currently publishes RDF schemas).
   In terms of work flow, DCMI generates the RDF schemas along
   with ready-reference Web pages from a common source using
   XSLT scripts, though surely more sophisticated editing
   and validation environments are available.

5) Declaring versus reusing (see also Libby [4]): "Should I
   use an existing term, get DCMI to declare one, or declare
   my own?  Where would I put it?  Should I make an Application
   Profile -- ...whatever that is?"

6) Syntax and interoperability.  In the DCMI context, Andy
   Powell and others have developed an Abstract Model for
   clarifying the extent to which different syntaxes (e.g.,
   XMTHL vs XML Schema vs RDF) support distinctions between
   multiple entities or between URI-identified "resources"
   as opposed to "string values" [5].  Lots of issues there...

7) Scalability and complexity: From DCMI one can see how much
   effort it takes to maintain, grow, document, and explain
   a vocabulary of just 90 terms.  As DCMI moves beyond the
   start-up phase, we hope this will turn into a well-oiled
   routine.  However, it does raise the more general issue of
   how much effort "best practice" will ultimately cost, and
   how that effort will scale to vocabularies and ontologies
   much larger and more complex than Dublin Core.

Tom

P.S. I would be most grateful if someone could send me an
archive of this mailing list in the native mbox format.

[1] http://dublincore.org/documents/dcmi-namespace/
[2] http://www.dlib.org/dlib/july03/baker/07baker.html
[3] http://lists.w3.org/Archives/Public/public-swbp-wg/2004JanMar/0016.html
[4] http://lists.w3.org/Archives/Public/public-swbp-wg/2004JanMar/0017.html
[5] http://www.ukoln.ac.uk/metadata/dcmi/abstract-model/


-- 
Dr. Thomas Baker                        Thomas.Baker@izb.fraunhofer.de
Institutszentrum Schloss Birlinghoven         mobile +49-160-9664-2129
Fraunhofer-Gesellschaft                          work +49-30-8109-9027
53754 Sankt Augustin, Germany                    fax +49-2241-144-2352
Personal email: thbaker79@alumni.amherst.edu
Received on Thursday, 1 April 2004 12:50:52 UTC