RE: Referencing FOAF from Makx Dekkers on 2012-12-18 (public-gld-wg@w3.org from December 2012)

From: Makx Dekkers <mail@makxdekkers.com>
Date: Tue, 18 Dec 2012 21:30:28 +0100
To: "'Gillman, Daniel - BLS'" <Gillman.Daniel@bls.gov>, "'Public GLD WG'" <public-gld-wg@w3.org>
Cc: "'Sandro Hawke'" <sandro@w3.org>, "'Phil Archer'" <phil@philarcher.org>
Message-ID: <005301cddd5e$85fbd2f0$91f378d0$@makxdekkers.com>
Thanks, Dan.

 

You make very valid points, but I think those relate mostly to what I
would call "value vocabularies", things you use in the Object position
in a triple, for example sets of concepts like the Library of Congress
Subject Headings, EuroVoc terms, Geonames etc.

 

My message was more narrowly focused on predicate vocabularies, like
DCMI terms, FOAF, DCAT, ADMS and was triggered by the discussion in the
call on 6 December on whether a W3C specification could rely on an
external predicate vocabulary like FOAF that is not managed according to
W3C rules and procedures.

 

But for value vocabularies, I agree that there is no sensible way to
define "good" in the sense of "fit for purpose". However, there could
still be scope for defining "good" in terms of "well-defined and
well-maintained" but maybe that's not a subject that is in scope for
this group.

 

Makx.

 

 

 

From: Gillman, Daniel - BLS [mailto:Gillman.Daniel@bls.gov] 
Sent: Tuesday, December 18, 2012 8:38 PM
To: Makx Dekkers; Public GLD WG
Cc: Sandro Hawke; Phil Archer
Subject: RE: Referencing FOAF

 

Makx et al,

 

This issue is one that is near and dear to my heart, so I want to weigh
in.  I think the only option that makes sense is #1 for the following
reasons:

 

1)      Trying to establish some authority or process for declaring
"good" vocabularies is a non-starter.  One person's  "good" won't be
everyone's, and there are always going to be quirky local reasons for
choosing one vocabulary over another.  An example from my business
(national statistical office)  is industrial classification systems.
These are vocabularies in the GLD sense, and there are many of them,
including ISIC (International System for Industrial Classification).  A
determination that ISIC should be used for all applications where an
industrial classification is required would ignore the special needs of
regional and national systems geared towards the specifics of those
economies.  Same with declaring any one of the regional systems "good".
So declaring "good" is problematic.

2)      When a vocabulary is referenced or used, mostly it is with the
concepts in that vocabulary defined at the time of use.  But, for any
popular vocabulary, over time changes will be made.  Concepts are
redefined, split, or combined.  New concepts are incorporated.  But,
none of those changes should invalidate the use that was made before the
changes.

3)      It is rare that no existing vocabularies will suffice for the
needs for some problem, meaning creating one's own vocabulary is rarely
needed

 

If we were to recommend the creation of a catalog of vocabularies, this
I believe would solve many of the problems described below.  First, any
time-stamped vocabulary could be registered, rendering the problem of
updates moot.  Second, a catalog will support discovery, and the problem
of knowing which vocabulary to use and whether there are any relevant
possibilities will be much easier to solve.  The use of a catalog will
substantially reduce or eliminate the risk that vocabularies will be
over-written and invalidate data.  Third, the issue of conceptual
overlap among  similar concepts across vocabularies can be recorded,
enabling developers to map across vocabularies and provide deeper
semantic links across data.

 

I submitted 3 documents early in the GLD process that address these
issues.  I updated the available versions, and they are located at

1)
http://www.w3.org/2011/gld/wiki/File:Open_Government_Vocabualries_-_Regi
stration_Procedure_v03.doc

2)
http://www.w3.org/2011/gld/wiki/File:Open_Government_Vocabualries_-_Cont
ent_Model_v02.doc

3)
http://www.w3.org/2011/gld/wiki/File:Open_Government_Vocabualries_-_Regi
stration_Model_v04.doc

 

The first and third documents are especially relevant for this
discussion.

 

Unfortunately, I will not be able to join the call on Thursday, but I
can join during January (3rd or 10th).

 

Yours,

Dan

 

 

 

From: Makx Dekkers [mailto:mail@makxdekkers.com] 
Sent: Tuesday, December 18, 2012 11:03 AM
To: Public GLD WG
Cc: Sandro Hawke; Phil Archer
Subject: Referencing FOAF

 

Dear all,

 

There was a short discussion in the GLD call two weeks ago
(http://www.w3.org/2011/gld/meeting/2012-12-06) about referring to other
people's vocabularies. Here are some of my thoughts on this issue.

 

In earlier work that I was involved in, this issue has come up a couple
of times. From those discussions, I remember three potential strategies:

 

1.       Re-use existing vocabularies irrespective of how they were
developed and who is responsible for them. This has the advantage of
maximum re-use, but of course the risk of using stuff that will
disappear or be modified in uncontrolled ways, making your instance data
invalid or undefined. 

 

2.       Re-use existing vocabularies that are somehow deemed to be
"good", e.g. well-defined, well-maintained or owned by a trusted entity.
For this, you need a set of criteria that determine what is "good" and
what is not. This could be purely a set of local criteria, but you might
also consider a set of globally accepted criteria. One of the ideas that
I've heard was that there could be a Community of Vocabulary Owners that
would agree on good practice in the form of a common set of maintenance
and persistence policies, which could include the kind of commitments
like the one between DCMI and FOAF
(http://dublincore.org/documents/dcmi-foaf/). The advantage is that you
would have some level of confidence that the vocabularies involved in
this community would not disappear or break; the disadvantage is that
such an approach takes time and effort in consensus building.

 

3.       Don't directly re-use anything, but create parallel classes and
properties in your own namespace with appropriate sameAs or subClass and
subProperty declarations referring to other vocabularies as the
alternative to re-use. I've heard the argument "Our project/service is
going to be around longer than <fill in an organisation that maintains a
vocabulary>" to argue for this approach. The advantage is that you're
not dependent on someone else's policies or credibility, but I don't
think it will help the wider objectives of Linked Data. You're moving
the pain to the consumers who will need to resolve all these sameAs etc.
relationships for incoming data to figure out that abc:title is really
the same as xyz:title because they are both sameAs dc:title.

 

I had the impression that in the meeting it was suggested that W3C
specifications may not want to refer to FOAF because it is outside of
W3C which feels like going for option (3) - not re-doing FOAF of course,
but in the sense of bringing existing FOAF under the W3C umbrella and
associated policies. I hope that is not the general answer.

 

Makx.

 

 


Makx Dekkers

makx@makxdekkers.com

+34 639 26 11 46
Received on Tuesday, 18 December 2012 20:31:04 UTC