- From: Thomas Baker <tbaker@tbaker.de>
- Date: Sun, 13 Mar 2011 19:13:40 -0400
- To: public-lld@w3.org
Context: after Corey wrote...
I've been thinking a lot about this question of the pros
and cons of unconstrained, generalized properties, and
am increasingly convinced that hard-coding domains and
ranges into things is a significant barrier to reuse. I
very much like the superclass / generalized superproperty
approach used in the rda vocabs and suggested by Jeff &
others on this list.
One of the things I like about this approach is that it
*could* have the potential to allow multiple views of
the same bibliographic data to co-exist without any of
the underlying assertions contradicting each other.
Jon Phipps questioned the usefulness of blank nodes for LOD
and explained the rationale for open superproperties:
Just two comments:
1. I _hate_ blank nodes in public-facing RDF, especially
RDF intended to be published and consumed as LOD, largely
because those nodes only provide a system-local identifier
for the thing being described. This has no utility beyond
the specific graph that 'contains' them and obfuscates
the nature of the thing as well. RDF and RDF-based LOD is
about knowledge transfer and not just data publishing. A
blank node says I have data about this thing, I can't
identify it, and you can't make any inferences about it
beyond the properties I've provided, and neither you nor
I know what it is. If you know enough about something to
give it properties, then you know enough to give it an
identifier, even (especially) if you add a significant
amount of data to it later.
2. The notion that somehow there's a cost to instantiating
an explicitly inferred superproperty when aggregating
public LOD flies in the face of much of the purpose of RDF
and its utility in navigating an open world of data where
the data model presumes that you don't _ever_ have all
of the available data and you can expand the data you do
have and dramatically increase interoperability through
intelligent inferencing guided by the publisher. The
_point_ of RDF LOD is publication of domain-specific,
system-specific knowledge in a way that can be consumed
and _understood_ in the open world of data. So that it
can be consumed and _understood_ by systems that have no
other knowledge of the domain supplying the data.
The RDA RDF vocabularies were designed to enable the
communication of library data to systems that have no or
limited understanding of 'library' data with as little
loss of meaning as possible for systems that might have a
clue. This is an entirely different purpose than simply
re-serializing MARC data in a different 'format', and
is the primary reason for the open superproperties. The
design is deliberately intended to support the kind of
recombinant metadata that you're suggesting -- there's no
reason why systems that have a different notion of WEMI
or WMI or W(EMI) can't describe their metadata properties
in a way that makes sense for their system and create a
relationship to the RDA superproperties that will allow
consuming systems to exploit that relationship to better
understand the data. It's about embracing the inevitable
chaos and working with it in creative and constructive
ways, rather than trying to legislate it out of existence.
Jeez, that was more rant than comment, eh?
But still just my $.03 and I hope maybe helpful.
Jon
--
Tom Baker <tbaker@tbaker.de>
Received on Sunday, 13 March 2011 23:14:21 UTC