Re: Other names for 'property groups' [was: Re: RDFa DOM API - New Editor's Draft] from Dan Brickley on 2010-06-01 (public-rdfa-wg@w3.org from June 2010)

From: Dan Brickley <danbri@danbri.org>
Date: Tue, 1 Jun 2010 19:08:24 +0200
To: Mark Birbeck <mark.birbeck@webbackplane.com>
Cc: Shane McCarron <shane@aptest.com>, Manu Sporny <msporny@digitalbazaar.com>, RDFa WG <public-rdfa-wg@w3.org>
Message-ID: <AANLkTilACF36PBUQxcamUkBemakTy2tZvx-BQsd97_4P@mail.gmail.com>
On Mon, May 31, 2010 at 9:49 PM, Mark Birbeck
<mark.birbeck@webbackplane.com> wrote:
> Hi Shane,
>
> On Mon, May 31, 2010 at 8:10 PM, Shane McCarron <shane@aptest.com> wrote:
>> I don't mind Group - but I also like "Collection".
>
> Yes, me too.
>
> As with 'property group' the distance between the term used and what
> the thing actually is, is very short:
>
>  Mum: What's a 'property collection'?
>  Mark: Oh, it's just a collection of properties that all apply to the
> same thing.
>  Mum: Ah...I see. I knew this semantic web business wasn't that difficult.

Just as wellwisher's aside, the first time I read this, I figured 'oh,
a property collection must be something like a template', ie. for
MP3File it might be a list like "dc:title", "dc:description",
"mo:artist", "foaf:sha1". Which is some of useful in some RDF
contexts, and related to the argots and application profiles work. But
it's not what you mean here, I realised. It's a collection of specific
instantiated properties.

So I should say historically, RDF has hopped around a bit on this
point. In ancient versions, what we now call rdf:Property used to be
called PropertyType, http://www.w3.org/TR/WD-rdf-syntax-971002/

In this revision http://www.w3.org/TR/1998/WD-rdf-syntax-19981008/
that usage was changed, see this from the appendix:

"Several pieces of basic vocabulary have been changed to clarify
terminology. Of particular note in this regard is the term property
which previously meant the combination of three things; a resource
identifier, an attribute (formally called a PropertyType in the
previous draft), and the value of that attribute for that resource.
Now this combination is formally called a statement and the old term
PropertyType is shortened to just Property. This usage of "property"
conforms more closely to common usage."

I don't cite this to say the '98 usage is wrong or right, just that
this is a distinction that's always haunted us. Both uses seem somehow
intuitive, to my RDF-addled brain. We can say that foaf:primaryTopic
is a property of things that are in the foaf:Document class; but we
can also say that I have a foaf:homepage property whose value is the
thing w/ URI <http://danbri.org/>.

RDF has always tried to balance a statement-centric view with a
thing/attribute/value view. The statement-centric terminology
generally comes to the fore when we are being somehow skeptical,
emphasising for example that we have different statements from
different sources, or selecting amongst sets of statements. The
thing-centric view comes when we immerse ourselves in the viewpoint of
some set of (temporarily, pragmatically) adopted statements. Once
we've picked a set of statements, we can ask -from their point of
view- what Dan's age is, or where he works. When we're at the meta-
level, we can only ask "which sets of statements say that Dan works in
Amsterdam?", rather than "where does he work". SPARQL of course allows
for both styles, via GRAPH. The core RDF specs have never really made
these levels explicit (beyond the awkward RDF'99 reification
mechanism, now largely abandoned). But the concept is well established
in practice through tooling and via SPARQL.

I've seen a few comments go past here which suggest that the
statement-centric perspective is somehow *more* RDF-y. I don't think
that's quite right. RDF is designed to allow you to dive in and see
the world from the point of view of some chunk of useful data, in
which case words like class, property and value get used. And it also
allows you to zoom out, and see a world composed of competing and
complementary claims. That's when our universe consists of nothing
more than source-attributed statements. The ability to move in a fluid
way between these styles is at the heart of RDF, but also at the heart
of our terminology problems. The word 'property' being the most
awkward.

Test case:

Consider a toy world containing Alice and her two brothers, Bob and
Charlie. An RDF  description of Alice might list a :name "Alice", and
then mention a :brother with name "Bob", and another :brother with
name "Charlie". So to be clear the data is:

<http://example.com/#alice> <http://ns.example.com/vocab#name> "Alice".
<http://example.com/#bob> <http://ns.example.com/vocab#name> "Bob".
<http://example.com/#charlie> <http://ns.example.com/vocab#name> Charlie".
<http://example.com/#alice> <http://ns.example.com/vocab#brother>
<http://example.com/#bob>.
<http://example.com/#alice> <http://ns.example.com/vocab#brother>
<http://example.com/#charlie>.

Give that scenario to a dozen RDF experts, OWL gurus, JSON hackers etc
and ask them "So, how many properties of Alice are given in that
description?". One might say "three!" as there are three triples er
statements with her in subject role. Or you might say "two!" as there
are two different predicates used in claims about her. And let's not
get into implied properties, eg. if :brother has a super property
:sibling or if OWL inverses, property chains etc. are defined.

My intuition is that we won't get a clean answer from the community;
some will say "two!", others will say "three!", still others will
check to see whether you mean asserted triples in the data, or implied
triples.

http://www.w3.org/TR/rdf-concepts/ has the usage "a predicate (also
called a property) that denotes a relationship".

http://www.w3.org/TR/rdf-mt/ is similar, eg. "and also, if it is used
to indicate a property, what values that property has for each thing
in the universe".

So I think two is the 'correct' RDF answer to the 'how many
properties' question, and this is something shared between the
1998/1999 RDF specs, and the 2004 RDFCore revisions. Whether 'two'
fits others' intuitions, I don't know.

Re the use of 'dictionary' from Python etc., I suggest keeping this
test case in mind. Simple associative arrays often don't allow two
different values for the same key. If the word you choose is going to
give programmers the idea RDF also works that way, we might end up
confusing them.

Hope this helps, even if I don't offer any answers. I'm aware I'm not
on the WG so will be missing some context.

cheers,

Dan
Received on Tuesday, 1 June 2010 17:08:57 UTC