Re: Semantic web formalisms (was Re: Requirements for a possible "RDF 2.0") from Dan Brickley on 2010-01-16 (semantic-web@w3.org from January 2010)

From: Dan Brickley <danbri@danbri.org>
Date: Sat, 16 Jan 2010 10:13:27 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: Jiří Procházka <ojirio@gmail.com>, Semantic Web <semantic-web@w3.org>
Message-ID: <eb19f3361001160113k7352ba56ndcd0721a0d4a4860@mail.gmail.com>
2010/1/16 Pat Hayes <phayes@ihmc.us>:
>
> On Jan 15, 2010, at 4:21 PM, Jiří Procházka wrote:

>> Look at FOAF for example... Undoubtedly there is demand for such
>> lightweight semantics.

Actually not lighter, just different. In FOAF we care about things
like where the information came from, when it was true, which RDF
properties can reasonably take different values later in time, etc.
None of which is in OWL or RDFS.  Having such things would help us
better document the meanings of the terms in FOAF, and perhaps do some
automated sanity checking.

Re formalising things like 'friend', ... well the idea of a standards
committee sitting around a table debating the definition (in English
or in a logic language) of "friend" always struck me as ridiculous,
which is one of the reasons I always resisted requests to
'standardise' FOAF through somewhere like W3C.

As an aside, there are some mappings to FOAF in Cyc now; see
http://wiki.foaf-project.org/w/Cyc for details (or go to
http://sw.opencyc.org/ and search for FOAF).

> It relies upon the RDF (and bits of RDFS) semantics, though. Not of course
> to DEFINE its meanings,

>From the start, I always tried to anchor FOAF in natural language
categories and terms - as a documentation technique, I wouldn't use
the word 'definition' for that either. For a long time, each class was
defined as a subclass of one generated from Wordnet hypernyms
(http://lists.w3.org/Archives/Public/www-rdf-interest/1999Dec/0002.html
and successors). That stuff is broken currently but we'll probably
bring it back, again as documentation rather than definition.

There is a paragraph in http://www.w3.org/TR/rdf-schema/ since the
RDFCore revisions, which I'll still stand by:

    """This specification does not attempt to enumerate all the
possible forms of vocabulary description that are useful for
representing the meaning of RDF classes and properties. Instead, the
RDF vocabulary description strategy is to acknowledge that there are
many techniques through which the meaning of classes and properties
can be described. Richer vocabulary or 'ontology' languages such as
DAML+OIL, W3C's [OWL] language, inference rule languages and other
formalisms (for example temporal logics) will each contribute to our
ability to capture meaningful generalizations about data in the Web.
RDF vocabulary designers can create and deploy Semantic Web
applications using the RDF vocabulary description language 1.0
facilities, while exploring richer vocabulary description languages
that share this general approach."""

The original FOAF writeup - http://www.foaf-project.org/original-intro
- described FOAF as a 'utility vocabulary', and we sketched some
family tree terms like foaf:uncle which it would be tempting to try to
formalise in a rule language. Ten years later, RIF is nearly finished,
OWL2 is done, and we could probably define some cartoon structures
that better capture things like "something is something else's uncle
if [blah blah blah male blah sibling blah parent blah]". And to the
extent those kinds of OWL2/RIF definitions shadow what real users of
the Web want to say about themselves or their families, we can happily
use any new RDF-based vocabulary technology to improve this
documentation.  And I do expect we'll see much more use of OWL2 in a
FOAF context in the coming months, particularly to describe groups of
people picked out by adhoc characteristics.

*However* ... when we have these sorts of tool in our toolkit, it is
very tempting and appealing to use them, over-use them, in situations
where they don't really fit. And FOAF is right there as the canonical
example. To a tidy-minded ontologist, it is tempting to define
foaf:gender as an enumueration with 2 values; but doing so excludes
thousands who don't feel comfortable picking from that short list. To
a tidy-minded ontologist, it's natural to define 'foaf:father' and
'foaf:mother' as functional properties; excluding millions who might
protest that their family history is a little more complicated than
that. It is always possible to write a more complex - aka pedantic -
ontology, to protest that we should be distinguishing
foaf:biologicalParent from other forms of parenting, so that we can
axiomatise the tidier parts. But doing so forces people to make
descriptive distinctions that may well not care to make; sometimes
that fuzzyness is there for good reason.

So when the FOAF stuff seems to lean towards the fuzzy, messy side of
things, it isn't for lack of respect for the richness of the formal
tools we have; just a concern that we don't over-apply  formality in
areas it isn't needed. If FOAF ever aquires family tree properties,
they'll lean towards definitions that defer to the views of the
parties involved; this is quite natural from a social perspective, but
something akin to rocket science if we tried to capture it all
formally.

( I have some fondness for
http://geeks-bearing-gifts.com/gbgContents.html here, if it's not
obvious)


> ...                  but to support what little inference that engines
> might need to draw; and they do: inverse functional properties for example
> play a critical role in FOAF deployment.

Yes. Actually what we really wanted was something like 'inverse
functional static', since FOAF apps merge multiple descriptions of the
same person, which might not ever have been simultaneously true. A
page about me from 2004 and a page about me from 2010 could at
authorship time both fairly ascribe me different ages. I can put all
that quite happily into a SPARQL db, but I need to be careful when
putting it into certain kinds of reasoner system. You can't merge both
documents (and the schema) into a common set of triples without
generating a contradiction; however you can reasonably expect
computers to figure out that they are nevertheless both descriptions
of the self-same person. Doing this by solely relying on the
owl:InverseFunctionalProperty nature of foaf:homepage doesn't quite
cut it; if a person can have different ages in different -
contextually reasonable - descriptions, why can't different things
have been the thing that had that homepage, at different times? In the
absence of formal languages to describe this kind of real world
practical mess, we get by with procedural code. Life goes on! But
formally speaking, there are useful kinds of data merging we want to
do over FOAF-based RDF documents which aren't sanctioned by owl:IFP.

>> If it was to be defined by some descriptive
>> language... to be frank, I can't imagine that - it would probably be
>> some upper ontology stuff which is subject of many disagreements.
>
> Agreed, we don't need that. I'm all for lightweight ontologies. But they
> still use the semantics of the formalism they are written in.

They do. Reminds me to add a section to the FOAF spec declaring
support for the real meaning of owl:sameAs; I've added mention of the
other bits of OWL and RDFS we use -
http://xmlns.com/foaf/spec/#sec-extrefs - , but forgot sameAs since it
only appears in instance data, not in our RDFS/OWL file...

>> That
>> is why we shouldn't embrace one specific descriptive language as main
>> RDF formalism, we should be able to choose it, therefore defining the
>> level our ontologies can be reasoned about by machines.
>
> No, here you go off the rails. Look, RDF does not define the meanings of
> FOAF, of course. But FOAF itself does use the formally defined (and very
> weak) meanings of things like rdf:type and owl:inverseFunctionalProperty;
> and it would actually break if these were arbitrary or not defined with some
> precision.

It's still reasonable to have various more (initially experimental)
descriptive languages built over the RDF base. For example, one that
dealt with provenance / sourcing of claims, temporal issues etc might
be attractive in some contexts but overly-costly in others. In an
ideal world these variations would all play well together, but I
wouldn't hold my break.

>> As I understand it, you expect to teach your machine RDF, and be able to "understand" whole semantic web.

I don't think anyone seriously proposed this as a requirement for RDF
or the W3C SemWeb specs. When we talk of machine understanding and
(more often) partial understanding around here, it is pretty
metaphorical. Generally, people handle the 'understand' bit, computers
get the boring job of processing, merging, comparing, querying,
transforming and not-screwing-up our data files. To call that
'understanding' risks getting us the reputation of being wide-eyed
idealists who are trying to make the Web achieve full consciousness.
I"d settle for making computers slightly less anoying and somewhat
more useful in day to day information management tasks. On top of that
modest end, there's a lot that can be achieved. Full machine
understanding can wait for version 3... :)

cheers,

Dan
Received on Saturday, 16 January 2010 09:14:02 UTC