Use case fodder: 'better photo metadata' design issues

I'm soon to vanish for the winter break, and wanted to note some design
issues here before the new year. My apologies that these are not formatted
in a standard way nor assigned to one of the sub-group categories.

Use case fodder: better photo metadata
--------------------------------------

Some feedback from having worked on an RDF-based application (in
collaboration with Libby Miller and others in Bristol). In RDFWeb (our
semantic web vapourware 'fun and hacking' side project) we have been
describing people, documents, organisations, photographs etc., and through
doing so have bumped into some issues that relate to schema/ontology
language design.

Our current prototype explores the use of RDF to describe something of the
content of digital images; that this photo 'depicts the person whose
mailbox is xyz' and so on. We have a database of photo descriptions that
describe people (and other parts of the images) in this fashion. Since
people tend not to have well known Web identifiers (URIs) we have used a
convention for 'identification through description' in RDF, based on a
language feature from DAML+OIL, namely the dpo:UnambiguousProperty
construct. Experience with harvesting and indexing such data has led me to
conclude that the formal semantics of dpo:UnambiguousProperty are not
strong enough to support some very common use cases (such as identifying
people). Rather than couch my comments in general terms, I'm going to
stick to the particular goal of 'better photo metadata'. Similarly, we
have in pursuit of the same goal found a need to describe things which
(for lack of better terminology) *don't exist*. We want, for example, to
describe the content of cartoons and other images. Again this need goes
beyond the 'simple photo metadata' scenario, but I'm resisting the urge to
generalise.

So, for better photo metadata we want
(i) to describe digital images (identified by) URI in RDF, where the RDF
vocabularies are defined in RDF Schema augmented with a Web Ontology
language.
(ii) to be able to conclude (through understanding the formal semantics of
the RDFS and ontology language) interesting things about the content of
the images.
(iii)such as that the image depicts some identified person or other agent
(iv)or that it depicts some scene (described in RDF) whose elements may or
may not correspond to things we believe to actually exist.

I have some longer notes on this use case online at...

	http://rdfweb.org/people/danbri/2001/12/puzzle/unicorny.html

...which I won't recycle in full here. I've resisted the temptation to go on
about Quine or temporal logics or whatever, and tried to give a plain story about
something we're trying to build using RDF and ontology technology. At some
point I might add some more technical detail or at least better references
and examples.


The main conclusions I draw from the photo metadata app are:

(i) dpo:UnambiguousProperty is of limited use for applications that need
to cope with a changing environment. We found a need for a property such
as wol:StaticUnambiguousProperty that guarantees to pick out an indivdual
across time, given some property/value pair. This is because we want to
distinguish two cases:

 - where there is AtMostOne entity with some given prop/value pair
   *at any given point in time*

 - where there is AtMostOne entity with some given prop/value pair
   *time invariant*

Practically, we want our RDF system to be able to read our schema and
understand from its WOL annotations where
'mailbox=president@whitehouse.gov' could name different individuals at
different points in time. I hope that we can find a way to do such a thing
within our Web Ontology language without a huge leap in complexity, but
suspect that will come at the cost of having language features not readily
captured by the formal techniques used in DAML+OIL. In my (limted
experience) of using DAML+OIL, dpo:UnambiguousProperty is by far the most
useful feature of the language. Despite that, the conclusions my app is
drawing from the dataset go boyond those that are really supported by
DAML+OIL (eg. that these two photo descriptions describe images that
depict the same person).

(ii) We will need to establish conventions for avoiding confusion when
describing (what I clumsily call) non-existent entities. This crops up in
photo metadata (or other descriptions of fiction, artwork etc). But also
in some other familar areas: consider Events. We want to exchange
precise descriptions of events that may never happen (eg. the rdf-calendar
work; or mechanised negotiations in a B2B context), just as we want to
exchange descriptions of cartoons that depict entities that never existed.
Or for that matter any state of affairs that can be associated with an
agent via a propositional attitude ('believes that...', 'hopes that...',
'fears that...'). RDF, being pretty simple minded, doesn't provide a lot
'out of the box' to help with such tasks. We could use reification, or
hypertext (see above URL for an example). It may be that dealing with
non-existents is something that can be punted off to a WebOnt/RDF Primer
or best practice note. My experience with RDF is that we'll bump into
these problem sooner rather than later.


These notes are a bit rough; hope they are of some use. At some point I'll
provide some RDF/XML test cases files associated with the points made
above (you can find some similar example data by following the links from
[1]).

cheers,

Dan



[1] http://rdfweb.org/people/danbri/2001/12/puzzle/unicorny.html

-- 
mailto:danbri@w3.org
http://www.w3.org/People/DanBri/

Received on Tuesday, 18 December 2001 13:16:00 UTC