Newbie frustrations

I have a small application which I use to generate photo galleries for
my Web site.  I've been meaning for some time to add some semantic
metadata to the galleries for some time, as having that information
would greatly assist a search application.  I thought this would be
easy to accomplish -- the gallery structure and exposition are already
in an XML representation of my own devising, and the individual pages
in the gallery are generated using make and XSLT; it would not be
difficult to add a <metadata> element to the DTD and write another
XSLT script to extract the properties of each image and format the
result as RDF/XML.

It turned out to be far more difficult than I had expected.

My photo galleries are fairly typical affairs: for each gallery, there
is an index page, with a description of the gallery as a whole.  Each
photo is provided in multiple resolutions, and for each pair (photo
number, resolution) there is a photo description page in HTML which
embeds the photo and contains navigational links.

My first difficulty was in how to contain the information explosion.
In a typical 75-photo gallery, there are 376 distinct resources: one
index page, 75 thumbnails, and a description page and image file for
each of two resolutions.  But in the abstract "photo gallery"
semantics, there are only 151 actual *things*: an index page, possibly
with extended narrative, 75 photos, and 75 photo captions.  It's not
at all clear how to represent this.  I ended up (after trying several
alternatives that were unsatisfactory) representing each abstract
photo and each abstract caption as named blank nodes, with a
dcterms:hasFormat property indicating each of the available sizes.  (I
never figured out how to use rdf:Bag or rdf:Alt for this last bit, so
each hasFormat property is written separately.)  Then the abstract
photo can refer to the abstract caption for its dc:description
property, and at least some of the implicit semantic structure is made
explicit.

The next problem was also an information explosion, and I see from the
archives of this list that it's a well-traveled road.  Each photo
description in my source file has a photographer attribute.
Originally, this was only used to automatically generate copyright
notices on each page, so all I have is the name of the photographer in
conversational order.  This is an obvious candidate for inclusion in
the gallery metadata.  My first implementation simply output the
photographer name as a literal in the dc:creator property, but I felt
like I ought to be able to better.  I knew, in particular, that users
might want to use the foaf vocabulary to describe individuals depicted
in their photos, so I decided to represent photographers as instances
of foaf:Person.  The naive approach of using a foaf:Person as a value
of the dc:creator property failed to represent the important
underlying expectation that two photographers with the same name in
the same gallery are the same person.  So again I used the
named-blank-node approach (which obviously only works because I
keep the metadata for the whole gallery in a single document).  But in
this case it's rather less than satisfactory; I was forced to use the
generate-id() XPath function to create the names, which means that the
author of the gallery has no way of adding additional properties to
one of these automatically-generated foaf:Person instances.  I may end
up removing this function entirely and requiring the user to handle
this herself -- which I already do for the textual elements, since
my DTD doesn't represent authorship of the descriptions.

Having made an initial proof-of-concept hack, I started to annotate an
existing photo gallery with metadata, and quickly ran aground.  There
are four obvious categories of metadata one might be interested in for
an individual photograph:

a) Technical: how the photo was taken, at what resolution, in what
orientation, etc.  I am mostly not concerned with this, since it is of
no value to my application.

b) Temporal: when the photo was taken.  This is easy to accomplish and
the choice of representation is obvious.  (I used dcterms:created and
represent the date in DTF.)

c) Geographic: where was the photo taken.  This was much more
difficult; the obvious schemas all took a very computer-oriented
approach to geocoding, representing locations as grid coordinates --
information I do not have.  I searched for hours looking for a good
representation of an ordinary street address (the only kind of
geographic location I might have access to for my photos) and didn't
find anything I would describe as "good".

d) Subject matter: what is this a photo of?  It seems that here I have
to develop my own ontology, since I don't tend to take photos of
airports and only rarely take photos of people (where foaf provides
everything that's required).

I take pictures of radio towers.  Most towers have a number, assigned
by the FCC, but some don't.  All I wanted was a way to represent
"photo X shows tower Y" in a way which would allow me to answer
queries like "show me all the photos of tower Y in temporal order".
It shouldn't be this hard!

Sorry for the long-winded rant.  Am I being unreasonable or is it
really expected to be this difficult?

-GAWollman

Received on Wednesday, 4 January 2006 03:28:08 UTC