- From: Richard Newman <r.newman@reading.ac.uk>
- Date: Thu, 10 Feb 2005 00:07:57 +0000
- To: Karl Dubost <karl@w3.org>
- Cc: semantic-web@w3.org
Karl,
I've had a few thoughts on this (some of which have percolated into
my iPhoto RDF exporter[1], but by no means all! I should do some more
work on that...). I'll reply in-line.
[1] <http://www.holygoat.co.uk/applications/iphoto-rdf/iphoto-rdf>
On Feb 9, 2005, at 21:46, Karl Dubost wrote:
> a. Is the good strategy one RDF file for one image? or an EXIF file?
> and an geo file? and a description file?
I'm not convinced that it matters. I'd expect it all to go into a
store, really, but you could also separate 'factual' from user-centric
data (EXIF vs. rating, for example).
> b. How do I identify the image ? a urn (which kind?) or an http uri?
> <http://example.org/photo/2005/02/06/foo.jpg> even if the image is
> not online.
If the EXIF contains a precise date, that's possible, but you would
need some way of mapping back to a file somewhere. I made up a scheme
that uses iPhoto's unique IDs, but some URN based on a timestamp might
be better. Giving it a URI would only work if you (or rather, the user
of the software) has a prefix that they control. Tough one, that. See
later answers for more.
> c. Images are between 1 and 3 Mo each, I keep them on DVDs or external
> hard drive.
> Can I use the previous identifier?
I would hope so, though the mapping between URI and a file to load may
be more complex than you would like! I'd prefer to see a more complex
schema (see answer to next question).
> d. I have many versions of the same image.
> * Original Image (2000 x 3000 px)
> * Thumbnail (75 x 75 px)
> * small version (400 x 600 px)
> * cropped version
> * published version in different context (different HTML pages, web
> sites)
> * Sampling (different images or part of the images associated with
> others)
> All these versions share one part or all parts of the information
> which is about the image. How do I define the model to identify it and
> gives information about it?
I took/take an FRBR-esque view of things: the photo is an abstract
entity (probably a RAW image that existed between CCD and
CompactFlash). It has a canonical representation (the JPEG you dragged
off the CompactFlash), which has at least one location
(file:///Users...), and a number of other representations (thumbnails,
web exports, etc.). Furthermore, there are a number of other derived
works which also have canonical representations, thumbnails, etc. ---
these would be crops, colour-corrected versions, etc.
Then there would be format changes and so on, and there are also
multiple copies (each JPEG version is linked to multiple instances by
various properties, so you could have the original JPEG
(file:///...orig.jpg), a backup burned to DVD, a copy on your Web site,
etc.).
Each of these corresponds with one of FRBR's 4 layers, and has its own
sets of properties.
I'd highly recommend digging up some stuff on FRBR, it makes this kind
of thing much clearer. Note, though, that it still doesn't make it
easier to identify the original picture, though if you know the
original import location (iPhoto does) you can actually use that as an
IFP to identify the abstract entity, skilfully avoiding the problem of
giving it a URI! E.g.
pic:originalImportLocation a owl:InverseFunctionalProperty .
_b1 a pic:Photo;
pic:originalImportLocation <file:///...> .
I've done a bit of work on modelling FRBR in RDFS and OWL, which I
should also get round to finishing. It might be nice to cast it down to
images as an exemplar.
> e. Is it better to have a large RDF file with information of all
> images? Or a small individual RDF file for each image?
Whatever works best for communication. If you're sharing 10 pics, smush
their RDF together. It'll probably hit a store before it's used,
anyway.
> * RDF inside or outside the image.
>
> RDFPic recommends to put metadata inside the comment zone of JPEG. XMP
> does it in the binary. In both case, I don't think it's always a good
> idea, for privacy reason. Many softwares do not propose to wipe
> metadata before publication on the Web. Problems will arise with cell
> phones and GPS information.
There are file strippers available, but I quite understand the concern.
I'd keep it separate.
> You may have information you don't want to publish on the Web.
> Personal comments on the image, geo-localization of the image.
>
> Scenario: Someone is at a party at your place and likes very much your
> painting or you computer. Cool. He takes a picture of it and send it
> on his moblog, which displays the GPS information, then the latitude
> and longitude with a comment "We are having so much fun at Peter's
> place. He will write something about it on his weblog".
>
> Well no problems :))) Peter is leaving for holidays in Africa for one
> month. Some people have noticed that. It's time for robbery !!! We
> know the stuff inside, we know that Peter is not there. Let's go.
Good scenario. That's why I don't let it leave my machine ;)
Keeping things out of the file also avoids having to re-write files
whenever the info changes, and means you don't have to read the file to
get information about it (which is great for query servers). EXIF can
stay, as that's supposed to be representative of the basic facts about
the image, but RDF annotation should probably hang about elsewhere.
Interesting post, Karl, thanks.
-R
Received on Thursday, 10 February 2005 00:08:46 UTC