- From: Richard Newman <r.newman@reading.ac.uk>
- Date: Thu, 10 Feb 2005 00:07:57 +0000
- To: Karl Dubost <karl@w3.org>
- Cc: semantic-web@w3.org
Karl, I've had a few thoughts on this (some of which have percolated into my iPhoto RDF exporter[1], but by no means all! I should do some more work on that...). I'll reply in-line. [1] <http://www.holygoat.co.uk/applications/iphoto-rdf/iphoto-rdf> On Feb 9, 2005, at 21:46, Karl Dubost wrote: > a. Is the good strategy one RDF file for one image? or an EXIF file? > and an geo file? and a description file? I'm not convinced that it matters. I'd expect it all to go into a store, really, but you could also separate 'factual' from user-centric data (EXIF vs. rating, for example). > b. How do I identify the image ? a urn (which kind?) or an http uri? > <http://example.org/photo/2005/02/06/foo.jpg> even if the image is > not online. If the EXIF contains a precise date, that's possible, but you would need some way of mapping back to a file somewhere. I made up a scheme that uses iPhoto's unique IDs, but some URN based on a timestamp might be better. Giving it a URI would only work if you (or rather, the user of the software) has a prefix that they control. Tough one, that. See later answers for more. > c. Images are between 1 and 3 Mo each, I keep them on DVDs or external > hard drive. > Can I use the previous identifier? I would hope so, though the mapping between URI and a file to load may be more complex than you would like! I'd prefer to see a more complex schema (see answer to next question). > d. I have many versions of the same image. > * Original Image (2000 x 3000 px) > * Thumbnail (75 x 75 px) > * small version (400 x 600 px) > * cropped version > * published version in different context (different HTML pages, web > sites) > * Sampling (different images or part of the images associated with > others) > All these versions share one part or all parts of the information > which is about the image. How do I define the model to identify it and > gives information about it? I took/take an FRBR-esque view of things: the photo is an abstract entity (probably a RAW image that existed between CCD and CompactFlash). It has a canonical representation (the JPEG you dragged off the CompactFlash), which has at least one location (file:///Users...), and a number of other representations (thumbnails, web exports, etc.). Furthermore, there are a number of other derived works which also have canonical representations, thumbnails, etc. --- these would be crops, colour-corrected versions, etc. Then there would be format changes and so on, and there are also multiple copies (each JPEG version is linked to multiple instances by various properties, so you could have the original JPEG (file:///...orig.jpg), a backup burned to DVD, a copy on your Web site, etc.). Each of these corresponds with one of FRBR's 4 layers, and has its own sets of properties. I'd highly recommend digging up some stuff on FRBR, it makes this kind of thing much clearer. Note, though, that it still doesn't make it easier to identify the original picture, though if you know the original import location (iPhoto does) you can actually use that as an IFP to identify the abstract entity, skilfully avoiding the problem of giving it a URI! E.g. pic:originalImportLocation a owl:InverseFunctionalProperty . _b1 a pic:Photo; pic:originalImportLocation <file:///...> . I've done a bit of work on modelling FRBR in RDFS and OWL, which I should also get round to finishing. It might be nice to cast it down to images as an exemplar. > e. Is it better to have a large RDF file with information of all > images? Or a small individual RDF file for each image? Whatever works best for communication. If you're sharing 10 pics, smush their RDF together. It'll probably hit a store before it's used, anyway. > * RDF inside or outside the image. > > RDFPic recommends to put metadata inside the comment zone of JPEG. XMP > does it in the binary. In both case, I don't think it's always a good > idea, for privacy reason. Many softwares do not propose to wipe > metadata before publication on the Web. Problems will arise with cell > phones and GPS information. There are file strippers available, but I quite understand the concern. I'd keep it separate. > You may have information you don't want to publish on the Web. > Personal comments on the image, geo-localization of the image. > > Scenario: Someone is at a party at your place and likes very much your > painting or you computer. Cool. He takes a picture of it and send it > on his moblog, which displays the GPS information, then the latitude > and longitude with a comment "We are having so much fun at Peter's > place. He will write something about it on his weblog". > > Well no problems :))) Peter is leaving for holidays in Africa for one > month. Some people have noticed that. It's time for robbery !!! We > know the stuff inside, we know that Peter is not there. Let's go. Good scenario. That's why I don't let it leave my machine ;) Keeping things out of the file also avoids having to re-write files whenever the info changes, and means you don't have to read the file to get information about it (which is great for query servers). EXIF can stay, as that's supposed to be representative of the basic facts about the image, but RDF annotation should probably hang about elsewhere. Interesting post, Karl, thanks. -R
Received on Thursday, 10 February 2005 00:08:46 UTC