W3C home > Mailing lists > Public > public-lod@w3.org > April 2010

Re: XMP RDF extractors?

From: Pierre-Antoine Champin <swlists-040405@champin.net>
Date: Tue, 13 Apr 2010 17:31:26 +0100
Message-ID: <4BC49C5E.1070206@champin.net>
To: Dan Brickley <danbri@danbri.org>
CC: Leigh Dodds <leigh.dodds@talis.com>, Linking Open Data <public-lod@w3.org>
Even more tangent, but when I read in detail the XMP spec last year (in
relation to the Media Annotation WG), I came to two conclusions:

- XMP specifies RDF at the level of the XML serialization, which is
*ugly* (emphasis on *ugly*). Furthermore, it makes it unsafe to use
standard RDF/XML serializers, as those may not enforce those syntactic

- XMP interprets RDF/XML in a non-standard way, considering the two
following tags as non equivalent
  <ns1:bar    xmlns="http://example.com/foo">...
  <ns2:foobar xmlns="http://example.com/">...
(which is again, a syntax-only perspective). So it is not safe to use
standard RDF/XML parsers, as they will produce a model which may be
inconsistent with other XMP parsers.

So you can neither use standard serializers nor standard parsers to
handle XMP's RDF safely, so as far as I'm concerned, XMP is not really
RDF -- and Dan's problems to extract it strengthen this opinion of mine...

That being said, the risks of inconsistency are minimal, especially for
parsing. So I guess there is some value in "pretending" XMP is RDF ;)
and using an RDF parser to extract it...


On 13/04/2010 16:04, Dan Brickley wrote:
> On Tue, Apr 13, 2010 at 3:56 PM, Leigh Dodds <leigh.dodds@talis.com> wrote:
>> Hi,
>> Yes.
>> PDF: http://patterns.dataincubator.org/book/linked-data-patterns.pdf
>> EPUB: http://patterns.dataincubator.org/book/linked-data-patterns.epub
> Something of a tangent but this reminds me, what's the latest on RDF
> extractors for Adobe XMP? I always used to use 'strings' and a regex
> but I haven't tracked the spec and have found this trick working
> *less* well over time, not better.
> strings linked-data-patterns.pdf | grep -i xmp
> " id="W5M0MpCehiHzreSzNTczkc9d"?><x:xmpmeta xmlns:x="adobe:ns:meta/">
> <rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" rdf:about="">
> <xmp:CreateDate>2010-04-12T23:01:36+01:00</xmp:CreateDate>
> </x:xmpmeta><?xpacket end="r"?>
> By contrast, downloading the .epub file and unzipping you find this in
> content.opf:
> <?xml version="1.0" encoding="utf-8" standalone="no"?>
> <package xmlns="http://www.idpf.org/2007/opf" version="2.0"
> unique-identifier="bookid">
>   <metadata>
>     <dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/"
> id="bookid">_id2880071</dc:identifier>
>     <dc:title xmlns:dc="http://purl.org/dc/elements/1.1/">Linked Data
> Patterns</dc:title>
>     <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/"
> xmlns:opf="http://www.idpf.org/2007/opf" opf:file-as="Dodds,
> Leigh">Leigh Dodds</dc:creator>
>     <dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/"
> xmlns:opf="http://www.idpf.org/2007/opf" opf:file-as="Davis, Ian">Ian
> Davis</dc:creator>
>     <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">This
> book lives at http://patterns.dataincubator.org. Check that website
> for the latest version. This work is licenced under the Creative
> Commons Attribution 2.0 UK: England &amp; Wales License. To view a
> copy of this licence, visit
> http://creativecommons.org/licenses/by/2.0/uk/. Thanks to members of
> the Linked Data mailing list for their feedback and input, and Sean
> Hannan for contributing some CSS to style the online
> book.</dc:description>
>     <dc:language xmlns:dc="http://purl.org/dc/elements/1.1/">en</dc:language>
>   </metadata>
>   <manifest>
>     <item id="ncxtoc" media-type="application/x-dtbncx+xml" href="toc.ncx"/>
>     <item id="htmltoc" media-type="application/xhtml+xml" href="bk01-toc.html"/>
>     <item id="id2880071" href="index.html" media-type="application/xhtml+xml"/>
> Wouldn't it be nice if there were easy conventions for books about RDF
> to have Webby linked RDF bundled in the files? Both seem nearly there
> but not quite... (this not a complaint Leigh, I love this work btw!)
> cheers,
> Dan
> ps. re epub see also
> http://lists.w3.org/Archives/Public/public-lod/2010Jan/0121.html
Received on Tuesday, 13 April 2010 16:32:12 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:05 UTC