W3C home > Mailing lists > Public > public-rdf-in-xhtml-tf@w3.org > March 2008

Re: Different results of different RDFa extractors

From: Mark Birbeck <mark.birbeck@x-port.net>
Date: Thu, 27 Mar 2008 17:32:50 +0000
Message-ID: <a707f8300803271032w7831fd21p6a9e197fef3cb3d8@mail.gmail.com>
To: "Sébastien Laborie" <Sebastien.Laborie@inrialpes.fr>
Cc: public-rdf-in-xhtml-tf@w3.org, Faisal.Alkhateeb@inrialpes.fr

Hi Sébastien,

>  We have tested the RDFa Distiller (http://www.w3.org/2007/08/pyRdfa/)
>  and the following Java extractors : RDFa extractor and SweetWiki .
>  The tested XHTML+RDFa web page is the following : http://
>  www.inrialpes.fr/exmo/people/laborie/SPARQLMM/PageWeb-CapitalEurope/
>  index.xhtml. This web page has been validated with W3C XTHML+RDFa
>  validator.

I think RDFa Distiller is the only one that is correct. All of the
others are wrong in various ways (at least from my cursory
examination).

It might help if I clarify what a parser _should_ generate.

You have cities laid out like this:

  <span instanceof="ex:city" href="Roma">
    <span property="ex:name" content="Roma"/>
      <span rel="foaf:depiction">
        <span instanceof="foaf:Image" href="Rome.jpg">
          <span property="dc:format" content="image/jpeg"/>
        </span>
      </span>
    </span>

    <div class="feature">
      <img src="Rome.jpg" alt="Rome" width="107" height="167"/>
      <p>
        Rome is the capital city of Italy and of the Lazio region, as well
        as the country's largest and most populous city, with more than 2.7
        million residents. The Colosseum or Coliseum, originally the Flavian
        Amphitheatre, is an elliptical amphitheatre in the centre of
the city of Rome.
      </p>
    </div>
  </span>

This should generate:

  <#roma> a ex:city .
  <#roma> ex:name "Roma"@en .
  <#roma> foaf:depiction <Rome.jpg> .
  <Rome.jpg> a foaf:Image .
  <Rome.jpg> dc:format "image/jpeg"@en .

Just in passing, you can avoid the repetition of the information
relating to the image for each city, as follows:

  <span href="Roma" instanceof="ex:city">
    <span property="ex:name" content="Roma"/>

    <span rel="foaf:depiction">
      <img src="Rome.jpg"
                instanceof="foaf:Image"
                property="dc:format" content="image/jpeg"
              alt="Rome" width="107" height="167"
      />
    </span>

    <div class="feature">
      <p>
        Rome is the capital city of Italy and of the Lazio region, as well
        as the country's largest and most populous city, with more than 2.7
        million residents. The Colosseum or Coliseum, originally the Flavian
        Amphitheatre, is an elliptical amphitheatre in the centre of
the city of Rome.
      </p>
    </div>
  </span>

This is because @src now acts like @about.

Regards,

Mark

-- 
  Mark Birbeck

  mark.birbeck@x-port.net | +44 (0) 20 7689 9232
  http://www.x-port.net | http://internet-apps.blogspot.com

  x-port.net Ltd. is registered in England and Wales, number 03730711
  The registered office is at:

    2nd Floor
    Titchfield House
    69-85 Tabernacle Street
    London
    EC2A 4RR
Received on Thursday, 27 March 2008 17:33:29 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:01:56 UTC