W3C home > Mailing lists > Public > public-xg-webid@w3.org > January 2012

Re: rdfa parsing issue -- was: fixed https://foafssl.org/test/WebId

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Fri, 06 Jan 2012 09:22:57 -0500
Message-ID: <4F0703C1.80306@openlinksw.com>
To: public-xg-webid@w3.org
On 1/6/12 7:49 AM, Henry Story wrote:
> On 6 Jan 2012, at 02:10, Henry Story wrote:
>
>> On 6 Jan 2012, at 01:53, Henry Story wrote:
>>>> note : still i get "keys don't match"
>>> Mhh. Well perhaps another error now, or perhaps I just did not fix the error correctly.
>>> Perhaps if you send me your public/private key, then I can continue debugging it from here without going over the list.
>> I am starting to think it is my rdfa parser that has a problem. So I'll look into that in more detail first thing tomorrow.
> Yes, the problem comes from an issue with parsing your content. I am using Damian Steers rdfa parser, which he himself claims is quick and dirty https://github.com/shellac/java-rdfa
>
> Shellac's parser parses the xhtml correctly as xhtml in fact, but when the html parser is used it comes to a different conclusion.

And as I said to you in another post, the problems are just getting 
started. RDFa is still about xhtml squatting in html realm re. mime. The 
DOCTYPE hack just makes RDFa a nightmare since it encourage really lousy 
HTML that's a nightmare to parse and process properly re. Linked Data.

People don't care about DOCTYPE when they publish, and they just opt for 
text/html.

> This is I suppose the huge problem of dealing with metadata in dirty old html.

See my comment above, the dirty only gets dirtier with RDFa coming from 
xhtml into html. Again, I encourage you to look at GoodRelations and its 
evolution re. eventual support of both Microdata and RDFa.
>   Now you may ask: why am I parsing it as html when it is quite clearly xhtml? Well that is because you are serving your content as html.

Again, see my comments above about this problem. And this is just the start!

> --------------8<----------------------------------
> $ curl -I  http://2sea.org/sea.jsp
> HTTP/1.1 200 OK
> Server: Apache-Coyote/1.1
> Set-Cookie: JSESSIONID=737470C327A5268D1F28F146049BE510; Path=/
> Content-Type: text/html;charset=UTF-8
> Content-Length: 1996
> Date: Fri, 06 Jan 2012 12:44:05 GMT
> ------------------------------------------------
>
> Here the output in xhtml seems to be correct
>
> --------------8<----------------------------------
> $ rdfa.sh --format XHTML http://2sea.org/sea.jsp
>       @prefix :<http://www.w3.org/ns/auth/cert#>  .
>      @prefix sea:<http://2sea.org/sea.jsp#>  .
>
>      <http://2sea.org/sea.jsp>      a<http://xmlns.com/foaf/0.1/PersonalProfileDocument>;
>           <http://xmlns.com/foaf/0.1/maker>  sea:j .
>
>      sea:j     a<http://xmlns.com/foaf/0.1/Person>;
>           :key  [
>               a :RSAPublicKey;
>               :exponent 65537;
>               :modulus "986534d06d133821f40157c15891857d537d20af028656ddca1caf93cd3cc9104d4f172cf14d6102ddf13b16852c09b3fccb0a2fe2a2e895b8f5993fd87321d1a03b656cac78726715f7198f7c5d539b8197fd35bafe274ceade6694ec38c86609a25d6988f6e749f401c37145ac11142d84d775f4f929dbcd6ba809ab4e39b3dc36087062efcf73050e313f60929b7f969b8b6bc80e25ef6000bbe66d6925aba09aed8a16271d6d9651edb27c6bb50a1ffc6bc7d8bfe8346965cf0b5993385352157fb7df1b143a97ac7d642428c1f87dd7988364115dcfa05cfb020b595417feb54febaa8ac4a81c40ba9ac1dc6a097f53b379ba9850e0a45e2f1c452f3743"^^<http://www.w3.org/2001/XMLSchema#hexBinary>  ];
>           <http://xmlns.com/foaf/0.1/depiction>  <http://2sea.org/2sealogo.png>;
>           <http://xmlns.com/foaf/0.1/name>  "jürgen" .
> ------------------------------------------------
>
>
> But as you can see with html the key is said to belong to the page, not to the user anymore
>
>
> --------------8<----------------------------------
> $ rdfa.sh --format HTML http://2sea.org/sea.jsp
>
>       @prefix :<http://www.w3.org/ns/auth/cert#>  .
>      @prefix sea:<http://2sea.org/sea.jsp#>  .
>
>      <http://2sea.org/2sealogo.png>      :key  [
>               a :RSAPublicKey;
>               :exponent 65537;
>               :modulus "986534d06d133821f40157c15891857d537d20af028656ddca1caf93cd3cc9104d4f172cf14d6102ddf13b16852c09b3fccb0a2fe2a2e895b8f5993fd87321d1a03b656cac78726715f7198f7c5d539b8197fd35bafe274ceade6694ec38c86609a25d6988f6e749f401c37145ac11142d84d775f4f929dbcd6ba809ab4e39b3dc36087062efcf73050e313f60929b7f969b8b6bc80e25ef6000bbe66d6925aba09aed8a16271d6d9651edb27c6bb50a1ffc6bc7d8bfe8346965cf0b5993385352157fb7df1b143a97ac7d642428c1f87dd7988364115dcfa05cfb020b595417feb54febaa8ac4a81c40ba9ac1dc6a097f53b379ba9850e0a45e2f1c452f3743"^^<http://www.w3.org/2001/XMLSchema#hexBinary>  ] .
>
>      <http://2sea.org/sea.jsp>      a<http://xmlns.com/foaf/0.1/PersonalProfileDocument>;
>           <http://xmlns.com/foaf/0.1/maker>  sea:j .
>
>      sea:j     a<http://xmlns.com/foaf/0.1/Person>;
>           <http://xmlns.com/foaf/0.1/depiction>  <http://2sea.org/2sealogo.png>;
>           <http://xmlns.com/foaf/0.1/name>  "jürgen" .
> ------------------------------------------------
>
> RDFA 1 is defined in xhtml only I understand, so it is true that we are going beyond what the spec by trying to parse html too.

Again, see my comments.

>   Perhaps this will be a lot simplified with rdfa1.1 which can be made to work with html5.

But not Microdata that's natural and native to html5, as another option ?

>
> Perhaps this is a simple parsing bug that can be fixed by Damian, perhaps not....

No, its one of the real world problems that will hit WebID. 
GoodRelations has already been there, learned, and adapted.
>
>
> Henry
>
>
>
>
>


-- 

Regards,

Kingsley Idehen	
Founder&  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen








Received on Friday, 6 January 2012 14:26:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 6 January 2012 14:26:03 GMT