- From: Henry Story <henry.story@bblfish.net>
- Date: Fri, 6 Jan 2012 21:10:58 +0100
- To: Damian Steer <pldms@mac.com>
- Cc: Jürgen Jakobitsch <j.jakobitsch@semantic-web.at>, "public-xg-webid@w3.org XG" <public-xg-webid@w3.org>
Thanks Damian, that was very helpful. I have now fixed a couple of issues on my side now, and I see that Jürgen has updated his xhtml even to be closer to xhtml. So the foafssl.org tester should work with that resource in any case. Btw, I get the following in the logs INF: [console logger] dispatch: 2sea.org GET /sea.jsp HTTP/1.1 ERROR [pool-3-thread-5] (RDFDefaultErrorHandler.java:40) - http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd(line 106 column 22): {E213} Unexpected end of file from server It looks like the RDFa parser is following the DTDs. Is there a way to stop that? I guess the W3C does not serve those files. Henry On 6 Jan 2012, at 15:57, Damian Steer wrote: > Hi Henry and Jürgen, > > On 06/01/12 12:49, Henry Story wrote: > >> Shellac's parser parses the xhtml correctly as xhtml in fact, but >> when the html parser is used it comes to a different conclusion. > > Yes, this is becoming a classic issue, and has nothing to do with RDFa > (although RDFa obscures the issue horribly). > >> RDFA 1 is defined in xhtml only I understand, so it is true that we >> are going beyond what the spec by trying to parse html too. Perhaps >> this will be a lot simplified with rdfa1.1 which can be made to work >> with html5. > > Yes, RDFa 1.0 is only really defined for xhtml, although useful work was > done on html 5 at the time (there are some html 5 tests). RDFa 1.1 does > address html 5, but note that it doesn't change anything here. > > The problem is this: > > <div rel="foaf:depiction" href="http://2sea.org/2sealogo.png"/> > <div rel="cert:key"> > ... > </div> > > An xml parser sees a closed div, followed by another div. An html parser > sees a broken div so repairs it as follows: > > <div rel="foaf:depiction" href="http://2sea.org/2sealogo.png"> > <div rel="cert:key"> > ... > </div> > </div> <!-- close that div --> > > i.e. one div contains another now, and thus you find > > <http://2sea.org/2sealogo.png> cert:key .... > > I ought to add a utility to switch the parser based on content type, > however in practice there's so much broken xhtml out there that tag soup > parsing is much safer (although it does lead to issues like this). > > My advice would be to expect tag soup parsing in the wild and change the > html: > > <div rel="foaf:depiction" href="http://2sea.org/2sealogo.png"></div> > > Hope this makes sense, > > Damian Social Web Architect http://bblfish.net/
Received on Friday, 6 January 2012 20:18:28 UTC