- From: Wouter Beek <wouter@triply.cc>
- Date: Tue, 7 May 2019 07:20:19 +0200
- To: KANZAKI Masahide <mkanzaki@gmail.com>
- Cc: SW-forum Web <semantic-web@w3.org>
Dear Kanzaki, Thank you for your information about the change in the W3C RDF/XML validator. IIUC the absence of the `rdf:RDF' root is valid, as long as there is one parent node (which is usually the case in HTML: the `<html>' tag). So that would mean that the W3C validator is not so useful ATM, until the "Extended interface" is added back. And also thank you for pointing towards a criterion that may determine whether or not an (X)HTML document is also an RDF/XML document: > RDF/XML has some constraints to be a "striping" > (elements should represent node -- property -- node pattern). I tested this by adding more nesting to my example XHTML document. I understand your hypothesis as saying that (X)HTML documents with 3 levels of nesting are RDF/XML, yet (X)HTML documents with more or less levels of nesting are not RDF/XML. I tested this hypothesis by adding more nesting to my test document: ``` <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title> </title> </head> <body> <table> <tbody> <tr> <td>some <em>col</em> 1</td> </tr> </tbody> </table> </body> </html> ``` But this still parses as 100% correct RDF/XML with Rapper: it adds more blank nodes in order to tie the additional levels of nesting together. I'm looking for the specific criterion in the RDF/XML 1.1 specification that is implemented incorrectly by Rapper and that causes it to parse this XHTML document as RDF/XML. If there is no such criterion, then RDF is a far more popular language than many might have previously believed. --- Best, Wouter. Email: wouter@triply.cc WWW: https://triply.cc Tel: +31647674624 On Tue, May 7, 2019 at 5:21 AM KANZAKI Masahide <mkanzaki@gmail.com> wrote: > > Hello Wouter, > > Some XHTML codes could be parsed as RDF/XML while others cause parse > errors, because RDF/XML has some constraints to be a "striping" > (elements should represent node -- property -- node pattern). > > W3C RDF Validator can parse your example XHTML by rapping with > <rdf:RDF..> and </rdf:RDF>. Actually, it had "Extended interface" > (existed until 2013, but now missing), where an option "RDF is NOT > enclosed in <RDF>...</RDF> tags (optional since 2004, eg. see many > DOAP files)" was provided. > > Note the result shows a parse error claiming 'String data "some col 1" > not allowed.' (Another popular parse errors include mixed content and > unexpected attribute on property element) > > If the <td> element is something > > <td><span>some col 1</span><td> > > then the validator accepts it and returns triples as you expected. > > > Ignoring some parse errors (good or bad), it would be possible to > interpret arbitrary XHTML as RDF/XML. My RDF visualizer ignores them > and seems to be able to handle most XHTML as RDF/XML [1] (check > XML/RDF option, otherwise it would be interpreted as Microdata HTML). > > cheers, > > [1] https://www.kanzaki.com/works/2009/pub/graph-draw > > 2019年5月7日(火) 5:03 Wouter Beek <wouter@triply.cc>: > > > > Dear SW community, > > > > The RDF/XML 1.1 specification contains the following two phrases: > > > > When there is only one top-level node element inside rdf:RDF, the > > rdf:RDFcan be omitted although any XML namespaces must still be > > declared. > > > > The XML specification also permits an XML declaration at the top > > of the document with the XML version and possibly the XML content > > encoding. This is optional but recommended. > > > > Does this mean that many/all (X)HTML documents are also RDF/XML > > documents? If so, there is much more RDF out there than I had > > previously thought. In fact, RDF would be at least as popular as HTML > > (contrary to common complaints from the SW community about RDF's > > popularity). > > > > Specifically, does the above mean that the following document should > > be parsed by a standards-compliant RDF/XML parser: > > > > ```xml > > <?xml version="1.0" encoding="utf-8"?> > > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" > > "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> > > <html xmlns="http://www.w3.org/1999/xhtml"> > > <head> > > <title> > > </title> > > </head> > > <body> > > <table> > > <tr> > > <td>some col 1</td> > > </tr> > > </table> > > </body> > > </html> > > ``` > > > > , resulting in the following RDF triples (serialized in N-Triples): > > > > ``` > > _:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > > <http://www.w3.org/1999/xhtmlhtml> . > > _:genid2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > > <http://www.w3.org/1999/xhtmltitle> . > > _:genid1 <http://www.w3.org/1999/xhtmlhead> _:genid2 . > > _:genid3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > > <http://www.w3.org/1999/xhtmltable> . > > _:genid4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > > <http://www.w3.org/1999/xhtmltd> . > > _:genid3 <http://www.w3.org/1999/xhtmltr> _:genid4 . > > _:genid1 <http://www.w3.org/1999/xhtmlbody> _:genid3 . > > ``` > > > > --- > > Best regards, > > Wouter Beek. > > > > Email: wouter@triply.cc > > WWW: https://triply.cc > > Tel: +31647674624 > > > > > -- > @prefix : <http://www.kanzaki.com/ns/sig#> . <> :from [:name > "KANZAKI Masahide"; :nick "masaka"; :email "mkanzaki@gmail.com"].
Received on Tuesday, 7 May 2019 05:21:21 UTC