- From: Wouter Beek <wouter@triply.cc>
- Date: Tue, 7 May 2019 07:03:10 +0200
- To: "Charles 'chaals' (McCathie) Nevile" <chaals@yandex.ru>
- Cc: "semantic-web@w3.org" <semantic-web@w3.org>
Dear Charles, I'm not referring to RDFa, JSON-LD, or Microformats in my question. I'm specifically interested which (X)HTML-only documents are also RDF/XML documents. If you say that most (X)HTML documents are not RDF/XML documents, do you mean that there is a specific criterion that most (X)HTML documents fail to meet that makes them not RDF/XML documents? I am interested in making such a criterion explicit. --- Best, Wouter. Email: wouter@triply.cc WWW: https://triply.cc Tel: +31647674624 On Tue, May 7, 2019 at 1:02 AM Charles 'chaals' (McCathie) Nevile <chaals@yandex.ru> wrote: > > Then the answer is clear: No, for the most part they are not. > > The exception is data (mostly schema.org) encoded as RDFa or JSON-LD. > > There is also a sense in which a reasonable amount of microformats and > microdata (the latter has been most of the entire included data in the > wild, but I think JSON-LD might catch it one day), is reasonably > straiightforwardly RDF. > > Collectively all of that is not uncommon - reasonable claims suggest > double-digit percentages of modern web content and *maybe* as much as a > quarter or more. > > For example schema.org's *model* for the data is RDF, whatever the > encoding. On the other hand microdata was specifically designed as an > anti-RDF, so a certain amount of it isn't RDF by any stretch, and in any > event you have to process it so I am not sure how that counts in what you > are looking for (you have to process JSON-LD and RDFa, both of which are > explicitly RDF, to match one to another...) > > cheers > > Chaals > > On Tue, 07 May 2019 00:46:49 +0200, Wouter Beek <wouter@triply.cc> wrote: > > > Dear Martynas, > > > > I am not interested in generating RDF/XML from non-RDF input. I'm > > asking whether all/most/some regular HTML documents are also RDF > > documents (without applying additional transformations). > > > > --- > > Best, > > Wouter. > > > > Email: wouter@triply.cc > > WWW: https://triply.cc > > Tel: +31647674624 > > > > On Tue, May 7, 2019 at 12:03 AM Martynas Jusevičius > > <martynas@atomgraph.com> wrote: > >> > >> You could generate the desired RDF/XML output with XSLT quite easily. > >> This is what GRDDL is about: > >> https://www.w3.org/TR/grddl/#grddl-xhtml > >> > >> On Mon, May 6, 2019 at 10:02 PM Wouter Beek <wouter@triply.cc> wrote: > >> > > >> > Dear SW community, > >> > > >> > The RDF/XML 1.1 specification contains the following two phrases: > >> > > >> > When there is only one top-level node element inside rdf:RDF, the > >> > rdf:RDFcan be omitted although any XML namespaces must still be > >> > declared. > >> > > >> > The XML specification also permits an XML declaration at the top > >> > of the document with the XML version and possibly the XML content > >> > encoding. This is optional but recommended. > >> > > >> > Does this mean that many/all (X)HTML documents are also RDF/XML > >> > documents? If so, there is much more RDF out there than I had > >> > previously thought. In fact, RDF would be at least as popular as HTML > >> > (contrary to common complaints from the SW community about RDF's > >> > popularity). > >> > > >> > Specifically, does the above mean that the following document should > >> > be parsed by a standards-compliant RDF/XML parser: > >> > > >> > ```xml > >> > <?xml version="1.0" encoding="utf-8"?> > >> > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" > >> > "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> > >> > <html xmlns="http://www.w3.org/1999/xhtml"> > >> > <head> > >> > <title> > >> > </title> > >> > </head> > >> > <body> > >> > <table> > >> > <tr> > >> > <td>some col 1</td> > >> > </tr> > >> > </table> > >> > </body> > >> > </html> > >> > ``` > >> > > >> > , resulting in the following RDF triples (serialized in N-Triples): > >> > > >> > ``` > >> > _:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > >> > <http://www.w3.org/1999/xhtmlhtml> . > >> > _:genid2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > >> > <http://www.w3.org/1999/xhtmltitle> . > >> > _:genid1 <http://www.w3.org/1999/xhtmlhead> _:genid2 . > >> > _:genid3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > >> > <http://www.w3.org/1999/xhtmltable> . > >> > _:genid4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > >> > <http://www.w3.org/1999/xhtmltd> . > >> > _:genid3 <http://www.w3.org/1999/xhtmltr> _:genid4 . > >> > _:genid1 <http://www.w3.org/1999/xhtmlbody> _:genid3 . > >> > ``` > >> > > >> > --- > >> > Best regards, > >> > Wouter Beek. > >> > > >> > Email: wouter@triply.cc > >> > WWW: https://triply.cc > >> > Tel: +31647674624 > >> > > > > > > -- > Using Opera's mail client: http://www.opera.com/mail/ >
Received on Tuesday, 7 May 2019 05:04:11 UTC