W3C home > Mailing lists > Public > semantic-web@w3.org > May 2019

Re: Are many (X)HTML documents also RDF/XML documents?

From: Wouter Beek <wouter@triply.cc>
Date: Tue, 7 May 2019 00:46:49 +0200
Message-ID: <CAEh2WcODcBTcs-OnvnVXUSmdqTdqPiVi4ADo8RQjjUVML6Nsbw@mail.gmail.com>
To: Martynas Jusevičius <martynas@atomgraph.com>
Cc: SW-forum Web <semantic-web@w3.org>
Dear Martynas,

I am not interested in generating RDF/XML from non-RDF input.  I'm
asking whether all/most/some regular HTML documents are also RDF
documents (without applying additional transformations).

---
Best,
Wouter.

Email: wouter@triply.cc
WWW: https://triply.cc
Tel: +31647674624

On Tue, May 7, 2019 at 12:03 AM Martynas Jusevičius
<martynas@atomgraph.com> wrote:
>
> You could generate the desired RDF/XML output with XSLT quite easily.
> This is what GRDDL is about:
> https://www.w3.org/TR/grddl/#grddl-xhtml
>
> On Mon, May 6, 2019 at 10:02 PM Wouter Beek <wouter@triply.cc> wrote:
> >
> > Dear SW community,
> >
> > The RDF/XML 1.1 specification contains the following two phrases:
> >
> >     When there is only one top-level node element inside rdf:RDF, the
> > rdf:RDFcan be omitted although any XML namespaces must still be
> > declared.
> >
> >     The XML specification also permits an XML declaration at the top
> > of the document with the XML version and possibly the XML content
> > encoding. This is optional but recommended.
> >
> > Does this mean that many/all (X)HTML documents are also RDF/XML
> > documents?  If so, there is much more RDF out there than I had
> > previously thought.  In fact, RDF would be at least as popular as HTML
> > (contrary to common complaints from the SW community about RDF's
> > popularity).
> >
> > Specifically, does the above mean that the following document should
> > be parsed by a standards-compliant RDF/XML parser:
> >
> > ```xml
> > <?xml version="1.0" encoding="utf-8"?>
> > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
> > "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
> > <html xmlns="http://www.w3.org/1999/xhtml">
> > <head>
> >   <title>
> >   </title>
> > </head>
> > <body>
> >   <table>
> >     <tr>
> >       <td>some col 1</td>
> >     </tr>
> >   </table>
> > </body>
> > </html>
> > ```
> >
> > , resulting in the following RDF triples (serialized in N-Triples):
> >
> > ```
> > _:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> > <http://www.w3.org/1999/xhtmlhtml> .
> > _:genid2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> > <http://www.w3.org/1999/xhtmltitle> .
> > _:genid1 <http://www.w3.org/1999/xhtmlhead> _:genid2 .
> > _:genid3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> > <http://www.w3.org/1999/xhtmltable> .
> > _:genid4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> > <http://www.w3.org/1999/xhtmltd> .
> > _:genid3 <http://www.w3.org/1999/xhtmltr> _:genid4 .
> > _:genid1 <http://www.w3.org/1999/xhtmlbody> _:genid3 .
> > ```
> >
> > ---
> > Best regards,
> > Wouter Beek.
> >
> > Email: wouter@triply.cc
> > WWW: https://triply.cc
> > Tel: +31647674624
> >
Received on Monday, 6 May 2019 22:47:48 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:51:27 UTC