Re: xhtml+rdfa parsing as html Re: RDFa is a Candidate Recommendation from Mark Birbeck on 2008-06-23 (public-rdf-in-xhtml-tf@w3.org from June 2008)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Mon, 23 Jun 2008 08:47:47 +0100
To: "Karl Dubost" <karl@w3.org>
Cc: "Ben Adida" <ben@adida.net>, "Shane McCarron" <shane@aptest.com>, "Ivan Herman" <ivan@w3.org>, "Ralph R. Swick" <swick@w3.org>, public-rdf-in-xhtml-tf@w3.org, public-swd-wg@w3.org, "olivier Thereaux" <ot@w3.org>
Message-ID: <ed77aa9f0806230047r1b956efdl8e50f3ddc8a11624@mail.gmail.com>

Hi Karl,

> If a document is sent as text/html, it will be processed following the HTML
> 5 parsing algorithm and will possibly give a very different DOM.

Which browsers are you thinking of here? We hear a lot about
'standardising the web', and currently the only DOMs supported in
browsers are the HTML DOM, and the XHTML DOM. And the RDFa parsing
algorithm has been structured carefully so that it can cope with
either; there are server-side implementations that work on XHTML using
XML parsers, but there are also client-side JavaScript implementations
that work just as well on an HTML DOM.


> An XHTML document sent as text/html is not defined by [XHTML Media Types -
> Second Edition][1]. It is not said how it should be parsed, except
> referencing HTML 4.01 which recommends to parse as SGML. SGML parsers are
> not implemented in most user agents.

But RDFa is a collection of attributes, so we _should_ only really
need to define the attributes and their relationship to each other,
along with their semantics. The definition could then apply to any
language that can contain attributes. In other words, we shouldn't
need to define the underlying parsing model for the host language,
only the 'get attribute' part.

However, that was always going to be a bit too radical, so instead we
have defined the very specific situation where RDFa is used in XHTML.

Of course, if a user agent is capable of handling XHTML as if it were
HTML, it obviously falls outside the scope of our standard. But if an
implementer wants to apply the same algorithms to the document that is
now running in HTML mode, they will find that it can easily be made to
work.

Regards,

Mark

-- 
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)

Received on Monday, 23 June 2008 07:48:26 UTC