Re: Discussion with Ian and Henri about HTML5+RDFa (part 2/2) from Henri Sivonen on 2009-01-20 (public-rdf-in-xhtml-tf@w3.org from January 2009)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Tue, 20 Jan 2009 11:30:47 +0200
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, Ian Hickson <ian@hixie.ch>
Message-Id: <CC463B15-94CF-4E71-A86E-632CBAF5AC48@iki.fi>

On Jan 20, 2009, at 02:01, Manu Sporny wrote:

> Education about RDFa would also be an issue with the majority of web
> authors who don't care about web semantics and just want to get their
> page operational.

> The problem is with less-than-guru web authors who don't necessarily
> care about web semantics and thus generate bad semantic data out of
> ignorance. The common mis-use of @rev was cited as one possible  
> outcome
> - mis-used so badly that it is commonly not trusted by search engines.

My concern was not as much about educating people to produce non- 
garbage triples per se but about people getting to make an educated  
decision about publishing RDFa data. If RDFa is evangelized to people  
outside the SemWeb community as something they should do by cargo-cult  
copy and paste without really understanding it, it's very likely that  
people will end up sinking their limited resources into doing  
something that complicates their pages but doesn't really help them.  
And they'll become frustrated later when they find they didn't get a  
benefit for their effort. (Of course, if people publish garbage  
triples, it's clearly wasted effort.)

I believe spec developers have a duty to avoid making people expend  
effort without a concrete benefit (to themselves, in a reasonable time  
frame). They could be expending that effort towards something more  
concretely useful. (See also the discussions about whether the  
longdesc attribute should be conforming in HTML5.)

> The example of the W3C
> serving up many, many gigabytes of the same HTML4.01 DTD every day was
> cited as an example of what happens when your "vocabulary" becomes  
> popular.

(To be precise, this is more of an issue with the XML DTDs than the  
HTML ones, because there are actual DTD-loading XML parsers out there.)

It isn't only about the ability of the vocabulary server to serve a  
lot of data. It's a problem when communication between two parties on  
the network is reliant on a third party keeping a service running.

See http://hsivonen.iki.fi/no-dtd/

> He cited that using something of the form of a pseudo-namespaced
> "foaf-foo" where each token was specified in a spec somewhere, but  
> there
> was no way to follow your nose to it, or validate against it, would
> solve the "failure-due-to-popularity" issue.

You could validate against it if you had a validator that also knew  
about foaf-foo a priori.

> Henri was certainly sympathetic to embedding semantics in HTML for
> everyone that needed the functionality (not just the 80% that
> Microformats addresses) in HTML.

I did mention a notable concern about this that I don't know how to  
address except by avoiding experimentation with RDFa in the first place:

For any specific use case, an RDF vocabulary expressed in RDFa is an  
inferior syntax and data model for that use case compared to a  
solution developed specifically for the use case. So if you use RDFa  
to experiment whether a use case has merit, by the time you have shown  
that the use cases does have merit you are stuck with a syntax and a  
data model that suck for the use case and will hinder further growth  
and/or be wasteful in terms of the complexity cost on both producers  
and consumers.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Tuesday, 20 January 2009 09:31:29 UTC