Re: RDFa and Web Directions North 2009

On 17/2/09 07:47, Ian Hickson wrote:
> This is a bulk reply to several e-mails on this thread. I apologise for
> its length.

Thanks for this. Just a couple of points for now.



[...]

> Oh don't get me wrong, if there are solutions that have commonalities,
> then obviously we should reuse solutions where possible. For example, we
> had several use cases -- offline Hotmail, offline Google Spreadsheets,
> offline Flickr -- and we came up with a single solution that covers all of
> these. But we evaluated the solution against each case independently.
>
> In other words, to get a good result, instead of:
>
>   1. Find problems.
>   2. Extract commonalities of problems.
>   3. Adopt a solution that solves the commonalities.
>
> ...the process needs to be:
>
>   1. Find problems.
>   2. Propose solutions that solve one or more of those problems.
>   3. Evaluate the solutions against each problem.
>   4. If a solution is found that addresses many of the problems, adopt it.

As an aside here, this is an approach to data exchange very much in the 
spirit of what we built with RDF. From a schema-sharing perspective, 
proably the most distinctive (or even, 'odd', maybe ... 'innovative') 
feature of RDF is the granularity of re-use. Instead of schema authors 
writing big things that are re-used wholesale to solve other people's 
problems, they contribute definitions of classes, and of properties, 
which may each individually be reused. So instead of suggesting everyone 
writes People-description-documents or Group-description-documents that 
follow some exact doc format, the RDF approach is to say "ok, we already 
have a schema that defines classes for Person, and Group. are these, and 
the associated properties, and associated public data, any use for this 
current problem? Can we fix what's missing by defining new properties 
and classes that reference these existing contributions?". So the idea 
is to build things out, bit by bit, and allow each dataset to meet the 
needs of its creators while using whichever fragments of prior work make 
sense for them.





> On Fri, 13 Feb 2009, Ben Adida wrote:
>> [...] we're not asking browsers to implement any specific features other
>> than make those attributes officially available in the DOM.
>
> You presumably do want some user agents some where at some time to do
> something with these triples, otherwise what's the point? Whether this is
> through extensions, or through browsers in ten years when the state of the
> art is at the point where something useful can be done with any RDFa, or
> through search engines processing RDFa data, there has to be _some_ user
> agent somewhere that uses this data, otherwise what's the point?

Well, this little exchange is a bit peculiar.

Ben is saying that we are not asking _browsers_ to do anything beyond 
expose the data. Ian reponds by talking about various ways in which 
different kinds of user agent (including eg search engines) might 
eventually use RDFa. Ben is talking about the minimum we need from 
browser makers before this effort can move forward, while Ian's response 
focuses on the fact that we must clearly be wanting someone to do 
*something* eventually with all this data.

I wouldn't be suprised if Ben feels he's not being listened to, in this 
exchange. And I'm pretty sure Ian would say the same about various 
previous exchanges. Can we collectively try to fix this please?


Of course, Ben and other RDFa enthusiasts would be delighted if browsers 
did innately start doing interesting things with this RDFa-encoded data. 
And re "whether this is through extensions", yes absolutely, browser 
extensions also ought to be able to do nice things with this data. 
Firefox addons, Opera widgets, Ubiquity scripts, ... are all ways of 
exploring future browser designs. The better ideas may find their way 
slowly into the core UI we all expect of a Web browser.

But it seems the "talking past each other" effect comes in strongly 
here. As I re-read the above excerpt, I don't see a conversation.

Ian's reply is couched in terms of _user agents_. Often enough people 
use this as a near synonym for 'browser', but taken broadly it can 
include other agents that act in service to end users. By listing 
"through search engines processing RDFa data" it's clear Ian is using 
this broader sense. The distinction is also mentioned in the HTML5 spec 
 
http://www.w3.org/TR/html5/interactive-elements.html#requirements-for-interactive-user-agents 


So yes, when search engine "(non-interactive) user agents" (like Yahoo 
SearchMonkey / BOSS) start doing things with this data, RDFa enthusiasts 
are pleased. But the makers of those search systems are not being 
requested to do anything; they chose to. Interactive user agents ( ~ 
browsers ) are in a somewhat different situation, since their software 
stands between 3rd party .js and many kinds of potential in-page or 
in-browser features. Which is why the question of proper DOM support is 
so important to everyone here.

A few questions.

What's the minimimum needed from browser makers before others can do 
innovative things using RDFa triples parsed and consumed within the 
browser environment?

What checks can we make to ensure we're making it easier rather than 
harder for browser makers and others working within the browser UI to 
exploit RDFa efficiently?

Is there anything beyond "can parse the current document into triples 
from Javascript" that is necessary or very very useful for RDFa, in the 
browser environment? Any aspects that relate directly to user 
experience, such as speed of parsing, or consistency of behaviour after 
DOM updates?

Can the advanced facilities in HTML5 (eg. SQL persistence) be usefully 
combined with RDFa usage scenarios. For example, can we load/store/cache 
parsed RDFS/OWL schemas within the browser? Can we use the browser's 
crypto APIs to check the schema hasn't been maliciously interfered with? 
Can we serialize the in-page RDFa triples into the browser's SQL store 
and perform SPARQL queries on it (i) within the SQL environment through 
query rewriting (ii) using in-memory .js SPARQL implementations...

I hope as always we can understand disputes as at least partially 
grounded in miscommunication. I'd also like to see a bit more 
collaborative hacking, to see if we can move beyond the current tone of 
"what's the minimum we need to agree so that we can stop talking to each 
other". But I suspect terminology is the first problem -

So...

Ian - can you suggest terminology that will keep these conversations 
mutually intelligible? Does the term "Web browser" basically cover the 
HTML5 notion of "interactive user agent" here? And can "non-interactive 
user agent" cover scenarios like search engines indexing and 
re-presenting HTML5/RDFa in ways that require nothing beyond basic HTML 
from user's browsers?

cheers,

Dan

--
http://danbri.org/

Received on Tuesday, 17 February 2009 09:45:14 UTC