Re: RDFa Use Cases from Ian Hickson on 2009-02-17 (public-rdf-in-xhtml-tf@w3.org from February 2009)

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 17 Feb 2009 22:22:08 +0000 (UTC)
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, Sam Ruby <rubys@intertwingly.net>, Dan Brickley <danbri@danbri.org>, Dan Connolly <connolly@w3.org>
Message-ID: <Pine.LNX.4.62.0902172211580.6186@hixie.dreamhostps.com>

On Tue, 17 Feb 2009, Manu Sporny wrote:
> 
> The exact code snippets that are used have been added, does this help to 
> explain how RDFa is applied to solve the problem? If not, what is 
> missing?
> 
> http://rdfa.info/wiki/rdfa-use-cases#Pseudo-code_Example_Using_Markup

Wow, that's awesome, yes. I am very impressed with the detail you are 
providing here, that's really great.

> > My understanding was that people wanted RDF data to be persisted 
> > across multiple sessions, which would lead to bad data "poisoning the 
> > well" in a way that no other feature in Web browsers has yet had to 
> > deal with.
> 
> Some people do, some don't. I think we should assume that the RDF triple 
> store may be more akin to the browser cache (can be cleared on a whim) 
> than to a traditional database (clearing the data is bad).

If we allow any persistence without some solution to the trust/spam 
problem, the store will quickly become useless (in the same way that the 
various features to open a new window quickly became useless once sites 
found ways to use them for doing popup ads).

This is one example of what I meant by having to evaluate each use case, 
by the way. If we decide that "RDFa" means "a per-tab triple store with a 
lifetime equal to the page and that is not affected by cross-origin 
iframes", then that wouldn't address the "collect lots of data and then 
query it" use case, despite still being "RDFa". It is IMHO important for 
the RDFa community to agree on exactly which uses cases are the ones that 
are intended to be addressed, so that we can make sure that what we come 
up with actually does address exactly those cases. (Is there documentation 
anywhere on what the existing RDFa specification is attempting to solve 
along these lines? e.g. what is the storage semantic for the current RDFa 
specification? Does it have persistence? How does it deal with 
cross-origin data load?)

> > (Search engine vendors spend gigantic amounts of resources dealing 
> > with this problem with today's HTML -- if the goal is to have the same 
> > kind of processing happening client-side, then the technology needs to 
> > be resilient to this kind of thing or else it will just collapse under 
> > its own weight.)
> 
> The solution to the "poisoning the well" problem seems to be to use 
> digital signatures to verify data that goes into any particular "well".

If we're just partitioning data stores on a per-origin basis, then there's 
no need for signatures, even, we can just use the existing origin data. 
The question is whether that is enough.

(This still doesn't address the problem of sites like wikipedia or blogs 
that accept input from multiple users, though.)

> http://rdfa.info/wiki/Developer-faq#How_does_one_prevent_bad_triples_from_corrupting_a_local_triple_store.3F

There needs to be some mechanism for determining what's in the white 
lists. (Black lists wouldn't work since an attacker could just come up 
with an infinite number of alternative site names.)

I don't really understand how the digital signature mechanism would work. 
In SSL, the user selects a single site, and the browser can verify that 
that site is who the user thinks it is. It doesn't prevent hostile sites 
that the user intended to go to from interacting with the user. How would 
digital signatures help here? Attackers can sign stuff just like anyone 
else can, no?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 17 February 2009 22:22:47 UTC