Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation

+cc: Norm Walsh

On 25/6/09 19:39, Juan Sequeda wrote:
> So... then from what I understand.. why bother with content negotiation,
> right?
>
> Just do everything in RDFa, right?
>
> We are planning to deploy soon the linked data version of Turn2Live.com.
> And we are in the discussion of doing the content negotiation (a la
> BBC). But if we can KISS, then all we should do is RDFa, right?

Does every major RDF toolkit have an integrated RDFa parser already?

And yep the conneg idiom isn't mandatory. You can use # URIs, at least 
if the .rdf application/rdf+xml mime type is set. I believe that's in 
the Apache defaults now. At least checking here, a fresh Ubuntu 
installation has "application/rdf+xml                             rdf" 
in /etc/mime.types (a file which I think comes via Apache but not 100% 
sure).

But yes - this is a major problem and headache. Not just around the 
conneg piece, but in general. I've seen similar results to those 
reported here with "write yourself a FOAF file" exercises. Even if 
people use Leigh Dodd's handy foaf-a-matic webforms to author a file ... 
at the end of the session they are left with a piece of RDF/XML in their 
hands, and an instruction to "upload it to their sites". Even people 
with blogs and facebook profiles and twitter accounts etc. can find this 
daunting. And not many people know what FTP is (or was).

My suggestion here is that we look into something like OAuth for 
delegating permission tokens for uploading files. OAuth is a protocol 
that uses a Web/HTML for site A to request that some user of site B 
allow it to perform certain constrained tasks on site B. Canonical 
example being "site A (a printing company) wants to see non-public 
photos on site B (a photo-sharing site)". I believe this model works 
well for writing/publishing, as well as for mediating information access.

If site A is an RDF-generating site, and site B is a generic hosting 
site, then the idea is that we write or find a generic OAuth-enabled 
utility that B could use, such that the users of site B could give sites 
like A permission to publish documents automatically. At a protocol 
level, I would expect this to use AtomPub but it could also be WebDAV or 
another mechanism.

But how to get all those sites to implement such a thing? Well firstly, 
this isn't limited to FOAF. Or to any flavour of RDF. I think there is a 
strong story for why this will happen eventually. Strong because there 
are clear benefits for many of the actors:

* a data-portability and user control story: I don't want all my music 
profile info to be on last.fm; I want last.fm to maintain 
http://danbri.org/music for me.
* a benefits-the-data source story: I'm sure the marketing teams of 
various startups would be very happy at the ability to directly push 
content into 1000s of end-user sites. For the Google/Link karma, traffic 
etc.
* benefits the hosts story: rather than having users share their FTP 
passwords, they share task-specific tokens that can be managed and 
rolled back on finer-grained basis

So a sample flow might be:

1. User Alice is logged into her blog, which is now AtomPub+OAuth enabled.
2. She clicks on a link somewhere for "generate a FOAF file from your 
music interests", which takes her to a site that asks some basic 
information (name, homepage) and about some music-related sites she uses.
3. That site's FOAF generator site scans her public last.fm profile 
(after asking her username), and then does the same for her Myspace and 
YouTube profiles.
4. It then says "OK, generated music profile! May we publish this to 
your site? It then scans her homesite, blog etc via some auto-discovery 
protocol(s), to see which of them have a writable AtomPub + OAuth 
endpoint. It finds her wordpress blog supports this.
5. Alice is bounced to an OAuth permissioning page on her blog, which 
says something like:
	"The Music Profile site at example.com  would like to
	have read and write permission for an area of your site: 
once/always/never or for 6 months?"
6. Alice gives permission for 6 months. Some computer stuff happens in 
the background, and the Music site is given a token it can use to post 
data to Alice's site.
7. http://alice.example.com/blog/musicprofile then becomes a page (or 
mini-blog or activity stream) maintained entirely, or partially, by the 
remote site using RDFa markup sent as AtomPub blog entries, or maybe as 
AtomPub attachments.

OK I'm glossing over some details here, such as configuration, choice of 
URIs etc. I may be over-simplifying some OAuth aspects, and forgetting 
detail of what's possible. But I think there is real potential in this 
sort of model, and would like a sanity check on that!

Also the detail of whether different sites could/would write to the same 
space or feed or not. And how we can use this as a page-publishing model 
instead of a blog entry publishing model.

I've written about this before, see 
http://markmail.org/message/gplslpe2k2zjuliq

Re prototyping ...

There is some wordpress+atompub code around, see 
http://74.125.77.132/search?q=cache:KcriYA9UohcJ:singpolyma.net/2008/05/atompub-oauth-for-wordpress/+wordpress+oauth&cd=3&hl=en&ct=clnk&client=firefox-a

Also I just found http://rollerweblogger.org/roller/entry/oauth_for_roller

And in http://danbri.org/words/2008/10/15/380 there is a point to some 
Google announcements from last year.


I hope someone on these lists will find the OAuth/AtomPub combination 
interesting enough to explore (and report back on). I've also Cc:'d Norm 
Walsh here who knows AtomPub inside out now.

Is this making sense to anyone but me? :)

cheers,

Dan

Received on Friday, 26 June 2009 07:36:14 UTC