Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation

Hi Martin,

On 25.06.2009, at 17:44, Martin Hepp (UniBW) wrote:

> Hi all:
>
> After about two months of helping people generate RDF/XML metadata  
> for their businesses using the GoodRelations annotator [1],
> I have quite some evidence that the current best practices of  
> using .htaccess are a MAJOR bottleneck for the adoption of Semantic  
> Web technology.
>
> Just some data:
> - We have several hundred entries in the annotator log - most people  
> spend 10 or more minutes to create a reasonable description of  
> themselves.
> - Even though they all operate some sort of Web sites, less than 30  
> % of them manage to upload/publish a single *.rdf file in their root  
> directory.
> - Of those 30%, only a fraction manage to set up content negotiation  
> properly, even though we provide a step-by-step recipe.

These are interesting statistics, maybe you want to blog about them or  
publish them in some other way?

> The effects are
> - URIs that are not dereferencable,
> - incorrect media types and
> and other problems.
>
> When investigating the causes and trying to help people, we  
> encountered a variety of configurations and causes that we did not  
> expect. It turned out that helping people just managing this tiny  
> step of publishing  Semantic Web data would turn into a full-time  
> job for 1 - 2 administrators.
>
> Typical causes of problems are
> - Lack of privileges for .htaccess (many cheap hosting packages give  
> limited or no access to .htaccess)
> - Users without Unix background had trouble name a file so that it  
> begins with a dot
> - Microsoft IIS require completely different recipes
> - Many users have access just at a CMS level
>
> Bottomline:
> - For researchers in the field, it is a doable task to set up an  
> Apache server so that it serves RDF content according to current  
> best practices.
> - For most people out there in reality, this is regularly a  
> prohibitively difficult task, both because of a lack of skills and a  
> variety in the technical environments that turns into an engineering  
> challenge what is easy on the textbook-level.

For the cases where people still want to serve RDF documents, it would  
be neat if various CMSes had a simple way of handling content- 
negotiation. What I'm thinking of is e.g. a module for Drupal which  
would allow the Drupal admin to specify that, if rdf/xml for node X is  
requested (a page), serve RDF document Y. The content negotiation  
would be handled by php code in the module, hence no fiddling  
with .htaccess required.

> As a consequence, we will modify our tool so that it generates  
> "dummy" RDFa code with span/div that *just* represents the meta-data  
> without interfering with the presentation layer.
> That can then be inserted as code snippets via copy-and-paste to any  
> XHTML document.

I like it! It's similar to what our Shift tool [2] does for other  
kinds of data. However, this might lead to other problems: many CMSes  
only allow a subset of HTML in their input forms, so some of the RDFa  
could get lost. I remember this was a problem with Blogger in the past  
(not sure if this problem persists).

Cheers,
Knud

[1] http://kantenwerk.org/shift/

>
>
> Any opinions?
>
> Best
> Martin
>
> [1]  http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
>
> Danny Ayers wrote:
>> Thank you for the excellent questions, Bill.
>>
>> Right now IMHO the best bet is probably just to pick whichever format
>> you are most comfortable with (yup "it depends") and use that as the
>> single source, transforming perhaps with scripts to generate the
>> alternate representations for conneg.
>>
>> As far as I'm aware we don't yet have an easy templating engine for
>> RDFa, so I suspect having that as the source is probably a good  
>> choice
>> for typical Web applications.
>>
>> As mentioned already GRDDL is available for transforming on the fly,
>> though I'm not sure of the level of client engine support at present.
>> Ditto providing a SPARQL endpoint is another way of maximising the
>> surface area of the data.
>>
>> But the key step has clearly been taken, that decision to publish  
>> data
>> directly without needing the human element to interpret it.
>>
>> I claim *win* for the Semantic Web, even if it'll still be a few  
>> years
>> before we see applications exploiting it in a way that provides real
>> benefit for the end user.
>>
>> my 2 cents.
>>
>> Cheers,
>> Danny.
>>
>>
>>
>
> -- 
> --------------------------------------------------------------
> martin hepp
> e-business & web science research group
> universitaet der bundeswehr muenchen
>
> e-mail:  mhepp@computer.org
> phone:   +49-(0)89-6004-4217
> fax:     +49-(0)89-6004-4620
> www:     http://www.unibw.de/ebusiness/ (group)
>        http://www.heppnetz.de/ (personal)
> skype:   mfhepp twitter: mfhepp
>
> Check out the GoodRelations vocabulary for E-Commerce on the Web of  
> Data!
> = 
> = 
> ======================================================================
>
> Webcast:
> http://www.heppnetz.de/projects/goodrelations/webcast/
>
> Talk at the Semantic Technology Conference 2009: "Semantic Web-based  
> E-Commerce: The GoodRelations Ontology"
> http://tinyurl.com/semtech-hepp
>
> Tool for registering your business:
> http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
>
> Overview article on Semantic Universe:
> http://tinyurl.com/goodrelations-universe
>
> Project page and resources for developers:
> http://purl.org/goodrelations/
>
> Tutorial materials:
> Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A  
> Hands-on Introduction to the GoodRelations Ontology, RDFa, and  
> Yahoo! SearchMonkey
>
> http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009
>
>
>
>
> <martin_hepp.vcf>

-------------------------------------------------
Knud Möller, MA
+353 - 91 - 495086
Smile Group: http://smile.deri.ie
Digital Enterprise Research Institute
   National University of Ireland, Galway
Institiúid Taighde na Fiontraíochta Digití
   Ollscoil na hÉireann, Gaillimh

Received on Thursday, 25 June 2009 17:10:11 UTC