W3C home > Mailing lists > Public > public-lod@w3.org > June 2009

Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Fri, 26 Jun 2009 10:04:04 -0400
Message-ID: <4A44D554.7060908@openlinksw.com>
CC: public-lod@w3.org, semantic-web at W3C <semantic-web@w3c.org>
Melvin Carvalho wrote:
> On Thu, Jun 25, 2009 at 6:44 PM, Martin Hepp
> (UniBW)<martin.hepp@ebusiness-unibw.org> wrote:
>> Hi all:
>> After about two months of helping people generate RDF/XML metadata for their
>> businesses using the GoodRelations annotator [1],
>> I have quite some evidence that the current best practices of using
>> .htaccess are a MAJOR bottleneck for the adoption of Semantic Web
>> technology.
>> Just some data:
>> - We have several hundred entries in the annotator log - most people spend
>> 10 or more minutes to create a reasonable description of themselves.
>> - Even though they all operate some sort of Web sites, less than 30 % of
>> them manage to upload/publish a single *.rdf file in their root directory.
>> - Of those 30%, only a fraction manage to set up content negotiation
>> properly, even though we provide a step-by-step recipe.
>> The effects are
>> - URIs that are not dereferencable,
>> - incorrect media types and
>> and other problems.
>> When investigating the causes and trying to help people, we encountered a
>> variety of configurations and causes that we did not expect. It turned out
>> that helping people just managing this tiny step of publishing  Semantic Web
>> data would turn into a full-time job for 1 - 2 administrators.
>> Typical causes of problems are
>> - Lack of privileges for .htaccess (many cheap hosting packages give limited
>> or no access to .htaccess)
>> - Users without Unix background had trouble name a file so that it begins
>> with a dot
>> - Microsoft IIS require completely different recipes
>> - Many users have access just at a CMS level
>> Bottomline:
>> - For researchers in the field, it is a doable task to set up an Apache
>> server so that it serves RDF content according to current best practices.
>> - For most people out there in reality, this is regularly a prohibitively
>> difficult task, both because of a lack of skills and a variety in the
>> technical environments that turns into an engineering challenge what is easy
>> on the textbook-level.
>> As a consequence, we will modify our tool so that it generates "dummy" RDFa
>> code with span/div that *just* represents the meta-data without interfering
>> with the presentation layer.
>> That can then be inserted as code snippets via copy-and-paste to any XHTML
>> document.
>> Any opinions?
> Been thinking about this issue for the last 6 months, and ive changed
> my mind a few times.
> Inclined to agree that RDFa is probably the ideal entry point for
> bringing existing businesses onto Good Relations.
> For a read/write web (which is the goal of commerce, right?), you're
> probably back to .htaccess, though, with, say, a controller that will
> manage POSTed SPARUL inserts.
> I think taking it "one step at a time", in this way, seems a sensible
> approach, though as a community, we'll need to put a bit of wieght
> behind getting the RDFa tool set up to the state of the art.

.htaccess is a sad and unnecessary technical detail that assumes we have 
an Apache mono-culture, and that said mono-culture is immutable.

For GoodRelations based product, services, and offerings descriptions, 
the workflow should be as follows:

1. Describe you products and services using terms from GR (ontology 
bound annotators help here irrespective of source and location);
2. Get an HTML as output from #1 (with embedded RDFa for the product and 
services description data);
3. Optionally, publish doc from #2 to your public Web Server;
4. Optionally, notify the broader Web via pinger services (PTSW, 
Sindice, etc..).

If you couldn't publish docs to your Web Server before you encountered 
GoodRelations, RDFa, and Linked Data, then we are dealing with a totally 
different matter, one that isn't specific to Linked Data deployment.

I think having a third party relay inaccurate opening and closing hours 
is a feature re. the GoodRelations, RDFa, Linked Data, and pinger 
services combo; it makes the "opportunity cost" of not putting the RDFa 
embellished HTML doc  (from #3) on the server, palpable :-)  Thus,  we 
end up with a closed loop, that simply lets the Web do the REST 
(including social and political cajoling re. doc publishing).

>> Best
>> Martin
>> [1]  http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
>> Danny Ayers wrote:
>>> Thank you for the excellent questions, Bill.
>>> Right now IMHO the best bet is probably just to pick whichever format
>>> you are most comfortable with (yup "it depends") and use that as the
>>> single source, transforming perhaps with scripts to generate the
>>> alternate representations for conneg.
>>> As far as I'm aware we don't yet have an easy templating engine for
>>> RDFa, so I suspect having that as the source is probably a good choice
>>> for typical Web applications.
>>> As mentioned already GRDDL is available for transforming on the fly,
>>> though I'm not sure of the level of client engine support at present.
>>> Ditto providing a SPARQL endpoint is another way of maximising the
>>> surface area of the data.
>>> But the key step has clearly been taken, that decision to publish data
>>> directly without needing the human element to interpret it.
>>> I claim *win* for the Semantic Web, even if it'll still be a few years
>>> before we see applications exploiting it in a way that provides real
>>> benefit for the end user.
>>> my 2 cents.
>>> Cheers,
>>> Danny.
>> --
>> --------------------------------------------------------------
>> martin hepp
>> e-business & web science research group
>> universitaet der bundeswehr muenchen
>> e-mail:  mhepp@computer.org
>> phone:   +49-(0)89-6004-4217
>> fax:     +49-(0)89-6004-4620
>> www:     http://www.unibw.de/ebusiness/ (group)
>>        http://www.heppnetz.de/ (personal)
>> skype:   mfhepp twitter: mfhepp
>> Check out the GoodRelations vocabulary for E-Commerce on the Web of Data!
>> ========================================================================
>> Webcast:
>> http://www.heppnetz.de/projects/goodrelations/webcast/
>> Talk at the Semantic Technology Conference 2009: "Semantic Web-based
>> E-Commerce: The GoodRelations Ontology"
>> http://tinyurl.com/semtech-hepp
>> Tool for registering your business:
>> http://www.ebusiness-unibw.org/tools/goodrelations-annotator/
>> Overview article on Semantic Universe:
>> http://tinyurl.com/goodrelations-universe
>> Project page and resources for developers:
>> http://purl.org/goodrelations/
>> Tutorial materials:
>> Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on
>> Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey
>> http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009



Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Friday, 26 June 2009 14:04:45 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:15:57 UTC